1838580 Members
3314 Online
110128 Solutions
New Discussion

HA DNS solution

 
UNIXGRUPPEN
Advisor

HA DNS solution


I'm in for a set up of a DNS service that requires High Availability (Max 4h downtime/year).

DNS is built to be somewhat redundant with primary and secondary dns servers, but the client implementation lacks necessary features to be considered a 'transparent' failover if the primary dns server is down.

IE, in MS OS the default failover is 30sec, it's adjustable, but the next lookup will start at the primary dns again and then timeout to go look for the second.

So my idea for it was to cluster BIND (using MC/Serviceguard) for dns usage.
I read a HP whitepaper on setting up BIND in a serviceguard enviroment, but that only included 1 bind-packade on 2 cluster nodes.

My question is, is it possible (and smart?), to build 2 BIND-packages to run on to cluster nodes, one in each?

Would it be a problem in case of a failover to run 2 bind instances in the same node, though they're binded to diffrent net interfaces ?

Is this the best way to solve the need for a HA DNS solution ?

TIA,
Johan
9 REPLIES 9
John Waller
Esteemed Contributor

Re: HA DNS solution

Hi Johan,

I can't quite understand why you would want two DNS packages? Are you having to support two seperate networks ??

I persoanally don't believe it is possible to run two seperate instances of named on the same server as you will always have to reference the same Master Zone file.(named.conf. Best I can suggest is that you create just one package to support all.

UNIXGRUPPEN
Advisor

Re: HA DNS solution

Hi John,

The main reason to run 1 named/bind instance in each node is to get some sort of load balancing.

The named.boot file will not be a problem since it's a modified version of bind (Netid) that'll be running.

And both dns packages will be located in a shared disk enclosure.

Since this doesn't seem to be a common solution I'm quite curious how companies solves a DNS that _always_ has to be up.

Which means that you cannot afford the 30sec timeout for the clients.

steven Burgess_2
Honored Contributor

Re: HA DNS solution

Hi

Please forgive me if i've got the wrong end of the stick here.

The whole reason for the primary and secondary is to have them both configured in /etc/resolv.conf. If you can't contact the primary server you simply use the secondary ?

Are you using the server for DHCP services also ? If so I would look at a product called QIP. You can switch services to the secondary node by killing the

root 16582 1 9 May 5 ? 19:52 /appl/qip52/usr/bin/dhcpd -f/appl/qip52/dhcp

process

I'm not exactly sure how this is configured, looks like a simple listener type agent/process

HTH

Steve
take your time and think things through
Geoff Wild
Honored Contributor

Re: HA DNS solution

I too have loooked at this.

The issue is the resolver - you can speed up the timeout in HP-UX by adding the following to resolv.conf:

retrans 2500
retry 2

Even in a failover, there is still down time as the package fail-over...

The best solution for HA DNS is to go with something like Cisco Context Switches...this would give you 5 9's....

http://www.cisco.com/en/US/products/hw/contnetw/ps792/products_data_sheet09186a008007ca3c.html

BAsically, they are load balance/fail over devices - put 2, 3, 4 WHY, servers behind - but point your clients at a single IP - if one box goes down - transparent to clients....

Rgds...Geoff
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
Sachin Patel
Honored Contributor

Re: HA DNS solution

Hi Johan,

I am going to say like other says. why? But if you can do it it is great.

Here is what we have.
We have hundreds of production systems each accessing each others. We can't have our dns down and that is why we have one master number of slave and all systems has three secondary server listed in resolv.conf.
If my dns goes down then no one can access internet. our whole corporate people will start crying.

Our servers are up for days and days. one of my dns (just serving as dns) was up for 500 days and I had to shut it down for move. and then since it is up

#uptime
9:05am up 99 days, 22:20, 1 user, load average: 0.00, 0.03, 0.06

They are not even high end systems. All three major secondary servers was 712 and I have just upgrade to B132.

And if you want more reliability put two disk in sytem and use dd to copy disk to disk. IF something goes wrong then you still have other disk.

Sachin
Is photography a hobby or another way to spend $
Robert Gamble
Respected Contributor

Re: HA DNS solution

I do no think it is wise to have two BIND packages, or really possible. As stated earlier, you can't have two masters running on same machine, as that could be possible if the one failed over to the other node in the cluster.

Your DNS evironment should have *one* master, and many secondaries. If the client seats never point at the master, they *should* never notice whether it has downtime. In my large environment, even with some of the clients pointing at the master, they pickup immediately on the secondary when we do maintenance.

Hope this helps!
Christopher Caldwell
Honored Contributor

Re: HA DNS solution

BIND is probably the last thing I'd attack with ServiceGuard, since resiliency features are already built into bind.

If we could understand the problem a little better, we might be able to suggest a topology.

On the client side, it's generally wise to list more than one DNS server - the failure then becomes effectively transparent, with trivial delays if a server goes down.

If the DNS "fails" - i.e. returns incorrect answers, you won't be able to fix that with ServiceGuard - you'll want find the reason that the service fails and fix that reason.

For scaling and performance, I have seen folks stick BIND behind a load balancer.

Mark Greene_1
Honored Contributor

Re: HA DNS solution

As others have stated, you really don't want two different DNS servers. But, more to the point, this won't address MS timeout.

Here's some info, including links to MS's site, on DNS client setup in windows:

http://groups.google.com/groups?selm=esjxO9hFDHA.2100%40TK2MSFTNGP12.phx.gbl&oe=UTF-8&output=gplain

HTH
mark
the future will be a lot like now, only later
Decio Miname
Frequent Advisor

Re: HA DNS solution


I've got a simple suggestion, don't know if it will work for your case.

Get two MCSG nodes to be primary and secondary DNS servers, completely off the MCSG configuration (e.g. BIND itself would not be part of a package).

Create a package with your valid DNS server's IP address ("1st choice" for DNS queries).

As long as your package is up in any of the two nodes, BIND will be up on it. This will give you typical HA downtime (improved by the fact that you don't have to wait for VG's to come up, etc) and minimum administration effort (you have to maintain only the records in the primary DNS node).

In case you have bit-brusher instincts, you could also tune MCSG timings to improve the avaialability even more, but I don't think that would be necessary.

You've mentioned load balancing between the two nodes. Are you really sure you need that? I've seen DNS servers supporting thousands of clients using small workstations, and performance is hardly an issue - usually they refer to whole-server load and NIC/LAN/WAN issues, not the DNS service itself.

Regards,

D.