1850380 Members
2534 Online
104054 Solutions
New Discussion

Re: SG and resolver.

 
SOLVED
Go to solution
brian_31
Super Advisor

SG and resolver.

Team:

We have a single node cluster (no failover) running on 11.0 currently our nsswitch.conf is as follows..
hosts: dns [NOTFOUND=continue UNAVAIL=continue TRYAGAIN=continue] files

In our resolv.conf we have three dns servers defined. We hardly have a dns outage even if the primary fails the secondary will be up. And then we have a third one too...

Looking at various posts here i am convinced that i shd have files first in nsswitch. Given my scenario could you pl. tell me what the negative points are in my set up (nsswitch ..dns first as above) in case of primary DNS failure given a MCSG env?

Best Regards

Brian.
9 REPLIES 9
Stuart Abramson
Trusted Contributor

Re: SG and resolver.

I don't understand why you have a single node cluster?

We set ours like this:

hosts: files [NOTFOUND=continue] dns

The logic is that a lot of the references in the cluster (we have multi-node clusters) are between cluster members, so we put all of the cluster members and the package floating IPs, etc, in the /etc/hosts, and those references are answered immediately and dont' have to wait for the DNS server.

We have a DNS server that has "High Availablity" somehow, I dont' know how - the DNS Team maintains it on a Windows Server.
Alex Lavrov.
Honored Contributor

Re: SG and resolver.

You don't want your cluster fail because of DNS server problems. Cluster should be a closed unit that should not be ( or as little as possible) dependent on the environment.

Since the nodes are written in /etc/hosts or cmclnodes files anyway, it's better to have name resolution to be files and then DNS. Ofcourse these are recomendations, you don't have to follow them, but it's just the right way to setup the cluster.

About the previous reply, why to have s ingle node cluster. It's very comfortable to have programs as packages, when you can bring the whole logic package, for example db+webserver, with one command in the order you choose and let someting monitor both things and take care when something falls, like restart the whole thing. It's a common usage of SG.

Alex.
I don't give a damn for a man that can only spell a word one way. (M. Twain)
brian_31
Super Advisor

Re: SG and resolver.

Hi:

We have a single node cluster for the exact reasons that Alex explained and it had been very useful. My question was in line of..what would happen if the primary dns fails and it had to go thru the secondary dns..what would be the impact in my scenario?

Thanks

Brian.
Victor Fridyev
Honored Contributor

Re: SG and resolver.

Hi,

In addition to Alex's message above:

The sequence
files [notfound=continue] dns
should be used, especially if you put often used hostnames into /etc/hosts (e.g. nodes hostnames)
The problem with
dns [notfound=continue] files
sequence is the follwing: if primary server does not answer, the next server will be asked in about a minute, so each name resolution request will take about one minute.Some applications will not work in these conditions.

HTH
Entities are not to be multiplied beyond necessity - RTFM
B. Hulst
Trusted Contributor

Re: SG and resolver.

Hi,

Having hostname lookups done by DNS first could delay operations in your cluster.
In worst case it could timeout and stall it.

Files before dns is best.

Regards,
Bob
brian_31
Super Advisor

Re: SG and resolver.

Thanks. Is there any way i can test the time it takes in case of dns failure? I want to find out the dns timeout before it gets to files.

Thanks

Brian
Todd Whitcher
Esteemed Contributor
Solution

Re: SG and resolver.

Hi Brian,

If you have a non-responsive/down DNS server configured in your /etc/resolv.conf the resolver will retry 4 times after its initial request then give and error similar to this:

"Cant find server name for address X.X.X.X: No response from server"

Then it will move on to the next nameserver in your resolv.conf file, if that server is not available it repeats the retry, times out and goes to the next server.

This can happen up to 3 times, depending on the # of nameserver entries you have in your /etc/resolv.conf.

The timeouts are doubled each time a retry is attempted.

timeout 1 ( 5 seconds)
timeout 2 (10 seconds)
timeout 3 (20 seconds)
timeout 4 (40 seconds)

So a total of 75 seconds for each nameserver that may be unavailable.

If you have 3 thats 225 seconds 3.75 minutes..which is forever.

You can adjust these timeouts in the /etc/resolv.conf file with the retry and retrans options. See man page for resolv.conf.

ex.

For example, to have a system wait 1 second for a reply and retry 1 time after
a timeout:

retrans 1000
retry 1

/etc/resolv.conf
domain hp.com
nameserver 15.152.153.154
nameserver 15.155.156.157
nameserver 15.158.159.160
retry 1
retrans 1000

So depending on how many dns servers you have in /etc/resolv.conf and how your retrans and retries are set (default or modified) determines how long it will take to go from DNS to /etc/hosts files via the /etc/nsswitch.conf file.

Thats why its a good practice to put files first in /etc/nsswitch.conf and entries in your /etc/hosts files for your more critical systems so that in the event of a DNS issue you wont experience delays.

Hope that helps
Jim Keeble
Trusted Contributor

Re: SG and resolver.

I agree that all the nodes in the cluster should be in /etc/hosts and files should be first in nsswitch.conf.

The DNS timeout for a down nameserver is controlled by the retrans and retry params in /etc/resolv.conf. See the man page for resolv.conf.

It requires later libc patch for 11.0, is built in to later HP-UX versions.

The timeout should be ~= retrans in milliseconds * retry.

Test the timeout by inserting a non-existent IP on your subnet first in resolv.conf. Test by looking up a hostname that is not in files, but is in your DNS server(s). For example :

"timex nsquery hosts hostname"

or

"ping hostname"

The ping won't start until after the lookup completes.

Geoff Wild
Honored Contributor

Re: SG and resolver.

Todd is correct - I do the same.

For ServiceGuard, I set the following in nsswitch.conf:

hosts: files [NOTFOUND=CONTINUE] dns


And for resolv.conf:

retrans 2500
retry 2

Rgds...Geoff
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.