Operating System - HP-UX
1827892 Members
2164 Online
109969 Solutions
New Discussion

Strange hostname resolution issue

 
Rikki hinn Ogurlegi
Frequent Advisor

Strange hostname resolution issue

After applying the latest 11.23 patchbundle to my cluster I'm seeing strange things happening on my cluster.

basalt# swlist | grep QPK
QPKAPPS B.11.23.0912.082 Applications Patches for HP-UX 11i v2, December 2009
QPKBASE B.11.23.0912.082 Base Quality Pack Bundle for HP-UX 11i v2, December 2009

Here is the setup (both nodes are identical):

granit# cat /etc/nsswitch.conf
hosts: files
services: files

relevant parts from /etc/hosts:

127.0.0.1 localhost loopback
10.7.6.20 granit granit.foo.com
10.99.99.20 granit granit.foo.com
10.7.6.21 basalt basalt.foo.com
10.99.99.21 basalt basalt.foo.com

(10.99.99.X is SG heartbeat only)

granit# ifconfig lan0
lan0: flags=1843
inet 10.7.6.20 netmask ffffff00 broadcast 10.7.6.255

basalt# ifconfig lan0
lan0: flags=1843
inet 10.7.6.21 netmask ffffff00 broadcast 10.7.6.255

After applying the patch I get the feeling some parts of the OS cant convert IP's to hostnames any more. I'm getting issues in SG, NFS and backup. NFS issue is the most simple so I'll list that here:

granit# cat /etc/exports
/test -access=basalt

basalt# mount granit:/test /test
Permission denied

Now I change exports to -root=basalt thus negating the access requirement so all can mount but node basalt only gets root access and then this happens:

granit# cat /etc/exports
/test -root=basalt

granit# exportfs -ua
granit# exportfs -a

basalt# showmount -e granit
/test (everyone)

basalt# mount granit:/test /test
basalt# cd /test
basalt# bdf .
Filesystem kbytes used avail %used Mounted on
granit:/test 1048576 374952 668384 36% /test
basalt# touch foobar
basalt# ls -l foobar
-rw-r--r-- 1 root sys 0 jan 19 11:50 foobar


So there root=basalt is working but not in access=basalt ?

and:

granit# showmount -a
(anon):/test

This is not a new setup. The NFS mounts I used on these nodes where there long before the patchbundle was installed and always worked fine.


The ServiceGuard problem is much stranger but it makes a node unable to join the cluster on reboot because the other node only get's an IP and not a hostname and refuses to recognise root on that machine as a legit admin. Thankfully I had added my user account as a full admin and was able to cmrunnode as my user to finally get the cluster up again.

USER_NAME ra
USER_HOST ANY_SERVICEGUARD_NODE
USER_ROLE FULL_ADMIN

root however was and is complete unable to.

I'm not sure what the backup issue is, because it just came to light but backups also worked fine untill I patched.
6 REPLIES 6
Rikki hinn Ogurlegi
Frequent Advisor

Re: Strange hostname resolution issue

I restrted mountd on the server (granit) and added "-l /var/adm/syslog/mountd.log -t2" to the commandline. Then I changed the exports line back to -access=basalt and attempted to remount. Got a whole lot af stuff in the logfile that I dont understand but I do see the following:

rpc.mountd: getclientsnames: Entering procedure
01.19 12:54:00 granit pid=24414 /usr/sbin/rpc.mountd
rpc.mountd: getclientsnames: calling getnetconfigent
01.19 12:54:00 granit pid=24414 /usr/sbin/rpc.mountd
rpc.mountd: getclientsnames: calling svc_getrpccaller
01.19 12:54:00 granit pid=24414 /usr/sbin/rpc.mountd
rpc.mountd: getclientsnames: calling netdir_getbyaddr
01.19 12:54:00 granit pid=24414 /usr/sbin/rpc.mountd
rpc.mountd: getclientsnames: cannot find name for host [10.7.6.21.2.222]
01.19 12:54:00 granit pid=24414 /usr/sbin/rpc.mountd
rpc.mountd: getclientsnames: returning anon_hsl

granit# grep 10.7.6.21 /etc/hosts
10.7.6.21 basalt basalt.foo.com







Rikki hinn Ogurlegi
Frequent Advisor

Re: Strange hostname resolution issue

I changed resolv.conf and nsswitch.conf to activate DNS on the server. nsswitch had files [NOTFOUND=continue] dns for the hosts config. Then NFS was working.

I cant switch to DNS on these machines due to some strange software package developed here.
Rita C Workman
Honored Contributor

Re: Strange hostname resolution issue

I'm sure some really smart NFS folks will get on this, so I'm just going to hit some little things...

Generally hostfile format is:
IP FQDN alias

I might first change hostfile to read:
10.7.6.21 basalt.foo.com basalt

You mention you edited resolv.conf & nsswitch.conf and set up DNS on the server. But....I'm thinking you didn't set up the server to 'run' DNS, you just pointed your HPUX server TO the 'actual' DNS servers - hence you resolved.

MCSG, backups software like Data Protector, and yes NFS require solid reverse lookups/resolution. Here DNS runs on Windows.
Now what that means is that Windows DNS and my hostfiles had better resolve the same, and perform reverse lookups exactly the same. So, whenever there is a new host or pkg they have to do reverse lookups down to the aliases for me.

If your companies DNS is not controlled by you....then you might want to make sure your hostfile and their DNS is in sync. Otherwise, things on HPUX will have issues.

Just a thought,
Rita
Steven Schweda
Honored Contributor

Re: Strange hostname resolution issue

> I might first change hostfile to read:
> 10.7.6.21 basalt.foo.com basalt

And/or change to use the FQDN in
"/etc/exports".
Rikki hinn Ogurlegi
Frequent Advisor

Re: Strange hostname resolution issue

I set up the hosts file according to the ServiceGuard documentation. I will try switching it around but that still means that a patch from HP apparently broke existing behavior. All was well before the patch.

When I edited nsswitch.conf and resolv.conf I was not setting up a DNS server on these machines, but using the internal DNS servers in house.
Rikki hinn Ogurlegi
Frequent Advisor

Re: Strange hostname resolution issue

I figured out what the issue was. Turns out that it is suddenly not enough to just change the hosts: field in /etc/nsswitch.conf but now there is an "ipnodes:" field that also has to be changed.

My nsswitch.conf files did not have that entry thus this defaulted to "dns [NOTFOUND=return] files" causing some tools to use /etc/hosts (ping for example) and other to use dns (traceroute for example).

Thanks for your help :)