Operating System - HP-UX
1834740 Members
2970 Online
110070 Solutions
New Discussion

Re: NFS/rpc issue in HP-UX 11.11 with MC ServiceGuard

 
SOLVED
Go to solution
Tan Hean Seng
Occasional Contributor

NFS/rpc issue in HP-UX 11.11 with MC ServiceGuard

I encountered problem twice in all NFS clients just now. When the problem happened, the NFS mounted directory were not accessible from the client and "bdf" stopped before the line showing the information about a NFS mount point. When it failed the first time, I rebooted the server and the NFS worked for about 45 minutes before it failed again.

When it failed the second time, I tried to show the status of "showmount" and "rpcinfo" from a client: "showmount -e node1" and "rpcinfo -p node1" showed good results but "showmount -e pkg1" gave "pkg1: RPC_PMAP_FAILURE - RPC_PMAP_FAILURE" and "rpcinfo -p pkg1" gave "rpcinfo: can't contact portmapper: RPC_SYSTEM_ERROR - Connection refused", where node1 is the fixed IP address of the server and pkg1 is the package IP (virtual IP) address of the server. Both fixed and package IP addresses responded to "ping".

I had then TC node1 and let node2 serve pkg1 and it has been working until now, 3 hours after the second failure.

I would appreciate if someone could point out what might have gone wrong.
2 REPLIES 2
Brian Hackley
Honored Contributor

Re: NFS/rpc issue in HP-UX 11.11 with MC ServiceGuard

Hello,

This symptoms on the NFS client indicate that one or more NFS mount points are no longer responding. There can be many reasons for this. The HP IT Resource Center knowledge base has some documents on this to help you debug that part of the issue. Amongst the ones that I recommend: NETUXKBRC00006283 and KBAN00000261.

Perhaps an examination by HP Support of the TOC, on node1, examination of log files e.g. rc.log, rc.log.old, OLDsyslog.log syslog.log, /var/adm nettl.LOG0*, diagnostic logs, package log files, may be helpful.

One thing I thought of, that may be easy to check out. You said that the "real" IP worked, and the "Package" IP did not, I wonder if there is a possiblity of a duplicate IP address on the network at the time of failure? Or perhaps the package was just beginning to fail over?

Hope this helps,
-> Brian Hackley
Ask me about telecommuting!
Todd Whitcher
Esteemed Contributor
Solution

Re: NFS/rpc issue in HP-UX 11.11 with MC ServiceGuard

Hello,

In addition to what Brian stated. I would suggest checking the package control ( cntl ) logs in /etc/cmcluster/nfs/pkg1.cntl.log for errors or warnings. Also, check the cmcld messages from the syslog or OLDsyslog.log on both nodes.

# grep -i cmcld /var/adm/syslog/syslog.log > cmcld.out

On the system that had the TOC

# grep -i cmcld /var/adm/syslog/OLDsyslog.log > oldcmcld.out

Since you could contact the portmapper with rpcinfo to the static IP but not to the pkg1 IP address on the same server that makes me think there was a possible duplicate IP issue like Brian stated. Especially since you could ping the pkg1 IP address. The message RPC_PMAP_FAILURE - RPC_PMAP_FAILURE indicates that portmapper was not running on the system with the pkg1 IP address which contradicts the fact you could get a response from the static IP addresses. All IP's active on the server should respond to the rpcinfo call so this seems very suspect.

You can look in the /var/adm/nettl.LOG000 file for messages that may help.

# netfmt -Nnlf /var/adm/nettl.LOG000 > net.out

vi the net.out file and start at the bottom looking for messagse / warnings /errors.

I'd also examine the arp ( arp -an) cache on the system you were pinging from to see if the MAC to IP address matches the MAC on the SG node.

Hope This helps !

Todd