cancel
Showing results for 
Search instead for 
Did you mean: 

NFS/RPC errors over WAN

Dave Sak.
Occasional Visitor

NFS/RPC errors over WAN

I'm trying to troubleshoot a mysterious NFS/RPC error that just started happening.
This problem happens when the client machine is trying to connect to a NFS server over a WAN connection. The same machine has no problems over the LAN. Also, these HP-UX machines previously had no problems connecting over the WAN. Our AIX machines still have no problems connecting to the NFS server over the WAN.

B.11.11 U 9000/785 3960557040 unlimited-user license

Error message from syslog:
May 19 15:32:53 client1 vmunix: NFS read failed for server server1: RPC: Timed out
May 19 15:34:55 client1 vmunix: NFS getattr failed for server server1: RPC: Timed out
May 19 15:35:55 client1 vmunix: NFS getattr failed for server server1: RPC: Timed out
May 19 15:33:53 client1 vmunix: NFS read failed for server server1: RPC: Timed out

I tried setting NFS_TCP=1 in /etc/rc.config.d/nfsconf

I also installed PHNE_39167 and related patches, which as far as I can tell is the latest NFS patch. I suspect something might have changed in the network configuration (as none of these clients had issues in the past,) but the fact still remains that AIX clients still use that type of connection with no problems. The connections where the HP-UX machines fail are mostly T1s, but the networks are not overly saturated.

Any advice would be appreciated.

-Dave
4 REPLIES
Mel Burslan
Honored Contributor

Re: NFS/RPC errors over WAN

Dave,

when you are talking about WANs, usually all bets are off but are your sure your well-functioning AIX servers and non-functional HPUX servers are using the same WAN connection ? What I mean by that, are they on the same network and governed by the same network rules (firewalls, traffic shaping and what not) ? The symptom sure sounds like your networking gear is not liking the RPC traffic and don't let it go, but again, this is just a guess.

________________________________
UNIX because I majored in cryptology...
Dave Sak.
Occasional Visitor

Re: NFS/RPC errors over WAN

I would agree it sounds like a firewall or network issue, but when we did the AIX vs. HP test, we did the test from the same location. (Same subnet, same T1 line, same firewall rules.) It's also worth noting that not all remote locations are having this problem. The app is a 'standard' app for our suppliers, and the firewall rules are generic for all subnets/remote sites on this extranet. Also this app has been in use for ~10 years without these issues.

I'm going to test again today with a sniffer connected to the server. By using netstat, I can't see any 'abnormal' connections to the server, but that doesn't mean their aren't any.

I see some other threads on here with people getting 'NFS read failed' and 'NFS getattr failed' messages, but none of them have the same type of scenario I have.
Mel Burslan
Honored Contributor

Re: NFS/RPC errors over WAN

Dave,

We also run a heterogeneous OS environment and also have HPUX servers alongside AIX servers. I found out that AIX networking is more resilient against minor changes or disruptions in the network than that of HPUX. Of course this does not explain why NFS doesn't work on HPUX while everything seems to be fine on the AIX world, but when you are looking at networking problems, putting HPUX on the same basket as AIX, ruling them both are UNIX variants, is quite inaccurate.

Having said that, looks like you have the same setup on multiple locations and I Am assuming this is some sort of an extranet connection. Is there any difference on the physical layer of the WAN between this troubled site and a functional side ? Like one runs of T1 and the other on OC-3 or one runs over a dedicated link whereas the other connects thru ATM ?
________________________________
UNIX because I majored in cryptology...
Dave Sak.
Occasional Visitor

Re: NFS/RPC errors over WAN

There are differences in the physical layer between the working and non working scenarios. However, at the sites where it's currently broken (but it used to work,) the physical layer has not changed.