- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- totally bizarre NFS anomaly
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 02:16 PM
тАО09-09-2004 02:16 PM
I have come upon a very very strange NFS problem. I need help here. Let me describe the problem to you:
we have a filesystem on server1 (named guam) exported to server2, and server2 has soft mounted it as read only.
when I go into the mount on server2, I can peruse the filesystem, cat files etc as per normal.
However, while I was doing this I came upon one file I could not cat, it returned a NFS error as follows:
NFS read failed for server guam: RPC: Timed out
cat: read error: No such file or directory
Initially, I thought that someone had umounted the filesystem or something, but no, I could cat every other file except this one. I've never heard of NFS failing on a single file, so it was strange.
I next thought it might be a permissions problem, so I examined the permissions on the file, but they were the same as every other file in the directory. I ran fuser on the file on server1 and no one was using it.
I ran the file command on the file, and it was like every other file in the directory, "commands text" (it is a ksh script, just like every other file in the directory).
So, here I am with a file I cannot cat, but I can to every other file in the directory. Now we will skip some time, because I tried many things that didn't work, but there was one thing I found that seemed to fix the problem:
If I deleted one (1) character from anywhere in the file, and saved it, I found that I was then able to cat the file on the NFS mount. How strange is that? I then found that if I added 5 more characters the file (4 more than it originally had), then I could also cat the file. But if I only added, like 2 or 3 characters it would fail.
So in essence, there was a range of 5 characters/bytes whereby NFS would fail to read the file, but any size outside that range, eithr higher or lower, it would work fine.
ie: the original file size was: 2825 bytes.
a file size of 2824 bytes would work
a file size of 2829 bytes would work
a file size of anything between 2824-2829 would fail (not including 2824 or 2829)
a file size greater than (or equal to) 2829 or less than (or equal to) 2824 would work.
Can someone please tell me why this would be the case? I am completely stumped...
- Andrew Gray
(I am running hpux 11.00, path level is at March 2004 bundle.)
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 02:30 PM
тАО09-09-2004 02:30 PM
Re: totally bizarre NFS anomaly
I found that I was always successful in doing a cat on the file when I was doing it on an automounted directory. ie cat /net/guam/myscripts/scriptwithproblem.ksh
would work (this was automounted)
but: cat /myscripts/scriptwithproblem.ksh
would fail (this was normal nfs mounted), even though they were the same file.
I meant to say the patch level was march 2004.
- Andrew Gray
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 03:05 PM
тАО09-09-2004 03:05 PM
Re: totally bizarre NFS anomaly
That is a strange problem, but I've seen stranger. :)
I assume when you say you're running 11.0 that you mean the NFS client is 11.0. What type of system is the NFS server? Are both systems running HP-UX? If so, please issue the following command on each system for me:
# swlist -l product | grep ONC
and give me the output.
As for your specific problem, you said that when you try to cat the file from the manually mounted NFS filesystem it fails unless you first modify it, but when you cat the file from the automounted directory it works. Which automounter are you running on the 11.0 client - the legacy automounter or the ONC 1.2 AutoFS? If you're not sure, please copy/paste your /etc/rc.config.d/nfsconf file into this thread so that I can see how the client is configured. If the server is also an HP-UX system, give me that system's nfsconf file contents as well.
Finally, I'd like to see the output of the command:
# nfsstat -m
on the NFS client, and the output of the command:
# cat /etc/xtab
on the NFS server - assuming it is an HP-UX system.
All of this information will give me some ideas of where to go next.
Regards,
Dave
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 03:30 PM
тАО09-09-2004 03:30 PM
Re: totally bizarre NFS anomaly
Can u go to server1 and move the perticular file to some other directory and see if it works.
regards
SK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 03:51 PM
тАО09-09-2004 03:51 PM
Re: totally bizarre NFS anomaly
I will attach the information you wanted. As for your question re automount vs autofs. It appears we are using the old automount, since AUTOFS=0 in the nfsconf file. Perhaps we should change it to use the new autofs?? is it heaps better?
Both the client and server are running HP-UX 11.00.
Let me know if you still need the /etc/rc.config.d/nfsconf files from the client/server
Also, the directory I've been playing with is /apps/dv. that is the mount which contains the file I'm having trouble with.
I've also tried mounting the /apps/dv directory on other HP-UX hosts I have around, and tested to see if they exhibit the same behaviour. The result so far is that the other hosts do have the same behaviour, and some of these hosts are hpux 11.11 (11i) hosts too.
However, if I copy the file to another server, create an export on that server, and then mount it around the place, then it will work, and I am then able to cat the file. So perhaps the problem lies with the NFS server - guam. Don't know what though.
What do you think?
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 03:54 PM
тАО09-09-2004 03:54 PM
Re: totally bizarre NFS anomaly
Can you do a 'cat -v file' on the NFS server and if you see any interesting characters?. Or what if you do
#cat file > file1
and then try on file1?
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:05 PM
тАО09-09-2004 04:05 PM
SolutionI didn't see the automounted filesystem in your nfsstat -m output, so I assume automounter simply unmounted it (as it is supposed to).
Since you're using the old automounter, it would have mounted the filesystem using NFS Version 2, whereas a manual mount will default to NFS Version 3. Both will use an 8K rsize/wsize by default on 11.0, and it appears you either have not enabled NFS/TCP on these systems or you're choosing to use UDP.
Should you use AutoFS? The ONC 1.2 AutoFS is not a very good version of AutoFS, but it is the only version available on 11.0. If you could update the client to 11i then you could download the ONC 2.3 version of AutoFS from http://software.hp.com, and that AutoFS is a far superior version of AutoFS than the ONC 1.2 version on 11.0 or the legacy automounter that you're using.
I'd still like to see the client and server's nfsconf files just to be safe.
The fact that other clients show the same behavior would seem to eliminate the possibility of a client cache corruption issue, which is what this problem originally sounded like.
I would be curious if this client, and the other clients, can successfully cat the file if you manually mount the filesystem with NFS Version 2. If you add the "vers=2" option to your mount syntax it should force NFS version 2.
You could even try creating a new empty directory on the initial client and mount the same filesystem from the server again using NFS version 2 into the new empty directory so that you'll have the same filesystem mounted twice on the same client - one with NFS V2 and one with NFS V3. That would be really interesting to see if the problem shows up in the V3 mount but not the V2 mount.
Let me know what this test reveals.
Thanks,
Dave
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:05 PM
тАО09-09-2004 04:05 PM
Re: totally bizarre NFS anomaly
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:12 PM
тАО09-09-2004 04:12 PM
Re: totally bizarre NFS anomaly
I just took a look at the latest 11.0 ONC patch and found this fix in it:
librpc.a
SR: 8606347226
DTS: JAGaf08050
Commands operating on an NFS file system mounted as a soft mount over UDP transport protocol fail with the error message: "RPC: Unable to receive".
This error is very similar to yours and you are using soft mounts and UDP, so this could be a match. If you are able to, I'd like you to install PHNE_30377 on the 11.0 NFS client (along with any dependent patches) and see if this affects the behavior of the client.
Again, I'm not one to typically "throw patches" at a problem, but the symptoms described in the patch text are pretty close to what you're seeing.
Regards,
Dave
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:36 PM
тАО09-09-2004 04:36 PM
Re: totally bizarre NFS anomaly
as I've mentioned, I get the same problem when I copy the file to different directories, and when I copy the file to different nfs shares. I also get the same problem when I create a file from scratch which comes up to between 2824 and 2829 bytes. (I filled the file with 2825 a's).
As for patching:
I'll schedule some patching do be done. However, note that (as I said before), when I copy the file to other hosts and create an export on them, it seems to work fine. So we'll see.
Any other ideas?
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:50 PM
тАО09-09-2004 04:50 PM
Re: totally bizarre NFS anomaly
thanks for your help so far. here is an update:
I don't know what did it, but I can no longer seem to duplicate the behaviour I was describing above. What I'm saying is that it seems to be working!
I was experimenting with adding the vers=2 line on the client and trying mounting etc, I know I also did an /sbin/init.d/nfs.client stop and start on the client also. But anyway, I can now cat the file. I didn't stop or restart anything on the server, but I can't get it to error like it was before. I don't know what I did to make it work properly again.
This is very strange! It behaves as if nothing was ever wrong! I'm stumped on this one.
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:53 PM
тАО09-09-2004 04:53 PM
Re: totally bizarre NFS anomaly
Dave
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:53 PM
тАО09-09-2004 04:53 PM
Re: totally bizarre NFS anomaly
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:54 PM
тАО09-09-2004 04:54 PM
Re: totally bizarre NFS anomaly
very strange.
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 04:57 PM
тАО09-09-2004 04:57 PM
Re: totally bizarre NFS anomaly
[guam]:/root # ps -ef | grep -e nfs -e biod -e rpc
root 1507 1 0 Jul 26 ? 0:21 /usr/sbin/biod 4
root 1483 1 0 Jul 26 ? 0:04 /usr/sbin/rpcbind
root 1488 0 0 Jul 26 ? 0:00 nfskd
root 1525 1 0 Jul 26 ? 0:04 /usr/sbin/rpc.lockd
root 1508 1 0 Jul 26 ? 0:21 /usr/sbin/biod 4
root 1509 1 0 Jul 26 ? 0:21 /usr/sbin/biod 4
root 1510 1 0 Jul 26 ? 0:21 /usr/sbin/biod 4
root 1519 1 0 Jul 26 ? 0:03 /usr/sbin/rpc.statd
root 1939 1 0 Jul 26 ? 4:36 /opt/dce/sbin/rpcd
root 12890 12885 0 Jul 31 ? 2:36 /usr/sbin/nfsd 4
root 12887 12885 0 Jul 31 ? 2:41 /usr/sbin/nfsd 4
root 12874 1 0 Jul 31 ? 0:02 /usr/sbin/rpc.mountd
root 3821 29041 1 14:56:29 pts/9 0:00 grep -e nfs -e biod -e rpc
root 12888 12885 0 Jul 31 ? 2:37 /usr/sbin/nfsd 4
root 12885 1 0 Jul 31 ? 2:34 /usr/sbin/nfsd 4
daemon 1859 1567 0 Jul 26 ? 0:03 rpc.cmsd
root 1585 1567 0 Jul 26 ? 0:03 /usr/dt/bin/rpc.ttdbserver
root 12889 12885 0 Jul 31 ? 2:38 /usr/sbin/nfsd 4
root 12900 1 0 Jul 31 ? 0:02 /usr/sbin/rpc.pcnfsd
root 12886 12885 0 Jul 31 ? 2:41 /usr/sbin/nfsd 4
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 05:10 PM
тАО09-09-2004 05:10 PM
Re: totally bizarre NFS anomaly
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 05:11 PM
тАО09-09-2004 05:11 PM
Re: totally bizarre NFS anomaly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 05:14 PM
тАО09-09-2004 05:14 PM
Re: totally bizarre NFS anomaly
Any ideas as to what is going on?
- Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 05:51 PM
тАО09-09-2004 05:51 PM
Re: totally bizarre NFS anomaly
The times I've seen behavior like this in the past, where a specific sized file fails, it has been a couple of things:
1. Client cache corruption
This wouldn't explain why it fails on more than one client
2. UDP checksum failures
Check the "netstat -p udp" output on all of the systems involved to see if any UDP checksum failures are logged.
3. Network problem
Some piece of intermediate network hardware is corrupting specific packets - usually base on a certain size of the packet or byte alignment. This would explain why adding a few bytes or deleting a few bytes from the file in question would get it to start working again.
If you don't see any UDP checksum failures on any of the systems, my best guess (without looking at any other data) would be #3 - that some piece of network equipment was dropping packets based on a byte alignment issue. This would explain why multiple clients saw the same behavior - assuming they use the same networking hardware (i.e. switches, hubs, routers) to communicate with the NFS server.
Again, pure conjecture at this point since the problem isn't reproducing any more so we can't collect network traces to verify whether packets are arriving intact between the clients and servers.
Hint - that would have been my next suggestion if the problem was still happening.
Regards,
Dave
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 06:23 PM - last edited on тАО09-16-2024 02:12 AM by support_s
тАО09-09-2004 06:23 PM - last edited on тАО09-16-2024 02:12 AM by support_s
Re: totally bizarre NFS anomaly
An interesting point to ponder though. You know when I went out and tried mounting on various other hosts around the site to see if it happened on those also? Well all the boxes I did this on all have like 5 or 6 bad checksums, all the ones I didn't test this on, all have 0 checksums, and of course cook has heaps.
So this would indicate that it was a networking issue, which is quite possible since the network guys had done some networking work last night, which may have stuffed things up.
This is very interesting, do you have any more information on this available to you?
Thanks for the help!!! I Really appreciate it.
- Andrew Gray
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 06:39 PM
тАО09-09-2004 06:39 PM
Re: totally bizarre NFS anomaly
The fact that all of the clients you tested with have logged checksum failures is a strong indication that something in the network is/was corrupting UDP packets. This would definitely explain the problem, and it fits with other cases I've seen in the past where the UDP checksum failures only occur for certain packets and not others.
You said:
__________________________________________
So this would indicate that it was a networking issue, which is quite possible since the network guys had done some networking work last night, which may have stuffed things up.
__________________________________________
What kind of networking "work" did the network guys do last night? Were they still making changes earlier this evening (or morning, depending upon where you are)? Did someone from the network team reset a router/hub/bridge/switch during your test and that's why the problem went away?
Bottom line, I really doubt you'll be able to pin point the exact cause of the problem unless it occurs again. If it does, I recommend taking a series of network traces to see which packets make it from the client to server and back and which ones don't.
You'll likely need need to take traces at various points in the network to figure out which hop in the network is causing the packet corruption (assuming there are multiple hops between the client and server). After enough tracing, you should be able to identify the device causing the failures and then get someone to correct it. However, it appears to have corrected itself (or someone corrected it without your knowledge) already.
Best regards,
Dave
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 06:51 PM
тАО09-09-2004 06:51 PM
Re: totally bizarre NFS anomaly
One other suggestion to consider for the future...
HP-UX 11.0 supports NFS/TCP. It is very possible that a TCP mount would not have shown this same problem since some networking hardware tends to treat UDP and TCP traffic differently in these cases. Also, since UDP and TCP headers are different sizes, it's likely that a TCP packet wouldn't have hit the same "magic" byte size that caused the UDP corruption to occur.
If you're interested in trying NFS/TCP on your 11.0 systems, you would simply issue the following command on both your NFS client and server systems:
# setoncenv NFS_TCP 1
Once you issue this command you either need to reboot the systems or stop/restart all NFS services on the system. If you're not using any NFS services at the time you can do an:
# /sbin/init.d/nfs.server stop
# /sbin/init.d/nfs.client stop
# /sbin/init.d/nfs.client start
# /sbin/init.d/nfs.server start
This will halt and restart all of the necessary NFS daemons with support for NFS/TCP. Also, once you enable TCP on an 11.0 system it becomes the default protocol used for future NFS mounts, unless of course you're either overriding the protocol by using the "proto=udp" mount option, or by using the legacy automounter, which only supports NFS Version 2 mounts using UDP.
Best of luck,
Dave
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 07:08 PM
тАО09-09-2004 07:08 PM
Re: totally bizarre NFS anomaly
I talked to the network fellas. Apparently they had upgraded the Cisco IOS (operating system) on the Layer 3 switch that the nfs server and client are both connected to. They had to reboot the L3 switch last night. They also said that they have done nothing with it all day, so I don't know why it would suddently start working.
We wondered if maybe it was a duplex issue, since we have had them in the past, but no, everything seems to be running full-duplex where it should.
Does NFS have anything to do with SNMP? The only other thing I can think of that happened about the time it started working was someone restarted SNMP on the nfs-server. I don't know how such a thing could affect network transmission check-sums though.
the NFS server and NFS client are both connected to the switch, but I don't see how a switch would be manipulating packets??? How does that happen? I could almost understand it if it was a router, but a switch??? Do you know how a switch could manupulate packets? The NFS server and client are actually plugged in right next to each other on the switch, same subnet, same vlan etc.
Any other information?
Thanks.
- Andrew Gray
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 07:18 PM
тАО09-09-2004 07:18 PM
Re: totally bizarre NFS anomaly
What are the advantages of NFS/TCP over NFS/UDP ?
We use NFS a lot, and I don't know the diff, and I never bother until now. Should I?
FWIW all our servers (HP-UX 10.20, HP-UX 11.00, HP-UX 11.11, AIX-4.3.3, and AIX-5.2.0) have cross mounted NFS all their file systems in both directions :) Sometimes Linux clients also mount any of those.
No auto-mount involved. All manual mounts.
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2004 07:19 PM
тАО09-09-2004 07:19 PM
Re: totally bizarre NFS anomaly
Then I suggest you start doing some tracing at this time..
Look at loading 'ethereal' on the client (may be your workstaion so you won't affect your production servers) and mount the filesystem.
You can get ethreal from HP's porting center. You will need to look at it's dependencies.. May be little hard to get it up and running but once it is running, it's a beauty. You can trace the packets only between these two servers (using tcpdump's packet filters) and see what's happening.
http://hpux.connect.org.uk/hppd/hpux/Networking/Admin/tcpdump-3.8.3/
From the server side, use 'tusc' with the 'cat' command and see if you get any clues out of it.
http://hpux.connect.org.uk/hppd/hpux/Sysadmin/tusc-7.5/
Even if you don't use them, then are good to have tools.
You can use tcpdump or built-in nettl (nettladm) to capture the packets.. but I personally like ethreal as it has a nice GUI.
-Sri