Operating System - HP-UX
1836383 Members
3807 Online
110100 Solutions
New Discussion

NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

 
Pablo Noya Noya
Occasional Advisor

NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Hello all!!! As a forum virgin, here is my first issue. We are having problems with a filesystem mounted via NFS. We lose the connection intermitently both day and night, restoring itself after several minutes as in the example below. We are running hp-ux 11.i on both machines. Is there a log, apart from the syslog where we can find further information. I will be mounting the same filesystem via NFS on another machine to test, but we are not sure if it is a network issue. Will network problems be recorded in a log??

Thanks in advance,

Pablo

Jan 17 12:56:26 cchp39 vmunix: NFS server 10.254.6.66 not responding still trying
Jan 17 13:04:29 cchp39 vmunix: NFS server 10.254.6.66 ok
Jan 17 13:07:37 cchp39 vmunix: NFS server 10.254.6.66 not responding still trying
Jan 17 13:16:40 cchp39 vmunix: NFS server 10.254.6.66 ok
Jan 17 13:16:41 cchp39 vmunix: NFS server 10.254.6.66 not responding still trying
17 REPLIES 17
Pete Randall
Outstanding Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Hi Pablo,

I'd start by trying to verify connectivity during a "not ok" period. After you get the "not responding" message, try ping the server: "ping 10.254.6.66".

Pete

Pete
Bill McNAMARA_1
Honored Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

use nfsstat to check status.
attach the results in your next post.

Bill
It works for me (tm)
Steven E. Protter
Exalted Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

If you have been playing with pfs_mount, perhaps trying to mount oracle cd's you can get in a little trouble with this issue.

Without the nfsstat data, here is a pfs_mount script that might help.

Attached.

Its an /sbin/init.d script but its really good at managing those tricky little oracle mounts.

P
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jannik
Honored Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

I have seen it on mashines which have a lot of network traffic. So my gess is that it just loos the connection for a short period of time.
Another thing could be you network settings like 100 or 10 MBit Halv or Full Duplex:
netstat -in to fine the right interfase.
lanscan to fine the nunber of the interface.
lanadmin -sx 0 (if interfase 0)
Output should be something like this:
pollux:/#lanadmin -sx 0
Speed = 100000000
Speed = 100 Half-Duplex.
Autonegotiation = On.

BR,
Jannik
jaton
Pablo Noya Noya
Occasional Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Dear all,

I am able to ping the machine when the NFS filesystem is down.
Here is the output of the nfsstat.
Oleg Zieaev_1
Regular Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Hello.

Please post outputs from nfsstat and nfsstat -m
This will help to point to the problem.

Good luck,
0leg
Professionals will prevail ...
Pablo Noya Noya
Occasional Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

I am currently unable to access NFS server 10.254.6.66 and lanscan returns a state of 'UP' for all my lans.
Here is the output for lanadmin:

Thanks again!!!!
Pablo

lanadmin -sx 1
Speed = 100000000
Current Config = 100 Full-Duplex AUTONEG
cchp39:root:/usr/sap/trans# lanadmin -sx 7
Speed = 100000000
Current Config = 100 Full-Duplex MANUAL
Pablo Noya Noya
Occasional Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

With the filesystem still down I am unable to do a
nfsstat -m
with the error
'Unable to open mount table'

Cheers!!

Pablo
Kelli Ward
Trusted Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Hi Pablo,
Have you had a chance to NFS mount another client yet?
Has the failure followed?
Kel
The more I learn, the more I realize how much more I have to learn. Isn't it GREAT!
Ron Kinner
Honored Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Do
lanadmin
lan
display

and look for errors (especially on the second page)

Do this on both machines. Possibly a duplex mismatch or bad cable is causing the switch to reset the interface periodically.

Also run

netstat -s

on both machines. See if you see anything suspicious.

Ron
Rammig Claus
Frequent Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Hi Pablo,

maybe the number of your NFS deamons (nfsd) is to low.
Look at NUM_NFSD in /etc/rc.config.d/nfsconf

Best regards ...
Claus
No risc no fun
Pablo Noya Noya
Occasional Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Hello all again!!!
I've had a look at the lan status on both machines with lanadmin, and I see that on cchp39 I have a Operation Status (value) = down(2). This differs from a value of 'up' on the NFS server 10.254.6.66 where the filesystem is exported.
I've also had a look at the num_nfsd parameter in nfsconf file which is set to 16.
Does any of this make any sense??

Thanks again to everyone who is helping me out with this. I'm only a DBA with unix admin skills, but lan stuff just kills me.

Pablo
Jannik
Honored Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

plz post an netstat -in from the server where you have a down connection.

BR,
Jannik
jaton
Ron Kinner
Honored Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Pablo,

I assume that you have only the one NIC in your box. Your Op Stat Down could be the result of having a second NIC installed but not connected. If you have more than one then after you enter the lan command type ppa 1
(or one plus whatever the first one was).

Op stat down essentially means the same as no link light. Either the cable is bad, the NIC is bad or the switch/hub port is not working. What kind of switch/hub do you have?

I have seen NICs where the RJ45 connector was a bit loose and you could move the cable one way and it would work. A little the other way and it stopped. However, I lean toward blaming the switch. Some switches will periodically reset or busy out a port which has too many errors. Too many errors is often the result of a duplex mismatch so it is usually a good idea to set the duplex to 100 full on both sides of the link (or what ever both sides can support). You might also try a different port on the hub/switch just on the off chance that it is bad.

Ron

Pablo Noya Noya
Occasional Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

Thanks to all for your help. The NFS mounted filesystem has been working correctly for the last couple of days and so I am unable to try several things. The alternate server to mount the filesystems is also unavailable until next week (you know, firewall people and all that).
We shall see on Monday!!!!

Thanks again
Wai Kiong Choy
Advisor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc

I encountered the same problem in my NFS environment. Our Unix and network guys spent a few days to fix the problem. Below are our findings:

1. Investigate whether the Kernel parameter values recommended by David Olker in his book "Optimizing nfs performance ..." are for you. There is a document based on the book available for free download on hp website.

2. Do you have multiple NICs on the NFS Server/Client? If yes, are these NIC connected to the same switch/hub? We experienced problem when we connect the NICs to different switches. Are these NICs advertising the same MAC address to other machines? (More on how to do this below)

3. Check the speed/duplex settings of the nic. Check whether the same settings are set on the switch/hub. Don't use Auto-negotiation because different vendors implement the negotiation differently and it may cause intermittent problems. If the machine is connected to the CISCO switch, check whether the port has unusually high FCS error. FCS error points to speed/duplex mode mismatch between nic and port.

4. Write a script to do a continuous ping between the NFS Server and client, perferably with timestamp attached to each ping output. When the "NFS server not responding" error occurs, check the ping output and see whether ping dropped packets at the same time.

5. Write a script to do the following continual (approx. 30 sec interval) ARP to resolve the MAC address of the NFS Server on the client, perferably with timestamp attached to the results too.

- arp nfsserver >> output
- arp -d nfsserver # delete the entry from arp table
- arp nfsserver >> output

ARP should resolve to the same MAC address all the time. If the MAC address of the nfs server changes, it will cause problem to the clients.
Sanjiv Sharma_1
Honored Contributor

Re: NFS not responding, NFS ok, NFS not responding, NFS ok,...etc.etc