HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

RH Linux 5: NFS hangs and users cant log in

 
JM asks
Advisor

RH Linux 5: NFS hangs and users cant log in

The server is a DL380G5 running Red Hat Linux 5 using NIS with NFS mounts. The server does encounter heavy loads.

$ uname -a
Linux . . . 2.6.18-8.el5PAE #1 SMP Fri Jan 26 14:28:43 EST 2007 i686 i686 i386 GNU/Linux

$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5 (Tikanga)

$ rpm -qa | grep -i nfs
system-config-nfs-1.3.23-1.el5
nfs-utils-lib-1.0.8-7.2
nfs-utils-1.0.9-16.el5

$ ssh -V
OpenSSH_4.3p2, OpenSSL 0.9.8b 04 May 2006

One server uses log on from
HP-UX . . . B.11.23 U ia64 3735928559



user mounts are

type nfs (rw,nosuid,soft,addr=nn.nnn.nnn.nnn)

two other non user mounts with
type nfs (rw,nosuid,nodev,addr=nn.nnn.nnn.nnn)

different users will log into this target server from same server, where home directories reside and may be on different filesystems (if that matters).


When issue arises symptoms include:
- Some uses are able to log in, though one user I talked to, I think, stated their home directory did not get mounted. Circumstances may change as time goes on.
- Some users cannot log in via ssh, like me: “ssh exchange identification: connection closed by remote host. “
- At times, could log in using telnet, which leads me to think it is not an NIS issue. At another time, it asked for password and then ‘: “Connection closed by foreign host.”
- I had a ssh session running, and event occurred overnight, the session continued to function and mounted home directory appear to continue OK, but when I tried to log in with an additional new session, that one got bounced
- when df command is executed, it hangs
- NFS hangs on reboot, going down
- Server had to be rebooted on back to back days; then went 2 weeks before having the issue occur 2X within 36 hours. Again, there are times the server encounters heavy loads
- FYI the message log generates about 250k messages a day to go thru and without specific moment when a user cannot log in or df command hangs, it is difficult to determine exactly when the event occurs.
6 REPLIES
Steven E. Protter
Exalted Contributor

Re: RH Linux 5: NFS hangs and users cant log in

Shalom,

1) Has this server ever been patched?
2) Try and see if there is a pattern, e.g. the failures are all with home directories being served by a particular host.

I suspect communications is being lost to a particular host in the NFS/home infrastructure.

You may need to create or elevate an existing server to be a management server. It will need the ability to manage what users are on what NFS home server.

You can run a few utilities on the various NFS hosts to gather data.

df -kh # if there are nfs mounts the hang will include the name of the hung server.

showmount -e hostname

# This will show available NFS shares and should be run over time.


If no host is going down, then you may simply be overloading the NFS server.

sar can be used to measure network traffic and you may need to load balance.

NFS can be clustered, active active, so you have options to lower load factors.

Network teaming can also help with load issues. Two NIC cards bonded with 1 IP address can in practice handle 75% more traffic than a single NIC card.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
TwoProc
Honored Contributor

Re: RH Linux 5: NFS hangs and users cant log in

As far as the log on problems for those that have problems all the time - check the permissions in host.allow - especially if your ftp is run xinetd or inetd. You may consider moving your ftp from being run from an inetd type service to being run full time and started from /etc/rcX.d to lower the load from having to have those services spin up and down constantly.
We are the people our parents warned us about --Jimmy Buffett
JM asks
Advisor

Re: RH Linux 5: NFS hangs and users cant log in

John/Steven

thanks for the quick response. I have only been here 3 weeks and am trying to get familiar with the environment. I am going to have to do a little more investigating to know the patch history of the server with some of the others more familiar with the server, but I do want to have more tools to work with if it does occur and have some better answers.

I am going to have to continue investigating and familarizing myself with things. One thing I noticed with "mount -l" is one of the less problematic users showed the server alias rather than the server name itself.

The only other Numbers I have at the moment are:

The server has only one NIC.
The only measurement that I have been following is PTYs which max out at 840.
MAX pty for the Linux server is 4096.
Gerardo Arceri
Trusted Contributor

Re: RH Linux 5: NFS hangs and users cant log in

I'd start by mounting the NFS share with the TCP option,( it defaults to UDP ) mount server:/share -o tcp.
If you are having network issues at least this will mitigate them.
I totally agree that this system badly needs to be patched.

JM asks
Advisor

Re: RH Linux 5: NFS hangs and users cant log in

users/app owners ramped down the workload on the server which helped clear problem
JM asks
Advisor

Re: RH Linux 5: NFS hangs and users cant log in

users modified application workload on server which helped clear problem.