1832914 Members
2774 Online
110048 Solutions
New Discussion

ML530 Hangs on RH 72

 
Dhiresh Vyas
Occasional Advisor

ML530 Hangs on RH 72

Our ML530 Server hangs on performing multiple large file transfers.

The current setup is as follws -

Hardware
ML530 G2
5304/256 raid controller
Drives 1-2 Raid 1
Drives 3-6 Raid 5 with online spare

O/S
RedHat 7.2 - We are limited to this because we run Veritas netbackup and that requires 7.2

We are also running Veritas File system on the Raid 5 drives.

Problems

Some of the data folders are exported using exportfs, and when one tries to copy several files to the NFS client, the server freezes after 10 minutes.

WHen the server freezes, all the drive lights are a steady green. Keyboard and mouse are not responsive, but server responds to ping. The only way out of this is a power cycle.

If we copy the same files via scp (as opposed to using the mounts) the server does not crash.

Any help in this matter will be greatly appreciated.

6 REPLIES 6
Huc_1
Honored Contributor

Re: ML530 Hangs on RH 72


If you find nothing in dmesg or in /var/log/messages on in all the log files in /var/log/* or even in your sar reports, I would considere using a tool like tcpdump.

#tcpdump -lvv > tcpdump.dat

beware this could generate lots of data in "10 minutes"on a busy network so you may need lot of space.

and also check logs on the other node if you can ... in despair you could also tcpdump from there too ...

This is sometimes hard to track, the above is only a way to determine if this is a network resource problem or more a raid/disk nfs problem.

Hope it help, and keep us informed.

J-P

Smile I will feel the difference
Dhiresh Vyas
Occasional Advisor

Re: ML530 Hangs on RH 72

I have tcpdump running for now. Will be able to post some results later.

Thanks
Mark Grant
Honored Contributor

Re: ML530 Hangs on RH 72

I would be interested in the clients logs myself. What OS is the client and what kernel version does Red HAt 7.2 run?

NFS was a bit odd on Linux a while ago.
Never preceed any demonstration with anything more predictive than "watch this"
Dhiresh Vyas
Occasional Advisor

Re: ML530 Hangs on RH 72

>I would be interested in the clients logs >myself. What OS is the client and what >kernel version does Red HAt 7.2 run?

The NFS client Runs RH 8 kernel 2.4.18-14smp
Server runs 2.4.7-10smp

The client log always shows

kernel: nfs: server 10.10.10.20 not responding, timed out,

umount gives me -
kernel: nfs_statfs: statfs error = 5

>NFS was a bit odd on Linux a while ago.

We are trying to install a nfsfix from veritas, but there seem to be some version conflicts with the patch and after applying it, nfsd does not start, so we are forced to uninstall and go back to the native nfsd module.

Dhiresh Vyas
Occasional Advisor

Re: ML530 Hangs on RH 72

Here is an update -

Veritas has given us a patch to fix this issue, but we cannot load the patch because od a kernel version mismatch between the Kernel and the patch.

The patch supposedly fixes a nfsd bug which causes the system to panic.

I have now been elevated to a higher level of support and am awaiting a call from Veritas.
Paulo A G Fessel
Trusted Contributor

Re: ML530 Hangs on RH 72

This is bizarre. We've been told that NetBackup would run regardless of the kernel version - something that does make sense, as there are no parts of NetBackup that are kernel dependent.

So my bet is that you're depending on Veritas because of VxFS. And if it's so, I have some bad news:

* the kernels that VxFS supports have sluggish SCSI performance. When certifying VxFS and VxVM for Linux I've noticed that. Upgrading from 2.4.7-10smp to newer and clean 2.4.19 raised SCSI performance by nearly 500% (2400kB/s in 2.4.7 versus 9800 kB/s in 2.4.19).

* it's not neccessary to use VxFS. Linux has other journalled filesystems with excellent performance. I personally recommend using XFS in light of recent benchmarks; if you do need the maximum flexibility of VxFS then ReiserFS is also a great choice.

Talk with your Veritas representative. There's no sense forcing the customers to use an old (and soon to EOL) version of Red Hat OS.

Good luck,
Paulo Fessel
L'employé propose, le boss dispose.