Re: Weird 11.11 NFS Client Behaviour to a NAS - tcp, udp or cache Issues?

Alzhy · ‎01-21-2010

I've an 11.11 NFS Client to a Celerra NAS. It is using NFS v3 over UDP (Client and NAS are on the same Gigabit LAN):

celerra:/bills /bills nfs soft,rsize=32768,wsize=32768,proto=udp,NFSv3 0 0

We've an application that was suffering purportedly very slow writes to this NAS. The writes are not that intensive but trussing the process show it indeed was trickling on writes.

So, I checked the performance of the NFS mount by doing a write to it, a 1 GB file using the "prealloc" command:

prealloc testfile 1000000000

After that test (which was done in 7 seconds), the applications write throughput went fast and all is normal.

Do any of the NFS Gurus here have any clue? I'm not that of an NFS expert, could not even understand what I should be lookin with nfsstat either.

Would changing to proto=tcp be a good move? The server is a fairly large 40 cpu nPar running 11.11 HP-UX. Cache is at 2GB. Server is a mixed load environment with one massive database on it and it also runs custom middleware apps that transact with the local database as well as Weblogic infrastructure. The server has a lone Gigabit Network.

The NAS is an EMC Celerra. We have other similar servers where this issue does not seem to be a problem and where we have the same NFSv3 over udp NFS mount option. Similarly sized as well but the system cache is far greater than what we have on this problematic environment.

TIA for any insights, views and suggestions on what to look for.

Hakuna Matata.

mvpel · ‎01-21-2010

Definitely use TCP - here's a good rundown of why: http://www.sunmanagers.org/archives/1997/1586.html

In particular, using 32k w/rsize on UDP NFS is not great, because if a single one out of about 22 packets is dropped or lost, all 22 packets have to be retransmitted. With TCP, only the missing packet needs to be resent.

Further, UDP doesn't do congestion avoidance like TCP does, so you can crush your network with large UDP transfers, whereas TCP will dial back gracefully.

You may also want to take a look at your TCP tuning parameters settings as you migrate your NFS links to TCP. Are you using jumbo frames? Most people don't. The default transmit and receive windows may be mismatched with your wire speed - the default TCP window size of 32k can be filled in about a quarter of a millisecond at gigabit speeds. I also find that turning on selective ACK is helpful too, with larger windows.

Alzhy · ‎01-21-2010

Thanks mvpel,

Any reason why my write test seemed to have "nudged" those apps to start writing faster? My clients are now wanting us to have a "nudger" script to run along side those apps that wrote to the NAS to ensure their apps write efficiently -- which is absurd ;^)

nfsstat's seemed to be clean.
my buffers at 2GB seem to be sufficient... 100% buffer util ave on Read. Write Cache performance varies widely 50% to 100%..

average packets per second during the episode was ranging from 1500 to 5000 P/s.

Hakuna Matata.

IT Response · ‎01-22-2010

If you have not rebooted since you have seen the problem, can you post the "nfsstat -c" here?

If the problem is reproducible, a before and after snapshot of the same might be useful.

If you were writing to the pre-alloc'd file with the app, maybe it was pre-caching the whole file on the NAS? Otherwise, no idea on the nudging theory. Maybe those big round nulls cleared the pipes? ;-)

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Weird 11.11 NFS Client Behaviour to a NAS - tcp, udp or cache Issues?

Weird 11.11 NFS Client Behaviour to a NAS - tcp, udp or cache Issues?