1748285 Members
4036 Online
108761 Solutions
New Discussion

NFSv4 slow performance

 
EricIrvin
Member

NFSv4 slow performance

We recently migrated our datacenter to a new location that has newer servers, more CPU's, double the memory, etc...  We took this opportunity to move from NFSv3 to NFSv4, and from day one, all the NFS transactions between server and client have been "much" slower, and reverting to NFSv3 fixed the slow down.

 

I know there are many factors that have changed by moving to a new datacenter, but the fact that reverting to NFSv3 fixed the slow down seems to indicate that NFSv4 is slower than NFSv3, but why?  Are there tunables that we have to change?

 

Basic system specs:

NFS Servers are SD2 vPars

NFS Clients are primarily Integrity BL860c i4's

OS: HP-UX 11i v3, March 2013 with ONCplus  B.11.31.16 ONC+ 2.3

 

Server share options via SG and HANFS: XFS[3]="-o fsid=16389,root=<host_list.fqdn> /interfaces"

 

Client mount options: sapuxr3p.ci.root:/interfaces /interfaces nfs vers=3,defaults 0 0

 

dmesg error: WARNING: Synchronous Page I/O error occurred while paging to/from NFS server sapuxr3p.ci.root
file system is /interfaces

 

I'm looking for some guidance as to why NFSv4 is performing worse than NFSv3.

6 REPLIES 6
Dave Olker
HPE Pro

Re: NFSv4 slow performance

Hard to say without data.  You could try mounting the same filesystem via v3 and v4 at the same time and then collect nfsstat -m to compare mount options.  Aside from mount options and kernel tunables (of which there are dozens), a network trace comparing v3 traffic vs. v4 requests would be the more definitive way of understanding why things are slower with v4.

 

If you are able to mount the same filesystem with v3 and v4, post nfsstat -m output here.

 

Dave

I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
EricIrvin
Member

Re: NFSv4 slow performance

Hi Dave,

   Thank you for the quick response.  I've added some more detail to my original post.  As requested, here is the output from netstat -m for both a v3 & v4 simultaneous mount:

 

esbd0606:/homeroot/root> nfsstat -m
/tmp_mnt/usr/sap/transSR from sapuxsrp.ci.root:/usr/sap/transSR
Flags: vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,devs,rsize=32768,wsize=32768,retrans=5,timeo=600
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

 

/usr/sap/transSR from sapuxsrp.ci.root:/usr/sap/transSR
Flags: vers=4,proto=tcp,sec=sys,hard,intr,link,symlink,acl,devs,rsize=32768,wsize=32768,retrans=5,timeo=600
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

 

**NOTE** I have taken this output from a lesser used production systems, but the variables should be the same.

Dave Olker
HPE Pro

Re: NFSv4 slow performance

      **NOTE** I have taken this output from a lesser used production systems, but the variables should be the same.

 

Not necessarily.  Some NFS mount options will change based on kernel settings.  If you can get the nfsstat -m output from the real system experiencing the problem that would be best.

 

Assuming the other system matches this one there doesn't appear to be any differences that would affect the performance of v4 over v3.  

 

To be clear, are you saying *every* NFS operation takes longer with v4 or only *certain* operations?  In other words, does it take longer to mount the filesystem with v4?  Once both filesystems are mounted, does it take longer to read, write, traverse directories, etc?  If you can isolate exactly which operations are slower with v4 it will help determine a course of action.

 

Dave

I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
EricIrvin
Member

Re: NFSv4 slow performance

Dave,

   Here is the output from the main problem system:

/interfaces from sapuxr3p.ci.root:/interfaces
Flags: vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,devs,rsize=32768,wsize=32768,retrans=5,timeo=600
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

 

The operations we have seen the most problems with are as follows:

- A scripted ftp from a server to one of the NFS clients does a "cd" within the ftp session which hangs for ~20 seconds.  This particular script transfers many small files, so the hangs wind up compounding.

 

- Reading a small file from a directory containing a large number of small files.

 

- Running a tranport of a medium to large size file into a client mount point. The transport has to copy some data and write log files across multiple client mount points.

 

These systems are all running SAP and the client chosen for any given operation is determined by the central SAP instances load balancing mechanism.

 

 

Dave Olker
HPE Pro

Re: NFSv4 slow performance

You can tackle these one at a time.  The most egregious one seems to be:

 

   - A scripted ftp from a server to one of the NFS clients does a "cd" within the ftp session which stops responding for ~20 seconds.

 

This should be fairly easy to reproduce manually.  Once you're able to reproduce this at will, a network trace of this operation should show you where the 20 second delay is.

 

Dave

I work for HPE

[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Bill Hassell
Honored Contributor

Re: NFSv4 slow performance

>>  - A scripted ftp from a server to one of the NFS clients does a "cd" within the ftp session which hangs for ~20 seconds.

 

20 secs sounds like a possible DNS resolution delay. Have you tried using IP rather than hostname to the remote system?



Bill Hassell, sysadmin