Operating System - HP-UX
1820619 Members
1839 Online
109626 Solutions
New Discussion юеВ

Re: NFS performance problems - large file writes slow system severely

 
SOLVED
Go to solution
Barry C. Dowell
Frequent Advisor

NFS performance problems - large file writes slow system severely

Hopefully I can describe the issue here well enough to get a good start towards a solution or places to look for solving the problem.

System number 1, HP-UX {hostname} B.11.11 U 9000/785

System is an NFS client of a Windows Server 2003 R2 server running Server for NFS. Now, before you start to push me towards Microsoft for support on the issue, please understand that I've not finished describing the problem and that the problem doesn't exist at all on another HP-UX box that is running 11.23, nor on another HP-UX box running 10.20, so it would seem to be directly related to the host I'm asking about here.

On the 11.23 system I run a test to check system write speeds where I copy a very large file from one NFS mounted directory to another directory on the same mount point. Example:

time cp /path/filename /path2/filename2

On that system that very large file is written in approximately 15 seconds.


Same command run on the 11.11 system mounting the same mount point basically ties up all I/O on that system until the cp has been completed. That wouldn't seem that bad except that the cp takes literally many minutes (40+) if not over an hour to complete the same thing that took 15 seconds on the 11.23 box.

I have the Dave Olker / NFS Performance Tuning for HP-UX 11.0 and 11i systems document and am trying to make my way through it but have to say up front I'm more familiar with Windows than with HP-UX. I recall seeing something in that document (but it may have been elsewhere as I've been googling away trying to find some things to check on the system) about a configuration that would result in the type of issue I mention, but am not sure where I saw it and am leery of making tuning changes to the system without more understanding of what I'm changing and/or why.

I'd greatly appreciate any tips and suggestions on where to look for solving this problem. Clearly something isn't right on the 11.11 system, but I'm not sure what.

It seems (basically) to be a case of the cp command overwhelming the I/O system to a point that nothing else I/O related will happen on the box until the original I/O has been completed. I know that isn't quite true as other processes do get their I/O servicing but at a much slower rate, and with much more delay in the response time.

Thanks in advance for any assistance you can provide.
51 REPLIES 51
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Adding a data point or two here, thanks to some help from one of my partner system admins we determined that the problem isn't just occurring during data writes, it also happens when attempting to read large files and copy them to local drives in the HP UX workstations.
A. Clay Stephenson
Acclaimed Contributor

Re: NFS performance problems - large file writes slow system severely

Before you do anything else make sure that you don't simply have network problems. Specifically mismatched speed/duplex settings between the host and switch port. Surprising a mismatched duplex setting will almost work well and you probably wouldn't see any problems, for example, in a telnet session but a file transfer (either NFS or ftp) would crawl.
If it ain't broke, I can fix that.
Dave Olker
Neighborhood Moderator
Solution

Re: NFS performance problems - large file writes slow system severely

Hi Barry,

In addition to Clay's suggestion, I'd also suggest trying an application that doesn't necessarily use both read and write to accomplish it's work, like cp.

I'd suggest getting a copy of iozone (www.iozone.org) and try running a simple read test or a simple write test against the NFS filesystem. That will at least show if the problem occurs with different applications and it helps isolate which types of I/O work and which ones cause problems.

If you need any help running iozone let me know and I can offer some suggested syntax.

Regards,

Dave


I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Steven E. Protter
Exalted Contributor

Re: NFS performance problems - large file writes slow system severely

Shalom,

After making sure the network is not the problem, there are some server configuration optimizations that can be played with in /etc/exports.

Disable asyncronous writes:

http://slackware.osuosl.org/slackware-3.3/docs/NFS-HOWTO
Ignore the portmapper stuff that's Linux. Anything that applies to /etc/exports can be tried on HP-UX.

In particular, these two parameters can be tuned to improve write performance, though it will have to be by trail and error:

size=4096,wsize=4096

Optimize those sizes and you can improve throughpout dramatically.

Don't forget to check the server for i/o wait issues:

http://www.hpux.ws/system.perf.sh

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Yogeeraj_1
Honored Contributor

Re: NFS performance problems - large file writes slow system severely

hi,

Allow me to also add that you can also monitor the global NFS activity by running glance

glance -N

with this, you will be able to read realtime values for:
Read Rate
Write Rate
Read Byte Rate
Write Byte Rate
NFS Call Count
Bad Call Count
Service Time
Network Time
Read/Write Qlen
Idle biods

hope this helps too!

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Steven E. Protter
Exalted Contributor

Re: NFS performance problems - large file writes slow system severely

I made a mistake:

rsize=4096,wsize=4096

These are client mount options, not /etc/exports options.

They still can help.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
V. Nyga
Honored Contributor

Re: NFS performance problems - large file writes slow system severely

Hi,

one way for a quick test:
'lanscan' - 'lan' - 'dis'
After the last command you see the lan interface with the ppa=0.
If
'Administration Status (value) = up(1)'
and
'Operation Status (value) = up(1)'
then it's your running interface.
'Description = lan0 HP PCI 10/100Base-TX Core [100BASE-TX,FD,MANUAL,TT=1500]'
shows you the mode it is running.

After
'Press to continue'
you see the 'Ethernet-like Statistics'.
If there's anything except '0' (except the first line), then you've a network problem.
Most likely - as already said - a HalfDuplex/FullDuplex mismatch.

HTH
Volkmar
*** Say 'Thanks' with Kudos ***
Christian Tremblay
Trusted Contributor

Re: NFS performance problems - large file writes slow system severely

Although it looks a lot like a network problem (switch ports speed and duplex settings) a useful tool to investigate NFS problems is nfsstat.

man nfsstat
KapilRaj
Honored Contributor

Re: NFS performance problems - large file writes slow system severely

Can you verify if the 'NFS Client' Versions are same on both 11.23 and 11.11 and if there are any patches for 11.11 NFS client ?.

Regards,

Kaps
Nothing is impossible
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Hello everyone and thanks for the suggestions so far.

Couple of things to report back and update a bit.

My co-worker and myself tried a few things to see what was going on.

First, on the network side we don't think we have problems there but might. We will of course continue to check there as well as other areas to keep trying to resolve the issue.

Second, on the benchmarking side we followed one of the benchmark tests shown in Dave Olker's presentation/paper: NFS Performance Tuning for HP-UX 11.0 and 11i systems and checked results with same.

Running dd to check performance shows numbers *way* off for the problematic 11.11 system(s), while the 11.23 works fine.

For example:

time dd if=/dev/zero of=/server/path bs=32k count=35 (note the small number) took 35s

Same command using the much larger file run against the 11.23 system (with 3500 records) runs in about 15s


Now, on the 11.23 system I could see that the system was making use of biods to do the job and could see (via top) where the biods were firing up and doing their thing as the system moved the data around. Even after the system came back to a prompt ready to run, I could see the biods showing in top as the cache was flushing, but again within about 10 - 15 seconds that activity cleared up and the system process list (top 30 give or take) showed everything else running and biods no where to be seen (the system wasn't active for any one at the time).


Following up on the rsize and wsize, those values are currently 32k on all systems involved (the server side and client side on all of the above).


One other data point from something we tried -- we added 1 GB of RAM to one of the problematic systems. We had recently retired another system and had the memory available from same so we added the memory to the system that had been experiencing problems. Since the system is heavily used, and that particular system is also serving up NFS for other systems, we figured adding the memory wouldn't hurt and should help the performance on that box.


Anyway, thanks again everyone for the suggestions and tips and hopefully more info will come here soon as we check versions, patches and such.
Dave Olker
Neighborhood Moderator

Re: NFS performance problems - large file writes slow system severely

Hi Barry,

_____________________________________

Now, on the 11.23 system I could see that the system was making use of biods to do the job and could see (via top) where the biods were firing up and doing their thing as the system moved the data around. Even after the system came back to a prompt ready to run, I could see the biods showing in top as the cache was flushing, but again within about 10 - 15 seconds that activity cleared up and the system process list (top 30 give or take) showed everything else running and biods no where to be seen (the system wasn't active for any one at the time).
______________________________________


Are you saying you didn't see this behavior on the 11.11 system? If you're not seeing the biod activity on 11.11 then I'm guessing you may be running out of buffer cache on the 11.11 system since that's where the biods would be pulling requests from and sending them over the wire.

How big is the buffer cache on your 11.23 system and your 11.11 system? How much free buffer cache resources do you have on the 11.11 system vs. the 11.23 system? Is buffer cache smaller on 11.11 or more consumed by other processes on the 11.11 system?

Regards,

Dave


I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

V. Nyga said:

Hi,

one way for a quick test:
'lanscan' - 'lan' - 'dis'


I'm not sure how that command is really supposed to be input or if that is a separate command.

(Sorry, I'm not as fluent in HP-UX as I am in Windows :-) )

I can do lanscan and get back results which list station addresses, tell me the MAC type, the NM ID and Hdw State and such. If I enter:

lanscan lan I get scolded for using a bad parameter.

I'm shown that default is lanscan /stand/vmunix /dev/kmem



Following up a bit on the network information - all of the systems involved are cabled directly to a Cisco 8 port gigabit switch. That includes the Windows Server 2003 R2 box that is serving up NFS, and all of the clients that are trying to access that shared space.

I'm not currently aware of the settings for the interfaces though my co-worker should be aware of the HP side of the house there. On the Windows side I'm sure that the settings are basically "auto" to get Gigabit performance (they are gig cards). Cabling is all Cat5e with short runs to the switch.
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Dave Olker said:
{
Are you saying you didn't see this behavior on the 11.11 system? If you're not seeing the biod activity on 11.11 then I'm guessing you may be running out of buffer cache on the 11.11 system since that's where the biods would be pulling requests from and sending them over the wire.

How big is the buffer cache on your 11.23 system and your 11.11 system? How much free buffer cache resources do you have on the 11.11 system vs. the 11.23 system? Is buffer cache smaller on 11.11 or more consumed by other processes on the 11.11 system?

Regards,

Dave}

Yes, I don't see that activity on the 11.11 system (or at least not that I could tell).

Again, I'm not fluent enough in HP-UX to really know where/how to check the Buffer settings you are mentioning, but am more than happy to check further if you could provide some tips on how/where to look for those settings.

Thanks again
Dave Olker
Neighborhood Moderator

Re: NFS performance problems - large file writes slow system severely

Hi Barry,

There are numerous tools to tell the size of buffer cache and the cache hit rates. sar and Glance are probably the most common and likely to be on your system.

If you have Glance installed (and if you don't you can download a free trial version from HP) you can go to the System Tables screens and look at screen 2 of 2. This can be done by hitting the lower-case "t" and then hitting the space bar to get to screen 2. In those metrics you'll see:

Buffer Cache 3.63gb na 3.63gb na
Buffer Cache Min 3.20gb
Buffer Cache Max 32.0gb

This will show you the minimum and maximum sizes of the cache as well as the current size of the cache. There are other metrics in Glance to show what your cache hit rate is, but for now we can at least verify the size of the cache on your 11.11 and 11.23 systems.

Regards,

Dave


I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Thanks for the assistance so far folks. I'm going to be out of the office on Thursday and Friday of the current week, but back in on Monday. Meanwhile my co-workers may add some information to this thread if they have information to add that will help get a resolution.

Thanks again all.
V. Nyga
Honored Contributor

Re: NFS performance problems - large file writes slow system severely

ARRGGGG!

Sorry Barry!

My fault, I don't know where the mismatch came from, I tested it at my ws but typed the different here.
'landiag' is the command
then you have the option 'lan', then 'dis'.
To terminate the administration tool type 'q' for quit.

HTH
Volkmar
*** Say 'Thanks' with Kudos ***
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Greetings again all (especially Dave Olker who had suggested getting these benchmarks).

I'm gonna try to add the results from each system (once I've gotten Glance installed on all of the above).

--

Here's the numbers from one of the 11.11 problem systems:
Buffer Cache 256.0mb na 256.0mb na
Buffer Cache Min 25.6mb
Buffer Cache Max 256.0mb


Note that system has 512 MB of physical memory.
Other system details:

Model : 9000/785/B2600 Phys Memory : 512mb Network Interfaces : 2
OS Name : HP-UX Number CPUs : 1 Number Swap Areas : 2
OS Release: B.11.11 Number Disks: 1 Avail Volume Groups: 2
OS Kernel Type: 64 bits Mem Region Max Page Size: 1024mb Page 2 of 2


More details to come once Glance is installed on the second problem system.
Dave Olker
Neighborhood Moderator

Re: NFS performance problems - large file writes slow system severely

Hi Barry,

Please be sure to post the same data from the working systems so we can compare a working system against a failing system.

Thanks,

Dave


I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Here's the second "non-working" system (slow performance) results:

Buffer Cache 1024mb na 1024mb na
Buffer Cache Min 102.4mb
Buffer Cache Max 1024mb


And more details about that system:
Model : 9000/785/B2600 Phys Memory : 2.0gb Network Interfaces : 3
OS Name : HP-UX Number CPUs : 1 Number Swap Areas : 2
OS Release: B.11.11 Number Disks: 2 Avail Volume Groups: 2
OS Kernel Type: 64 bits Mem Region Max Page Size: 1024mb Page 2 of 2


I'm gonna refer to this system as "H", with the earlier system referred to as "R" (so I can keep names straight for myself).

The working system will be referred to as "F" (same reasons). Please note that on the "working" system, the system is running HP-UX 11.23, not 11.11i as these other systems are running.


More to come as I get Glance/Glance Plus installed on the other system.
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Ok, this is from the "working" system, one I'll call "F"
--

Buffer Cache 409.6mb na 409.6mb na
Buffer Cache Min 409.6mb
Buffer Cache Max 409.6mb

System details:
Model : 9000/785/C3700 Phys Memory : 2.0gb Network Interfaces : 3
OS Name : HP-UX Number CPUs : 1 Number Swap Areas : 2
OS Release: B.11.23 Number Disks: 2 Avail Volume Groups: 2
OS Kernel Type: 64 bits Mem Region Max Page Size: 1024mb Page 2 of 2


That covers the requested information from Glance.

I'm gonna try to get the landiag (suggested above) done also and see what I can report back from same.
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Here's the landiag output from system "F" (the system that performs well)
--

PPA Number = 0
Description = lan0 HP PCI 10/100Base-TX Core [100BASE-TX,FD,AUTO,TT=1500]
Type (value) = ethernet-csmacd(6)
MTU Size = 1500
Speed = 100000000
Station Address = 0x306e48db7f
Administration Status (value) = up(1)
Operation Status (value) = up(1)
Last Change = 698
Inbound Octets = 1540672468
Inbound Unicast Packets = 18282357
Inbound Non-Unicast Packets = 45709
Inbound Discards = 0
Inbound Errors = 0
Inbound Unknown Protocols = 23
Outbound Octets = 3408036617
Outbound Unicast Packets = 26097421
Outbound Non-Unicast Packets = 2685
Outbound Discards = 0
Outbound Errors = 0
Outbound Queue Length = 0
Specific = 655367


Same output from the other two systems momentarily
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

This is the landiag from system "H" - note that PPA is 1 for that system (not 0)
--

LAN INTERFACE STATUS DISPLAY
Mon, Nov 20,2006 16:36:25

PPA Number = 1
Description = lan1 HP A5230A/B5509BA PCI 10/100Base-TX Addon [100BASE-TX,FD,M
Type (value) = ethernet-csmacd(6)
MTU Size = 1500
Speed = 100000000
Station Address = 0x12799d2403
Administration Status (value) = up(1)
Operation Status (value) = up(1)
Last Change = 3538
Inbound Octets = 1074036813
Inbound Unicast Packets = 3279633
Inbound Non-Unicast Packets = 43739
Inbound Discards = 0
Inbound Errors = 0
Inbound Unknown Protocols = 18
Outbound Octets = 1887633585
Outbound Unicast Packets = 4106193
Outbound Non-Unicast Packets = 4415
Outbound Discards = 0
Outbound Errors = 0
Outbound Queue Length = 0
Specific = 655367
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

landiag output for system "R"
--

LAN INTERFACE STATUS DISPLAY
Mon, Nov 20,2006 16:39:25

PPA Number = 0
Description = lan0 HP PCI 10/100Base-TX Core [100BASE-TX,FD,MANUAL,TT=1500]
Type (value) = ethernet-csmacd(6)
MTU Size = 1500
Speed = 100000000
Station Address = 0x306e481b5d
Administration Status (value) = up(1)
Operation Status (value) = up(1)
Last Change = 3757
Inbound Octets = 443751177
Inbound Unicast Packets = 914440
Inbound Non-Unicast Packets = 44815
Inbound Discards = 0
Inbound Errors = 0
Inbound Unknown Protocols = 11
Outbound Octets = 491816315
Outbound Unicast Packets = 924207
Outbound Non-Unicast Packets = 2625
Outbound Discards = 0
Outbound Errors = 0
Outbound Queue Length = 0
Specific = 655367
Barry C. Dowell
Frequent Advisor

Re: NFS performance problems - large file writes slow system severely

Thanks again to everyone for the suggestions and the follow up on this. I really appreciate the patience in letting me get back into the office to get Glance installed (and also appreciate the followup on the landiag command :-D )

Hopefully these stats and reports will help provide some suggestions on where to make some adjustments at.