Operating System - HP-UX
1745906 Members
4330 Online
108723 Solutions
New Discussion юеВ

Strange performance issue

 
James Murtagh
Honored Contributor

Re: Strange performance issue

Well I couldn't replicate this, although I didn't have an 11.00 system. I tested using 11i (well patched) and Solaris 2.8 (unpatched from base)and the throughput was steady before and after 2GBs in any configuration. I suspect it is a patch issue, make sure you are on the latest of the relevant subsystem, especially vxfs if this is what you are using. Also, as mentioned, Dave Olkers online guide is excellent, his book even better. I don't know much about the NetApp product but it may be worth looking into their software too. If the software is on a unix host you can use tusc to trace the server process, use the timestamp option to check for a delay in a system call. If there is a delay here HP can trace the process stack to find the kernel function that is causing a problem.

- James.
Elmar P. Kolkman
Honored Contributor

Re: Strange performance issue

I'm still looking at the given info... A. Clay Stephenson's document is proving to be usefull, but takes time to interpret to our problem. But I now found that I can get the performance after 2Gb the same as before the 2Gb by running without the biod. But the performance will drop down to 17Mb/s the whole way. Running with 1 biod gives about the same speed, running with 2 biods gives the old results: speed drops after 2 Gb. Haven't tried concurrent backups yet. I'm afraid they drop in speed with 1 biod...

Some info I forgot and is asked: we don't have this issue on 11i systems (used as storage node). James, thanks for testing, but we had done this too.

One thing to bear in mind: the problem does not exist when using cp, so it can/should not be directly network related !

Jeff, I will look into the patches.
Every problem has at least one solution. Only some solutions are harder to find.
Elena Leontieva
Esteemed Contributor

Re: Strange performance issue

Just a quick note. Normally your biods processes should not take a lot of CPU - 1-3% depending on the NFS activity. When we had a problem with the application being very slow (the server running (HP-UX 11.00) financial application with the Oracle DB on Network Appliance Filer), our four biods consumed almost all four CPU time.
You may want to check biods and run nfsstat -m during your tests.
I'll do the latest NFS patches first.
A. Clay Stephenson
Acclaimed Contributor

Re: Strange performance issue

I would try running with 0 biod's. I would also bump the number up to about 16 and get that data point. I assume you started with the default 4. Boxes running NFS also typically benefit from larger than normal buffer cache (maybe as much as 800-1200MB). So that you are not altering too many things at once, I would disable dynamic buffer cache by setting bufpages to a non-zero value --- that's my preference anyway for almost all HP-UX boxes. Finally, eventhough it is not obvious, especially for an NFS client, I would make sure that all VxFS/hfs performance patches are installed because many of those affect buffer cache usage -- and that in turn affect even NFS clients -- and biod's.



If it ain't broke, I can fix that.
Elmar P. Kolkman
Honored Contributor

Re: Strange performance issue

We have tested with 0, 1, 2, 4 and 16 biods.

With 16 biods (the situation I started with), the performance dropped after 2 Gb.

With 4 biods and 2 biods, the same.
With 1 biod, the speed is a bit better with 1 stream, but totally unacceptable with multiple streams.

With 0 biods, we have the most stable solution, right now.

I'm going to test patch 28995 (GE-LAN patch) now. Haven't yet have time to look into Jeff's patch list.

Since I don't know yet what solves the problem, I can not yet assign points.
Every problem has at least one solution. Only some solutions are harder to find.
Elmar P. Kolkman
Honored Contributor

Re: Strange performance issue

Of the list of patches in Jeffs post, only the LVM and JFS patch are not installed. I know that the parent directories of the NFS mounts are in those, but for now I will not install those patches. The machine needs to be available as failover node for other software too, so I don't want to be too far off with patches between the two environments.
Every problem has at least one solution. Only some solutions are harder to find.
James Murtagh
Honored Contributor

Re: Strange performance issue

It would be interesting if you could provide the server model, physical memory installed and buffer cache kernel settings. We know that only 25% of the buffer cache can be used with biod's running but that doesn't explain why have one will give you steady performance while more than this it drops....unless they are pulling data too quick over your gigabit lan to the server and saturating the cache, causing constant invalidations/flushes. From your posts it appears(?) that you are always testing the write case to the NFS mounted filesystem - if you haven't already tried then a read test may throw up some interesting results, i.e. copy a file from the NFS mount to the local disk. I'm not quite sure if the buffers marked as writeable to the local disk will still fall under the NFS window but I think its worth testing if you haven't already. Even if they are the writes to the local disks should hopefully be a lot quicker and you should see different results.

- James.
Stefan Farrelly
Honored Contributor

Re: Strange performance issue

Not sure if this will help as you may have already seen this doc but we had a few problems with our Netapps filers too until we followed their documentation exactly;

http://www.netapp.com/tech_library/3146.html

Do everything - hidden kernel parameters etc. Read it thoroughly.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Elmar P. Kolkman
Honored Contributor

Re: Strange performance issue

Stefan, this document was already brought under my attention. It was the reason for the 29210 patch. But thanks.

James, I also tried to read from the disk when copying to another NFS disk. But since I'm going to use those disks for backup, write performance is the most important. When restoring, the network to the client trying to restore will be the bottleneck... So it may be a bit slower, though I assume using biods will improve performance then.

We're still looking into it... And HP and Legato are too. But no solution yet...
Every problem has at least one solution. Only some solutions are harder to find.
Elmar P. Kolkman
Honored Contributor

Re: Strange performance issue

Problem still exists... What I find strange about it, is that apparently no one on this forum has run into this problem... Or nobody noticed it.

HP still has not come up with a solution. As has Legato.
Every problem has at least one solution. Only some solutions are harder to find.