Operating System - HP-UX
1833847 Members
1839 Online
110063 Solutions
New Discussion

10.20 and Network Appliance

 
Jim Kriegel
Advisor

10.20 and Network Appliance

Hi,
Does anybody eles using a Network Appliance? I've got a client who is experiencing a intermittent slowness when creating oracle data files. Sometimes it takes 3 min, sometimes it take more than an hour.
Thanks,
10 REPLIES 10
Lasse Knudsen
Esteemed Contributor

Re: 10.20 and Network Appliance

Works fine for us - however we have some network related problems that is causing NFS performance to be drastically reduced. You need some tricks in order to get around those.

What is you output from 'nfsstat -rc' - any timeouts ??
In a world without fences - who needs Gates ?
Jim Kriegel
Advisor

Re: 10.20 and Network Appliance

yes I do have some timeouts also badxid, but they don't seem to be growing all that fast.
The Filer and HP system are connected via fiber, and I don't see any errors at the interface.
Thanks for any help,

Lasse Knudsen
Esteemed Contributor

Re: 10.20 and Network Appliance

If your timeouts are growing then there is probably something wrong. Exactly at what rate do they grow ?

What is your number of biod's on the HP - are the NetApp running NFS vers 3 (try rpcinfo -p )

When you say fiber, is that Giga-giga, giga-100, 100-100 .. (fill in the missing :-) ??
In a world without fences - who needs Gates ?
Jim Kriegel
Advisor

Re: 10.20 and Network Appliance

Giga-giga, The filer is reporting NFS ver. 3 and 2, there maybe another system running ver 2, I'll check.
The timeouts have not changed for the last hour, and may or may not have changed for sometime.
The system is running 4 biod's, that is my next step as soon as I get permission, probably go to 16.

Jim Kriegel
Advisor

Re: 10.20 and Network Appliance

Giga-giga, The filer is reporting NFS ver. 3 and 2, there maybe another system running ver 2, I'll check.
The timeouts have not changed for the last hour, and may or may not have changed for sometime.
The system is running 4 biod's, that is my next step as soon as I get permission, probably go to 16.

Lasse Knudsen
Esteemed Contributor

Re: 10.20 and Network Appliance

Hmmmm - beginning to look weird this. Is your problem only occuring when creating oracle data files - you should be monitoring with nfsstat if you are able to reproduce the error.

badxid's occur when the NetApp are having a busy time and not able to respond to NFS requests within 7/10 and when the client receives the answer it marks it as 'bad'. badxid's usually also counts on the timeout counter. Timeouts also occurs when the network equipment drops packets (but the dont count as badxid's) - the packet loss is a problem at our site. Maybe your clients network drops packets too at certain times.


What the relationship between badxid's and timeouts ??

I don't think increasing the number of biods will have a positive effect on this problem though - increasing the number of biod's put a bigger load on NetApp and network equipment.
In a world without fences - who needs Gates ?
Jim Kriegel
Advisor

Re: 10.20 and Network Appliance

Giga-giga, The filer is reporting NFS ver. 3 and 2, there maybe another system running ver 2, I'll check.
The timeouts have not changed for the last hour, and may or may not have changed for sometime.
The system is running 4 biod's, that is my next step as soon as I get permission, probably go to 16.

Lasse Knudsen
Esteemed Contributor

Re: 10.20 and Network Appliance

Jim - I think you might have hit the 'reload' button on one of you last answer.
In a world without fences - who needs Gates ?
Jim Kriegel
Advisor

Re: 10.20 and Network Appliance

Sorry for the multiple replies, my connection was slow (dialup) and the connection "stalled". I thought the posting didn't make it.
The badxid's are at 54 and the timeout's are 1547.
Normal queries and transactions are fine. Running big reports and creating data files are intermittantly slow/fast. I'll probably up the number of biod's (although not the solve this will be a high use client.), and get the DBA to create files as I watch.

Thanks again for the help.
Lasse Knudsen
Esteemed Contributor

Re: 10.20 and Network Appliance

Hi againg Jim,

Timeouts could be an indication of a lot of things - (NFS server shutdown while NFS server running for one thing).

If such issues are not the case, your figures seems very much alike what we experience. I have tracked our problem down to be a buffer overrun problem in our Cisco equipment (64K buffer for each port is not much). However we only see the problems going from Giga to 100 (or 10 for that matter).

Our Ciscos was clearly dropping packets causing NFS timeouts - but was very obvious when I tried to increase biods to 16. With 16 biods you can have 16 "outstanding" NFS requests - typical NFS read is 8k totaling 128K. When 128k comes with Gigabit speed through a 100Mbit connection with 64k buffer it is bound to go wrong. This 128K size window is not being used for normal reads, but when using memory mapped files. Memory mapped files are seldom used by applications to read files (except for shared libs), but Oracle could be using memory mapped files.

But you say you are running giga-giga so maybe your problem is within your switch (maybe overloaded).

Just some thoughts for you to go on with. Finding (and proving) packet loss in a network is a cumbersome task.
In a world without fences - who needs Gates ?