Operating System - HP-UX
1825737 Members
2497 Online
109687 Solutions
New Discussion

Disk IO Performance issue ?

 
SOLVED
Go to solution
Eric Theroux
Occasional Contributor

Disk IO Performance issue ?

Hi all,

I have 5 servers in cluster mode (using MC/SG) connected to a SAN (EMC Symmetrix through Brocade switches). The cluster is built with 3 rp7400 and 2 rp7410.

At some point users complaint about performance and when running the sar -d command, I'm getting very HUGE numbers. I'm just wondering if these numbers make any sense or if the sar command show me something wrong !

If I can trust sar output, looks like I'm in trouble !

bash-2.05# sar -d 10 4

5 REPLIES 5
A. Clay Stephenson
Acclaimed Contributor

Re: Disk IO Performance issue ?

You can't trust sar and you can't even trust high-end performance tools like Glance for this. I should say that you can trust them but you have to understand what you are seeing. All the the host-based tools can know is that an awful lot of I/O is going through what they think is one disk. They have know way of knowing that this is really a high-end disk array.

If it ain't broke, I can fix that.
Victor_5
Trusted Contributor

Re: Disk IO Performance issue ?

Your data looks not too bad, here are some thought of mine:

1. sar -u
%idle low? This is the percentage of time that the cpu is not running processes, yes, possibly it is IO bottleneck.

%usr high? Many systems normally operate with 80% of the cpu time spent as user time, and 20% spent as system time. No, possibly it is IO bottleneck.

%wio > 15? Yes, possibly disk IO problem.

2. sar -d
%busy >50? Yes, you may have IO bottleneck on disk, check which disk having problem.

Since your %busy is not too high, you may need to check your network, use vmstat.

You also can use iostat or glance to help you to determine. Good luck.

John Poff
Honored Contributor
Solution

Re: Disk IO Performance issue ?

Hi,

Those 'avque' numbers look too big. There is a patch for 'sar' for 11.11 that addresses that problem. Here is the details:

Symptoms:
PHKL_27200:
( SR:8606249217 CR:JAGae15611 )

"sar -d" reports incorrect values for avque and avwait.

Example output:
device %busy avque r+w/s blks/s avwait avserv
c17t1d1 66.00 60178.29 284 27206 2124620672.00 0.00
c25t1d1 67.00 32767.50 296 28525 4.91 2.91
c33t1d1 69.00 65531.50 294 28413 4.99 2.87
c41t1d1 67.60 65534.50 316 29669 5.04 2.68
c49t1d1 67.80 60426.86 310 29845 1295032832.00 0.00




JP
Martin Johnson
Honored Contributor

Re: Disk IO Performance issue ?

I agree with others who point out that Unix based utilities can be unreliable about performance on disk arrays/SANs, etc.

EMC has their own monitoring tool for monitoring their Symmetrix system performance. I would recommend using this tool. It easily points out "hot" spots on the EMC disks.

EMC also has an OVO SPI to their product that will alllow you to automatically send performance alerts to OVO. You can set thresholds to send alerts before problems occur. Nice feature.

HTH
Marty
Eric Theroux
Occasional Contributor

Re: Disk IO Performance issue ?

Thanks a lot to everybody.

The first thing I suspect is the patch against sar -d who doesn't show reliable data, that matches exactly to the symptoms I've noticed.

But I also agree with who point to the fact that I should never fully trust those tools blankly. I'll try to collect data using EMC tools instead, but I'm not very familiar with those one.

thanks again folk,
Eric