Operating System - HP-UX
1753816 Members
8027 Online
108805 Solutions
New Discussion юеВ

Finding I/O bottleneck on L2000

 
Steve Blackwell
Occasional Advisor

Finding I/O bottleneck on L2000

We are currently in the process of sizing a new system. We have a standard benchmark that is normally CPU intensive and the output can be used to scale a new system.

For some reason we have an L Class (2 x 440), that has a bottleneck in the I/O. We have installed the trial version of Glance and have determind that I/O is the problem on this server. I does not seem to matter if the test is on local disks or an external disk pack (SC10).

The filesystems are mirrored and have a PVG-strict policy.

Does anyone have any ideas how we can intergate the I/O problem and find the cause and we are current flustered?

3 REPLIES 3
Steven E. Protter
Exalted Contributor

Re: Finding I/O bottleneck on L2000

If licensed, I recommend glance.

if not, I recommend you collect some data with the script set I'm attaching and look it over to find the logical volume or physical disk that is bottlenecked. Then you can take action.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Steve Blackwell
Occasional Advisor

Re: Finding I/O bottleneck on L2000

As I said in the opening question, we installed Glance and it points to a I/O problem. But we can not pin the problem down to a Logical volume or a disk. The problem seems to reside somewhere else.. scsi controller or the kernel?
Bill Hassell
Honored Contributor

Re: Finding I/O bottleneck on L2000

Other than adding all the current patches from your SupportPlus CDROM, the only thing you can do is to look at the I/O rate (I/O's per second). It's important to not pay attention to 'bottleneck' warnings as these are very misleading labels. If your system is generating 500 to 1000 I/Os per second, it is running at perfectly normal rates but Glance might report a bottleneck because the I/O rate is so high. That is true only if the I/O is unexpected. You want I/O to run as fast as possible.

Now if your test is normally CPU intensive but is now disk intensive (assuming test data and program environment are the same) then I would look at the I/O rate on the old machines. If it is signiicantly lower then the old system is avoiding I/O, probably through the buffer cache. Use Glance to check the size of the buffer cache (larger is better up to about 500 megs) but this all assumes that your test is using files and not raw disk.

To accurately compare disk rates between the two machines, use dd against the raw disks as in:

time dd if=/dev/rdsk/c1t5d0 of=/dev/null bs=128k count=2000

Run this 2-3 times to average the result. Now to see how the buffer cache can change the results (in this example, much worse results), change ../rdsk/.. to ../dsk/.. and you'll see 50% increase in elapsed (real) time, and 10-20x increase in CPU time.

Now the dd test is artificial in that it is purely sequential and bypasses all filesystem code, but it will tell how fast a sequential read can be.


Bill Hassell, sysadmin