Operating System - HP-UX
1846187 Members
3936 Online
110254 Solutions
New Discussion

Disk queue high with low disk utilization

 
Jack Fan
Regular Advisor

Disk queue high with low disk utilization

All,

Can someone help me to explain this problem and how can I solve it?

My HW specification,
Server : rp7410/8 cpu/16g ram
Disk Array : EMC 3330-18 / 18GB*20
DB : ORACLE 7.3.4.5
ERP : BaanIII

Before this ERP system migrate from HP V2500 server (12 cpu/12G RAM) to rp7410 (8cpu/16GB RAM), everthing is okay and performance good..
But after migration, the performance is very bad ....

Why we get high disk queue ( > 5) but the disk utilization around 80~90?

Any suggestion?
8 REPLIES 8
Michael Steele_2
Honored Contributor

Re: Disk queue high with low disk utilization

Please post the results of the following:

vmstat 5 5
sar -d 5 5
sar -v 5 5
sar -u 5 5
swapinfo -tam
fcmsutil /dev/td? (* I/O reads and writes *)
queue_depth (* use power link *)
ioscan -fnkC disk
Support Fatherhood - Stop Family Law
Tim Adamson_1
Honored Contributor

Re: Disk queue high with low disk utilization

Also post kmtune, sysdef and cat /stand/system

Tim
Yesterday is history, tomorrow is a mystery, today is a gift. That's why it's called the present.
Jack Fan
Regular Advisor

Re: Disk queue high with low disk utilization

Sir,

Kmtune & sysdef & system here, please check the attached file (zip).

Thanks,

Jack Fan
Jack Fan
Regular Advisor

Re: Disk queue high with low disk utilization

Sir,

sar / fcmsutil /ioscan results here, please check attached file.

Thanks,

Jack Fan
Michael Steele_2
Honored Contributor

Re: Disk queue high with low disk utilization

If you refer to sar -d and check %busy you'll see EMS luns with values greater than 50% indicating a disk bottleneck. Your %wio greater then 15 values support this.

I/O under 'fcmsutil' is not listed and I am susptect all you I/O is going out one HBA. I.e., td0, td1, td2, etc. To correct this use a round robin PV Link ordering: c9 primary c10 alternat , c10 primary c9 alternate, etc.

You can also use LVM stripping to load balance but try round robin first.

Support Fatherhood - Stop Family Law
Michael Kelly_5
Valued Contributor

Re: Disk queue high with low disk utilization

The average service (avserv) times for your disks appear to be quite high. This would explain why your disk utilisation is is high.
You can see that there appears to be a correlation between the %busy and avserv figures (the more time it takes the disk to complete each request, the more time it is busy).
I would start by checking the disk array configuration.
Greater than 20 milliseconds to complete a disk request is an indication of a problem somewhere in the disk subsystem.
Perhaps caching has been turned off?

HTH,
Michael.
The nice thing about computers is that they do exactly what you tell them. The problem with computers is that they do EXACTLY what you tell them.
Alzhy
Honored Contributor

Re: Disk queue high with low disk utilization

Stripe accross your LUNS on different HBA's. Whether you are using LVM or VxVM you should be able to stripe accross... but put your redo logs on a non-striped LUN. Other things, check your mount options so you minimise buffering -- VXFS mount option convosync, direct??...

You may also check your throttling configuration and maxIO kernel parameters. What about your block sizes == is it equal to your Oracle DB Block size? VxFS block size should be 8192 (8k) which should match Oracle's DB Block size of 8192... as well..
Hakuna Matata.
Tim Sanko
Trusted Contributor

Re: Disk queue high with low disk utilization

Two words "striped metas" from EMC if you are also short on EMC cache.

Do you have workload analyzer on your server?

If you have EMCs PowerPath you could eliminate the bottlenecks by adding an fibre HBA if you have an available slot in the 3830.

If you had a failure and are using PowerPath you would see high IO on one path and the IO would deadlock. Since Powerpath handles failover you might miss it...

Do a powermt display dev=all | grep dead.

If you have multiple HBAs and have lost paths this will explain high IO rates and this may be the problem we saw the same symptoms...

I have a V2600 production box and have 7410 for a test box. If you use pvlinks and a path failed you could see this....