cancel
Showing results for 
Search instead for 
Did you mean: 

Wait IO

Global Server Operation
Frequent Advisor

Wait IO

Specs:
OS: HP UX 11.11 64 bit
MODEL: 9000/800/S16K-A
CPU: 4 @ 875MHz
Memory: 12096MB
AUTO RAID: HP Sure Store Virtual Array
Problem from Oracle DBA:
my understanding is that report processing time is taking longer (extracting data from disks and/or processing it). There may also be OLTP slowness (things like navigating forms, manipulating old data, and inserting new stuff).
From the DBA perspective, especially during heavy-use times like next week, the wait io contributes significantly to the maxed-out CPU utilization for extended periods. Database statistics indicate that our greatest addressable problem is IO contention. This week I'm also noticing more rollback waits than I've seen previously.

Wait IO is consistently higher on the system than I've experienced on other UNIX servers ?

I would like to know if there are any ideas on how or where to tackle with this. The DBA is recommending to ditch HP's auto-RAID and do straight striping and mirroring?


7 REPLIES
Jean-Luc Oudart
Honored Contributor

Re: Wait IO

Hi,

usually we're better off when we have a baseline for system performance (and database performance) as this would give you the hints where to look at.
Would you have these performance results/reports you could use.

Also how is the system performing when users complain ?

a few threads on VA but there are more around.

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=616937

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=628299

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=183732


Regards
Jean-Luc
fiat lux
Sridhar Bhaskarla
Honored Contributor

Re: Wait IO

Hi,

This is most likely a bottleneck on the system itself.

See if you can put some of the LUNs on the alternate path. Look at 'sar -d 5 20' output and see if only path is active all the time. In that case, make some of the alternate paths as primary and see if you get any relief. This isn't qualified as true load balancing but you are trying to use the alternate path as much as possible. Say primary path now is c6t0d1 and the alternate path is c8t0d1. To make c8t0d1 as the primary path do

vgreduce vgxx /dev/dsk/c6t0d1
vgextend vgxx /dev/dsk/c6t0d1

Post your 'sar -d 5 5' output followed by 'sar -b 5 5' and 'sar 5 5'.

-Sri

You may be disappointed if you fail, but you are doomed if you don't try
Steve Lewis
Honored Contributor

Re: Wait IO

We had a similar problem with a VA7410 and rp8440 (12x875Mhz).
Basically the computer is too fast for the disk array and is capable of hammering individual luns at too-fast a rate for the array to keep up.
Typical causes of this are: using too few disks of too-large-a-size; having a full array so that all the data access is at the centre of all the disks; using slower spin-speed disks, such as 10000rpms; sending all the i/o down one lun at once.
Stick with the 15k rpms at all times and dont accept financial arguments to the contrary. Having mixed spin speeds caused major performance problems with us and it ended up costing more - a 146Gb disk at 10krpm has to do twice the work of two 73Gb disks at 15kprm and its only 2 thirds of the spin speed - thats quite a penalty.
A good way to alleviate your bottleneck and i/o wait problem is to re-create each logical volume as striped across two luns, one lun in each RG.
Next time I wont get anything less than an xp array.


Bharat Katkar
Honored Contributor

Re: Wait IO

Hi,
Go thr' this doc as well...
Regards,
You need to know a lot to actually know how little you know
RAC_1
Honored Contributor

Re: Wait IO

Auto raid performs well, when you take cre of certain things. It is always a good idea to leave nearly 30-40 % of the space unconfigured on autoraid. What happens with this is "the data which is read most of the time is kept in one raid level" while the reminaing data is kept in another raid level.

So you need to check how much space you have configured and how much is free. The other thing that you need to check is how the paths have been configured. Check alternate and primary pats and try adjusting them as Shridhar suggested.

Re-arranging the raid would be a big exercise. You would require to back up the data, set raid and then restore. We have a 7400 configured in 0+1 (stripping and mirroring). With this half of the space is unused. The cost of the per GB is high, but offers good performance. You may also want to have a look at how sql code is configured?? Is sql code efficient, is doing un-necessary read/writes???

You may also want to frequently read data to seperate VG. Glance would be helpful in this regard. glance -i would give you IO by file system.

Hope this helps.

Anil
There is no substitute to HARDWORK
Alzhy
Honored Contributor

Re: Wait IO


The name of the game for oracle storage is S.A.M.E (Stripe And Mirror Everything). In your case since the LUNs are already protected - try striping them a minimum of 4 ways with a width of 64K to 128K.

Hakuna Matata.
Alwyn Santos
Advisor

Re: Wait IO

Can you send a copy of the statspack report that the DBA is looking at? Maybe we isolate the issue to just a particular set of data files. Once we know which ones they are we can then see which disks they're on. We can also try and trace the process' UNIX system calls using the tusc utility to show exactly where the slowness is being experienced.

Alwyn