Operating System - HP-UX
1834621 Members
2140 Online
110069 Solutions
New Discussion

Re: Performance and %wio info

 
SOLVED
Go to solution
Tim Nelson
Honored Contributor

Performance and %wio info

After reading many many replies to disk performance and sar stats I have this perplexing question on alot of responses stating that a high %wio via sar indicates a disk bottleneck. I am looking for a definative answer on what %wio really is.
The definition I read is the percentage that the CPU or process is waiting on an io request. An io request could be disk, memory, network, etc ( just about anything that does not include instructions actully executing on the cpu ). Is this true ??
When using glance after seeing high %wio sometimes there is little disk activity, i.e. disk %busy is only 10%-20%. If I look at global wait states I have 90% of the process stats waiting on STREAMS ( which is a type of io but not disk ) %IO is small. This in my mind does not indicate a disk problem but a performance problem in the STREAMS system.

Anyone have a great definative description to offset all the answers I see on the forum that seem to always direct people that high %wio is always a disk problem. I personally do not think it is true all the time and maybe the responses to this inquiry will help others down the road.
Thanks to all in this forum and its collaborative efforts !


8 REPLIES 8
Jose Mosquera
Honored Contributor
Solution

Re: Performance and %wio info

Hi,

Pls check the swapper presence on top 1st page, if be there your system handles activation and reactivation of processes when free memory falls bellow minfree, or when the system appears to be ???trashing???.

dbc_max_pct Kernel parameter fix this problem, the recomendations are from 5% to 10% depends on the total amount of memory in your system.

The minimun value must be 300MB, then you can try to put a low percentage, I sugest you that first try with 25% and look at the backup performance in order to know if this is a good percentage. After you do this and reboot your dont want to see the USED line for the device lines from swapinfo -mt > 0. This means no memory pressure and no paging. This will increase performance considerably.

Rgds.
Jose Mosquera
Honored Contributor

Re: Performance and %wio info

Stefan Farrelly
Honored Contributor

Re: Performance and %wio info

Very interesting question.

Firstly, the manpage for sar is specific;
wio% = [CPU] idle with some process waiting for I/O (only block I/O, raw I/O, or VM pageins/swapins indicated).

So, it should be easy enough to confirm this.
Lets take a brand new server I havent commissioned yet;

> sar 1 10
15:40:01 %usr %sys %wio %idle
15:40:02 0 2 0 98
15:40:03 0 0 0 100
15:40:04 0 1 0 99
15:40:05 0 0 0 100
15:40:06 0 0 0 100
15:40:07 0 0 0 100
15:40:08 0 2 0 98
15:40:09 0 1 0 99
15:40:10 0 0 0 100
15:40:11 0 0 0 100

So no %wio as you would expect because nothing is running apart from the base OS.

So, I kick of a dd of the system disk (dd if=/dev/rdsk/c0t6d0 of=/dev/null &), then run sar again;

15:38:47 %usr %sys %wio %idle
15:38:48 58 13 29 0
15:38:49 56 16 28 0
15:38:50 48 19 33 0
15:38:51 51 16 33 0
15:38:52 51 11 38 0
15:38:53 49 17 34 0
15:38:54 56 15 29 0
15:38:55 52 14 34 0
15:38:56 48 19 33 0
15:38:57 42 20 38 0

As you would expect from a raw dd (which hammers the disk to the max, or the controller its on) the wio% is now 30 odd %, ie the CPU could have run the job faster if it wasnt waiting on i/o (disk/controller).

On a large server with many many disks it doesnt take all of them to be running heavy/busy for wio% to be high - it only takes a raw dd on a single disk (or a single disk to be hammered by a job) for the wio% to be high. You need to check all disks. During this raw dd I was running sar -d only showed the disk in question being around 35-40% busy (nowhere near 100%) so dont look too much into sar -d output as an indication of high wio%, on a fast disk device even a low sar -d % could be maxing out causing wio% to go up.

I guess it is possible that a high wio% could also be caused by queues in the single threaded part of the kernel which handles locking - as this is a bottleneck in itself (until hopefully HP move to an OSF spinlock kernel aka True64! :-) ) but this may be simply a reflection of the large amount of locking needing to be done as part of a heavy disk i/o operation, or indeed something like heavy STREAMS usage - anything that requires locking, but how to prove ?


Im from Palmerston North, New Zealand, but somehow ended up in London...
Stefan Farrelly
Honored Contributor

Re: Performance and %wio info

I forgot to add, in my 13 years experience i havent seen any cause of high wio% other than disk/controller. But that doesnt mean its not possible:-)
Im from Palmerston North, New Zealand, but somehow ended up in London...
Tim Nelson
Honored Contributor

Re: Performance and %wio info

Interesting replies..
If sar is accurate then why does glance not agree or top for that matter..Top does not even report on WIO. If I start to investigate an IO problem and find the all disk at a minimal %busy, disk avwait and avserve is around 5-6, no swapping is taking effect, then I start to look at why processes are blocked waiting for IO. If I find that the blocked on IO is low .5% and blocked on STREAMS is 90+% I begin to believe that %WIO is not exculusively a disk issue. Hope all can read the below.
Example:
Load averages: 1.89, 1.82, 1.68
1544 processes: 1534 sleeping, 10 running
Cpu states:
CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
0 2.24 13.1% 5.7% 2.2% 79.0% 0.0% 0.0% 0.0% 0.0%
1 1.83 10.2% 7.1% 2.9% 79.8% 0.0% 0.0% 0.0% 0.0%
2 1.88 8.8% 4.9% 1.4% 84.9% 0.0% 0.0% 0.0% 0.0%
3 1.63 9.0% 4.5% 3.7% 82.7% 0.0% 0.0% 0.0% 0.0%
4 1.61 7.8% 20.0% 1.2% 71.0% 0.0% 0.0% 0.0% 0.0%
5 1.50 12.9% 6.7% 2.4% 78.0% 0.0% 0.0% 0.0% 0.0%
6 1.48 10.2% 11.6% 2.5% 75.7% 0.0% 0.0% 0.0% 0.0%
7 2.95 11.2% 6.9% 1.6% 80.4% 0.0% 0.0% 0.0% 0.0%
--- ---- ----- ----- ----- ----- ----- ----- ----- -----
avg 1.89 10.4% 8.4% 2.3% 78.9% 0.0% 0.0% 0.0% 0.0%
Memory: 1003884K (472576K) real, 1289252K (796552K) virtual, 654440K free Page# 1/56
sar 10 10
10:08:27 %usr %sys %wio %idle
10:08:37 16 2 63 19
10:08:47 17 2 64 17
10:08:57 15 3 62 20
10:09:07 16 2 64 19
10:09:17 14 2 71 13

Glance wait states
Event % Time Threads Blocked On % Time Threads
--------------------------------------------------------------------------------
IPC 0.0 0.00 0.0 Cache 0.0 0.09 0.1
Job Control 0.0 0.00 0.0 CDROM IO 0.0 0.00 0.0
Message 0.1 1.69 1.1 Disk IO 0.0 0.00 0.0
Pipe 0.1 3.00 2.0 Graphics 0.0 0.00 0.0
RPC 0.0 0.00 0.0 Inode 0.0 0.00 0.0
Semaphore 5.5 160.68 107.8 IO 0.6 16.25 10.9
Sleep 16.7 486.75 326.7 LAN 0.0 0.00 0.0
Socket 0.2 6.00 4.0 NFS 0.0 0.00 0.0
Stream 63.8 1855.48 1245.3 Priority 0.0 0.22 0.1
Terminal 0.3 7.39 5.0 System 11.3 328.72 220.6
Other 1.2 36.00 24.2 Virtual Mem 0.0 0.36 0.2
Stefan Farrelly
Honored Contributor

Re: Performance and %wio info

Hi Tim,

from your last reply im having difficulty seeing how you worked out that STREAMS is 90+% (Glance wait states shows 63.8) and there is no blocked on IO ouput (at 0.5). Where did you get those figures ?

And what was sar -d output over the same time interval ?

Certainly your server is doing some work, load avg is 1.8, and wio% is over 60% !
Im from Palmerston North, New Zealand, but somehow ended up in London...
Tim Nelson
Honored Contributor

Re: Performance and %wio info

I guess the point was that blocked on IO was very low compared to the amount of overall time the system seems to be blocked on STREAMS ( which also is a form of IO ? ). Notice that no other performance tool reports on WIO, not glance, not top.
When watching system performance there are times when some disks are very busy and I would contribute the high WIO to that. There are also times when not a single disk is very busy ( below 20% utility )but WIO is still shown as extremely high via sar.
What I am really attempting to derive out of this thread is the following. for example:
We ask that the Application vendor look at some application performance issues. The first thing the App vendor notices is high WIO via sar. Hence they blame all App performance issues on that single stat from sar and a disk bottleneck. I am attempting to disprove that sar's high %WIO metric is always a disk issue especially when avewait and averesp is low ( 5-6) and there are no queues.
I appreciate any and all responses.
David Bell_1
Honored Contributor

Re: Performance and %wio info

Tim,

Just something you may want to look at:

PHKL_28003 sar avwait, avque patch
PHKL_23901 load average patch
PHCO_25174 sar cumulative patch

I don't know that these will help but it's worth looking into.

HTH,

Dave