Operating System - HP-UX
1833760 Members
2239 Online
110063 Solutions
New Discussion

What do these SAR and VMSTAT Stats Tell ?

 
Alzhy
Honored Contributor

What do these SAR and VMSTAT Stats Tell ?

20-CPU SD32000 nPar, 60GB Memory, 4x2 (total 8) FC-HBA's to an EVA5K, SecurePath, LVM, OJFS/VxFS 3.3 .. Oracle DB Server (11i/9i)... This nPar has been consistently averaging near 50% CPU WaitIO's. VMSTAT reports consistently greater than 10 on Run queue and processes blocked for resources.

What do these stats tell?

SAR:
%usr %sys %wio %idle
06:17:00 12 3 51 34
06:18:00 15 4 53 28
06:19:00 17 4 56 23
06:20:00 19 5 51 25
06:21:00 21 7 45 28
06:22:00 15 4 46 35
06:23:00 17 4 50 29
06:24:00 13 3 44 39
06:25:00 14 4 48 34
06:26:00 15 5 45 35
06:27:00 14 5 52 30
06:28:00 14 4 46 36
06:29:00 16 4 46 34
06:30:00 16 4 53 27
06:31:00 19 6 46 29

VMSTAT:
procs memory page faults cpu
r b w avm free re at pi po fr de sr in sy cs us sy id
6 16 0 10503581 3771919 184 36 1 0 0 0 12 8628 37326 3047 11 2 87
14 13 0 10344219 3776626 1043 100 0 0 0 0 0 13986 93984 7469 35 6 59
14 13 0 10344219 3774923 567 74 0 0 0 0 0 13326 73862 6725 25 4 71
16 18 0 10589222 3769612 310 58 0 0 0 0 0 13133 78457 8066 27 14 59
17 18 0 10589222 3771224 188 39 0 0 0 0 0 13372 72626 7318 25 7 67
Hakuna Matata.
5 REPLIES 5
Jeff Schussele
Honored Contributor

Re: What do these SAR and VMSTAT Stats Tell ?

Hi Nelson,

Well they tell me one of two things:

1) You could use a faster disk subsystem - including having the LUNS laid out better.

2) Clumsy, inefficient SQL code's causing a lot of disk access & the CPU's waiting on it's completion.

My next step would be to use array performance software to pinpoint one or the other. If the drives are slow the SW will see access times & bottlenecks everywhere. If the disk space is laid out poorly it will consistently show *hot* LUNs. If the code is inefficient it may show both of the above.

Ny 2 cents,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Alzhy
Honored Contributor

Re: What do these SAR and VMSTAT Stats Tell ?

Thanks Jeff. Same initial analyses as I had. But what is striking is even if Glance, sar and vmstat seem to agree -- "sar -d" does not seem to agree as no individual EVA LUN is considered "hot" by sar -- as evidenced by sub 10ms average access and wait times. Is'nt "sar -d" the most authoritative metric in ascertaining if certain LUNs/disks are hot?


The EVA in question is a shared one by other IO heavy servers (HP-UX as well)..
Hakuna Matata.
Jeff Schussele
Honored Contributor

Re: What do these SAR and VMSTAT Stats Tell ?

Yes - but you almost have to ignore the "snapshot" sar lines. I always focus on the "Averages" section at the end of sar reports.
So *always* run sar reports for quite a few iterations and at diff times of the day to get a good feeling for "overall" average disk performance. sar can be misleading - kind of like top - in those "moment in time" values.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Alzhy
Honored Contributor

Re: What do these SAR and VMSTAT Stats Tell ?

Jeff, these stats are what we have as "average" for this server's 0800 to 2000 grind.
Hakuna Matata.
Jeff Schussele
Honored Contributor

Re: What do these SAR and VMSTAT Stats Tell ?

Oh - OK, very good.
And re-reading your 2nd post I just noticed that the array is shared with other big-hitters.
That's another reason why you need to be looking at the array as a whole picture.
You may not have "hot LUNs" but may be bottle-necking at an FCA board or even a switch port. Those kind of bottlenecks can appear to be "slow LUNs" when in actuality the LUN performance is above average. It's just the round trip takes a while - kinda like commuting on the Eisenhower or Kennedy in Chicago. You can get there & back just takes a while.

Cheers,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!