Disk Utilization at 100%

NDO · ‎04-20-2011

Hi All

My setup is 2 node McSG cluster running 11.31 and the storage is NetApp. I found that I am having disk utilization (in glance) 100% most of the time. File system usage is average of 87%. we are using VxVM which I am not familiar with it. The storage admin says his netapp is fine. Can somebody help me to check this problem.

F.R.

Duncan Edmonstone · ‎04-20-2011

disk utilization of 100% simply means that during the measurement interval the disk was doing some sort of IO operation 100% of the time... that's not necessarily a bad thing... the more important measurement is the average service time for the disk(s) - values below 10ms are good, below 20ms may be "acceptable" and values over 20ms could be a problem... use "sar -d" to look at this... e.g. "sar -d 10 10" to look at disk performance 10 times with 10 seconds between each measurement.

HTH

Duncan

I am an HPE Employee

NDO · ‎04-20-2011

Hi!

Ok, when I do sar -d 10 10, "avwait" is always less (close to 0) than avserv, so I assume that everything is fine, but I have got some 100% busy disks from %busy column:

Average disk5 0.77 0.50 2 19 0.00 7.64
Average disk7 0.76 0.50 2 19 0.00 8.00
Average disk9 0.90 0.50 2 70 0.00 7.39
Average disk10 0.52 0.50 1 9 0.00 7.90
Average disk37 53.19 0.50 89 15008 0.00 6.14
Average disk38 100.00 0.50 205 16983 0.00 6.62
Average disk39 100.00 0.50 171 15536 0.00 6.28
Average disk40 0.22 0.50 0 22 0.00 21.75
Average disk41 0.02 0.50 0 0 0.00 9.79
Average disk42 16.88 0.50 23 7421 0.00 7.31
Average disk43 0.19 0.50 3 99 0.00 0.58
Average disk45 14.40 0.50 21 6479 0.00 6.97
Average disk46 14.90 0.50 21 6586 0.00 7.26
Average disk44 0.06 0.50 1 18 0.00 0.68
Average disk81 0.11 0.50 1 11 0.00 0.77

F.R.

Bill Hassell · ‎04-20-2011

Your applications require 100% of your disk time. If you stop your applications, then the usage will go down. Now if the applications are not designed to use 100% of the disk bandwidth, then they need to be reconfigured or redesigned. You can't reduce the percentage of disk usage unless you increase the transfer rate (change fibre, HBAs, switch and controllers to 8Gbit) and reduce latency with a much larger disk cache with wide striping. All of these fixes fairly expensive and may not improve performance if you have a small and/or slow machine. If possible, maximize your system (64GB RAM, 32 processors or larger), especially if you have large databases running.

But these are very broad recommendations. You need to examine performance with Glance and Measureware logs to better understand what is normal.

Bill Hassell, sysadmin

NDO · ‎04-20-2011

Hi

I have 2 rx7640 with 4CPUÂ´s and 32Gb of RAM. the HBAÂ´s that connect with netapp are 4gb.

I have used measuware to give this output every 5 minutes. this is just an extract:

OVPA Export 04/01/11 08:06 Logfile: /var/opt/perf/datafiles/logglob SCOPE/UX C.04.73.410(3) dbnode0
Active Phys Phys Free Memory User Pg Out Pg Out Pg Out Swap VM Pg
Date Time CPUs CPU % IO Rt IOs Mem % Mem % Swap % KB KB Rt Rate Out Rt Scan Rt
03/01/2011 07:00:00 4 83.74 174.8 52422 11112 64.92 47.71 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:05:00 4 81.87 125.6 37730 11166 64.75 47.54 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:10:00 4 81.37 114.9 34458 11160 64.77 47.56 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:15:00 4 83.39 116.5 34973 11179 64.71 47.50 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:20:00 4 17.16 45.4 13610 11135 64.83 47.63 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:25:00 4 5.49 41.0 12300 11114 64.91 47.70 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:30:00 4 4.71 32.7 9826 11093 64.97 47.76 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:35:00 4 5.51 42.8 12827 11054 65.08 47.88 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:40:00 4 4.69 44.0 13186 11029 65.17 47.96 44.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:45:00 4 5.84 47.6 14284 11044 65.08 47.88 45.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:50:00 4 4.83 37.0 11103 10989 65.25 48.05 45.00 0 0.0 0.0 0.0 0.0
03/01/2011 07:55:00 4 6.37 43.6 13084 10930 65.44 48.23 45.00 0 0.0 0.0 0.0 0.0

Michael Steele_2 · ‎04-20-2011

100% disk utilization is meaningless. The definition of a disk bottleneck is just as you stated above, anywhere where 'avwait' > 'avserv'.

And typically you never see a disk bottleneck anymore due to SAN technologies, except, on Local SCSI disks which are usually where the O/S is located.

So in general, you always see a disk bottleneck on the O/S disks, and never on a LUN.

In fact, I would say that they've basically been obsoleted.

And one comment about using OpenView and Measureware and those types of applications:
a) Analysis is often ojective, meaning what you see might not be what someone else see's, (* as opposed to an error in a syslog *)

b) Analysis is also heuristic meaning that you're usually taking your best guess at fixing the problem, (* again, as opposed to an error in syslog with a fix that matches up one to one *)

Support Fatherhood - Stop Family Law

chris huys_4 · ‎04-20-2011

Hi,

Always with disk IO performance problems, on HP-UX 11.31, "log" the complete, "sar -LdR 1 10 output", not "only" the average output.

And for the rest, if the netapp guy says that everything is fine on the netapp, that means that the netapp diskarray isnt working enough.

Double up the max_q_depth for the affected disks and check if the number of IO/sec increases, like explained in the following thread.

http://h30499.www3.hp.com/t5/System-Administration/avwait-and-avserv-are-normal-but-Disk-is-100-busy-on-HP-UX-lance/m-p/4644647#M379752

Greetz,
Chris

NDO · ‎04-20-2011

Hi!

I have seen from the "sar -d" output that some disks have avwait > avserv, and I have identify those disks as disk5, disk7 and disk10, the first two are vg00 disks, and disk10 is a vgora, I beleive where binaries for oracle reside (file system U01). I also did the following :
scsimgr get_attr -D /dev/rdisk/disk5 -a state -a max_q_depth

SCSI ATTRIBUTES FOR LUN : /dev/rdisk/disk5

name = state
current = ONLINE
default =
saved =

name = max_q_depth
current = 8
default = 8
saved =

The o/p is the same for the other two disks.

F.R.

NDO · ‎04-26-2011

Hi

Please HP experts, be nice and respond to this post

F.R.

Vivek_Pendse · ‎04-27-2011

Hi,

In glance, check the logical volume status.
Where, you will get a lvol, which is heavily used.
In glance, even if a single disk/lvol used 100%, then also it shows the total disk utilization at 100%.
Further more, you can drill down to %wio, if any thing serious, then you can check the & avwait time.

Also, at the same time, ask storage admin to run the perf analysis on storage for the required time & sampled rate & compared the report with OS data.
Note: the sampling period must be same for storage & os, so to know exact issue.

Hope, this helps you.

Thanks,
Vivek

NDO · ‎04-28-2011

Hi

>I have identify those disks as disk5, disk7 >and disk10, the first two are vg00 disks, >and disk10 is a vgora, I beleive where >binaries for oracle reside (file system >U01).

This is where I get problems: for vg00 is lvol usr.

F.R.

NDO · ‎05-02-2011

closing

F.R.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Disk Utilization at 100%

Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%

Re: Disk Utilization at 100%