Operating System - HP-UX
1824994 Members
2164 Online
109678 Solutions
New Discussion юеВ

Re: High Disk Utilization

 
SOLVED
Go to solution
hpuxsa
Frequent Advisor

High Disk Utilization

Hi,

We have an RP7410 server running Oracle CRM application, OS is 11.11 connected to EMC disk array. Glance reports disk I/O bottle neck, most of the time it's at 100%.
I was checking one of the highest disk utilizing process using tusc and the output is below. Most of the time it's in "times".
[4643] times 0x800003ffc00072c0) .. = 385213206
[4643] times(0x800003ffc00072c0) .. = 385213207
[4643] times(0x800003ffc00060a0) .. = 385213207
[4643] times(0x800003ffc00060a0) .. = 385213207
[4643] times(0x800003ffc00072c0) .. = 385213207
[4643] times(0x800003ffc00072c0) .. = 385213207
[4643] times(0x800003ffc00060a0) .. = 385213207
[4643] times(0x800003ffc00060a0) .. = 385213207

Could someone explain this please. The EMC engineers confirmed that the disk array is fine and we do not have a problem on other servers connected to the array.

Thanks,
Franklin.
5 REPLIES 5
hpuxsa
Frequent Advisor

Re: High Disk Utilization

Forgot to mention the process name, it's oracleCRM (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
Sridhar Bhaskarla
Honored Contributor
Solution

Re: High Disk Utilization

Hi Franklin,

It will be more helpful if you can post your 'sar -d 2 5' 'sar -b 2 5' and 'sar 2 5' outputs. Are you running any load balancing softwares like Powerpath etc?. Are you seeing one or two LUNs that are highly active?

1. Do not use few larger disks. Try smaller LUNs. It's not really because of your disk system. It's because OS will only queue certain number of requests (a) at any point of time. If the number of requests are more than 8 for a disk, then the disk utilization is shown as 100%.
2. See if you can split the data from the busier disks to non-busier disks.
3. If you see more than one disk busier at any point of time and if you don't have loadbalancing software, then put one of the disks on the alternate path. It can be easily done with a series of vgreduce and vgextend commands. For ex., to make cyt0d0 as primary and cxt0d0 (currently primary) as secondary do

vgreduce vgxx /dev/dsk/cxt0d0
vgextend vgxx /dev/dsk/cyt0d0

4. If 1 and 2 are not possible, then try setting a better queue depth by using the 'scsictl' command. "man scsictl' for more information. Find what would be an optimum setting by checking with your EMC engineers.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Hein van den Heuvel
Honored Contributor

Re: High Disk Utilization

>> Glance reports disk I/O bottle neck, most of the time it's at 100%.

Is that really a problem though, or just something that looks ugly? Is the application performance satisfactory?
How is the CPU usage? If the disk activity were to evaporate, how much better would the system become?

>> Most of the time it's in "times".

That is not un-common for an Oracle task... IF you have TIMED STATISTICS enabled. For yucks, just try "SQL> ALTER SYSTEM SET timed_statistics = FALSE;

>> the highest disk utilizing process

If, as you say, the disk activity is mostly generated by an Oracle slave process (oracleCRM...), then please do not waste your time with system tools.
Just (learn to) ask Oracle what it it doing?
There are LOTS of web pages / books to give hints there.
Personally I like to start with STATSPACK and simple queries like:

select BUFFER_GETS, EXECUTIONS, DISK_READS, ROWS_PROCESSED, substr(sql_text,0,240) "
SQL_TEXT"
from v$SQL
where BUFFER_GETS > 50000000
order by buffer_gets
/

Change select from field name and value depending on findings
(try disk_reads, try 10* more or less if too much or too little results)


A little more elaborate:
(Sorry, I forgot where I found that one. Untested in a while).

SELECT
ses.sid,
DECODE (ses.action,NULL,'online','batch') "User",
MAX(DECODE(sta.statistic#,9,sta.value,0)) /greatest(3600*24*(sysdate-ses.logon
_time),1) "Log IO/s,
MAX(DECODE(sta.statistic#,40,sta.value,0)) /greatest(3600*24*(sysdate-ses.logo
n_time),1) "Phy IO/s",
round(60*24*(sysdate-ses.logon_time),0) "Minutes"
FROM V$SESSION ses, V$SESSTAT sta
WHERE ses.status = 'ACTIVE'
AND sta.sid = ses.sid
AND sta.statistic# IN (9,40)
GROUP BY ses.sid, ses.action, ses.logon_time
ORDER BY SUM( DECODE(sta.statistic#,40,100*sta.value,sta.value) )
/


Once you identified the heavy user and its heavy query, then you'll probably need an 'explain plan' and hope you can find a tablescan you can avoid by adding an index or some such.

Good luck!

Hein.




Bharat Katkar
Honored Contributor

Re: High Disk Utilization

HI Franklin,
Even though glance reports "DIsk Bottleneck" that may not be related with disk or disk array.
Before jumping directly to disk, you need to analyze your memory and CPU becuase any bottlenecks in them could be reflected as disk bottlenecks.
So, use sar to collect information about the same and then you will be able to conclude something.

Also find attached the performance cookbook which may helpful in diagnosing the issue.

Hope that helps,.
REgards,
You need to know a lot to actually know how little you know
Bill Hassell
Honored Contributor

Re: High Disk Utilization

Glance's disk bottleneck message is not very useful at all. Even the 100% usage is not nearly as simple as it sounds. Glance reports %disk busy based on the total time waiting on I/O versus elapsed time. If the elapsed time is 5 seconds and one disk was constantly busy, for instance a dd to read the LUN, then the "disk" bust is 100% even though not a single other disk /LUN was touched during the measurement time. You must isolate the busiest disks/LUNs, then relate the busy periods to what Oracle (and any other middleware) is doing and then go after the code that is creating the disk I/O's.

There is noting wrong with 100% disk usage on every disk, just as there is nothing wrong with 100% CPU usage--unless the I/O's and CPU usage is being wasted by badly designed code. Thus the instructions given above to measure Oracle's performance and especially using an explain plan for lengthy SQL queries.


Bill Hassell, sysadmin