1825771 Members
2393 Online
109687 Solutions
New Discussion

Disk I/O issue

 
Vidyasankar Narayanan
Occasional Contributor

Disk I/O issue

Hi All,
I strongly suspect that we have some disk I/O bottleneck in one of our servers. I constantly get critical messages from glance. When i took a closer look at the system, it appears that very few disks are slammed ( I could identify 7 disks) with usage upto 100%. On using sar -u %wio is about 28% on the average and in vmstat runqueue is 7-10 and the blocked processes are about 4-8. In this case what would be the best approach that I should take.
Thanks in advance.

-Vidya.
The future belongs to those who believe in the beauty of their dreams.
5 REPLIES 5
Rita C Workman
Honored Contributor

Re: Disk I/O issue

You may want to continue monitoring...check what is running and hitting those drives (I use Glance and defined out the processes so I could check for things like this).
Depending on what is causing it...you may be able to move things to another drive (and I would suggest controller). Example Oracle; put several tables on same disk that all are being hit severly. By moving some of the tables to different drives you help reduce the i/o activity on the disk (and like I said watch for which disks are which controller, that has an impact on i/o). See if there is something like this that may be "fairly simple" to re-arrange...
Just one idea,
Pramod_4
Trusted Contributor

Re: Disk I/O issue

Hi,

Simple answer is that you need to distribute the load...

The concept team work applies here too...!

I mean you need to consider disk striping/RAID-5 solutions...

If you can provide the current disk space configuration/allocation I would be able to help you out in detail.

Pramod

Vidyasankar Narayanan
Occasional Contributor

Re: Disk I/O issue

In 2 logical volumes, there is no software striping. But all the disks are EMC Raid-S. I'm sure i have to do some striping with these.
The future belongs to those who believe in the beauty of their dreams.
Tim Malnati
Honored Contributor

Re: Disk I/O issue

Don't bother attempting to stripe Raid S. A variety of benchmarks that I performed a while back shows that nothing is gained (and sometimes performance is worse). Striping fully mirrored EMC drives is a totally different situation though; there is a lot to be gained. You need to keep the internal EMC architecture in mind when you build them. I found that a 4-way stripe across 4 different internal controllers seems to be optimum. Typically EMC splits a physical drive into two volumes in a full mirror arrangement. With this in mind you need to be careful that the other side of the split is not heavily used by another machine. If you are attaching SCSI, be sure to balance the load evenly on the SCSI buses as well. If you do it right there will be no need for PowerPath (just another processing load).

Also check out the hit ratios on your buffer cache. Often a small increase in memory buffer can make a dramatic improvement if the application is primarily read oriented. Where the cache is best placed (on the EMC, unix buffer or database controlled buffer) is very dependent on the specific database engine you are running and how it is configured.

Another thing to consider is the application. All too often application programmers don't consider machine impact in their code. Many of them don't have a clue what their code is actually doing. DBA's can be just as guilty. In a lot of cases the load is added by batch oriented processing that could just as easily be performed off hours. I've found that the best way to to educate them is to allow them all kinds latitude on a test platform. When a developer requests disk space, a statement explaining how overworked you are right now with production stuff, but how you can slip in a drive in the next few minutes and allow them to configure it themselves (sudo works great for this). This combined with some hands on education with Glance plus can be a real eye opener for some people that want to learn. Others complain that its not their job, but so be it, they will need to wait a while. The cardinal rule to follow if you play this game is to NEVER get upset if they hose the test machine. Just keep in mind that every screw up in test is one less to deal with in production.
Dave Wherry
Esteemed Contributor

Re: Disk I/O issue

Are you working in the character version of Glance or the GUI (gpm)?
If you are using the character version and you are talking about the Disk Util bar at the top of the screen it may be nothing to worry about. That bar shows the most active disk during the last sample interval. It is not system disk usage. So 100% there may not be an issue.
You need to drill down and see which disk is so busy. Type u to get to the IO By Disk screen. Here you can see if it is the same disk that is constantly busy or if the activity moves around. If it is moving around it is probably not a problem. If it is staying on one disk all of the time you need to distribute your data a little better.
In glance type ? and you will see a list of the different screens you can use to drill down for more information.
Without knowing more about your hardware it is difficult to offer many suggestions. I agree very much with Tim that striping across your interfaces is a big winner in most cases.