System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

Disk Queuing on LINUX (sar -d or OVPA/Glance)

Alzhy
Honored Contributor

Disk Queuing on LINUX (sar -d or OVPA/Glance)

Should this be treated and addressed in the same way as UNIX?

I am seeing disk queues in excess of 12 sometimes breaching 20 on our newly migrated Database (using ASM) to a Linux Big Iron Server.

The server currently has 2x4Gbit FC-HBAs, with all driver modules set to default and using HPDeviceMapper 4.3 on RHEL 5.4 and hooked up to an XP12000 Array. Server is the lone occupant of the XP12K's front end CHiP ports.

I am recommending adding 2 more FC-HBAs and another pair of CHIP ports correspondingly on teh XP12K array.

Hakuna Matata.
3 REPLIES
JBR
Frequent Advisor

Re: Disk Queuing on LINUX (sar -d or OVPA/Glance)

Hi, I am using a three servers Linux with Oracle Rac with ASM with XP24k and my config for "multipath" is round robin. Can you check this feature?

# multipath -l

disk_asm1 (360060e80152649000001264900000367) dm-2 HP,OPEN-V*4
[size=100G][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=0][active]
\_ 0:0:0:28 sdac 65:192 [active][undef]
\_ 1:0:0:28 sdcg 69:64 [active][undef]

You can try to obtain more info about performance of disks using this tool: FIO
http://www.linux.com/archive/feature/131063
Is a best tool to identify latency rates, IO rates...

Best Regards
Ivan Ferreira
Honored Contributor

Re: Disk Queuing on LINUX (sar -d or OVPA/Glance)

I normally use, for both, Unix and Linux, the average service time as a disk performance metric.

If the problem is on the back end processors, array cache, or parity groups, then adding more FC HBA's won't help.

Besides the disk queues, do you see high front end processor usage on your disk array? What about your service time?
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
TwoProc
Honored Contributor

Re: Disk Queuing on LINUX (sar -d or OVPA/Glance)

If you are using the "big lun" theory (One or two huge luns per volume group), then the above is pretty common; just keep in mind that your default scsi queue length is 8, so you may have to increase it if you've not already done so.

If you use the "lotsa luns" theory (lots of open 9,E's, etc). Then you've got more of a problem. How large is the cache in the front of the XP? Using Raid 5, or Raid 0/1?

Have you checked with the DBA and his run of statspack to see if it's a hot tablespace that's got you? What is the buffer_cache that he/she is using? What is the hit ratio?
If you've got high I/O, and your DBA believes that a 93% hit ratio or higher is good enough (which is the rule of thumb most use), it may not be high enough. To raise the buffer hit ratio, and keep I/O at bay - you may need to use a much, much larger buffer cache.

I've *clearly* seen when moving a database from a slower server to a faster one, that I absolutely must increase the Oracle buffer_cache available. This is mostly evident when a DBA tries to use mostly the *same* init.ora file for startup as they had before on the slower server, believing that it is *already tuned*.

The truth is, while your processor speed and capabilities may be 2,3,6 times faster, your I/O subsystem is probably largely the same as before. Right?

Well, that means since *you can* process more data, your bottleneck has moved from being (presumedly) processor bound, to being now I/O bound. Which means you need a) more I/O or b) more ram to cache away most of your I/O problems.

In all cases that I've seen, it is always much, much cheaper to add ram to the server, and allocate to the Oracle SGA than to increase I/O bandwidth when possible.

One more gotcha, your DBA may be very reluctant to vastly increase the size of the SGA, because this is the suggestion by most Oracle company trained DBAs, that you would only need to increase this area if your hit ratio is low. Which is absolutely wrong, you increase your buffer_cache if your hit ratio is low and your I/O is high, AND you increase your buffer_cache again if your hit ratio is now better, but total I/O is still too high. Naturally, like most things, the law of diminishing returns is at work here. A 4Gb increase in buffer_cache does a lot of good if you're starting out with only 4Gb of cache to begin with, but if you add 4Gb and you're already increased to 24GB, well, the improvement in reducing I/O is less per Gb of ram consumption.

Have a chat with your DBA, buy him/her lunch and make friends, I think your problems may be there.
We are the people our parents warned us about --Jimmy Buffett