IO/sec and performance

Tim Hempstead · ‎06-06-2006

We are currently having discussions at work over one of our systems which appears to be having a few performance issues related to its usage of disk storage.

We are pretty sure that the problem lies with the number of IO/sec being requested as opposed to the amount of data throughput. Looking in Glance (-i) we often see one of the filesystems with figures in the 500-700 range in the physical IO column, (and many more than that in the logical IO column). Looking at the -u screen in Glance freweuently shows ~200+ physical IOs second on individual disks housing the busy filesystem(s).

Now we have asked our applications people to look into the IOs which are going on and see if there are any tuning which can be performed to reduce them. They have queried why we think that these figures are something to be concerned about.

Now from what we have seen in the past we normally take as rule of thumb that if a disk is being hit with more than ~100 IO/sec then it may start being an issue and ~200 IO/sec is heading towards a real issue. Now in this case the "disks" are actually LUNs on a SAN attached disk array. We have asked our SAN guys to research what the disk unit is capable of in terms of IO/sec but we were wondering what the forums opinion was of this. How many IO/sec should we consider to be an issue to be looked into, (either with normal disks or SAN attached LUNs (HDS)).

Cheers

Tim

BUPA IS · ‎06-08-2006

Hello tim,
The perfomance of partiuclar disk depends on a whole lot of things not just the i/o rate. I have some old 7200rpm disks which get indigestion at 40 or 50 i/os per sec . so picking a rule of thumb can be difficult.
Can you tell us what model the system is ?
What sort of array is behind the luns and how is the array connected to the system ?
Is there a one to one mapping from a physical disk to a lun ?

From glance option u (or gpm the graph) are you getting any queueing on the luns indicated under Qlen ?

Are you using any mechanism to spread the load down the channels ?

Is the data in unix files (cooked) or is it in a raw database managed disk space?
Does the disk array have its own cache ?
regards
Mike

Help is out there always!!!!!

Tim Hempstead · ‎06-08-2006

Mike,

The System is a vPar (10x CPU, 26GB Memory) runing on a PARISC SD32000 Superdome. The Disk attached to the SAN is contained in a HDS 9570 Disk array, this will have a cache but I do not know how much is installed. SAN Connectivity is provided by 2x 1GB SAN links from the server to the SAN switch. There is not a 1-to-1 mapping at the array end between physical disks and LUNs. The LUNs and their alternate path(s) are added to the volume group in usual manner. Data is held on normal vxfs largefile enabled filesystems.

Glance -u is showing Qlen as being 0 against the LUNs most of the time with occasional peaks up to ~10.

Cheers

Tim

Steve Lewis · ‎06-08-2006

High ops per second versus low thoughput is often caused by transactional databases with a small page size in high load TP environments.
Another cause is widely spaced data requests, at inner and outer edges of the disks. This can be mitigated by increasing the amount of striping and also spreading the load over more interfaces.
HDS arrays are often configured 2+2 striped and mirrored, which isn't as wide a stripe as you sometime need with TP. People often use LVM or VXVM to cross-stripe their HP-UX logical volumes over multiple LUNs to increase the bandwidth and reduce the queues. I find that 4-lun stripes work well and give better management that PVG distributed allocation. You just need to ensure that your luns are actually on different ACP processors and different groups at the back end of your disk array.

My xp12k (basically a big HDS) is hardly ticking over at 17000 ops/second into an 8-way machine over dual fibre. 200 ops/second on a single lun is nothing at all.

1Gbit fibre is slow by todays standards and you should consider upgrading to 2gbit.
Secondly consider 4 interface cards, not just 2.
You also need to start learning more about your SAN storage.

Tim Hempstead · ‎06-08-2006

Steve,

Yep i agree I need to know more about the SAN environment, trouble is where I am Unix Support does not support the SAN at all, that is handled by another specific team, hence getting time and training budgets is an issue ... which means in this situation we do not have the knowledge or visibility of that end of the problem.

We are looking at upgrading from 1GB fibre to 2GB or 4GB, (along with a possible new server), but as with althings that involves spending money which means that it takes time.

We will have a look into this string of the LV across LUNs on different parts of the backend like you suggest.

Cheers

Tim

TwoProc · ‎06-08-2006

Tim, I agree with and like Steve's last post. Especially the two suggestions 1) convert to 2G and 2) use four cards instead of two.

However, at any given number of I/O's second, whether or not you're going to see problems depends on how many disks (and ACPs) are behind the lun that you're looking at. Also, if you're configured on 1Gb cards, I'm suspecting that it's an older XP (HDS) system. If that's so, then the price for cache on those units on the used market have dropped to a fraction of what they used to be - you should check out maxing out the cache on the server as well.

We are the people our parents warned us about --Jimmy Buffett

BUPA IS · ‎06-08-2006

I agree with Steve and Tim they have said most of what I would say anyway .

My preference is to used the distrbuted extent method (lvcreate -D ) of spreading the lv about it makes using lvm mirroring easier if you need it later.

Find out how big your lun sizes are in relation to the physical spindle size on the array for best performance you want to try and get the vg composed of luns on different spindles your disk supplier should be able to help adjacent luns tend to be on the same spindle sets.
Also you disk supplier should be able to run performance tool against the array itself and tell you if there is major bottle neck in there e.g. not enough cache or all the i/o going through one internal bus .
If you do not have a tool to automatically load balance the fc paths you can cheat by using the
pvchange -s -A y /dev/dsk/cxxtydn
to move some of the io to the alternate link you may have to rerun it after each reboot.
If the disks contain data from a database get the dbas to have look at the database cache and page sizes .

Not that I want to compete or anything but I found one of our disks exhibiting silly queue lengths and 100% busy see below . This disk will show phys i/o above 1100 on occasion. (The end users are not actually complaining about this system yet!)
An analysis of the content shows that the dba has placed the database roll back logs on the same disk as a major part of the database
We are going to move the roll back logs elsewhere as a quick fix but a complete reorganisation of this volume group is on the cards .
N.B. glance is fibbing, this disk only has extents from lvol7 on it. use lvdisplay and pvdisplay to confirm .

glance option u line

Idx Device Util Qlen KB/Sec Logl IO Phys IO
--------------------------------------------------------------------------------
75 1/0/4/0/0.33.27.19 100/100 630.2 4988.2/ 5287.1 na/ na 627.8/640.7

B3692A GlancePlus C.03.72.00 15:37:57 9000/800 Current Avg High
--------------------------------------------------------------------------------
CPU Util S SU U | 82% 73% 88%
Disk Util F F |100% 92% 100%
Mem Util S SU UB B | 80% 80% 80%
Swap Util U UR R | 36% 36% 37%
--------------------------------------------------------------------------------
Disk Detail for: 1/0/4/0/0.33.27.19.0.1.5 Bus Type: scsi
Controller: na Product ID: Vendor: EMC
--------------------------------------------------------------------------------
Log Reads : na Phy Reads : 607.1 Util (%): 100 q=0 (%): 0
Writes : na Writes : 0.6 Q Len : 710.0 0 Rd Bytes: na Rd Bytes: 4840.4 2 Wt Bytes: na Wt Bytes: 4.8 4--------------------------------------------------------------------------------
Mapping Information
--------------------------------------------------------------------------------
Partition :/dev/dsk/c8t1d5
Partition :/dev/rdsk/c8t1d5
File Sys :/oradata/vg02_lvol9
File Sys :/oradata/vg02_lvol8
File Sys :/oradata/vg02_lvol7
File Sys :/oradata/vg02_lvol10

i hope this helps
good luck
Mike .

Help is out there always!!!!!

Tim Hempstead · ‎06-12-2006

Many thanks to those who have replied. We now have quite a few things to look into to see if we can stop what we are seeing from being a problem.

Cheers

Tim

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

IO/sec and performance

IO/sec and performance

Re: IO/sec and performance

Re: IO/sec and performance

Re: IO/sec and performance

Re: IO/sec and performance

Re: IO/sec and performance

Re: IO/sec and performance

Re: IO/sec and performance