Operating System - HP-UX
1755013 Members
3634 Online
108828 Solutions
New Discussion юеВ

LVM striping - sar -d horror

 
SOLVED
Go to solution
Steve Lewis
Honored Contributor

LVM striping - sar -d horror

Hi, just thought I'd see if anybody can shed any light on my customer's problem. They manually striped 19 disks in a VG using 1Mb extents, but they striped each of the 4 LVs one at a time, as opposed interleaving the LVs as you can do. Files get ftp's into the first LV and are worked on, before getting copied to the others. The c5... disks are mirrored to the c8... and c4 is mirrored to c7. Check out these horrific sar stats!
5 REPLIES 5
Steve Lewis
Honored Contributor

Re: LVM striping - sar -d horror

Its a T500 with lots of disks, 6 optical libraries and a couple of DLT libraries.09:48:13 c11t15d0 1,98 0,50 1 4 7,86 12,62
c4t5d0 56,44 31,05 64 1018 179,21 58,77
c5t0d0 99,01 639,62 50 796 12538,25 179,90
c5t1d0 99,01 479,50 61 982 12745,78 137,88
c5t2d0 99,01 588,44 65 1046 8305,22 132,22
c5t3d0 99,01 261,00 70 1125 4386,75 112,49
c5t6d0 99,01 41,50 77 1236 11801,90 107,25
c5t15d0 99,01 773,50 40 634 13234,55 229,68
c7t5d0 53,47 31,52 63 1014 177,56 59,01
c8t2d0 56,44 71,84 69 1109 194,67 58,20
c8t3d0 99,01 101,00 108 1727 690,48 73,17
09:48:14 c4t5d0 11,00 0,50 2 22 2519,70 469,95
c5t0d0 100,00 597,19 62 992 13286,15 156,79
c5t1d0 100,00 412,45 73 1146 12663,76 114,06
c5t2d0 100,00 576,50 74 1174 7737,88 108,86
c5t3d0 100,00 187,00 77 1222 8128,33 105,96
c5t4d0 54,00 76,03 50 788 156,04 67,25
c5t6d0 14,00 1,00 3 48 44560,91 430,15
c5t15d0 100,00 731,00 45 720 13594,93 186,19
c7t5d0 9,00 0,50 2 22 2466,86 456,96
c8t0d0 26,00 12,11 28 448 90,27 61,39
c8t2d0 59,00 28,50 58 928 885,74 82,84
c8t3d0 52,00 23,00 47 752 1686,35 90,84
c8t4d0 51,00 76,98 65 1040 187,39 56,07
Rick Garland
Honored Contributor

Re: LVM striping - sar -d horror

Looks like a rebuild of the VG using the correct stripping configurations.
Stefan Farrelly
Honored Contributor
Solution

Re: LVM striping - sar -d horror


I dont see any evidence the striping method used is causing a problem, all your sar stats show is that all the disks listed are very busy. Why are they busy ? the system must be using them very heavily. The striping itself doesnt put any load on the disks, its designed to even the load across all disks, ie. your sar shows all the disks being used to some degree, which is good. Without striping+mirroring some of the disks will be flatout and others idle or running much with much less load. HP's internal Openmail email servers all use manual striping+mirroring, and if I showed you a sar output from one of them I would expect to see all the disks running around the same level, none of them idle or not doing much work.

In order to determinte your i/o situation and how it can be improved you need to delve into a hardware diagram. How many scsi controllers do you have, how many disks on each, how fast are the disks on each, and whats the best way to get optimal performance out of them. Then you can look into stripe methods, do some tests, and work out the best stripe layout for your type of application.

Im from Palmerston North, New Zealand, but somehow ended up in London...
Anthony deRito
Respected Contributor

Re: LVM striping - sar -d horror

Steve, your disks are busy. This is normal for disks. Disks that run at 100% are simply busy 100% of the measurement interval. You need to check out many other possible bottlenecks. Use vmstat, swapinfo, sar -u, sar -b, and above all GlancePlus. Look at this problem from the porcesses viewpoint. Use the porcess list screens and check out wwait states and system calls. Need help, just ask.

Tony
Steve Lewis
Honored Contributor

Re: LVM striping - sar -d horror

OK I got the disk map and indeed, half of the controllers have few disks, so I told them to move some of the disks about and re-build the the vg. The PB controllers only move 20Mb/sec. Secondly it turns out that they are just moving files using mv, so it would make sense to create a big filesystem and mv within it, rather than copying the data. Thirdly I noticed process contention on the disk resources, between archive and queries. Setting fs_async may help and they have a UPS, so I will suggest that. The disk wait and process times are high - this is worse than than the number of blocks and %busy - indicating high latency. Lastly it may help to move one lv01 (temporary area) to a different vg.