Disk optimisation

 
Tim D Fulford
Honored Contributor

Disk optimisation

Hi

This could be a mamouth question, and I suspect the answer is "Perform tests". Which costs money time people........ Anyway we are really looking to reduce the number of itterations to get this right!!

Hardware layout:
We have an FC60 with 30x18GB disks. Two FC60 contriollers. There are 3xSC10's, they are in split bus mode so all 6 FC60 buses are used. The disks are split up using hardware mirroring (RAID 1) over 13 LUNS leaving 4 disks as hot spares. Of the 13 LUNS we split them into 2 volume groups 12 LUNS for the database [vgdbase] 1 LUN for the filesystem stuff [vgfsystem]. We then use LVM. The database logical volumes are striped across all 12 LUNS of vgdbase stripe size 4k.

Database:
This is supports an OLTP reading/writing say 1 page of information (2k). However, it does do some considerable bulk loads. Optimising the system for one will severely impact the other (OLTP vs DS)

Question:
There are many ways of configuring the above hardware, and a heated argument has arrisen about how best to optimise the storage.
o Get rid of LVM stripes & use H/W stripes (expand disks into at least 4 SC10's + 2 disks)
o Keep LVM strips, but loose 4KB stripes &
use extent based stripes [4MB]
o Keep current config

I'll stop here, any pointers would be most apriciated

Tim
-
9 REPLIES 9
Tim D Fulford
Honored Contributor

Re: Disk optimisation

I'm replying to my own question (as no one else will)

For any one who is interested the FC60 gave a better block size (bandwidth) with Extent based stripes (5.5kB as opposed to 2.5k kilobyte strips) generally a small drop in throughput, disks busier though, thus ultimate throughput (IO/s) will have dropped. From this I guess going to 8k or 16k kilobytes stripes would be the "optimum".....

OR..
Drop it, & go to VA7410, striping not issue as it is done internally in the VA!!!

I'd award myself 10points but can't!!

Tim
-
Pete Randall
Outstanding Contributor

Re: Disk optimisation

Tim,

The word I'm getting from my vendors is that I need to look into replacing my (less than 3 year old) FC60 as HP is going to be discontinuing. I'm not sure of the time frame. If this is true, I would lean toward your last option and go with a VA solution. Of course that costs money! If money is an issue, you seem to have almost convinced yourself (and therefore me) to go with "8k or 16k kilobytes stripes".

For what it's worth, we've got almost the exact same configuration: dual controller FC60 with 3 SC10s in split bus mode containing 30 18GB disks configured as RAID 0/1 (which is mirrored and striped). We've always been satisfied with the performance and therefore have left it alone.


Pete



Pete
Wayne Green
Frequent Advisor

Re: Disk optimisation

Tim,

Slightly different here. We have an FC60, 2 controllers, 6 SC10s, 36GB disks. Fully populated. All LUNS are now 6disk RAID5 for max storage capacity. config would only allow 16kb stripes. Running oracle OLTP which had response problems during large I/O (copy of DB LUN to LUN)

Concentrating on the I/O as this was critical we LVM striped 3 x RAID5 LUNs and used H/W 12 disk RAID1/0 LUN and got the same increase in I/O for both. Measured in I/O time halved.

Also nailed the BC at 20% on 4GB L3000 system as a compromise.

Other stuff we did which may or may not apply -

We were also told of the scsictl -m queue_depth setting. This is the number of requests that can be queued to a device and defaults to 8. But this is with a single device in mind not a LUN. We upped it to 64 on a RAID5 LUN and got noticable improvements.


We also used the -S switch of cp, bypases buffer cache so no hit on OLTP. Doubles the I/O time.

There was also a kernel param disksort_seconds which is supposed to add a timing factor to the disk I/O algorithm but this had no effect at all.

I hope to get off the FC60s soon for VAs but am suspicious of the autoraid. We had worse problems on a 12H before.

Cheers
Wayne
I'll have a beer, thanks
Tim D Fulford
Honored Contributor

Re: Disk optimisation

Pete

I posted this question over a year ago. I just happened to be able to answer it after some tests and the game moved on

To give you an idea, the FC60 was getting service times of about 4-5ms, say 4.5. The VA7410 with 30x 15k rom disk gives 1.5-2.5. this is 125% more performance (throughput).

The VA7410's are cheaper (comparing "from new" prices), unfortunately this gain is obliterated by the fact you need fiber switches which cost more than the fc hubs. We did a direct swap (even the kilobyte striping, which I objected to, but hey ho) and have been very pleased with the performance. Previously we upgraded the CPUs, only to find that the 750MHz were 50% better, so this pushed the bottle neck onto the disks! Now the bottleneck is firmly in the CPU camp (which is where we want it).

Many thanks for ypur interest.

Tim
-
Pete Randall
Outstanding Contributor

Re: Disk optimisation

Tim,

Sorry to say I never even noticed the date on your original post. It definitely sounds like the VA is the way to go. Now if only we had the money . . .


Thanks,
Pete




Pete
Tim D Fulford
Honored Contributor

Re: Disk optimisation

Wane

Many thanks for your input.

I'm surprised that RAID5 give good OLTP performance. I thought it would hammer the small write performance. But you got-what-you-got. The kernel pointers are also nice to hear about, I whish I knew about this some months back as we ran into similar problems (another system, upgrading from FC60 30 disks to 60 disks & performance went through the floor!).

On the VA we use RAID1+0, From what I hear the AutoRAID SHOULD give you the best of both worlds. It assigns a portion of the disks to be RAID1+0 for random, small write performance and the rest RAID5DP (RAID6 in old money).

As for the old AutoRAIDs I have had some experience with them, no problems personally. However, I did hear about the performace being really slow under certain conditions. But, I think that taking a 2-generation old technology and infering that the newest generation product will suffer from the same problems is really stretching a point. I would really hope HP would learn from past mistakes.

Many thanks again

Tim

-
Stefan Farrelly
Honored Contributor

Re: Disk optimisation

Very interesting argument. In my experience using RAID disks which are striped at hardware (as RAID does across all disks in the raid group) will always benefit from using LVM striping in addition. Ive seen this result many times. Ive never yet seen a hardware disk subsystem which does perfect raid10 (the best possible i/o throughput!) so using lvm striping can always help, and is easier to control as you can do it yourself regardless of how the disk subsystem was setup (and is usually a lot harder to reconfigure).
Im from Palmerston North, New Zealand, but somehow ended up in London...
Tim D Fulford
Honored Contributor

Re: Disk optimisation

Stefan

One reason (I believe) that LVM striping is better than H/W is that you tend to have more LUNs. this will reduce the HP-UX disk queues. e.g compare a 60 disk sub system (imaginary system based on say FC60).
o This could be split as 15 RAID1 LUNs (2x disk per LUN) & stripped across at LVM level.
OR
o 10 LUNs as 6 disk per LUN in RAID1+0, then stripe across these
OR
o 1 LUN as 30disks in RAID1+0

At the HP-UX level if it sends out 60 IOs, on the first 30 can be dealt with, the other 30 queue (disk q=1). On the second, 10 IOs dealt with and other 50 queue (disk q=5). On the third deal with 1 and queue 59 (disk q=59).
I know the upside is that the service times for the IOs will get less, but HP-UX will still need to store these queues somewhere, which is an overhead.

I think this is the crux of what Wayne was describing in tuning the kernel parameter max_scsi_queue_depth (or whatever it was?).

Many regards

Tim

Many thanks for

-
Wayne Green
Frequent Advisor

Re: Disk optimisation

Tim,

Think you're right about about RAID5 and OTLP performance but our issues were with OLTP performance with a high I/O load going through the FC60 and the I/O rates achieved.
To improve the rate the reasoning from HP was the more spindles the better.

With the autoraid we put the performance down to the working set size. Wasn't it supposed to be upto 10% of the space configured or the data that changed in 24 hours. Trouble was loads of data changed every 24 hours so it must have been moving various bits of it between raid5 and raid10. It was quicker to put it on tape. I'm told this would not be a problem on the VAs but basically the real issue was the intransigence of the customer to change their working practices.

After explaining that to improve sequential I/O rates would impact on random I/O rates and vice versa I was told this was a fault in the OS and should be addressed by HP. Was a bit stumped by that.
I'll have a beer, thanks