cancel
Showing results for 
Search instead for 
Did you mean: 

Server Storage Performance

SOLVED
Go to solution
Larry Precure
Occasional Advisor

Server Storage Performance

I've just setup an ML150G2, with the hot-swap SATA RAID optional controller. At present, I'm using 3, 250GB drives, in a RAID 5 array. Server has 2GB RAM, running SuSE 9.

The server is going to be used as a file storage server, and virtually all of it's use will be to store backup images of my customer's hard drives. At present, I haven't copied much data to the server. Total size of the backups I've made is around 200K files, and around 40GB space. But I intend to put a LOT more information on the system. (I've purchased three more drives, but haven't installed them, yet.)

Performance is MUCH worse than I expected. Friday, as a test (to narrow the bottleneck down somewhat), I created a new folder on the array, and copied all the backup files into this new folder. (Copying data from the array back to the array, simply on the server itself.)

Time for the server to copy 40GB from the array to the array: 3 1/2 hours.

This performance is not acceptable for the task I'm attempting. I'm willing to try spending more money on the server if it will help. (More server RAM? Or is the bottleneck the RAID card? Or is this just the way SATA drives perform?)

HP tech support told me that this is just the normal performance hit caused by RAID 5, and that I need to go to RAID 0. But I'm reluctant to sacrifice half of my capacity, and I've just begun a test where I install two of the unused drives as a RAID 0 array, and it looks like it's going to take the system about 8 hours just to CREATE the array.

I'm really hoping that I'm not going to have to simply trashcan my $3K server, and start over looking for a $12K server with SCSI drives and controller, and so forth.
10 REPLIES
Steven E. Protter
Exalted Contributor

Re: Server Storage Performance

Raid five parity 9 gives you an enormous performance hit on writes.

It writes the data nine times. On big transfers, that is gonna hurt.

Thats why you should go with RAID1.

Raid 0 offers no protection for your data.

Your server is salvageable. You just need to get it some disk.

I've built a few very nice RAID 1 Linux boxes that do nothing other than run like a tank and serve data. 40 GB copies on those babies in much less than 3 hours.

Regards,

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Larry Precure
Occasional Advisor

Re: Server Storage Performance

Actually, as I understand it, RAID 5 (I've never heard of "Raid five parity 9"), means that everything that used to be a disc write becomes one read and one write, on each of two drives.

The system has to read the parity data, and the data that's about to be overwritten. It then subtracts the old data from the parity, and adds the new data to the parity, to calculate the NEW parity, writes the new data, and writes the new parity.

In short, what would have been read, seek, write, seek (to get ready for the next read), now becomes (I'm assuming worst-case timing) read, two seeks (on two drives, so they can be done "in parallel"), two reads (also in parallel), calculation, two writes (parallel), and one seek (to get ready for the next read.

But it still seems really slow to me. I would have thought, if these had been just plain-old drives, without RAID, that this operation (copying 40GB) would have taken about 10 minutes. There's a big difference between 10 minutes and 200.

I could understand a 4:1 performance hit, but 200:1 seems excessive

(Just for comparason, on my workstation (P4 1.9, 1MB RAM, WinXP, one IDE drive), I just copied 1GB from the hard drive to itself. Time: 4 minutes.

David Claypool
Honored Contributor
Solution

Re: Server Storage Performance

The old saying 'there's no such thing as a free lunch' applies here, both in performance costs and in pricing.

First, while SATA sounds impressive with its 1.5 gigabit speed, the drive in question is a 7200RPM drive, and its head-data rate is about 750 megabits per second, meaning you're not getting anywhere near the 1.5 gigabit speed (although because there are 6 SATA ports, each at 1.5Gb, at least you aren't saturating the controller as you would with a shared bus approach like parallel ATA or SCSI). However, this controller has read cache only, and 64MB isn't a lot for that at all, especially when you're planning on copying gigabytes at a time.

Comparatively, Ultra320 SCSI is 320 megaBYTES per second, or approximately 2.6 gigabits/sec. Most SCSI drives these days are 10,000 RPM, so you're looking at about a 25% improvement in head/data rate, although as mentioned above, a single channel would be saturated with anything more than 3 drives.

I can't tell from the 250GB SATA data sheet what the native specs of the drive are. You might want to check the model number of the OEM and do some research on it. One optimization feature that SCSI drives come standard with is native command queueing, which not all SATA hard drives have (one reason to do some research). This helps to eliminate unnecessary seeks from one place to another by sequencing seeks logically even though the OS is asking for them randomly that can degrade performance.

The bottom line is that there's a lot of voodoo involved in understanding and optimizing storage performance. RAID 5 vs RAID 1 vs RAID 1+0 (all of which this controller is capable of) can be very significant. RAID 1+0 not only mirrors the data, it also stripes the data from one drive to the other, meaning that particularly during sequential read operations, instead of reading track 11 and then repositioning for track 12, hard drive A can be transferring track 11 and hard drive B has already been told to reposition to track 12 and is ready with its data as soon as track 11 on hdA has completed. Remember that your biggest enemy in storage performance is MECHANICS, not ELECTRONICS.

All is not lost, though. One thing that could really make a big performance difference is to minimize the times that the overhead is incurred. You could consider using a pair of smaller drives configured as RAID 1+0 and put the system and swap volumes there, keeping your pure data volumes on a RAID 5 set. Since the OS has to keep up with its housekeeping even while primarily being told to do a file copy, this could have a big impact. Since swap data isn't all that critical, you could even contemplate connecting a JBOD (non-RAID) drive to the built-in SATA controller and placing the swap volume there, not only eliminating RAID overhead, but also splitting that stream onto another system bus. An additional possibility is to acquire a second RAID controller and use it for the system RAID 1+0 volume and keep it separate from the RAID 5 volume (it's possible that trying to ask your current controller to handle both a RAID 1+0 set and a RAID 5 set at the same time might be beyond the performance of the controller's processor and bog you down).

Without personal experience with this configuration, I'm guessing, but in ascending order of performance I would try:

- Split swap volume off to a non-RAID volume using the system SATA controller
- Configure system and swap on RAID 1+0 and data on RAID 5 all on the same controller
- Configure system on RAID 1+0, swap on a non-RAID volume and data on RAID 5
- Configure system and swap on RAID 1+0 on a separate controller from the RAID 5 data
- Configure system on RAID 1+0 on a dedicated SATA RAID controller, swap on a non-RAID volume (using the standard SATA controller) and data on RAID 5 on its own dedicated RAID controller
Larry Precure
Occasional Advisor

Re: Server Storage Performance

I'd considered RAID 1+0, if I HAD to fall back on mirroring, but all I've seen says that this controller fomr HP only supports 0, 1, and 5. (If I have to go to straight RAID 1, then that also means my six drives have to be split into 3 volumes, partitioned seperatly, and so forth.)

(The card seems to be a variant of an Adaptec card, which DOES support 1+0. So, maybe that feature's been disabled on the HP version, or maybe it's supported and I just missed it in the docs.)

I'd wondered if adding more server RAM would help, either by giving the RAID controller more room to work (I don't know if it uses system RAM) and to reduce seeks by providing a bigger hard drive cache.

(I'd also wondered, if the server's CPU is actually doing the parity calculations, would a second CPU help? I don't expect this server to be doing a lot of multi-tasking, but if RAID takes up one CPU all to it's own, that still might help.)

Thanks for the info that faster RPMs equated to faster sustained transfer rates. I'd always assumed that faster RPMs simply meant less rotational latency. (Which could be a factor in this case, since RAID writes involve reading a sector and then waiting a full rotation so that it can be written.)

In any case, thanks.
Karsten Breivik_1
Frequent Advisor

Re: Server Storage Performance

David makes a few good points. Listen to him.

Separate data from system files. At least concider placing the swap on a separate disk. I personally would not even bother using RAID for the system and swap files, because: RAID IS NOT BACKUP!!!

I have seen both software and hardware based RAIDS being wrecked beyond repair. This is important so I'll say it again: RAID IS NOT BACKUP!!!

Also pay attention to what he says about native command queueing. This is what really makes the SCSI disks stand out performance wise. However I did not know that some SATA disks also had this feature...

- k

poi
Uwe Zessin
Honored Contributor

Re: Server Storage Performance

Of course RAID is no replacement for BACKUP - it is about continuous computing. And I would not be happy to learn that my sever went down just because the page/swapfile was not protected.
.
Larry Precure
Occasional Advisor

Re: Server Storage Performance

Attempting to impliment the suggestions. (The more I think about it, the more having RAID on a swap file sounds dumb).

Current fun problem: The system appears unable to recognise a slave device on the (only) IDE controller. Whichever device is jumpered as master (or CS and plugged into the last connector) is recognised, and the slave (or CS in the middle) is not.

Since I need a CD drive in order to install the OS on the boot drive, this is making things tough.

I've noticed that there's a BIOS upgrade for this model, and I'm hoping this is a BIOS problem.
Larry Precure
Occasional Advisor

Re: Server Storage Performance

So much for "upgrading the BIOS may fix it".

I already HAVE the latest BIOS.

Why this system will not recognise 2 IDE devices I have no idea.

Kodjo Agbenu
Honored Contributor

Re: Server Storage Performance

Hi Larry,

First, you need to accurately identify whether the bottleneck is on the hardware or software (filesystem).

250GB is quite a huge size, and using it with a good performance may require some tricky configuration bits.

=> Check the filesystem block size (generally : 4K)

=> Check the RAID block size (generally : 32K)

=> ...

good lcuk.
Kodjo
Learn and explain...
Steven E. Protter
Exalted Contributor

Re: Server Storage Performance

Raid 5 can be configured on a disk array with different parity levels. I have never heard of anyone using parity above 9, now do I know if its possible.

Parity tells you how many copies of each block(datapiece whatever) are being kept. Hence my earlier statement.

I'm not sure what parity raid 5 software raid on Linux uses. It may depend on how many physical drives are in the LUN.

Your server has a disk preparation disk, which I imagine you used to prepare storage prior to installing the OS that gives, and may even explain some of the intricacies.

A very interesting thread and discussion btw.

Good Luck, and let us know what you decide and how it turns out.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com