ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

fschicker
Advisor

Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

We are a small hosting company in austria and are using P400 controllers in about 20 HP Servers (DL 320(s), 160, 180)
Now, since more servers have mucher to do, we recognize very strange performanceproblems on writing to a logical drive
on the P400 Controllers even if the server is idle.

some tests we made, shows the performance problems:

OS: Debian Linux, we also tried 2.6.25 kernel and Ubuntu, CentOS and FreeBSD

Controller: P400 Rev 1 or 3, with BBWC and without, all with 256 MB Ram, latest FW or older (see detailed info attached from hpacucli controller details and hpaducli output)

Server: DL320 (FW Upgraded to latest Version with Firmware Maintenance 8.10)

if we make write stuff to the disk you can see the problem:

time sh -c "dd if=/dev/zero of=/tmp/test bs=1024k count=4000; sync"
4000+0 records in
4000+0 records out
4194304000 bytes (4.2 GB) copied, 325.159 seconds, 12.9 MB/s
real 7m49.905s
user 0m0.004s
sys 0m12.121s
(you see, syncing needs also 2 minutes after dd has finished, after this the serverload is about 5, before 0.01)

we tried almost everything to find the problem, her is a list what we tried:

- updated p400 fw to latest version
- upgraded servers fw to latest version with fw-maintenance 8.10
- tried other linux distri
- tried other linux kernel (latest 2.6.25)
- other HP Server: we tried DL320s, DL320, DL160 and DL180
- tried other p400

Now we dont know what to do now, we have no more ideas how we could solve the problem :(

It would be very very very nice if anybody could help us with this issue!

Greetings from austria (the country without kangoroos),

Florian Schicker
33 REPLIES
fschicker
Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

i could only attach 1 file, here is the second!
Hein van den Heuvel
Honored Contributor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

>> with BBWC and without

I see no sign of BBWC in the attached listings, and see 3 signs it is not there:
1) RAID 6 (ADG) Status: Disabled
2) Accelerator Ratio: 100% Read / 0% Write
3) Battery Pack Count: 0

Now I know next to nothing about the P400, so maybe I'm not looking in the right place, but the reported performance is consistent with no BBWC.

For the Battery Backed Write Cache to work you need cache (present!) and... drumroll... a battery.
The controller does not seem to see a battery, so it does not do the write cache (0% write), and refuses to do ADG as that performance would be totally horrible without cache assist.

The system appears to be running bare metal raid 0+1 for which the IO response time is the worst case for each disk for each write. Both physical disk IOs need to be ready before the OS is informed that the IO is ready. It seems to me that the 13MB/sec is a reasonable performance under those conditions.

http://h18006.www1.hp.com/products/quickspecs/12400_div/12400_div.HTML

Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting


Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

I kind of solved this problem in my setup. I enabled the write cache via hpacucli:

=> ctrl slot=2 modify drivewritecache=enable

It gave quite a performance bump. From 8Mb/s to 124Mb/s semi sustained rate.
Hein van den Heuvel
Honored Contributor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

yabut... The "drive write cache" setting goes out to the mode-page in the drive, enabling the unprotected per-drive (8MB)write cache.

As per ACU Ref Manual:

"On controllers and drives that support physical drive write cache, you can use this command to enable or disable the write cache for all drives on the controller.

CAUTION: Because physical drive write cache is not battery-backed, you could lose data if a power failure occurs during a write process. To minimize this possibility, use a backup power supply"

btw... that same manual confirms the r/w cache ratio of 100/0 indeed indicates that there is no writeback cache:

"Changing the controller cache ratio.
... You can change the ratio only if the controller has a battery-backed cache (because only battery-backed cache can be used for write cache)"

Hein.
fschicker
Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

hein:
the files i attached are from a dl320, but we have the problems also on DL320s with BBWC, same setup which Ulrik has.

write performance of 8 mb/s is surely not an issue of wrong raid level :) we have many other systems with other controllers and we get always a minimum of 40mb/s at writing.

greets
Hein van den Heuvel
Honored Contributor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx


And is that BBWC truly active?
Show me the money!

Did you try the "modify dwc=enable", just as a problem finding technique, or as a longer term solution if the risks are acceptable (UPS) for the application?

hth,
Hein.
fschicker
Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

yes it is!

Battery Pack Count: 1
Battery Status: OK

i dont want to try dwc on our DL320s because it is a live system. on our DL320 hpacucli dont let me set it:

=> controller slot=2 modify drivewritecache=enable

Warning: Without the proper safety precautions, use of write cache on physical
drives could cause data loss in the event of power failure. To ensure
data is properly protected, use redundant power supplies and
Uninterruptible Power Supplies. Also, if you have multiple storage
enclosures, all data should be mirrored across them. Use of this
feature is not recommended unless these precautions are followed.
Continue? (y/n) y


Error: This operation is not supported with the current configuration. Use the
"show" command on devices to show additional details about the
configuration.

- config details:

controller slot=2 show config detail

Smart Array P400 in Slot 2
Bus Interface: PCI
Slot: 2
Serial Number: PA5360J9STU1LA
Cache Serial Number: PA82C0D9STJ12S
RAID 6 (ADG) Status: Disabled
RAID 6 (ADG) Enabler Status: Enabled
Controller Status: OK
Chassis Slot:
Hardware Revision: Rev B
Firmware Version: 5.20
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 15 sec
Cache Board Present: True
Cache Status: OK
Accelerator Ratio: 100% Read / 0% Write
Total Cache Size: 256 MB
Battery Pack Count: 0
SATA NCQ Supported: False

Array: A
Interface Type: SATA
Unused Space: 0 MB
Status: OK

Logical Drive: 1
Size: 74.5 GB
Fault Tolerance: RAID 1+0
Heads: 255
Sectors Per Track: 32
Cylinders: 19146
Stripe Size: 128 KB
Status: OK
Array Accelerator: Enabled
Unique Identifier: 600508B100104A39535455314C410000
Disk Name: /dev/cciss/c0d0
Mount Points: / 70.7 GB, none 3.8 GB

physicaldrive 1:1
Port: 1I
Box: 1
Bay: 1
Status: OK
Drive Type: Data Drive
Interface Type: SATA
Size: 80.0 GB
Firmware Revision: HPF0
Serial Number: 5RW0RNZV
Model: ATA FB080C4080
SATA NCQ Capable: False
physicaldrive 1:2
Port: 1I
Box: 1
Bay: 2
Status: OK
Drive Type: Data Drive
Interface Type: SATA
Size: 80.0 GB
Firmware Revision: HPF0
Serial Number: 9RW08J3W
Model: ATA FB080C4080
SATA NCQ Capable: False
Hein van den Heuvel
Honored Contributor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

>>yes it is!
>> Battery Pack Count: 1
>> Battery Status: OK

Ok. That looks good.

I assume you also checked the "Accelerator Ratio" for more than 0% Write


>> Error: This operation is not supported with the current configuration.

Sorry. Dunno.

Hein.
fschicker
Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

yes, we tried other values, no effect.
John Kufrovich
Honored Contributor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Florian,

I'll agree with Hein, there is no battery.

You're last output shows that.

controller slot=2 show config detail

Smart Array P400 in Slot 2
Bus Interface: PCI
Slot: 2
Serial Number: PA5360J9STU1LA
Cache Serial Number: PA82C0D9STJ12S
RAID 6 (ADG) Status: Disabled
RAID 6 (ADG) Enabler Status: Enabled
Controller Status: OK
Chassis Slot:
Hardware Revision: Rev B
Firmware Version: 5.20
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 15 sec
Cache Board Present: True
Cache Status: OK
Accelerator Ratio: 100% Read / 0% Write
Total Cache Size: 256 MB
Battery Pack Count: 0
SATA NCQ Supported: False



Battery Pack Count: 0

jk
fschicker
Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

john, please read my post carefully ;)

i have one dl320s WITH bbwc and one WITHOUT bbwc, on both the same problem. (we have about 30 hp servers).
the output i pasted is from the one without the bbwc but its the same (bad) speed as with bbwc.

(i hope its clear now)
John Kufrovich
Honored Contributor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Florian,

Then why attach two outputs, acucli and adu, for the same system without BBWC. My concern would have been the system with BBWC and it's slow performance.

Could you supply reports from the server having poor performance and BBWC?

Let's see if that blocksize is too large for those SATA's. Instead of 1M blocks, can you try something smaller. 128k or 256k

Have you look into the default elevator queues for those OS's.

jk

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Have you tried installing windows server on one of the servers. Just to see if it is a driver or hw issue?
fschicker
Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

after more than 1 month of searching the failure with hp techsupport (yeah we got 2nd level support... but the person there didnt even know the hpacucli commands...) we found the problem ourself.

here what we did:
- we opened a new support case at hp where we said that our p400 controller is defect in the hope that we get a newer hardware revision
- 2 days later the new controller arrived, and yes!, with a newer hw revison!
- switched controller in one dl320s
- testet again, and got much better performance:

sync; time sh -c "dd if=/dev/zero of=/mnt/test2 bs=512k count=4000; sync"
4000+0 records in
4000+0 records out
2097152000 bytes (2.1 GB) copied, 40.734 seconds, 51.5 MB/s

real 0m48.770s
user 0m0.016s
sys 0m6.072s

(before we got 8 mb/s max.)

so, HP, please NOW believe us that its not an issue of RAID 6 or a "normal" problem of SATA Disks. No its your crappy P400 hardware revision which has major problems...

now i will try to explain this issue to the hp support guys. (ohno! that will be a horror-trip...)

thanks to all guys who tried to help - i think you have to change your controllers also.

greets from austria,

Florian
Hein van den Heuvel
Honored Contributor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Thanks for the update! Much appreciated.

I hope we at least helped to convince you that you are doing the right tests and looking at the right settings and thus to pursue the real cause (that HW revision).

Cheers,
Hein.
Chirs Werfel
Occasional Visitor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

As a follow up I had a similar problem with an AIO1200 storage server (DL320s with P400) running Windows Storage Server. The hardware revision on the card is newer than what you posted, I think, but the symptom was UNUSABLE write performance without enabling the physical disk write cache. I am posting my summary below in case anyone else experiences this problem. I am unable to go to firmware 5.20 due to a known incompatability issue with the AIO ASM software.

1. AIO1200 w/ASM 1.4, P400 controller w/fw 4.12, SATA drives exhibits slow write performance and variable file transfer approximated times during Windows large file copies. Perfmon counters for Physical Disk Current Queue Length and Network Adapter Bytes/Sec show that disk queue rapidly rises to a fixed number (21 on a three disk raid 5 volume) while network peaks at 300-500 Mbps then drops of to zero. Once the disk queue drops to 0 network activity jumps back up. This cycle continues until the transfer is completed.
2. Turning on the P400 Advanced Setting, Physical Disk Write Cache improves file write times by 100%. Network activity stays consistent even though the disk queue length pegs at 21.
3. Rebooting after changing the write cache setting further improves file write times by another 100% (dropping total transfer time to one fourth of the original time). Presumably the OS has awareness of this setting further optimizing write times.
4. As per the documentation Physical Drive Write Cache should only be enabled when the host server has an uninterruptable power supply.

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Hi Florian,

Just a quick note to add my experiences and maybe help explain what happened in both yours and my case.

I did extensive work on a customer machine, an ML350 G5 with E200 controller and SATA disks. Write performance (and disk IO in general) was terrible and would sometime crash the server (Windows 2003). After updating the system BIOS and E200 firmware many times, nothing changed.

The problem was that the combination of E200 driver and firmware would not allow the disks' hardware cache to be enabled on SATA drives - the driver would always disable the cache after I tried to enable it.

Finally HP released a new version of the E200 firmware which allowed SATA drive cache to be enabled (with warnings about UPS etc.) and the problem went away for good - the server is now fast and very reliable.

This issue sounds to me the same as the one you had with the P400 - I assume the new motherboard shipped with a controller running the updated firmware and therefore allowed the drive cache to be enabled.

Glad you got it working anyway. I'm looking to buy a DL360 with P400 controller + SATA disks so I hope this is now fixed on new machines!

Cheers.

George
Ed Roche
Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

We have 2 DL380 G5s with this P400, same strange disk behavior even when server is unused. We were having weird performance issues on one of them where the network share would hang for 7 to 10 seconds every so often during the day. After a final shot at solving the year old problem it seems to have been fixed by one of two things. Either upgrading the cache from standard 128mb to 512 cache plus battery (Cache 50/50) or upgrading veritas backupexec from 10d to 11d. After a lot of research and speaking to a proliant consultant it seems that the p400 is a poor piece of kit.
Ste B
Occasional Visitor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Hi George,

I was wondering how you were able to determine that the drives' cache has been switched off? I can't get hdparm to look at the drives at all.

I've managed to switch on the controllers' write cache (dwc=enabled) on my card - and now my speed has been lifted to the heady heights of 14Mb/sec!

Thanks,

Ste

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Oh crap...

We bought another server (DL180 G5) with SATA and P400 (512Mb BBWC) and we have the same problem again.

The previous server the DL320S is still running and performing, though we still need to have the drives write cache enabled. This trick doesn't make it for the DL180 and this time I don't have time to go through the same troubleshooting again. This time HP will have to fix it for me.
Chirs Werfel
Occasional Visitor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Ulrik,

I have noticed that I have had to reboot after changing the cache setting AND it often took several hours or a day for write performance to improve. I think this is based on how the controller performs parity and optimization operations.

Chris
ldoodle
Frequent Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

Just got one of these cards WITH 128MB BBWC, physical write-cache enabled and accelerator enabled for both arrays.

Interestingly though, Windows Server 2008 won't let me turn on it's own write-cache - I tick the box and apply, then go back in and it's de-ticked!!!

Anyway, have 2 80GB SATA's in RAID 1 for OS, and 4 250GB SATA's in RAID 5 for data. I also have a 500GB USB drive plugged in for backups.

When writing to the RAID 5 array, write speeds are ~20MB, but when copying from the RAID 5 array to the USB drive, speeds peak at about 170MB - 8.5 faster!!

What's wrong with the write speeds on this card?
PhilipHarvey
Occasional Advisor

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

We have six HP DL380 G5 servers with P400 controllers with 256MB of non-battery backed cache. Each server has 2x SAS disks in RAID1 and 4x SAS disks in RAID5.

The read and write performance on RAID1 is good. About 30MB/sec write and about 35MB/sec read.
The Read performace from RAID5 is OK but lower than you would expect, about 25MB/sec.
The Write performance for RAID5 is crap, totally unusable, worse than consumer grade junk, 10MB/sec.

I am going to test if the 5.22 firmware makes any difference, and also test if installing and enabling a battery backed controller helps and will post back the results.
It looks like a firmware/hardware fault, and if firmware doesn't fix this then I will be sending them back to HP.

Re: Horrible Write Performance with Smartarray P400 in DL320, DL320s and DL1xx

@Ste B

Hi Ste,

As I remember the driver logged an entry in the Windows Application log (or was that System log? probably...) saying that the change I had made was being disabled by the driver.

Just had a look on the server now and the difference between the earlier and later version of driver and firmware are very apparent from within the ACU. I'm running ACU v8.0.14.0 if I click on the E200 controller, then Advanced Settings (read: ESSENTIAL SETTINGS!) there's an option labelled Physical Drive Write Cache Settings from which you can enable the disks' caches.

Once i had done that, the machine worked as it should have out of the box. Never mind it nearly lost me a customer and all that....

Good luck.

George