ProLiant Servers (ML,DL,SL)
1753963 Members
7621 Online
108811 Solutions
New Discussion юеВ

Re: P400 recurring random disk failure

 
T. Hiukkanen
Advisor

P400 recurring random disk failure

I have a new 12-bay DL185 G5 loaded with 6 x 500G SATA drives (GB0500EAFYL), connected to a P400 (512M BBWC).

The drives are configured as a single RAID 1+0 array (also tried two RAID5 arrays).

The problem manifests when one or several of the drives "fail" on random intervals. Mostly the array reports the drive in question as failed, sometimes as removed (Hot-plug drive removed, port=1I Box=1 Bay=n). Putting load on the disks (with bonnie++) seems to have no effect on the problem (the disks fail even on an idle system).

Mostly only one drive is reported as failed, sometimes several drives at once. If left to run without rebooting, eventually the server becomes unreachable due to multiple disk failure. (Recently, there's also been a persistent warning on drive 1 predictive failure.)

After reboot, the controller bios reports the drive(s) as working again and starts rebuilding the array on the background. Mostly the rebuild is finished ok, sometimes the percentage sticks to a number and never finishes. After a second reboot, the server starts over and (mostly) finishes rebuilding.

The server, including the controller and the drives, have the latest firmware installed. The drives report NCQ capable & enabled = true.
The problem has occurred on different stripe, raid level and drive cache on / off settings.

HP has previously replaced the mainboard (which fixed a separate ILO problem), P400 controller and SATA backplane board.

I'll attach a settings dump (hpacucli ctrl all show config detail).

I'd really appreciate any information on similar issues.

Thanks, again, for your time.
8 REPLIES 8
Roberto Colmegna
New Member

Re: P400 recurring random disk failure

Re: P400 recurring random disk failure

I'm in a very similar situation. Dl180 G5 with a p400 512mb bbwc. I have a 3 disk raid 5 for my exchange databases. Last month it failed two of the three drives. I restarted and they came back up fine and rebuilt, but would soon fail after trying to restore the database. This happened shortly after updating my p400 to firmware 6.86. I replaced my non-hp drives with HP-branded drives thinking that perhaps HP was punishing me. It worked fine for a month and a half and now it's doing the same thing. It failed 2 of the 3 drives in my database raid 5. I let it rebuild and now it failed the third of my 3 disk raid 5 and is rebuilding using a hot spare. This is unacceptable!! I'm currently on the phone hoping to get some help from HP, but if anyone knows what is causing this or a fix PLEASE post it! Thanks!!
T. Hiukkanen
Advisor

Re: P400 recurring random disk failure

Good luck. My drives, controller and come to think of it every part has been changed again and again.

Two of my drives are now GB0500EAFJH (HPG6), which have not failed. Four GB0500EAFYL (HPG1) remain, which fail under any OS (RHEL, ESXi, OpenSUSE).

You can see from the age of the thread that this has not exactly been a model case for HP support either.

Please if you can see your drive model number with 'hpacucli ctrl all show config detail', i'd appreciate any info of the drives you're using.
TarunJain
Respected Contributor

Re: P400 recurring random disk failure

Hi,
Maybe this issue is resolved in firmware version 6.86 (16 Jul 2009) for Smart Array P400.

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=1157689&prodTypeId=329290&prodSeriesId=1157687&swLang=8&taskId=135&swEnvOID=1005

Has anyone tried it?

I am an HPE Employee

Accept or Kudo
src="/html/assets/hpe_banner_SME_signature.p ng">

T. Hiukkanen
Advisor

Re: P400 recurring random disk failure

Thanks, but no luck there either. Same problem still exists after gone through 6.86 (B), 5.26, 5.26(B) and 5.26(C).
marcus1234
Honored Contributor

Re: P400 recurring random disk failure

update sysboard bios then the disk too

as per link



http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c00305257&prodTypeId=15351&prodSeriesId=397642


and
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c01612429

goodluck

by the way i have seen many issues with 6 series firmware on the p400 update disk firmware and use 5 series firmware for the p400 contoller .........


This is how we thank in the forum

http://forums11.itrc.hp.com/service/forums/helptips.do?#33
marcus1234
Honored Contributor

Re: P400 recurring random disk failure

marcus1234
Honored Contributor

Re: P400 recurring random disk failure


special note this firmware

it is in this link
from HPG2
18 Nov 2009

as mentioned on fixes revision is to be used with 5 series firmware on the controller , do not use it while you have 6 series firmware on the controller or it will be more headaches


This is how we thank in the forum

http://forums11.itrc.hp.com/service/forums/helptips.do?#33