ProLiant Servers (ML,DL,SL)
1752571 Members
5252 Online
108788 Solutions
New Discussion юеВ

Re: Excessive Failed Hard Drives

 
SOLVED
Go to solution
Rancher
Honored Contributor

Excessive Failed Hard Drives

We have two ML370 G5 servers, each with 4 MSA 60 storage shelves, running Server 2008, with the HP firmware and software for Insight Manager support. These servers have only been on line since the first of December and we have already had 5 failed hard drives. It is true that we are running a lot of drives, but this seems excessive to me. Any thoughts? Thanks!
35 REPLIES 35
Michal Kapalka (mikap)
Honored Contributor

Re: Excessive Failed Hard Drives

hi,

i think you have lot of drives, so the number of failed drives are not so big, the question if the failed drives was placed on one shelve, or it was random disk failures.

mikap
Roy Main
Valued Contributor

Re: Excessive Failed Hard Drives

It does sound excessive. You'll want to troubleshoot them a little more the next time you have a failure. Start collecting ADU reports. Run the drive Diagnosis in Insight Diagnostics before you replace the next one and see if it agrees with the failure. Also take a look at the environment, specifically heat and make sure you are operating in the correct temp range.
Rancher
Honored Contributor

Re: Excessive Failed Hard Drives

They are random disk failures. As a matter of fact, I just got an alert this morning that another drive is in the predictive failure mode. I thought maybe it might just be a problem with the HP management agents reporting false failures. This time I ran diagnostics and sure enough, it also showed the drive as failing.
cnb
Honored Contributor

Re: Excessive Failed Hard Drives

Can you run ADU and review or attach the report here?


Rgds,
Rancher
Honored Contributor

Re: Excessive Failed Hard Drives

Last night I updated the controller firmware as cnb advised me to do in a different thread.

The last time I ran ADU and it did fail. I have a screen shot but I don't know how to attach it.
Rancher
Honored Contributor

Re: Excessive Failed Hard Drives

I just had another failed hard drive. Here is what Diagnostics reported:

FailedError: 640001: Controller has reported a SMART error on this drive
Error: 640006: The Read and/or Write HARD error rate is above threshold
wobbe
Respected Contributor

Re: Excessive Failed Hard Drives

I guess most harddisk failures take place dureing the fist few months. So I would expect failure rates to drop.
cnb
Honored Contributor

Re: Excessive Failed Hard Drives

Don't forget to upgrade the disk firmware. Some predictive failure issues were resolved with certain disk drive firmware updates.

Rgds,
cnb
Honored Contributor

Re: Excessive Failed Hard Drives

>The last time I ran ADU and it did fail. I have a screen shot but I don't know how to attach it.

To attach files:
https://forums13.itrc.hp.com/service/forums/helptips.do?#15


Rgds,