ProLiant Servers (ML,DL,SL)
1748179 Members
4136 Online
108758 Solutions
New Discussion юеВ

Re: Conflicting reports of hard drive failure

 
SOLVED
Go to solution
Kimberly Joyner
Advisor

Conflicting reports of hard drive failure

When I login to my server and run the HP System Management homepage it tells me there are no issues. It provides no information and just says:

Overall System Status
no failed/degraded items
---------------------------------
HP System Management Homepage v2.1.2.127

When I run the HP Insight diagnostics it tells me one of my drives has failed.

Physical Hard Drive 1, Controller Slot 3
Passed
This drive IS functioning within the proper operating specifications and should NOT be replaced. If you feel you are experiencing problems in your storage system, refer to the HP Insight Diagnostics User Guide or the HP ProLiant Servers Troubleshooting Guide located on hp.com.

Physical Hard Drive 2, Controller Slot 3
Passed
This drive IS functioning within the proper operating specifications and should NOT be replaced. If you feel you are experiencing problems in your storage system, refer to the HP Insight Diagnostics User Guide or the HP ProLiant Servers Troubleshooting Guide located on hp.com.

Physical Hard Drive 3, Controller Slot 3
Failed Wed Feb 27 14:00:57 2008
640004: Controller has reported a critical performance threshold error on this drive
This drive has FAILED and should be replaced.
Perform the following steps before replacing a drive:

Have a known good data backup.
Replace the failed drive with one that is the same as, or functionally equivalent to, the original drive.
Do not replace the drive if there is another drive that is failed, offline, or in the process of being rebuilt.
On non-Hot Plug drives make sure the replacement drive ID matches the original drive ID.
For further replacement procedures or help with identifying drive status, see the HP Smart Array Controller User Guide for your controller. The User Guide is available from hp.com.
To identify the failed drive, click on the 'Flash LED' button below to flash the LED on this drive.

When I look at the three drives they all look fine, all are green and appear to be working.

I run the HP Array Diagnotic Utility and it tells me there are no problems:

Array Diagnostic Utility Version 7.40.7.0
Array Diagnostic Utility Inspection Report Version 7.40.1.0

Date/Time: Wednesday, February 27, 2008 1:46:59PM
Computer Model: HP Server


SLOT SUMMARY:
Slot Num Slot Type Array Controllers and Host Adapters Detected
-------- --------- --------------------------------------------
SLOT 3 PCI Smart Array 641 Controller

SLOT 3 Smart Array 641 Controller ERROR REPORT:

No problems detected



Any idea which of these I should believe? Is there another method to check and see if the hard drive is really going bad or failed?

Many thanks.



4 REPLIES 4
Brian_Murdoch
Honored Contributor
Solution

Re: Conflicting reports of hard drive failure

Hi Kimberly,

ADU is usually the best tool for checking the status of the drives. Would you be able to attach the ADU report file (report.txt) to a reply and I will run the adu analyser on it to double-check.

If it's OK it would suggest an issue with Insight Diagnostics.

You could try a newer version of ADU. Here is a link to the Windows version (7.85)
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=uk&prodTypeId=15351&prodSeriesId=397646&prodNameId=3279717&swEnvOID=1005&swLang=8&mode=2&taskId=135&swItem=MTX-4b253263b07b4ed799f8ea1010


What happens if you boot the Smartstart CD and run the Insight Diagnostics in stand-alone mode (i.e. without the O/S loaded)?

Attach your ADU report (old and new if running the later ADU version) please.

Regards,

Brian
Kimberly Joyner
Advisor

Re: Conflicting reports of hard drive failure

I ran the CD version if the drive diagnostics and it also told me the drive was bad. I have not removed it and ordered a new one, I put it back in. Now the array is rebuilding with, what it thinks, is a new drive. The windows version of insight diagnostics still says the drive is bad even though it is rebuilding.

I guess my next step is to wait for the array to complete the rebuild process and swap out the drive to see if the new replacement is shown as bad.

Is there a way to tell hoe long the rebuild is going to take?
Joshua Small_2
Valued Contributor

Re: Conflicting reports of hard drive failure

Look at the drive status on the System Management Homepage.

If it shows "predictive failure" here, it states the drive has encountered a SMART error and WILL fail, but hasn't yet. This would account for the behaviour you're seeing.

HP will replace drives with predictive failures.
Kimberly Joyner
Advisor

Re: Conflicting reports of hard drive failure

Now I appear to have an even bigger problem. After rebooting the server the array is rebuilding the "bad" drive becuase I pulled it to get the p/n for the replacement (which I now have). The array started rebuilding the old drive but has been stuck at 8% since last night and does not appear to be actually working. All three drives are flashing the amber X and the server does appear to be working.

What will happen if I yank the bad drive? Will I destroy the entire array?

My System Management Homepage does not work properly as I can't figure out how to properly configure the SNMP (I think) but the Array Confirguration says:

#778 The current array controller is rebuilding logical drive 1 ( RAID 5 ).

Configuration changes to this logical drive or any other logical drive in array A are not allowed until rebuilding is complete. Also, configuration changes to any other array that is waiting for expansion or rebuild are not possible until this process completes. If unused space exists, additional logical drives can be created. Otherwise, most configuration changes are not allowed until this process is complete.

Thanks.