Servers - General
1820392 Members
3263 Online
109623 Solutions
New Discussion

HP ML350 Gen9 > HDD Degraded (Predictive failure)

 
SOLVED
Go to solution
juergen-account
Occasional Advisor

HP ML350 Gen9 > HDD Degraded (Predictive failure)

Hello, 

is this the best procedure a.m. incident?

Problem: ILO 4 say: 
Degraded (Predictive failure)
Physical Drive in Port 1I Box 4 Bay 3
RAID 5 (I think without spare), I think all drives are online.  Backup on NAS is done.  (vmware esxi)

a)
> hotplug-hotswap replace with new hdd  (e.g. same HDD P/N like be before) + waiting for rebuild complete status.  No need for shutdown the whole server.

b)
Cache Module Status = Failed <<  shutdown of server needed. No special advantage / cause, for doing it before the a.m. HDD Replacement.

 

++++

Details:

57   POST Message 11/01/2022 15:13 [NOT SET] 1 Option ROM POST Error: Errors will be corrected when the sector(s) are overwritten. Action: Backup and Restore recommended. 56   POST Message 11/01/2022 15:13 [NOT SET] 1 Option ROM POST Error: 1716-Slot 0 Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface scan. 55   POST Message 11/01/2022 15:13 [NOT SET] 1 Option ROM POST Error: Ensure all other drives in the array are online! Back up data before replacing drive(s) if using RAID 0. Action: Replace drive. 54   POST Message 11/01/2022 15:13 [NOT SET] 1 Option ROM POST Error: Port: 1I, box:4, bay: 3 (SAS) 53   POST Message 11/01/2022 15:13 11/01/2022 15:13 1 Option ROM POST Error: 1720-Slot 0 Drive Array - S.M.A.R.T. Hard Drive(s) imminent failure: 52   Drive Array 11/01/2022 16:33 11/01/2022 11:56 2

Internal Storage Enclosure Device Failure (Bay 3, Box 4, Port 1I, Slot 0)

 

 

 

+++++

 

Controller Status

 

 

 OK Serial Number   Model Smart Array P440ar Controller Firmware Version 6.60 Controller Type HPE Smart Array Cache Module Status

 

 

 Failed Cache Module Serial Number   Cache Module Memory 2097152 KB Encryption Status Not Enabled Encryption ASIC Status

 

 

 OK Encryption Critical Security Parameter NVRAM Status

 

 

 OK

 

 

+++

Logical Drive 01

  • Status
     

     

     Degraded Capacity 3353 GiB Fault Tolerance RAID 5 Logical Drive Type Data LUN Encryption Status

    No Encrypted

Physical Drive in Port 1I Box 4 Bay 3

  • Status
     

     

     Degraded (Predictive failure) Serial Number   Model EG000600JWEBH Media Type HDD Capacity 600 GB Location Port 1I Box 4 Bay 3 Firmware Version HPD4 Drive Configuration Configured Encryption Status Not Encrypted
2 REPLIES 2
Tam92
HPE Pro
Solution

Re: HP ML350 Gen9 &gt; HDD&nbsp;Degraded&nbsp;(Predictive failure)

Hello,

 

As there is 1716 POST error , we have an HPE advisory for this issue.

 

This is caused when there are unrecoverable media errors detected on the drives. One drive which is predictive failure state, needs to be replaced for sure.

 

 

Also , there is an extensive analysis required to check the unrecoverable media errors on the other drives in the array. Once the faulty drives are replaced, we need to follow the below advisory.

 

https://support.hpe.com/hpesc/public/docDisplay?docId=a00104843en_us&docLocale=en_US

 

The array needs to be deleted and recreated.

 

For cache module status failed.. Yes the replacement procedure you mentioned is correct.

 

I would recommend you to log a case with HPE to get the storage subsystem checked and take corrective actions.

 

Thanks,

TAM

 

 



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
juergen-account
Occasional Advisor

Re: HP ML350 Gen9 > HDD Degraded (Predictive failure)

report:

a) server shutdown
b) swap cachemodule-battery
c) start server
d) hotswap "hdd which shows "predictive error"
e) rebuild started automatically
f) cachemodule still shows "failed"  but we, no further reboot where done yet.
ATM it is OK, all drives are healthy.