ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

RAID5 disk with Degraded Predictive failure() and Unrecoverable Media Errors Detected on Drives

 
ApolloPT
Occasional Advisor

RAID5 disk with Degraded Predictive failure() and Unrecoverable Media Errors Detected on Drives

Hello everyone,

I have a HP ProLiant DL380p, with a RAID5 with 4 disks, where I had 2 logical disks running on those disks. Yesterday, I notice that I had one disk with the followin status: Degraded Predictive failure()

And after that, I check all the Integrated Management Log's, and I check that there was also some other caution warnings, that I dont know if are related with this disk or not, since the number of the slot is different, but since it is placed inside of a RAID5, dont know if the disk that have failed some SMART test is already messing with other disks:

"id","Severity","Class","Last Update","Initial Update","Count","Description",
"111"," Caution","POST Message","08/17/2019 13:23","08/17/2019 13:23","1","POST Error: 1720-Slot X Drive Array - S.M.A.R.T. Hard Drive(s) Detect imminent failure",
"110"," Caution","POST Message","08/17/2019 13:23","08/17/2019 13:23","1","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"109"," Caution","POST Message","08/17/2019 12:55","08/17/2019 12:46","2","POST Error: 1720-Slot X Drive Array - S.M.A.R.T. Hard Drive(s) Detect imminent failure",
"108"," Caution","POST Message","08/17/2019 12:55","08/17/2019 12:46","2","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"107"," Caution","POST Message","08/16/2019 17:52","08/16/2019 17:08","4","POST Error: 1720-Slot X Drive Array - S.M.A.R.T. Hard Drive(s) Detect imminent failure",
"106"," Caution","POST Message","08/16/2019 17:52","08/16/2019 17:08","4","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"105"," Caution","Drive Array","08/17/2019 13:24","07/28/2019 05:32","16","Internal Storage Enclosure Device Failure (Bay 1, Box 2, Port 1I, Slot 0)",
"104"," Caution","POST Message","07/23/2019 18:40","07/23/2019 18:40","1","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"103"," Caution","POST Message","07/23/2019 10:40","07/23/2019 10:29","2","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"102"," Caution","POST Message","07/12/2019 13:23","07/12/2019 13:23","1","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"101"," Repaired","Network","06/26/2019 13:13","06/26/2019 13:11","1","Network Adapter Link Down (Slot 0, Port 2)",
"100"," Repaired","Network","06/26/2019 13:13","06/26/2019 13:11","1","Network Adapter Link Down (Slot 0, Port 1)",
"99"," Repaired","Network","06/26/2019 12:15","06/26/2019 12:13","1","Network Adapter Link Down (Slot 0, Port 2)",
"98"," Repaired","Network","06/26/2019 12:15","06/26/2019 12:13","1","Network Adapter Link Down (Slot 0, Port 1)",
"97"," Repaired","Network","06/26/2019 13:14","06/25/2019 18:13","3","Network Adapters Redundancy Reduced (Slot 0, Port 3)",
"96"," Repaired","Network","06/25/2019 18:15","06/25/2019 18:13","1","Network Adapter Link Down (Slot 0, Port 2)",
"95"," Repaired","Network","06/26/2019 13:14","06/25/2019 18:11","3","Network Adapters Redundancy Reduced (Slot 0, Port 2)",
"94"," Repaired","Network","06/25/2019 18:13","06/25/2019 18:09","1","Network Adapter Link Down (Slot 0, Port 1)",
"93"," Caution","POST Message","06/03/2019 19:45","06/03/2019 19:45","1","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"92"," Caution","POST Message","05/31/2019 13:13","05/31/2019 13:13","1","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"91"," Caution","POST Message","04/30/2019 13:16","04/30/2019 13:09","2","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"90"," Caution","POST Message","03/20/2019 13:55","03/20/2019 13:55","1","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"89"," Repaired","Network","03/19/2019 19:07","03/19/2019 19:05","1","Network Adapters Redundancy Reduced (Slot 0, Port 2)",
"88"," Repaired","Network","03/19/2019 19:07","03/19/2019 19:05","1","Network Adapters Redundancy Reduced (Slot 0, Port 3)",
"87"," Repaired","Network","03/19/2019 19:05","03/19/2019 19:04","1","Network Adapter Link Down (Slot 0, Port 2)",
"86"," Repaired","Network","03/19/2019 19:05","03/19/2019 19:04","1","Network Adapter Link Down (Slot 0, Port 1)",
"85"," Caution","POST Message","03/13/2019 11:37","03/13/2019 11:37","1","POST Error: 1716-Slot X Drive Array - Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten.",
"83"," Caution","POST Message","03/09/2018 16:27","03/09/2018 16:27","1","POST Error: 1792-Slot X Drive Array - Valid Data Found in Cache Module. Data will automatically be written to drive array.",

 Regarding the predictive failure disk, since it is a hot swap controller, I think that I just need to replace that disk, by disconnect it and add a identical disk with the same model, to start the replication from RAID5, during production time or not. If its less danger to do it with that server offline I can do it that way.

Resuming, anyone can help with those two doubts? 

Thanks in advance!

4 REPLIES 4
ApolloPT
Occasional Advisor

Re: RAID5 disk with Degraded Predictive failure() and Unrecoverable Media Errors Detected on Drives

Seems like that warning,

"Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface Analysis (ARM) scan. Errors will be fixed automatically when the sector(s) are overwritten."

is occuring only when I reboot this server.

SanjeevGoyal
HPE Pro

Re: RAID5 disk with Degraded Predictive failure() and Unrecoverable Media Errors Detected on Drives

Hello

Thank you for contacting us.

I would request you replace both the predictive failure disks.

  1. Delete the array & re-create array and restore data from the backup.

Why? There are excessive media errors on both the drives, replacing the drives one by one is not recommended as it will copy the errors to the newly installed drives during the rebuild process.

  1. Update the SA controller firmware and all hard disks firmware with the latest before putting into production.
  2. Restore the data from the backup.

NOTE: Make sure you should have complete data backup before performing the above recommendations

Please let me know if you further queries.

 If you feel this was helpful please click the KUDOS! thumb below!

Regards,

  


I am a HPE Employee

Accept or Kudo

ApolloPT
Occasional Advisor

Re: RAID5 disk with Degraded Predictive failure() and Unrecoverable Media Errors Detected on Drives

Thanks for response @SanjeevGoyal !

This server is already in a production side, and is not placed in a cluster, so I cannot perform those sugestions without a bit downtime.

Regarding the sevarity of the situation of having media errors in this RAID5, I dont know if is something that I should be worried, since it meantions that:

"Errors will be fixed automatically when the sector(s) are overwritten"

I assume that those sectors have not been overwritten or there are several to be overwritten.

The other thing is that disk with a predictive failure, can I simply unplug it and plug a new one during production time? or its more safe to turn off the server and install it turned off?

 

SanjeevGoyal
HPE Pro

Re: RAID5 disk with Degraded Predictive failure() and Unrecoverable Media Errors Detected on Drives

Hello,

Apologies for the delay response.

You can replace the predictive failed drive one by one. The replaced second predicted failed drive, once rebuilding the complete of a first replacement drive.

Please let me know if you have any other queries.

Regards,

Sanjeev Goyal

 

 


I am a HPE Employee

Accept or Kudo