ProLiant Servers (ML,DL,SL)
1827251 Members
2640 Online
109716 Solutions
New Discussion

Rapid Parity Initialization leading to multiple drive failues

 
kbtpt
Senior Member

Rapid Parity Initialization leading to multiple drive failues

Hello,

I have a number of ProLiant DL380 G10s with HPE Smart Array P480i-a SR RAID Arrays that are having some issues with their logical drives. They have been displaying error 1716 in the IML, which is "Unrecoverable Media Errors Detected on Drives during previous Rebuild or Background Surface scan. Errors will be corrected when the sector(s) are overwritten. Action: Backup and Restore recommended." 

I've seen and read a couple KB articles that suggest updating firmware and enabled surface scan analysis priority of High. However, once the firmware gets updated, the 1716 error clears, but gets replaced with error 1915, which has basically the same description and has the same resolution steps. I've still tried enabling the high priority surface scan analysis, but it hasn't done anything to resolve the issue.

I've been following the directions outlined in the KB that HPE put out about this issue, involving migrating the VMs to another host and then rebuilding the array with rapid parity initialization. However, the part that seems unusual to me is that on each machine I do this process on, 2 or 3 drives fail out of the 7 in the array. Is this normal? I still have several more servers to do this process on and I've seen this enough times that I'm starting to wonder if I'm doing something incorrectly.

5 REPLIES 5
BPSingh
HPE Pro

Re: Rapid Parity Initialization leading to multiple drive failues

Greetings!

I believe you are referring to the below advisory. 

https://support.hpe.com/hpesc/public/docDisplay?docId=a00104843en_us&docLocale=en_US

Ideally deleting and recreating an array with RPI- (in parity RAID) should fix the issue, however before you do so on a server that is showing unrecoverable errors, it is recommended to first check the drives for media errors. If any drive has a high count of media errors, it would be best to replace it. Please log a support case and provide the AHS logs so that the errors can be reviewed from the logs.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
kbtpt
Senior Member

Re: Rapid Parity Initialization leading to multiple drive failues

Thanks for the reply. Unfortunately, none of these systems are in support anymore, so I don't think I can open a support case...

Yes, that is the advisory I'm referring to. Deleting and re-creating the array with RPI has cleared the error every time, however, it usually ends up leading to 1-3 drives failing during the process.

How can I check the counter of media errors on a drive? I assume in the iLO somewhere, but I'm not finding anything.

kbtpt
Senior Member

Re: Rapid Parity Initialization leading to multiple drive failues

Actually, I see if I show more details on a given drive, I see a line for Uncorrected Read Errors and a line for Uncorrected Write Errors. What would be considered a high number of media errors?

BPSingh
HPE Pro

Re: Rapid Parity Initialization leading to multiple drive failues

During RAID creation, performing RPI typically blocks bad sectors. However, if a drive has an excessive number of errors, it is likely compromised and should be replaced promptly, especially if the error count continues to increase. Generally, drives with hundreds of errors should be considered for replacement. 



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
support_s
System Recommended

Query: Rapid Parity Initialization leading to multiple drive failues

Hello,

 

Let us know if you were able to resolve the issue.

 

If you have no further query, and you are satisfied with the answer then kindly mark the topic as Solved so that it is helpful for all community members.

 

Please click on "Thumbs Up/Kudo" icon to give a "Kudo".

 

Thank you for being a HPE valuable community member.


Accept or Kudo