1825711 Members
3139 Online
109686 Solutions
New Discussion

Array rebuild

 
Engineer2
Occasional Advisor

Array rebuild

One of our servers has a problem with the discs. One drive has totally failed, and two more are in a predictive fail state.  The server is configured RAID 5.

I got new discs shipped out by HP which arrived today. I swapped out the failed drive with one of the new drives, but I cannot see in the array config utility if it is actually rebuilding the array. It still registeres the disc as failed. However, when I plugged the new disc in, all the discs (including the new one) lit up blue.

About 15 minutes later, only disc 3 was blue. Another few minutes after that, all the discs went blue again. I was under the impression the array should rebuild automatically when the failed disc was replaced?  It shows in the configuration utility as being in "interim recovery mode" but other than that, I cannot see a way to confirm if the array is rebuilding.

The server in question is an HP Proliant DL380 Gen 8.

Can anyone offer any advice? Thanks

15 REPLIES 15
parnassus
Honored Contributor

Re: Array rebuild

What exact HPE Smart Array RAID Controller your Server is equipped with (just to have a complete picture to start from)?


I'm not an HPE Employee
Kudos and Accepted Solution banner
Engineer2
Occasional Advisor

Re: Array rebuild

Hi, its an HP Smart Array P420i array controller.

Thanks

parnassus
Honored Contributor

Re: Array rebuild

Supposing Array member disks are (included the failed one and the new one you ordered as replacement disk) Hot-Plug capable (and all with the correct size, especially the ordered one as replacement), was the swap failed disk->out | replacement-new disk->in happened online or offline (with Server powered off)?

Read (from HPE Smart Array Controllers User Guide) under Replacing drives paragraph:

For systems that support hot-pluggable drives, if you replace a failed drive that belongs to a fault-tolerant configuration while the system power is on, all drive activity in the array pauses for 1 or 2 seconds while the new drive is initializing. When the drive is ready, data recovery to the replacement drive begins automatically.
 
If you replace a drive belonging to a fault-tolerant configuration while the system power is off, a POST message appears when the system is next powered up. This message prompts you to press the F1 key to start automatic data recovery. If you do not enable automatic data recovery, the logical volume remains in a ready-to-recover condition and the same POST message appears whenever the system is restarted.

I'm not an HPE Employee
Kudos and Accepted Solution banner
Engineer2
Occasional Advisor

Re: Array rebuild

Hi, the server was power on and running when the swap-over occurred. It was not powered down. I took out the disc that has failed completely and simply replaced it with one of the new discs from HP.

parnassus
Honored Contributor

Re: Array rebuild

Isn't Automatic Data Recovery enabled? Automatic Data Recovery should be enabled by default so the HPE Smart Array RAID Controller will start the rebuild automatically (you just swapped the failed drive with the new one without changing the disk bay where the failed one initially was):

When you replace a drive in an array, the controller uses the fault-tolerance information on the remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced failed drive) and then write the data to the replacement drive
 
so basically if the type of disk is coherent (SATA with SATA, SAS with SAS, SSD with SSD) with disks type used on the Array (SATA, SAS or SSD) without any mix and the disk capacity requirement is fulfilled (replacement disk size should be equal to or greater than the size of the failed drive) the rebuild should start automatically if ADR is left enabled.
 
The logical drive will remain in a "Ready to recover" condition and the same status will be displayed at the next system restart until the rebuild happens (automatically or invoked manually).
 
Could be a physical drive Firmware issue (need to be updated and the server rebooted?) if you see Disk Locate LED flashing blue (solid blue means the disk is recognized).
HP_ProLiant_Gen8_Hot-Plug_disk_LEDs.png

 

 

What the HPE Smart Storage Administrator appliacation reports?

I'm not an HPE Employee
Kudos and Accepted Solution banner
Engineer2
Occasional Advisor

Re: Array rebuild

Where can I check if ADR is enabled?

TTr
Honored Contributor

Re: Array rebuild

What is the status of the logical disk in the array? Are they rebuilding or still degraded?

You may have to set the unused drive as a "auto-replace" hot spare in order for it to be used to rebuild the array.

Engineer2
Occasional Advisor

Re: Array rebuild

The disc which I replaced is still showing as failed in the Storage Administration utility. I cannot see anywhere it indicates if it's rebuilding the array or not.

The other three discs; two are still in predictive fail mode, while the third disc is all ok.

Because the other discs haven't completely failed yet I am sure the array can still be rebuilt on to the new discs.
TTr
Honored Contributor

Re: Array rebuild

So the new disk does not show up as unused/unassigned? What does it show up as?

TTr
Honored Contributor

Re: Array rebuild

I see above you said it shows up as failed. Is it possible that the new disk was DOA? Where did it come from?

parnassus
Honored Contributor

Re: Array rebuild

If the replacement drive you plugged in shows up as "Failed" (a screenshot of the SSA interface would help, isn't it?) no rebuild action is possible (no matter if ADR is enabled tried to be performed by the Smart Array Controller or not).

Setting the replacement disk as Global Hot Spare shouldn't be a requirement since, as written, ADR takes care of the array rebuilding (eventually you can plan to add another identical drive as GHS on another free bay, if you have one, once the array exits its degraded state when the rebuilt procedure will end successfully, it is adviseable not mandatory).

Also disk LED status will help.

HPE SSA (which version are you running?) will help you to identify firmware versions of Smart Array controller and disks plus a ton of other diagnostic/status informations.


I'm not an HPE Employee
Kudos and Accepted Solution banner
Engineer2
Occasional Advisor

Re: Array rebuild

The disc came from went out by HP under a warranty call, the server is still under warranty for another couple of months so I got them to send out replacement discs.

I haven't tried using any of the other discs yet as I didn't want to interrupt the array rebuilding (if it was, although it just doesn't look like it is happening.)

Would you recommend trying another disc, just to confirm its not a dead-on-arrival possibility with one of the new discs?

parnassus
Honored Contributor

Re: Array rebuild

I would you recommend to check through HPE Smart Storage Administration (SSA) application the status of your Array/Volume/Disks...it's so visually easy...start to provide answers to all questions we asked you (the more detailed you are the more help you may receive by Community users, the more spontaneous details you provides the higher is the probability you will fix your issue by your own in few steps or with minimun help of Community users).

So if I were you I will start with answering those questions:

  • What are the LED Status (see above provided screenshot) of all your involved disks?
  • What type of disk (provide Product Number) are you working with?
  • What Firmware versions involved disks have (important)?
  • What Firmware version your HPE Smart Array P420i Controller is running on (that can be very important)?
  • What OS your Server is running with?
  • What software version your HPE SSA has (important too)?
  • What HPE SSA reports (you can provide various screenshots or sanitized logs if you want)?
  • Have you read relevant HPE Smart Array Controllers User Guide in order to better troubleshoot what is going on on your disks?
  • Have you read relevant HPE Smart Storage Administrator User Guide in order to be able to diagnose and act properly about your disk issue?
  • How large is your Volume (to understand how long a rebuild process could be) Moderator [above link is no longer valid, please visit https://support.hpe.com/connect/s/  to find the latest info ]
  • and so on...

I would not recommend you to do anything (like removing the replacement disk you inserted on the failed disk bay) if you haven't a clear picture of what is going on (probably automatic rebuilding is not happening for the reason given in the above post).


I'm not an HPE Employee
Kudos and Accepted Solution banner
Engineer2
Occasional Advisor

Re: Array rebuild

Update: Looks like the replacement disc HP sent out was faulty!
I took the disc out, replaced it with another new disc and within 40 seconds, the array has started to rebuild!

parnassus
Honored Contributor

Re: Array rebuild

Lucky guy, that's better than - just an example - discovering that an automatic rebuilt operation correctly started once you inserted the new replacement disk to then see it exiting abnormally after some time...glad you found the culprit.


I'm not an HPE Employee
Kudos and Accepted Solution banner