Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

RAID6 doesn't rebuild

SOLVED
Go to solution
a13x
Occasional Advisor

RAID6 doesn't rebuild

Hello!

DL580 G4
RAID6
8 Physical Disks 146GB

2 weeks ago I replaced one drive in the Bay4 that was blinking amber. 4 days ago I replaced the second drive in the Bay1 that was blinking amber too.

Then I restarted the server (OS is RHEL5.5) and hit F1 to start data recovery.
After this I booted from SmartStart.
At first ACU was showing that disks at the Bay 1 and 4 were recovering, but then ACU wrote:
(Ready for Recovery) The logical drive is queued for rebuilding.

If I restart the server, the situation simply repeats.

How can I force the recovery of the array?
I changed the controller settings to "High Rebuild" priority but that didn't seem to do much.

I've attached a copy of the ADU report.

Any help would be greatly appreciated.

I'm looking forward to your prompt reply.
Alex.

С Ñ Ð²Ð°Ð¶ÐµÐ½Ð¸ÐµÐ¼,
Ð Ð»ÐµÐºÑ Ð°Ð½Ð´Ñ Ð Ð¾Ð¿Ð¾Ð²,
IPG
12 REPLIES
a13x
Occasional Advisor

Re: RAID6 doesn't rebuild

Hello again!
Is there anyone who can help me?

Thanks in advance!
Alex
Johan Guldmyr
Honored Contributor

Re: RAID6 doesn't rebuild

Hey!

After you press F1 - does it just reboot or what happens?
a13x
Occasional Advisor

Re: RAID6 doesn't rebuild

Hey!
Thank you for your reply!

When I press F1 it says that automatic data recovery enabled and the server starts to boot the OS or SmartStart (if I insert the SmartStart CD).
Then I look at the ACU. At first ACU shows that disks at the Bay 1 and 4 are recovering, but then ACU writes:
(Ready for Recovery) The logical drive is queued for rebuilding.

Johan Guldmyr
Honored Contributor
Solution

Re: RAID6 doesn't rebuild

Hey, from the ADU report it says that three drives have predictive failure.

1:6
1:7
1:8

And as far as I can see you only have one logical drive.

Can you putACU inside the OS? Maybe it is doing something different once it boots to the OS?
a13x
Occasional Advisor

Re: RAID6 doesn't rebuild

OK! I'll try and report the results!
cnb
Honored Contributor

Re: RAID6 doesn't rebuild

Hi,

First, you can't force it because it still has an error.

Second, your P400 subsystem is missing several mandatory *critical* firmware updates and some that address issues with rebuilds and data loss, but this is not the main reason for the failure:


Smart Array P400 in slot 4 : Identify Controller

Configured Logical Drives 1 (0x01)
Configuration Signature 0xa35bd66d
RAM Firmware Revision 2.08
ROM Firmware Revision 2.08

The latest Firmware Update DVD for your Server and components:
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=1137825&swItem=MTX-3380ae888f86436aac54c149eb&prodNameId=3288122&swEnvOID=4006&swLang=8&taskId=135&mode=3

The rebuild is failing due to a Critical Read Error:

0 *Rebuild Aborted From Read Error Critical*

In addition to the predictive fault codes:
1I:1:6 Predictive Failure
1I:1:7 Predictive Failure
1I:1:8 Predictive Failure

Other drives reported faults and one looks like it is still reporting a fault or this is latched from tha last read failure:

1I:1:6 Last Failure Reason Hardware Error (0x0d)
2I:1:1 Last Failure Reason Hot Removed (0x14)
2I:1:4 Last Failure Reason Hot Removed (0x14)

The drives are also missing important firmware updates that affect hot removal and premature failure reporting of drives.

You're also mixing drive types which isn't the best practice and may cause rebuild failures. Try to keep the same drive type *and* firmware across the volume:

HP DG146ABAB4
HP DG146ABAB4
HP DG146ABAB4
HP DG146ABAB4
HP EG0146FAWHU *
HP DG146ABAB4
HP DG146ABAB4
HP DG0146FARVU *

1) Fix the drive at 1I:1:6
2) Update system Bios and backplane Firmware
3) Update Controller Firmware
4) Update Drive Firmware
Try the rebuild, but it probably won't work due to many fimrware issues noted.
5) Update any Linux drivers (check ADU version too!)
6) Recreate the RAID6 volume and restore from backup


Rgds,


-cnb
cnb
Honored Contributor

Re: RAID6 doesn't rebuild

sorry, this got cut out from my last post...

"Check the Predictive or Read Faiilure Errors in ADR after updating everything. If you still get errors replace the disks with like drives, update firmware and then try the rebuild again.

If it still fails with known good replacement disks and firmware is all up to date, recreate the raid and restore."

Rgds,
a13x
Occasional Advisor

Re: RAID6 doesn't rebuild

Hello CNB and Johan Guldmyr!

The firmware is updated and the array seems to be rebuilt, but I'm not sure that it is ok.

HPACUCLI shows some new errors now.
I attach new ADU report to this message.

I'm very hopeful to save this array and it's data. Is it possible?

Thank you for your great help!
Looking forward to your reply.
A13x.
cnb
Honored Contributor

Re: RAID6 doesn't rebuild

Your Controller Firmware is still low. 7.22 is the latest!

Your drives are still mixed, so not sure what you want analyzed. They are still reporting Predictive failures and mixed firmware.

The data should be available. Back it up, fix the drives and update the firmware.

Rgds,


a13x
Occasional Advisor

Re: RAID6 doesn't rebuild

Hey, CNB and Johan Guldmyr!

1. We've done a full backup of data on the array
2. I've changed the defective disk at the Bay 6
3. I've updated controller firmware (by Smart Update Firmware DVD 9.10C)
4. I've updated the firmware of all the disks (by Smart Update Firmware DVD 9.30)
5. I've cleaned controller configuration

But there are still S.M.A.R.T. errors!

Moreover, I've noticed that if I change the disk when the server is running (hot plugging), the new disk starts to show S.M.A.R.T. errors.
But if I shut down the server before changing an old scsi disk to a new sas disk, the new disk works fine and doesn't blink yellow.
Did I damage my new disks by hot plugging them?
Should I create RAID6 and not worry about these errors?

I'm attaching a new ADU report to my message.

I hope for your help, CNB and Johan Guldmyr.
Thank you in advance!
a13x
Johan Guldmyr
Honored Contributor

Re: RAID6 doesn't rebuild

"But if I shut down the server before changing an old scsi disk to a new sas disk, the new disk works fine and doesn't blink yellow.
Did I damage my new disks by hot plugging them?
Should I create RAID6 and not worry about these errors? "

Maybe a reseat of all the disks will clear the condition.
a13x
Occasional Advisor

Re: RAID6 doesn't rebuild

Hey, Johan Guldmyr and CNB!

I've used "Clear Configuration" at ACU in Smart Start, but 6 of 8 disks are still blinking yellow and show SMART errors.

How should I reset all disks to clean the condition?