ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Raid 6 Array won't rebuild drive

 
Chuck Slabaugh
Occasional Contributor

Raid 6 Array won't rebuild drive

I have an HP Server with a P400 Controller running Raid 6 with 8 72GB drives.

 

One of the drives failed and could not be rebuilt because 2 other drives have Hard Read errors.

 

Am I understanding correctly that the drive cannot be rebuilt and I will have to backup and restore using 2 more new drives in order to fix the problem?

 

If so, Is replacing the drives and doing a restore correct or will I have to break the array and recreate it in order to fix the problem.

 

I am needing some direction on what to do. The Server is running ok right now, but in a degraded more - essentially Raid 5.

 

I could just wait for the Server to fail or go ahead and backup and restore, but I want to make sure I do the right thing, and I do not want to lose any data. I do not have a Backup Server to restore and test on. I am using Acronis Backup and Restore Virtual Edition.

 

I have 2 Logical Drives. The System boots into Windows Server 2008 with Drive C containing the operating system and Drive G contianing the Virtual Drives. There are 3 Virtual Servers there.

 

Thank you for any suggestions on the course to take.

5 REPLIES
anthony11
Regular Advisor

Re: Raid 6 Array won't rebuild drive

That sure sounds to me like you'll need to restore from backup -- if you even have a good backup. 

 

You haven't been monitoring drive status?

 

Goutham_Sabala
Esteemed Contributor

Re: Raid 6 Array won't rebuild drive

As the server is booting to windows now , take a backup and recreate the logical drive, I would not advise you to wait for the server to go down as you never know when that is going to happen and you just keep waiting instead back recreate and reinstall and restore
Was the post useful? Say thanks by clicking the white KUDOS Star!
Goutham Sabala
Chuck Slabaugh
Occasional Contributor

Re: Raid 6 Array won't rebuild drive

Thank you for your replys.

 

Here is the situation.

 

I had 1 hard drive fail in the 8 drive Raid 6 Array. That is when I found out I could not rebuild the replacement drive.

 

Raid Diagnostics shows a Raid Parity Initialization Error. I have Read errors on 2 other drives. However; they still are particiapating in the Raid Array. The Raid array configuration shows everything good. The diagnostics shows the error. 

 

I suspect the hard read errors are on areas of the drive containing no data because the Server is performing fine and backups are 100%.

 

However; I have been using HP Data Protection Express and it will not create a disastor recovery CD.

 

I have switched to Acronis Backup and Recovery Virtual Edition. I want to make sure I have a good backup.

 

I just finished the offsite data backup. Accronis does not see the tape drive because from what they say, I need to uninstall HP DPExpress. I will do that next, and get a good tape backup.

 

It is a HP Server with an HP P400 Controller.

 

Question 1: Will a chldsk /r on the drives do any good? My thinking is, Raid works on the Physical layer and WIndows on the Logical layer; therefore a chkdsk /r would not do any good. Maybe I am wrong, and since it appears the read erros on areas of the disk that do not contain data, it might work. What do you think?

 

My next step is to get a good backup to tape using Acronis.

 

I am not sure what my next step is.

 

Question 2: I have a Raid Array with 2 Logical Drives. Can I replace drives 5 and 7 (Read Errors) and do a restore from backup?

 

Or will I have to break the array, blast all of the data and recreate the array, and do a resotre in order to fix the problem?

 

It's scary because everything is fine now and I do not have a backup server to test with. It's a matter of trusting a good backup.

 

This Server is used around the clock. My 1st test will be to test the bootable CD created by Acornis and make sure it sees the Array controller and drives.

 

I have time to think about this and I want to get it right.

 

Question 3: There are 7 drives participating in the array now, and 1 is not (the new drive).

 

What if I purchased 7 new drives (or 8) and went through the restore process. If the restore worked, great - if not I could replace the original drives back and go from there. The problem with that I am thinking is that I would have to recreate the array using the new drives. Putting back the old drives in case of failure may not work...

 

Purchasing another Server is the only other way I know to test the backup/restore.

 

What do you think?

 

I am looking for expertise with Raid and I think you for your suggestions.

Goutham_Sabala
Esteemed Contributor

Re: Raid 6 Array won't rebuild drive

Question 1: Will a chldsk /r on the drives do any good? My thinking is, Raid works on the Physical layer and WIndows on the Logical layer; therefore a chkdsk /r would not do any good. Maybe I am wrong, and since it appears the read erros on areas of the disk that do not contain data, it might work. What do you think?

 

chkdsk /r - With my experience what I have seen is chkdsk /r does not do any good 


Question 2: I have a Raid Array with 2 Logical Drives. Can I replace drives 5 and 7 (Read Errors) and do a restore from backup?
Or will I have to break the array, blast all of the data and recreate the array, and do a resotre in order to fix the problem?
It's scary because everything is fine now and I do not have a backup server to test with. It's a matter of trusting a good backup.
This Server is used around the clock. My 1st test will be to test the bootable CD created by Acornis and make sure it sees the Array controller and drives.
I have time to think about this and I want to get it right.


> You are essentially making the server unbootable by replacing the drive 5 and 7 as their is already one drive which is failed and waiting to rebuild
Rebuild happens from parity data on the rest of the drives as their are read errors like you said on 5 and 7 I doubt if the rebuild is ever going to start


Or will I have to break the array, blast all of the data and recreate the array, and do a resotre in order to fix the problem?

> recreate and restore is the best solution , server may conintue to work fine untill a restart , so it is important to have a backup before a restart


Question 3: There are 7 drives participating in the array now, and 1 is not (the new drive).

What if I purchased 7 new drives (or 8) and went through the restore process. If the restore worked, great - if not I could replace the original drives back and go from there. The problem with that I am thinking is that I would have to recreate the array using the new drives. Putting back the old drives in case of failure may not work...

Yes can purchase new drives and restore but got to 100% sure that backup has completed successfully, like I said restarting the server in the current situation with the current drive configuration ( could replace the original drives back and go from there ) server may just not come back on .

 

Recreating the logical drives - and restoring the data with all new drives or by just replacing the hard drives with read errors and then restoring from a backup will work, like you have mentioned now it all goes down to the backup job and trusting that it will work when restored. 

 


Was the post useful? Say thanks by clicking the white KUDOS Star!
Goutham Sabala
Chuck Slabaugh
Occasional Contributor

Re: Raid 6 Array won't rebuild drive

I bought HP Care Pack for 1 year and will await for registration.

 

All Drives but drive 3 (replaced 6 months ago) and Drive 6 have the same model and firmware level (HPD9)

 

Drive 3 is a different model with firmware at HPD8.

 

Dirve 6 - the drive that will not rebuild is model DH072BAAKN - Firmware HPD3 - the latest available for that drive.model. I could not find the drive model for drive 3 (EH0072FARUS - which is at HPD8)

 

Acronis Backup and Recovery 11 - Virtual Edition has been installed and a backup completed.

 

Primary Server has 2 drives - Drive C - Operating System and Drive G - Virtual Drives are located there.

 

I have taken a full backup of dirve C and all 3 Virtual Drives - skipping the backup of Drive G - They said that was not neccessary since Virtual Machines have been backup up.

 

Including Drive G exceeded the 400GB Tape so it was skipped - Does that seem correct?

 

What do you think about the model/firmware level?

 

Server is working fine and full backups complete 100%. Hard Read Errors remain on 5 and 7.

 

Could replacing Drive 6 with the same model/firmware level drive help?

 

What if I took one of the dirves with hard read errors (5 or 7) and replaced it with a new drive?

 

My understanding was that the P400 Raid Controller should be able to deal with Hard Read errors on the drive.

 

This problem has a lot of people stumped and I've received conflicting advice.

 

Could the drive models/firmware versions have anything to do with this or is breaking the array, recreating it and restoring the only hope?

 

Does the Backup I'm making OK? (skipping Drive G - but making Virtual Machine Backups?