ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

ML110 RAID auto-rebuild

Kevin Chamberlain
Occasional Visitor

ML110 RAID auto-rebuild

I'm trying to understand how a server managed to loose a weeks worth of files - any suggestions?

Scenario:
HP ML110 G5 with SATA RAID driver - single C: system drive and RAID 1 pair for data storage.

Client relocated their server, but it would not restart with missing OS error. So they disconnected the data drives (thinking it was booting from them), then reconnected and it booted correctly.
The following weekend I got access to the server to find out why it would not boot (turned out to be an external backup HDD they had left connected). However after the reboot to correct BIOS, the RAID controller automatically started a RAID rebuild (although system had worked normally for a week, presumably on 1 HDD).
However, this is where the problem started, as client reported they had lost files from Friday. On investigation ALL files created since the original relocation were lost, I can only assume because the RAID was rebuilt from the wrong drive (i.e. the one which had been flagged as failed and so not updated).
Have had to restore from backup, but I (and the client) would like to know how this can happen?
3 REPLIES
TTr
Honored Contributor

Re: ML110 RAID auto-rebuild

> I can only assume because the RAID was rebuilt from the wrong drive

Are you sure the customer did not play aroud with the raid1 pair? They definitely pulled drives out and then probably plugged them in on a live system. I suspect this is where things went wrong, the array came up with one drive as soon as it saw it. Did they check the status of the raid1 pair after plugging the data drives back in?

What you are suspecting is probably what happened but there are two many unknowns here. It is possible that the array controller came up with one data drive and the next time it came up with the different drive. The customer might not be telling you all the details either.
Kevin Chamberlain
Occasional Visitor

Re: ML110 RAID auto-rebuild

I totally agree it's a bit of a mystery, but my real concern is that it was all apparently working OK for a week (but I didn't have the Windows based storage manager installed then), but after a simple reboot the RAID array somehow managed to wipe out all the files created since the last reboot - I'd have thought that once a HDD is flagged as failed it should not be brought back online until it's rebuilt - not randomly select which drive is the 'master' on each reboot?
robert_baird
Occasional Visitor

Re: ML110 RAID auto-rebuild

I had this problem also.  I know this is a community KB but you would think HP would take interest in it and respond to questions like this.

 

Same scenario, failed disk, replaced it, powered on the server and see that the drive is no longer failed, the array is OK, and rebuilding. 

 

Look on disk, its blank. 

 

Never had a problem like this on a HS ML350, why would a device desired to protect data repair the array by copying a blank disk that did not even belong to the array over top of the online drive. 

 

HP I cannot trust your entry level servers for anything important.  I am very disappointed. 

 

My conclusion, DO NOT replace the failed drive on the ML110, instead install the new drive as a hot spare, allow it to fail over to the spare, then remove the failed drive.