NetRAID RAID Lvl 5 main system hdd Failure all addresses

Jeff Bowman
Occasional Visitor

Utilizing a moth-balled server HP LH II till the FY. I rebuilt the server for data with the following configuration. The server is congured with 12 hdds, 6 in the main system, and 6 in the subsytem. The Main subsystem is set to channel 0 and the subsystem is running on channel 1, both configured for RAID 5.

This morning the RAID alarm started, so I brought the machine down (unfortunately the OS, Novell, was locked up so had to bring the system down hard). I restarted the system, went into the config tools, turned the alarm off, went into add/view config and found the following:

All drives on Channel 0, addy's 0-6 were failed
All drives on Channel 1 (external subsystem) 0-6 were listed as online.

To make matter's worse I have discovered that a backup had not been run on the system. So of course I do not have a backup to which I can fall back.

At this point I need to make some decisions so I can get the system back up. Therefore I need some advice.

I have considered the following:
1. Manually putting all drives on Channel 0 back online and attempting to get as much Data off as possible before the system fails completely again.
2. Remove the drive at addy 0 on ch 0, insert a new drive and rebuild the array in hopes the lead drive caused an array failure ( never seen such a thing but willing to try anything)
3. Blowing away the array config and letting the RAM config recreate the Array config in case the array config I am using now is corrupt ( grasping at straws now).
4. Any other suggeston that may be out there as I need to treat this as a one shot deal as I may lose all the data.

Any suggestions? Thanks for any help in advanced!
Marco Hogeveen
Honored Contributor

It might be possible that 1 bad drive in your array caused the other 5 drives to fail.
You might want to check the physical drives for errors.
Go into Express Tools (Ctrl-M during POST) and select: ->Object->Physical disks.
Now selectevery disk on channel0 and look at the properties. The errors are displayed at the bottom of the screen.

If you find a disk with errors, do option 1 (Manually putting the disks online) without this disks with errors.
Your system should be able to boot now and you can replace the disk afterwards (always do it when the server is powered up, otherwise you will get an unresolved mismatch) and rebuild the RAID5

If there are multiple disks with errors, then you'd better start crying.... you've lost your data.

Good luck!

Jeff Bowman
Occasional Visitor

Marco, Thanks for the suggestion. Unfortunately I had to solve the problem last week with or without suggestions.

The short of the long; I manually brought the drives online, rebooted (there were some config problems with the drives), cleared the entire configuration, reset the drives to the same settings they were prior to the incident, rebooted the machine and no more issues at the moment. And, of course, immediately ensured the system got and stays backed up!

Thanks again Marco!