Email Subscription Notifications Suspended Temporarily
We are in the process of making navigation in the Servers and Operating Systems forums simpler and more direct. While doing this, we have to temporarily suspend email notifications for subscriptions. If you are subscribed to one or more discussion boards or blogs in the community, please check them daily to see new content. Notifications will be turned back on in a few days. We apologize for any inconvenience this may cause. Thanks, Warren_Admin
Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

MSA 1000 failed HDD replaced, rebuild not starting

Joe Conner
Occasional Advisor

MSA 1000 failed HDD replaced, rebuild not starting

I know this has been asked before with similar configs, but I've not see this specific to my hardware:
Upon startup, bay #12 in one of my MSA1000's lit red. I've hot replaced the failed drive. MSA1000 sees the drive hot added, and Array Configuration Utility v7.50.23.0 (Windows) now shows the new HDD as unassigned. Nowhere can I find a place to assign the drive, and begin the rebuild process. I thought it starts automatically, but it's been quite a while since I've been through the scenario.
How do I nudge the MSA1000 to rebuild the array back to RAID using the replaced drive?

I plan to restar the MSA1000 and attached server in a few hours, I hope that does the trick.
14 REPLIES
Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

I am going to need some help, as to how to get the logical drive to start rebuilding.
I have restarted the server and the MSA1000 - and tried two different new drives as replacements, and it just won't start the rebuild.

Physically inserting the new replacement drives shows they are hot added, I can hear them spin up. The green hard disk light on that replaced drive does not stay lit however - maybe because it's just not part of anything yet? It does blink orange status when selected in ACU (and also blinks along with the others when the array is selected in ACU.

I can see the drive in ACU, it shows up in Box1 Bay 12 like it should, status "OK", and is unassigned. But I don't see anywhere where I can assign it to be a spare or anything.

Any other option to manipulate the logical drive gives a warning that data will be lost, so that scared me away from trying anything.
There is no action menus on the MSA1000 itself I could find to start the rebuild.

Help!
Víctor Cespón
Honored Contributor

Re: MSA 1000 failed HDD replaced, rebuild not starting

What SCSI ID does the disk get?
What SCSI ID did the failed disk have?

Should be the same, but on most cases like this what happened was a corruption of the RIS, so suddenly the MSA sees a disk as missing from the RAID, a disk that has an SCSI ID that does not exist. So even if you replace the disk, it will not get that SCSI ID that the controller is waiting for.
marsh_1
Honored Contributor

Re: MSA 1000 failed HDD replaced, rebuild not starting

hi,

i'm surprised you can't see spare management if you select the array (not the logical drive).
have you tried through the cli ? i take it this was not a raid 0 array ?

fwiw


Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Well, I don't see a "SCSI ID" anywhere. The failed drive was in Box 1 Bay 12, so is the replacement drive. The replacement drive is the same specs (capacity and all that), and as I said the MSA1000 does see it arrive into the enclosure.
I'm not sure where I would check (or change) the SCSI ID. "More Information" shows all the data about the replacement drive, but SCSI ID is not listed.

I'm surprised I don't see any options to add a spare as well. I suspect changes to the array are not allowed because its running in "interim recovery mode", I did read that changes are not allowed while operating with a failed drive (it is RAID 5 btw).

I've never use CLI, I don't think I have the cable even. I've always just used ACU.

Do I need to dismount the volume from the host O/S, "delete" the array, and recreate it again? Although I'm warned about data loss, does that really delete the data on the array?

I've got to resolve this soon, as I'm unprotected now, if another drive fails I really will lose the data on it.
marsh_1
Honored Contributor

Re: MSA 1000 failed HDD replaced, rebuild not starting

hi,

there is always :-

backup - delete array - recreate array - restore

fwiw

Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Yeah, it's beginning to look that way. Not trivial, since it's quite a bit of data. Hopefully, the data will survive deleting and recreating the array - but if it isn't seen the same way as before, I doubt that will be the case.

It's disappointing the array didn't work as planned, and allow a hot rebuild without all of that. But at least thus far, no data has been lost.

Still, the question is open, is there way to entice the MSA1000 to start the rebuild onto the replaced drive?
marsh_1
Honored Contributor

Re: MSA 1000 failed HDD replaced, rebuild not starting

hi,

you can use the acucli , it's not as friendly as the serial cli , but useable -

C:\Program Files\Compaq\Hpacucli\Bin\hpacucli.exe

type in help at the prompt to get the list of commands up, if you've got the acu utility user guide theres a section in there with some examples.

fwiw

Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Thanks for the Cli tip, I'll try that out also!
Greybeard
Esteemed Contributor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Hi, you don't say what size the disks are but if they're all the same just make the new disk a Global hot spare then the degraded array should grab it.
_________________________________________________
How to assign points on this new forum? Click the Kudos Star!
Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

They are all "73GB" disks, the new one is a match to the existing disks in the array.

How do I assign the replacement disk as a global spare", I don't see that option anywhere (in ACU anyway). That's pretty much my problem, I can't seem to change anything to get the rebuild to begin.

I'll be looking at the Cli options today, and at least one full backup of the array should be done by then, at least throwing a safety net out there in case it gets worse not better.
Johnny von Heimerall
Frequent Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Hi Joe,

Can you still access the LUN?

Best regards,
Johnny
Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Well, no Cli utility on the server. I didn't find any installed that I could use (it's not a Compaq server). I do have ACU, and I installed SANsurfer, but that just seems to deal with the host adapters and not the arrays attached.

I'm not sure what you mean if I can still access the LUN - the drive is still mounted and working on the server (Windows 2003), it's obviously running off of the RAID with the failed drive. ACU sees the array just fine, and complains about the failed drive. "#274 The current array controller has a bad or missing physical drive attached to Port 2: SCSI ID 12. To correct the problem, check the data and power connections to the physical drive." and "#272 The current array controller has a bad or missing drive. Logical drive 1 (RAID 5 in array A) is operating with reduced performance and a further physical drive failure may result in data loss" ... "... Configuration changes to this logical drive or any other logical drive in array A are not allowed until the problem is corrected. To correct the problem, check the data and power connections to the physical drives or replace the failed drive. For more information, run the Array Diagnostics Utility".

I don't have the "Array Diagnostics Utility".

I've replaced the drive - tried two different NEW drives, one at a time. When the old drive is in, it lights red as failed. New drives go in and panel on MSA1000 indicated something akin to "drive hot added bay 12". The green hard disk light on the drive does not stay lit. in ACU, there then is a drive listed under MSA1000 Controller as "72.8 GB Parallel SCSI Unassigned Drive at Box 1:Bay12". I just can't seem to assign it to anything, to get the rebuild to begin. Under that, "Parallel SCSI Array A", there is an item "??? Parallel SCSI Drive at Port 2:SCSI ID 12", its icon is a red "X" over the hard drive icon. The other "normal" drives show as "Bay 1" thru "Bay 13", skipping "Bay 12".

So the MSA1000 knows drive 12 failed, and sees the replacement drive arrive (when I removed the failed drive, the panel indicated that also). It just won't start the rebuild, which if memory serves me, should start automatically. Rebuild priority is set to medium btw.

How would you go about assigning the drive as a spare? I suspect I cannot, since no changes are allowed since the array is in "interim recovery mode", as indicated on the MSA panel as part of the startup sequence. Am I missing a key step in the process to cause the rebuild to begin? Right-clicking anything in ACU only brings up "More Information".

BTW, right-clicking the failed drive under the array shows its status as failed. Right-clicking the new drive at the top of the MSA1000 tree shows its status as "OK".

Show physical view and show logical view doesn't seem to add any options, nor have I found anything in the "Configuration Wizards" that doesn't talk about deleting data.

Once backups are done, I'm going to try updating the firmware to 5.20 (it's 4.24 right now), and if that doesn't work, try deleting and recreating the array - might lose the data, might not.

I'm still stumped why the rebuild didn't start automatically, or I can't seem to start it myself. What good is RAID if you can't recover from it without losing all the data, assuming I'm going to if I delete and recreate the array?
Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Firmware flash to 5.30 didn't help.

I just deleted the array, and recreated it from scratch. No more playing around with it.

Thanks everyone for your advice!
Joe Conner
Occasional Advisor

Re: MSA 1000 failed HDD replaced, rebuild not starting

Closing thread. There was no resolution on how to get the array to rebuild.

It should have somehow, and not required a do-over to get it back to proper operation. But at least RAID functioned well enough I didn't lose the data.