Storage Boards Cleanup
To make it easier to find information about HPE Storage products and solutions, we are doing spring cleaning. This includes consolidation of some older boards, and a simpler structure that more accurately reflects how people use HPE Storage.
Disk Arrays
cancel
Showing results for 
Search instead for 
Did you mean: 

Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

SOLVED
Go to solution
Gunther S.
Occasional Advisor

Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Have an MSA1500 with 4 MSA30 shelves, one is a RAID5 using all 12 disks on a single shelve. I would never do this again BUT ...

Had a power outage here and UPS could not hold for long enough. When power cam back on, one of the shelves was not powering up because of a failed PSU. But the MSA1500 did start. Trying to access this shelve, it found that all drives had failed and concluded the death of the unit. BUT, in truth the unit is not dead, just the shelve didn't power up!

Now the status is:

Unit 2:
In PDLA mode, Unit 2 is Lun 3; In VSA mode, Unit 2 is Lun 2.
Unit Identifier :
Device Identifier : 600508B3-0091FB70-E907EF7D-B1C00013
Preferred Path : Controller 2 (other controller)
Cache Status : Enabled
Max Boot Partition: Enabled
Volume Status : VOLUME FAILED (Media_exchanged)
Parity Init Status: waiting for first write
12 Data Disk(s) used by lun 2:
Disk108: Box 1, Bay 08, (B:T:L 0:09:00) REPLACED MARKED OK
Disk109: Box 1, Bay 09, (B:T:L 0:10:00) REPLACED MARKED OK
Disk110: Box 1, Bay 10, (B:T:L 0:11:00) REPLACED MARKED OK
Disk111: Box 1, Bay 11, (B:T:L 0:12:00) REPLACED MARKED OK
Disk112: Box 1, Bay 12, (B:T:L 0:13:00) REPLACED MARKED OK
Disk101: Box 1, Bay 01, (B:T:L 0:00:00) REPLACED MARKED OK
Disk102: Box 1, Bay 02, (B:T:L 0:01:00) REPLACED MARKED OK
Disk103: Box 1, Bay 03, (B:T:L 0:02:00) REPLACED MARKED OK
Disk104: Box 1, Bay 04, (B:T:L 0:03:00) REPLACED MARKED OK
Disk105: Box 1, Bay 05, (B:T:L 0:04:00) REPLACED MARKED OK
Disk106: Box 1, Bay 06, (B:T:L 0:05:00) REPLACED MARKED OK
Disk107: Box 1, Bay 07, (B:T:L 0:08:00) REPLACED MARKED OK
Spare Disk(s) used by lun 2:
Disk201: Box 2, Bay 01, (B:T:L 1:00:00)
Logical Volume Raid Level: DISTRIBUTED PARITY FAULT TOLERANCE (RAID 5)
stripe_size=64kB
Logical Volume Capacity : 764,107MB

and I ask myself: don't I have any options to get this volume back? I mean the whole bank of disks is just fine now that it has power? There's got to be a way of getting the MSA head to give it another chance? I have a backup on tape, but still, I think it's silly to kill a perfectly fine unit just because of the shelve powering up too late.

Thanks much for the advice!

-Gunther

9 REPLIES
gregersenj
Honored Contributor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

In most cases you can get it back very easy :)

When a Smart Array suffer from multible disk failure it will disable the logical drive.
When you got all disks back on-line all you need to do, is to enable the logical drives again.
You can do that from the ACU.
I canøt rembemer excatly, but it's fairly strait forward.

Had to do it my self for more than 1 year ago.
No data loss at all :)


BR
/jag
Gunther S.
Occasional Advisor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Thank you Gregersenj! This makes me hopeful. Now I can only use the CLI to configure, but I am sure there is no tool like "ACU" which does not in the end use the CLI to do the actions. So if you or anyone could point me to the trick to re-enable the unit, I'd be ecstatic! Thanks.
gregersenj
Honored Contributor
Solution

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Page 30:
http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c01183955/c01183955.pdf

Recognizing a failed unit.

I belive, that is the answer.

I have never used the CLI myself.

BR
/jag
Clarete Riana
Valued Contributor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Yes as pointed out in the document "ACCEPT UNIT #" is the command.Where # should be the unit/lun number. If you have multiple luns that you want to enable then use "ACCEPT UNITS". This resets the status of all failed units and does not expect the unit number.
Gunther S.
Occasional Advisor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Thank you, thank you, thank you!!!

This worked.

I am glad this subject will now be indexed on Google by the exact error message.

kind regards,
-Gunther
Gunther S.
Occasional Advisor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Closing this thread happily with gratitude to the respondents.
Clarete Riana
Valued Contributor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

You could also assign points to the responses if the suggestions fixed your problem.
Gunther S.
Occasional Advisor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Yes, i assigned the points after I closed. Of course. Thanks again.
gregersenj
Honored Contributor

Re: Good drives, but MSA1500 kills unit "VOLUME FAILED (Media_exchanged)" after power outage+PSU failure

Glad you got it fixed.

BR
/jag