HPE EVA Storage

Critical problem : MSA1500 CS

 
RL-SHADY
New Member

Critical problem : MSA1500 CS

MSA1500 CS Controller
The current array controller has a failed logical drive:
1.Logical drive 1 (Raid 5 in array A)
2.Logical drive 2 (Raid 5 in array B)
3.Logical drive 3 (Raid 6 (ADG) in array C)
4.Logical drive 1 (Raid 6 (ADG)in array D)
On 1/1/2010, a problem in the main server + array
For this case, we have the spare server ready. During checking the spare server, we discovered that it cannot access the Online or the Near line storages.

After checking the storages from the (Array Configuration Utility), the logical drives appears with red x.

After checking the Array HDD status from the HP System Management, We found (Port3, Drive 10) HDD failed. We replaced the HDD with a new one and the status changed to OK but the HDD still off and no rebuilding process appears.

In conclusion, there are data stored in the array system (Online) and Synchronized with a data in the (Rearline) System. The strange case is when 1 HDD from one array system, the both array are failed and we were not able to access the data.
3 REPLIES 3

Re: Critical problem : MSA1500 CS

For what it seems, whatever problem there was in the system caused drives to disconnect at the same time exceeding the RAID fault tolerance, thus, causing the failed luns. This would happen when there is power outage or similar issue. As long as the units are failed you wont be able to access them, and rebuild will not take place.

Probably you already did, but first thing is to do a full power cycle of the system, including Disk enclosures. And confirm that there are no other warnings in the array after that, the array might still have the failed Volumes and the failed disk that you already identified.

You should confirm there is valid backup of the data just in case, but if the situation above is what happened and the drives are all back now, with no other problem but failed luns you should be able to reenable them and getting your data back.

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=415598&prodTypeId=12169&prodSeriesId=415598&objectID=c01204574
RL-SHADY
New Member

Re: Critical problem : MSA1500 CS

Many thanks;

but what is the percentage that if i click REENABLE the data will not be lost,

Re: Critical problem : MSA1500 CS

I have seen this happen some times, and when the problem was a power loss with only one server using the volume, it has worked with no problems almost 100%, you reenable it then go to the OS, with the volume usable and the file system intact. Remember that if this happened the data remains in the disks, reenabling the lun lets the controller attempt to see the pointers to the information, so if the data is usable, then it will be there. There are multiple reasons why it can fail, the ones I remember from the top of my head:

- Dynamic disks are used in Windows, or other type of Software Raid, which could also be corrupted due to unproper Shut Down.

- The volume is not presented to the same server that had it when it failed.

- Volume was accessed by diferent servers in a non-clustered environment.

- Multiple HDD failures in the array.

- Not only a power loss, but an overcharge, or a hardware problem caused the volume to fail.

Remember that if the data does not become available and there is no other way to get the data back, do not format or re-initialize the volume. At that point you can still send the drives to a Data recovery facility; if you reformat the unit, then the data will be lost and backup restore will be needed.