Disk Enclosures
1748109 Members
4734 Online
108758 Solutions
New Discussion

MSA1000 disk fail caused Device/Pool deactivation

 
Cameron Todd
Regular Advisor

MSA1000 disk fail caused Device/Pool deactivation

I have recently installed and built an entry-level SAN using several DL360G4 servers (with QLogic HBA's) connected to a new MSA1000 (SAN switch 2/8 + two MSA30's).

The servers have NetWare OES (V6.5 SP3) and the firmware on the MSA1000 was updated to the latest version (FabricOS v3.2.0a, MSA V4.48)

One of the U320 146GB drives failed last night, yet despite the MSA selecting a hot spare and starting an array rebuild as would be expected, every Pool and Volume on the NetWare server deactivated with "device failure" messages.

I was under the impression that the point of having a RAID array was that a drive failure would be seamlessly repaired and that functionality would not be impaired (only slowed a little depending on the priority of the Rebuild setting).

I do not understand why Pools residing on totally separate arrays (I have defined 4 separate RAID5 arrays across the MSA cabinets) also failed, nor why the server had to be power cycled in order to allow any volumes at all to be seen and mounted by clients.

There is no redundancy built into the SAN infrastructure (no secondary SAN switch or duplexed fibres and HBA's) but then, even if there was, this device failure would I assume still have occurred, as it appears was a problem involving the MSA controller failing a low-level NetWare OS diskaccess request when the drive failed rather than responding with the requested data while repairing the fault in the background.

Any ideas on what is happening here?

Is this a configuration error or a fault in the MSA?
1 REPLY 1
Cameron Todd
Regular Advisor

Re: MSA1000 disk fail caused Device/Pool deactivation

No replies.
Moving to Storage Area Network (SAN) list as a better match for the problem.