MSA Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

MSA1000 issue

 
meekrob
Super Advisor

MSA1000 issue

Hi All,

 

lately we had a serious problem on one of our MSA1000 and when we tried to balance the load on its second controller the operation failed. Kindly find attached the extracted logs. The weird thing that the "Controller Status" field within that log is "OK" however, on this right controller all "arrays status=failed" and all "physicaldrives status=failed".

Any idea/suggestion on how to proceed with the troubleshooting steps?

 

Thanks in advance

6 REPLIES 6
John Kufrovich
Honored Contributor

Re: MSA1000 issue

 

Balance the load between controllers?  The firmware this array is running is 4.48 which is a Active/Passive fw.  You need to install A/A fw, 7.20.

 

 

meekrob
Super Advisor

Re: MSA1000 issue

Hello,

 

just to clarify that in this case we do not need an A/A config nor I/O balancing and it arranges us a working Active/Passive configuration via the 4.48 firmware and in which when the Active controller is failed then the Passive one takes the I/O load whcih was not the case here.

As you can see that we simulated that the Active one (the left one) by disabling and disconnecting, the right controller did not take the I/O load and it seems that it does not have any connection with the disk drives where you can see in the logs that all physical disks and arrays statuses was "FAILED".

 

Any suggestion / proposal to continue troubleshooting will be welcomed.

 

Thanks in advance

Highlighted
meekrob
Super Advisor

Re: MSA1000 issue

In addition and just to clarify on what was done on the array:

 

we tried to manually fail the Active controller which is the left one (in slot #2) and when waiting for controller in slot #1 to become Active this step did not take place.

we also had the messages below from the LCD panel (however note that no critical LEDs on the controllers) :

 

42 Reduncy Active /Standby controller
01 MSA 1000 StartUp complete
43 Redundancy failed Hardware Failure
510 Initializing Fibre Subsystem
518 Persistent MEM Enabled
22 Initializing SCSI Devices
21 Scanning for SCSI Devices
40 Initializing Redundancy support
20 Initializing SCSI Subsystem
62 Cache Module #1 256 MB
500 Initializing PCI susbsytem
00 Array FIRMWARE 4.48.B342

 

Any suggestion will be too helpful.

 

Thanks in advance

 

 

 

 

 

John Kufrovich
Honored Contributor

Re: MSA1000 issue

 

You need to look at your multipath software.  Once the HOST multipath software detects a path down for the Active controller.  It will disable the entire path to that Active controller.  When you pulled the Active controller, the redundant took ownership of the LUNs. Then the multipath software will then send a "start unit" command to the redundant controllers LUNs, activating host/lun IO and completing the failover

 

 

 

 

meekrob
Super Advisor

Re: MSA1000 issue

Hello,

 

after checking, and unfortunately there is no multipathing software installed on both host servers that are linux servers and the already configured  cluster is not configured with multipath devices.

Any idea / hint on how to proceed with troubleshooting in that case?

 

Thanks in advance

 

meekrob
Super Advisor

Re: MSA1000 issue

Hello,

Could you please advise regarding the situation that we are facing as the linux cluster was configured earlier without multipathing on how to configure multipathing?

One more time, thanks for your help