MSA Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

Reboot of MSCS node causes MSA1500 controller failover

 
Allen Baer
Occasional Visitor

Reboot of MSCS node causes MSA1500 controller failover

I have a MSCS SQL Server cluster consisting of 2 DL580G3's connected to a MSA1500CS for the shared storage. I can failover the cluster within the OS with no problems but the first time I try to reboot a node after it had been the active cluster node the server hangs at the Windows splash screen and this causes an array controller failover on the MSA. I then need to reset the server to get it to come up. After this first reboot I can reboot this server with no issues but if I try to reboot the other node that had been the active MSCS node this issue comes back. Anytime I reboot a node which had been the active node the MSA has an array controller failover.

Has anyone seen anything like this?
4 REPLIES 4
Pieter 't Hart
Honored Contributor

Re: Reboot of MSCS node causes MSA1500 controller failover

There are many things to check.

When using an active/passive array-controller configuration it could be one of the hosts uses controler-1 and the other controller-2.
If you have multiple connections from the host to the controller, you can try switching the two connections, so both hosts use controler-1 at boottime.

It could be the application in use is not cluster-aware so it tries to access resources that are now active on the other node.

Check your cluster groups and
dependencies of the resources. a misconfiguration here can lead to similar problems.

You need to have a cluster-group with cluster resources (quorum-disk, clustername, cluster-adres) and one ore more application groups.

An application should only use resources in its own group, so not(!) the quorum disk and don't(!) use the clustername or adres to share an application . Instead create aditional names/adresses in the respective application group.

Another cause can be that one of the volums on the array is not configured as a cluster-resource. so it cannot be shared.
James Muell
Valued Contributor

Re: Reboot of MSCS node causes MSA1500 controller failover

This also could be due to incorrect host cabling or No multipathing. We always configure node-A HBA-0 in lowest slot going to sw1, w/HBA-1 next slot going to sw2 and always MSA_cntrlr1 go'g to sw1. IF you take a laptop and connect it to the rj45 jack in the right contlr and open Hyperterm using 19200,1 none 1 none you can run the command show version -all and review the output.
You should also be using the ACU's SSP.
James Muell
Valued Contributor

Re: Reboot of MSCS node causes MSA1500 controller failover

thats hyperterm 19200, 8, no, 1, no handshake
Ganesh chichakar
Occasional Visitor

Re: Reboot of MSCS node causes MSA1500 controller failover

Hi Allen

Please upload "show tech_support" logs.