Array Setup and Networking
1839185 Members
4406 Online
110137 Solutions
New Discussion

Re: How does failover of controllers work?

 
SOLVED
Go to solution
support27
Occasional Contributor

How does failover of controllers work?

Hi there I have a 1Gb standard network configuration with eth1 & 2 management only and eth3-6 data only.

Each management port and data port on each controller are plugged into a separate switches as documented in the Nimble networking best practices:

I was wondering what triggers a failover of the controller?

There's a few scenarios I'm considering:

  • If S1 fails eth1, eth3 and eth 5 lose connections however we still have connections to eth 2, eth4 and eth6 so I guess no failover occurs as we still have connectivity from the controller but in a degraded state? and the same if S2 fails but with eth 2, 4 and 6?

If an individual port either on the switch or controller failed then what would happen?

  • in the case of the management network does failover rely on the availability of the group management IP address which 'floats' between the 2 management ports and if that becomes unavailable then failover occurs?
  • In the case of the data network, if one or more ports are unavailable on a controller then what happens?

I am not in a position to test failover scenarios at present but would like to know the theory behind failover.

6 REPLIES 6
jrich52352
Trusted Contributor

Re: How does failover of controllers work?

hopefully support or an engineer will answer this, but based on testing, a controller only fails over if there is a complete outage on a controller. so if you lose one data/mgmt port, it will stay.

mgram128
Valued Contributor
Solution

Re: How does failover of controllers work?

Justin is correct.  There are other conditions that will trigger a controller failover, but from a network related perspective, its is because that a network (management or data) that is defined on any interface is no longer able to communicate with other devices on that network.  So, the loss of a single interface on a controller will not trigger a failover unless there are no other available paths on that controller for the related network.

support27
Occasional Contributor

Re: How does failover of controllers work?

Thanks for the responses and clarification guys!

tcrowley96
New Member

Re: How does failover of controllers work?

I have the same network config as SA vault has.

If I disconnect the cable from eth3 on controller A should the traffic fail over the eth3 on controller B and eth 4 - 6 stay on controller A

Thanks

Tim

support27
Occasional Contributor

Re: How does failover of controllers work?

Hi Tim,


I summarized it as follows and tested accordingly:


Switch failure:


  • If switch S1 fails then eth1, eth3 and eth5 lose connectivity, however there are still connections to eth2, eth4 and eth6 from the controller so no failover occurs, as there is still connectivity from the controller to both networks (management and data) but in a degraded state. (And the same if S2 fails but with eth2, eth4 and eth6 losing connections.)

If an individual port either on the switch or controller fails:

  • In the case of the management network, failover relies on the availability of the group management IP address which 'floats' between the 2 management ports on the controller.    If the group management address becomes unavailable then failover occurs, therefore it would need both management ports to be unavailable for failover of the management ports to other controller.
  • Likewise, in the case of the data network, if one or more ports are unavailable on a controller then no failover occurs unless all data ports are unavailable


If a controller fails then of course all connections failover.

Also Mitch's answer above clarifies what happens when interfaces fail on either management or data network:
"the loss of a single interface on a controller will not trigger a failover unless there are no other available paths on that controller for the related network."

vladv106
Advisor

Re: How does failover of controllers work?

I would like to add some info to the above after we did some failover tests.

A switch misconfiguration (ex. wrong vlan) on the data paths will not trigger a failover. So if the storage paths are up but they cannot be reached, we found out that it will not failover. We waited for 80 seconds.

LE: But having one data path fail (NIC down) on a controller will trigger a failover even though there is still one data path available. "Attempting failover to active role because controller B has better IP network connectivity."

Software version: 2.3.9.0-296119-opt