BladeSystem Management Software

Which one causes Link Loss Failover to kick in?

 
chuckk281
Trusted Contributor

Which one causes Link Loss Failover to kick in?

Some failover questions from Chad:

 

******************

 

Which check in the OA causes Link Loss Failover to kick in? Is it when one or both of these messages states Down?

 

Kernel: Network link is up at 1000Mbps - Full Duplex

OA: Network link to gateway is up

 

And a follow-up, if network flooding is detected by an OA and the Active and Standby OA ports are both down, what happens then? Are you forced to reboot one of the OAs so that the other checks to see if it is Active and re-enable the port?

 

****************

 

Reply from Monty:

 

******************

 

The OA Link Loss Failover feature is based on the link state of each OA as reported in the OA CLI “show oa network” command.

 

The standby OA forces failover if it has an active link and the active OA has been without an active link for the configurable “Link Loss Interval”.  This information between the OA modules is provided over the internal serial connection between the two modules.

 

If both active and standby OA modules have no link – no failover is triggered.

 

Link loss failover requires the OA redundancy status (reported in the enclosure status) is OK – so at that time both OA modules must have the same firmware version and must be able to communicate with each other.  This includes synchronization of the active OA configuration.

 

Rebooting an active OA module does not force a link loss failover – unless the active OA has no link for the configured link loss interval after it reboots and the standby OA has a link.

 

There are no OA syslog messages for link loss, but if LLF is enabled and the standby determines it needs to failover, the standby issues a message "Active Onboard Administrator has lost link connectivity on the external NIC for %d seconds. Forcing take over"

 

Hope that helps,

 

******************

 

Other comments or questions?