BladeSystem Management Software

Onboard Administrator / Integrated Lights-Out (iLO) network ports disabled by Network packet flooding ?

 
chuckk281
Trusted Contributor

Onboard Administrator / Integrated Lights-Out (iLO) network ports disabled by Network packet flooding ?

Tristan had an OA question:

 

**************

 

Hi,

 

A customer lost OA and iLO access in a C7000 after ‘Network packet flooding’ messages were logged into the syslog.

The syslog was collected via the serial port. 

Would the OA ultimately disable the management port if the Network packet flooding reoccurs ?

 

**********************

 

Monty replied:

 

*********************

 

These OA syslog messages are just warnings – no other action is taken by the OA firmware on any OA version.

 

The OA generates this syslog message based on the number of incoming network packets  per second exceeding several thousand per second.

 

A typical reason for these messages is when the OA management port is connected to a network (particularly higher speed) with a significant percentage of broadcast packets.  One source of this is server backup software.

 

Since the OA performs Ethernet bridging in firmware from the OA management port to/from the external network to the internal enclosure management network – the OA CPU can be overloaded by this traffic, and on the original OA modules for the c7000 and c3000 enclosures, this resulted in active OA watchdog reboots – which will interrupt all traffic to/from the enclosure until the active OA has completed initialization.

 

We recommend checking the link speed of the connection between the OA management port and the external switch port and checking the percentage of broadcast packets on that connection.   We have seen the external switch port be overrun if the connection to the OA management port is configured for 100Mb and the other ports in that network are configured for 1Gb.

 

The newer c7000 OA with KVM has a faster CPU and DDR2 memory and can be configured for 1000 or 100BaseT which should help this situation.

 

But your report is that both OA and iLO access were lost.  I see the reports that the blade management processors were unresponsive, but not where the OA was unresponsive.  If the OA CPU was overloaded by this network traffic – it would might even have a watchdog reset – did the customer have such an event this case?

 

If the network traffic was broadcasts – the OA must bridge those broadcasts to all the iLO and interconnects in the enclosure and they could explain why the iLO became unresponsive.

 

*************************

 

Other comments or suggestions?