Networking
Showing results for 
Search instead for 
Do you mean 

Bonding Failover Problem

Occasional Advisor

Bonding Failover Problem

I have DL380 G4s with NC7771 NIC cards running redhat ES 3 update 6. I have them in mode 1 plugged in to separate switches. I have tried the tg3 and ncm5700 nic drivers. The problem is that when I unplug the active nic it won't pass traffic to the other nic for about 60-90 seconds. When you view the dmesg output it shows it failing immediately and activating the other nic. Any ideas why this is not working correctly?

modules.conf -
#alias eth0 tg3
alias eth0 bcm5700
#alias eth1 tg3
alias eth1 bcm5700
#alias eth2 bcm5700
alias scsi_hostadapter cciss
alias usb-controller usb-uhci
alias usb-controller1 ehci-hcd
alias bond0 bonding
options bond0 mode=1 miimon=100

ifcfg-bond0 -
DEVICE=bond0
BOOTPROTO=none
IPADDR=10.70.80.119
NETMASK=255.255.240.0
ONBOOT=yes
TYPE=Ethernet
USERCTL=no

ifcfg-eth0 -
# eth0
DEVICE=eth0
#IPADDR=10.70.80.119
#NETMASK=255.255.240.0
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
ONBOOT=yes
#ETHTOOL_OPTS="speed 100 duplex full autoneg off"
TYPE=Ethernet

ifcfg-eth1 -
DEVICE=eth1
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
ONBOOT=yes
#ETHTOOL_OPTS="speed 100 duplex full autoneg off"
TYPE=Ethernet

dmesg output
bcm5700: eth0 NIC Link is Down
bond0: link status definitely down for interface eth0, disabling it and making interface eth1 the active one.
4 REPLIES
Honored Contributor Honored Contributor

Re: Bonding Failover Problem

Maybe a switch (spanning tree or something) issue. It looks that it takes too long to identify the location of the MAC address.

Check the status in proc/net/bonding/bond0.

Check also the port status for autonegotiation problems.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Highlighted
Occasional Advisor

Re: Bonding Failover Problem

It's not a switch problem because I have some DL380 G5s and DL360 G3s that fail over correctly. The servers are negotiating correctly.
Honored Contributor Honored Contributor

Re: Bonding Failover Problem

Check your kernel configuration for ARP values, for example, arp_filter.

The, I would try with fail_over_mac option.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Occasional Advisor

Re: Bonding Failover Problem

arp_filter is 0

I can't do fail_over_mac because it was added in v 3.2 and I'm running 2.6