1830517 Members
2807 Online
110006 Solutions
New Discussion

NIC Bonding in RH ent 5

 
SOLVED
Go to solution
City_Blue
Super Advisor

NIC Bonding in RH ent 5

hi

we have just bonded the nic's on 8 new servers all identical builds and driver and firmware revisions.

we have used the same files on all servers just modified the HWaddress and IP
modprode.conf
ifcfg-bond0
ifcfg-eth0
ifcfg-eth1

the problem is all but 2 of the servers fail over with out droping a ping

but 2 servers take about 15 seconds to fail over to the second NIC. if pinging we lose about 4 to 6 pings, and if we have an SSH session it seems to lose connection but then springs to life.

this happens whe we fail over from each NIC on each server.

any one got any ideas what could be happening and why and how i can fix it

cheers
6 REPLIES 6
Steven E. Protter
Exalted Contributor

Re: NIC Bonding in RH ent 5

Shalom,

Need to see modprobe.conf. Need to see the results of command service network restart.

See /var/log/messages.

This could be an external problem with network routing.

Cisco router configuration is known to be able to interfere with bonding.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Matti_Kurkela
Honored Contributor
Solution

Re: NIC Bonding in RH ent 5

Are the servers with the different behaviour plugged into a different switch than the others? Or is the switch port configuration somehow different?

Which bonding mode are you using? (What are the options for bond0 in modprobe.conf?)

For more information about bonding modes and related options, please read:
http://www.kernel.org/doc/Documentation/networking/bonding.txt

When the bonding system executes a NIC failover, the switch(es) see it as a MAC address suddenly jumping from one port to another. If the switch has security features for preventing MAC address hijacking, they might interfere with the failover.

Normally the bonding mechanism sends out a "gratuitous ARP" to announce that the MAC address has moved to a different switch port. If these are filtered out at some level, the switches would try to send traffic to the old port until their ARP table entries expire. The expiration time for ARP entries varies by manufacturer, but 15 seconds would sound about right.

MK
MK
City_Blue
Super Advisor

Re: NIC Bonding in RH ent 5

here are the ouoputs from messages this shows me doing ifdown and up and also a service network restart

---------------------------------------------
May 28 12:46:01 server_name kernel: bonding: bond0: Removing slave eth0
May 28 12:46:01 server_name kernel: bonding: bond0: Warning: the permanent HWaddr of eth0 - 00:23:7D:E8:7D:6C - is still in use by bond0. Set the HWaddr of eth0 to a differen$
May 28 12:46:01 server_name kernel: bonding: bond0: releasing active interface eth0
May 28 12:46:01 server_name kernel: bonding: bond0: making interface eth1 the new active one.
May 28 12:46:01 server_name kernel: bonding: bond0: Removing slave eth1
May 28 12:46:01 server_name kernel: bonding: bond0: releasing active interface eth1
May 28 12:46:01 server_name avahi-daemon[4616]: Withdrawing address record for xxx.xxx.xxx.xxx on bond0.
May 28 12:46:01 server_name avahi-daemon[4616]: Leaving mDNS multicast group on interface bond0.IPv4 with address xxx.xxx.xxx.xxx.
May 28 12:46:01 server_name avahi-daemon[4616]: iface.c: interface_mdns_mcast_join() called but no local address available.
May 28 12:46:01 server_name avahi-daemon[4616]: Interface bond0.IPv4 no longer relevant for mDNS.
May 28 12:46:01 server_name avahi-daemon[4616]: Withdrawing address record for fe80::223:7dff:fee8:7d6c on bond0.
May 28 12:46:01 server_name avahi-daemon[4616]: Leaving mDNS multicast group on interface bond0.IPv6 with address fe80::223:7dff:fee8:7d6c.
May 28 12:46:01 server_name avahi-daemon[4616]: iface.c: interface_mdns_mcast_join() called but no local address available.
May 28 12:46:01 server_name avahi-daemon[4616]: Interface bond0.IPv6 no longer relevant for mDNS.
May 28 12:46:01 server_name kernel: ADDRCONF(NETDEV_UP): bond0: link is not ready
May 28 12:46:01 server_name kernel: bonding: bond0: Adding slave eth0.
May 28 12:46:01 server_name kernel: bnx2: eth0: using MSIX
May 28 12:46:01 server_name kernel: bonding: bond0: enslaving eth0 as a backup interface with a down link.
May 28 12:46:01 server_name kernel: bonding: bond0: Adding slave eth1.
May 28 12:46:01 server_name kernel: bnx2: eth1: using MSIX
May 28 12:46:01 server_name kernel: bonding: bond0: enslaving eth1 as a backup interface with a down link.
May 28 12:46:04 server_name kernel: bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full duplex
May 28 12:46:04 server_name kernel: bonding: bond0: link status definitely up for interface eth0.
May 28 12:46:04 server_name kernel: bonding: bond0: making interface eth0 the new active one.
May 28 12:46:04 server_name kernel: bonding: bond0: first active interface up!
May 28 12:46:04 server_name kernel: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
May 28 12:46:04 server_name kernel: bnx2: eth1 NIC Copper Link is Up, 1000 Mbps full duplex
May 28 12:46:04 server_name kernel: bonding: bond0: link status definitely up for interface eth1.
May 28 12:46:05 server_name avahi-daemon[4616]: New relevant interface bond0.IPv6 for mDNS.
May 28 12:46:05 server_name avahi-daemon[4616]: Joining mDNS multicast group on interface bond0.IPv6 with address fe80::223:7dff:fee8:7d6c.
May 28 12:46:05 server_name avahi-daemon[4616]: Registering new address record for fe80::223:7dff:fee8:7d6c on bond0.
May 28 12:46:05 server_name avahi-daemon[4616]: New relevant interface bond0.IPv4 for mDNS.
May 28 12:46:05 server_name avahi-daemon[4616]: Joining mDNS multicast group on interface bond0.IPv4 with address xxx.xxx.xxx.xxx.
May 28 12:46:05 server_name avahi-daemon[4616]: Registering new address record for xxx.xxx.xxx.xxx on bond0.
May 28 12:46:15 server_name snmpd[3146]: Connection from UDP: [127.0.0.1]:57500
May 28 12:46:15 server_name snmpd[3146]: Received SNMP packet(s) from UDP: [127.0.0.1]:57500
May 28 12:46:30 server_name snmpd[3146]: Connection from UDP: [127.0.0.1]:50212
May 28 12:46:30 server_name snmpd[3146]: Received SNMP packet(s) from UDP: [127.0.0.1]:50212
May 28 12:46:45 server_name snmpd[3146]: Connection from UDP: [127.0.0.1]:59925
May 28 12:46:45 server_name snmpd[3146]: Received SNMP packet(s) from UDP: [127.0.0.1]:59925
May 28 12:46:53 server_name snmpd[3146]: Connection from UDP: [xxx.xxx.xxx.xxx]:1351
May 28 12:46:53 server_name last message repeated 18 times
May 28 12:47:00 server_name snmpd[3146]: Connection from UDP: [127.0.0.1]:41755
May 28 12:47:00 server_name snmpd[3146]: Received SNMP packet(s) from UDP: [127.0.0.1]:41755
---------------------------------------------

this is my modprode.conf and my eth configs

---------------------------------------------
# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth1
BOOTPROTO=static
HWADDR=00:23:7D:E8:4D:BE
ONBOOT=yes
TYPE=Ethernet
MASTER=bond0
SLAVE=yes


# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth0
BOOTPROTO=static
HWADDR=00:23:7D:E8:4D:BC
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
MASTER=bond0
SLAVE=yes

# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=bond0
BOOTPROTO=static
ONBOOT=yes
TYPE=Ethernet
NETMASK=255.255.255.0
IPADDR=xxx.xxx.xxx.xxx
GATEWAY=xxx.xxx.xxx.xxx
USERCTL=no
BONDING_MASTER=yes

-----------------------------------------------

all the switch configs are the same, although they do span 2 switches, i have servers connected to the same switches that work
City_Blue
Super Advisor

Re: NIC Bonding in RH ent 5

forgot to say we are using active-backup

we have tried round robin and this works

but again while the we change the setings we lose connection, but the other servers all stay up
Steven E. Protter
Exalted Contributor

Re: NIC Bonding in RH ent 5

Shalom,

My experience with broadcom is that active-active will not work, but active-backup will.

This still says to me switch problem.

Have the switch port configurations checked and try to put the systems each NIC on a single switch.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
City_Blue
Super Advisor

Re: NIC Bonding in RH ent 5

found spanning tree portfast was not enabled on the ports

enabled it and it now workes

cheers