Operating System - Linux
1832335 Members
2406 Online
110041 Solutions
New Discussion

Re: Network bonding did not work after a crash

 
Rui Vilao
Regular Advisor

Network bonding did not work after a crash

Hi,

I have two clustered DL585 servers running RHEL 4.0 AS U3 (x86ÇÇ_64)
After an OS crash (which is still under investigation), the one of the configured Linux bonds failed…


Both network cards are connected to one switch.

The bonding driver configuration is:

/etc/modprobe.conf

...
alias bond0 bonding
alias bond1 bonding
options bond0 miimon=100 mode=5 max_bonds=2
options bond1 miimon=100 mode=5

The bonding interface are configured as follow

[root@asapnrdb1 network-scripts]# cat ifcfg-bond1
DEVICE=bond1
IPADDR=172.30.92.67
NETMASK=255.255.255.128
NETWORK=172.30.92.0
BROADCAST=172.30.92.127
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
IPV6INIT=no
PEERDNS=yes
GATEWAY=172.30.92.1
TYPE=Ethernet

[root@asapnrdb1 network-scripts]# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v2.6.1 (October 29, 2004)

Bonding Mode: transmit load balancing
Primary Slave: None
Currently Active Slave: eth8
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth8
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:11:0a:5d:0c:84

Slave Interface: eth6
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:11:0a:5d:0a:2c

After a hang and a reboot (by ASR after 10 minutes)

...
Nov 20 12:43:16 asapnrdb1 kernel: bonding: bond1: making interface eth8 the new active one.
Nov 20 12:43:16 asapnrdb1 kernel: bonding: bond1: enslaving eth8 as an active interface with an up link.
Nov 20 12:43:16 asapnrdb1 kernel: e1000: eth6: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex
Nov 20 12:43:16 asapnrdb1 kernel: bonding: bond1: enslaving eth6 as an active interface with an up link.
Nov 20 12:43:16 asapnrdb1 kernel: ip_tables: (C) 2000-2002 Netfilter core team
Nov 20 12:43:16 asapnrdb1 kernel: bonding: bond1: link status definitely down for interface eth8, disabling it
Nov 20 12:43:16 asapnrdb1 kernel: bonding: bond1: making interface eth6 the new active one.
Nov 20 12:43:16 asapnrdb1 kernel: bonding: bond1: link status definitely down for interface eth6, disabling it
Nov 20 12:43:16 asapnrdb1 kernel: bonding: bond1: now running without any active interface !

This occurred on both servers and thus the cluster did not startup!

After a reboot (shutdown –ry 0) everything went ok.

Any help/suggestion is highly appreciated.

TIA.

PS: How can I check the current speed the of the network bond? mii-tool returns 10mbps and ethtool does not work...

Kind Regards,

Rui Vilao.
"We should never stop learning"_________ rui.vilao@rocketmail.com
3 REPLIES 3
Ivan Ferreira
Honored Contributor

Re: Network bonding did not work after a crash

Are you running ethtool to the bond interface or to the eth interface? You could modify the /etc/rc.d/init.d/network script to modify the settings of the network interfaces with ethtool before the bonding is established.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Rui Vilao
Regular Advisor

Re: Network bonding did not work after a crash


This is what I get:

root@asapnrdb1 ~]# mii-tool bond0
bond0: 10 Mbit, half duplex, link ok
[root@asapnrdb1 ~]# ethtool bond0
Settings for bond0:
No data available
"We should never stop learning"_________ rui.vilao@rocketmail.com
Al Licause
Trusted Contributor

Re: Network bonding did not work after a crash

Since you are using eth6 and eth8 in the bond set, the assumption is that eth0-eth5 and eth7 also exist on the system. It would be helpful to know how they are configured and which drivers they are using.

The problem may be with interaction of one or more of the other devices.