Operating System - Linux
1753575 Members
6125 Online
108796 Solutions
New Discussion юеВ

Linux network bonding not working after reboot

 
Matthias Kretschmer
Frequent Advisor

Linux network bonding not working after reboot

Hello,

I have installed Asianux (Linux distri. based on RedHat) on a DL580.

I want to bond eth0 and eth2 to bond0.

I configured it and everything looks good, but after a reboot the network is down.

Actually the bond and the nics are up, but I can not connect to the server, or ping another box from the server.

If I restart the network (/etc/init.d/network restart)or just eth0 (ifdown eth0, ifup eth0) the connection comes up via eth0, but eth2 is still not usable (after disabling eth0 again)

 

I hope somebody could give me a hint. I think it has something to do with the order of loading the bonding module during startup.

Here is my config:

 

ifcfg-bond:

DEVICE=bond0

BOOTPROTO=none

IPADDR=192.168.7.43

NETMASK=255.255. 240.0

ONBOOT=yes

GATEWAY=192.168.7.1

TYPE=Ethernet

BONDING_OPTS="primary=eth0 miimon=100 mode=1"

 

ifcfg-eth0:

DEVICE=eth0

ONBOOT=yes

MASTER=bond0

SLAVE=yes

BOOTPROTO=none

TYPE=Ethernet

 

ifcfg-eth2:

DEVICE=eth2

ONBOOT=yes

MASTER=bond0

SLAVE=yes

BOOTPROTO=none

TYPE=Ethernet

 

/etc/modprobe.conf:

alias bond0 bonding

 

/var/log/messages during startup:

Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: Unable to set primary slave; bond0 is in mode 0
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: Setting MII monitoring interval to 100.
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: setting mode to active-backup (1).
Mar  2 11:21:46 NKGRAC1 kernel: ADDRCONF(NETDEV_UP): bond0: link is not ready
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: Adding slave eth0.
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: enslaving eth0 as a backup interface with a down link.
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: Adding slave eth2.
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: enslaving eth2 as a backup interface with a down link.
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: Setting eth0 as primary slave.
Mar  2 11:21:46 NKGRAC1 kernel: netxen_nic: eth0 NIC Link is up
Mar  2 11:21:46 NKGRAC1 kernel: netxen_nic: eth2 NIC Link is up
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: link status definitely up for interface eth0.
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: making interface eth0 the new active one.
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: first active interface up!
Mar  2 11:21:46 NKGRAC1 kernel: bonding: bond0: link status definitely up for interface eth2.
Mar  2 11:21:46 NKGRAC1 kernel: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

 

I am missing the line

network: Bringing up interface bond0:  succeeded

which appears when I restart the network manually.

 

6 REPLIES 6
Matti_Kurkela
Honored Contributor

Re: Linux network bonding not working after reboot

In your BONDING_OPTS, the primary=eth0 setting is before mode=1.

 

Looking at the error messages, the problem is that you must first set mode=1 before you can specify the primary= option. The options are processed in order: the primary= option fails as the bond is still using default mode=0 at that point. Because an error was detected, it will not proceed to setting the IP address. But it still processes the rest of the BONDING_OPTS line: it sets mode=1.

 

So your subsequent manual "ifup bond0" will be successful, because the mode will already be set to 1 when it tries to set the primary interface again.

 

So try setting the bonding options in this order instead:

BONDING_OPTS="mode=1 primary=eth0 miimon=100"

 This should allow all the bonding options to be accepted on the first ifup run, allowing bond0 to come up automatically at boot time.

MK
Matthias Kretschmer
Frequent Advisor

Re: Linux network bonding not working after reboot

Hello Matti,

Thank you for your fast reply. I tested your recommendation, but it is still not comming up after a reboot.

My ifcfg-bond0 looks now like this:

 

DEVICE=bond0
BOOTPROTO=none
IPADDR=192.168.7.41
NETMASK=255.255.240.0
ONBOOT=yes
GATEWAY=192.168.7.1
TYPE=Ethernet
BONDING_OPTS="mode=1 primary=eth0 miimon=100"

 

During boot my messages file writes:

Mar  4 14:06:09 NKGRAC1 kernel: bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: setting mode to active-backup (1).
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: Unable to set eth0 as primary slave as it is not a slave.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: Setting MII monitoring interval to 100.
Mar  4 14:06:09 NKGRAC1 kernel: ADDRCONF(NETDEV_UP): bond0: link is not ready
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: Adding slave eth0.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: enslaving eth0 as a backup interface with a down link.
Mar  4 14:06:09 NKGRAC1 kernel: netxen_nic: eth0 NIC Link is up
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: Adding slave eth2.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: enslaving eth2 as a backup interface with a down link.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: link status definitely up for interface eth0.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: making interface eth0 the new active one.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: first active interface up!
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: link status definitely up for interface eth2.
Mar  4 14:06:09 NKGRAC1 kernel: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: Setting eth0 as primary slave.
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: link status definitely down for interface eth2, disabling it
Mar  4 14:06:09 NKGRAC1 kernel: netxen_nic: eth2 NIC Link is up
Mar  4 14:06:09 NKGRAC1 kernel: bonding: bond0: link status definitely up for interface eth2.

  And if i do an ifdown eth0 and an ifup eth0 it logs:

Mar  4 14:14:24 NKGRAC1 kernel: bonding: bond0: Removing slave eth0
Mar  4 14:14:24 NKGRAC1 kernel: bonding: bond0: Warning: the permanent HWaddr of eth0 - 44:1E:A1:4C:29:A4 - is still in use by bond0. Set the HWaddr of eth0 to a different address to avoid conflicts.
Mar  4 14:14:24 NKGRAC1 kernel: bonding: bond0: releasing active interface eth0
Mar  4 14:14:24 NKGRAC1 kernel: bonding: bond0: making interface eth2 the new active one.
Mar  4 14:14:31 NKGRAC1 kernel: bonding: bond0: Adding slave eth0.
Mar  4 14:14:31 NKGRAC1 kernel: bonding: bond0: enslaving eth0 as a backup interface with a down link.
Mar  4 14:14:32 NKGRAC1 kernel: netxen_nic: eth0 NIC Link is up
Mar  4 14:14:32 NKGRAC1 kernel: bonding: bond0: link status definitely up for interface eth0.
Mar  4 14:14:32 NKGRAC1 kernel: bonding: bond0: making interface eth0 the new active one.

 

Any other recommendations?

I am also woundering why it says that I have to set miimon or arp. I setted it in the ifcfg-bond0 file.

 

Thank you very much in advance.

 

BR

Matthias

epretorious
Regular Advisor

Re: Linux network bonding not working after reboot

Matthias:

 

Have you tried the debug module parameter?

 

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/deployment_guide/s1-networkscripts-interfaces#s2-networkscripts-interfaces-chan

 

Do not place parameters for the bonding kernel module in the /etc/modprobe.conf file. Instead, specify them as a space-separated list in the BONDING_OPTS="<bonding parameters>" directive in the ifcfg-bond<N> interface file.
The only exception is the debug parameter, which cannot be used on a per-device basis, and which should therefore be specified in /etc/modprobe.conf as follows:

options bonding debug=1

For further instructions and advice on configuring the bonding module, as well as to view the list of bonding parameters, refer to Section 43.5.2, тАЬThe Channel Bonding ModuleтАЭ. 

 HTH,

Eric Pretorious
Matti_Kurkela
Honored Contributor

Re: Linux network bonding not working after reboot

Looks like the ifup does not complete the initialization of the slave interfaces before trying to start the bond0 interface. Perhaps Asianux is missing some ifup patch that RedHat has added?

 

Or... do you have backup copies or any old versions of the ifcfg-* files in /etc/sysconfig/network-scripts? These might confuse the ifup scripts. If you have any extra copies of the ifcfg-* files in that directory, move them somewhere else and see if it has any effect.

 

Since RedHat ifup/ifdown is essentially a set of scripts, you might add "set -x" to suitable points in the scripts to see what is happening.

 

I think epretorious's suggestion of enabling the debug option of the bonding module is a bit of an overkill, since the problem seems to be not in the module itself, but in the scripts that set up the bonding. But it certainly does not hurt to try it.

 

> I am also woundering why it says that I have to set miimon or arp. I setted it in the ifcfg-bond0 file.

 

To understand this, you need to know the history of the bonding module.

 

Originally, the only way to specify the bonding modes and other options was to use kernel module options. Some people forgot to specify miimon or arp_ parameters, so the kernel was patched to display a warning in this case.

Then it became possible to change the bonding parameters via the /proc filesystem.

 

At that point, RedHat added the BONDING_OPTS variable into its network configuration system, and made it the primary method of setting up bonding. Now the module will be loaded without specifying any options, causing the warning to be output... and immediately after that, the ifup scripts will specify the necessary options using /proc.

 

So the warning is no longer needed on RedHat distributions, but it is still in the kernel code because there has not been an important enough reason to remove it yet.

MK
Matthias Kretschmer
Frequent Advisor

Re: Linux network bonding not working after reboot

Thank you both for your hints and explanations!

 

I fixed the problem.

I need to set the network speed of the nics and disable negotiation like this.

ethtool -s eth0 speed 1000
ethtool -s eth0 autoneg off
ethtool -s eth1 speed 1000
ethtool -s eth1 autoneg off
ethtool -s eth2 speed 1000
ethtool -s eth2 autoneg off
ethtool -s eth3 speed 1000
ethtool -s eth3 autoneg off

 After that my bonds where running as expected, even after an reboot.

 

Thank you very much again.

 

BR

Matthias

epretorious
Regular Advisor

Re: Linux network bonding not working after reboot

Matthias:

 

  1. How did you figure this out?
  2. How do you accomplish this (e.g., /etc/rc.d/rc.local)?
Eric Pretorious