Operating System - Linux
1830870 Members
1900 Online
110017 Solutions
New Discussion

Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

 
Divs
Occasional Contributor

Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

Hi there,

I am new to the forum and need some help getting ethernet bonds to work on my blade.
I am using eth0 and eth1. After a 'service network restart', I am able to ssh to the server but am unable to press - the server looks like it has hung.

Logging into remote console via ILO, everything looks fine and I have to 'ifdown eth0' for server to come back to life over the network.

Where am I going wrong???

Thanks in advance for any help,
Divys

root on BUILD cathlbwsp08 # cat ifcfg-bond0
DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes
NETWORK=10.9.188.0
IPADDR=10.9.188.4
NETMASK=255.255.255.0
USERCTL=no

root on BUILD cathlbwsp08 # cat ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
#IPADDR=10.9.188.4
#NETMASK=255.255.255.0
ONBOOT=yes
TYPE=Ethernet
MASTER=bond0
SLAVE=yes
USERCTL=no
HWADDR=00:14:c2:3d:c1:26
ETHTOOL_OPTS="speed 1000 duplex full autoneg off"

root on BUILD cathlbwsp08 # cat ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
#IPADDR=10.9.188.4
#NETMASK=255.255.255.0
ONBOOT=yes
TYPE=Ethernet
MASTER=bond0
SLAVE=yes
USERCTL=no
HWADDR=00:14:c2:40:da:9f
ETHTOOL_OPTS="speed 1000 duplex full autoneg off"

My modules.conf:
alias bond0 bonding


...and an excerpt from dmesg
Ethernet Channel Bonding Driver: v2.6.0 (January 14, 2004)
bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not de
tect link failures! see bonding.txt for details.
divert: allocating divert_blk for bond0
ip_tables: (C) 2000-2002 Netfilter core team
tg3.c:v3.27RH (May 5, 2005)
divert: allocating divert_blk for eth0
eth0: Tigon3 [partno(N/A) rev 1100 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:c2:3d:c1:26
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0]
divert: allocating divert_blk for eth1
eth1: Tigon3 [partno(N/A) rev 1100 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:c2:40:da:9f
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0]
divert: allocating divert_blk for eth2
eth2: Tigon3 [partno(N/A) rev 1100 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:15:60:09:28:0a
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0]
divert: allocating divert_blk for eth3
eth3: Tigon3 [partno(N/A) rev 1100 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:15:60:09:28:0b
eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0]
bonding: Warning: failed to get speed/duplex from eth0, speed forced to 100Mbps, duplex forced to Full.
bonding: bond0: enslaving eth0 as an active interface with an up link.
bonding: Warning: failed to get speed/duplex from eth1, speed forced to 100Mbps, duplex forced to Full.
bonding: bond0: enslaving eth1 as an active interface with an up link.
ip_tables: (C) 2000-2002 Netfilter core team
ip_tables: (C) 2000-2002 Netfilter core team
tg3: eth0: Link is up at 1000 Mbps, full duplex.
tg3: eth0: Flow control is off for TX and off for RX.
tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is off for TX and off for RX.
tg3: eth2: Link is up at 1000 Mbps, full duplex.
tg3: eth2: Flow control is off for TX and off for RX.
7 REPLIES 7
Ivan Ferreira
Honored Contributor

Re: Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

You should verify your switch port speed settings. What is the output of netstat -ni?

Also, sometimes services seems to hang when name resolution is not working correctly. Try adding your client computer to /etc/hosts or change sshd_config parameter:

UseDNS no
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Divs
Occasional Contributor

Re: Problems with ethernet bond on RHES v3 running on a HP G3 BL20P


Hi Ivan,

Thanks for responding.

We donâ t use DNS and I have entries for the host in my hosts file. Output of â netstat â inâ is:

# netstat -in
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
bond0 1500 0 374 0 0 0 252 0 0 0 BMmRU
eth1 1500 0 374 0 0 0 252 0 0 0 BMsRU
eth2 1500 0 11348 0 0 0 32 0 0 0 BMRU
lo 16436 0 147 0 0 0 147 0 0 0 LRU


What I notice is that the bond0, eth0 and eth1 interfaces all have the same MAC address of 00:14:c2:3d:c1:26.
eth1 is actually 00:14:c2:40:da:9f. Is this correct??

How do I know which bonding driver is being used on my server? I recently installed the HP PSP drivers for RHES v3.

Cheers,
Divya



# ifconfig -a
bond0 Link encap:Ethernet HWaddr 00:14:C2:3D:C1:26
inet addr:10.9.188.4 Bcast:10.9.188.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:460 errors:0 dropped:0 overruns:0 frame:0
TX packets:305 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:42120 (41.1 Kb) TX bytes:44055 (43.0 Kb)

eth0 Link encap:Ethernet HWaddr 00:14:C2:3D:C1:26
BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:25

eth1 Link encap:Ethernet HWaddr 00:14:C2:3D:C1:26
inet addr:10.9.188.4 Bcast:10.9.188.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:460 errors:0 dropped:0 overruns:0 frame:0
TX packets:305 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:42120 (41.1 Kb) TX bytes:44055 (43.0 Kb)
Interrupt:72

eth2 Link encap:Ethernet HWaddr 00:15:60:09:28:0A
inet addr:10.9.143.3 Bcast:10.9.143.255 Mask:255.255.252.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15201 errors:0 dropped:0 overruns:0 frame:0
TX packets:52 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1474744 (1.4 Mb) TX bytes:4848 (4.7 Kb)
Interrupt:73

eth3 Link encap:Ethernet HWaddr 00:15:60:09:28:0B
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:74
Steven E. Protter
Exalted Contributor

Re: Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

Shalom,

Check with network administration to see if your ports are set up properly.

For Red Hat 3 bonding, it is necessary to set up and compile a kernel module. I have no notes here on that. Did you do that?

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Ivan Ferreira
Honored Contributor

Re: Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

You won't see the bond0 interface if the bond module is not loaded, you can check with lsmod.

The MAC address issue is correct, they must act as a single interface, in case of failover, no MAC->IP address change should be done.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Divs
Occasional Contributor

Re: Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

Hi SEP/Ivan,

I confirmed with Redhat technical support about version ESv3 (2.4.21-37.ELsm) - the native bonding driver is there by default. They advised me not to use any 3rd party bonding drivers.

lsmod shows that the module is loaded (bonding 64068 1 )

I am also getting the switch for the Blades checked - thanks SEP.

In the meantime, maybe you could advise me on the PSP (psp-7.40.rhel3.linux.en.tar.gz). Shall I try and compile the HP bonding driver & HP BCM 5700 rpms? Will these fix my problem? Has anyone used the native drivers and gotten ethernet bonds to work on HP blades?

The card in the bHP lade is a Broadcom TIGON3 and the native Linux ethernet driver works fine with it (until I set up the ethernet bond).

Thanks for your help..
Divs

mwarner
Occasional Contributor

Re: Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

Add this to your /etc/modules.conf file:

options bond0 miimon=100


Both drivers should work. We are moving bakck to the RH drivers because of better options that are available. For example, load-balancing modes that are not switch assisted.
Alan Anderson_3
New Member

Re: Problems with ethernet bond on RHES v3 running on a HP G3 BL20P

First thing I need to point out. In your modules.conf you didnt show the bonding mode you specified. If you didnt set a specific mode then you will get mode 0 which is balanced round-robin mode. This mode requires a trunking configuration on the switch. To see the mode being used look in /proc/net/bonding/bond0 file. The connection hanging sounds like you didnt setup trunking on the switch, but this can be caused by lots of configuration issues. Only bonding modes 1 5 and 6 do not require a matching switch configuration of some sort.

The issue with not being able to disable autonegotiation on the bnx2 device is a feature. The developers are not allowing autonegotiation to be disabled at gigabit speeds. So make sure the switch is also setup for autonegotoation.