Operating System - Linux
1831593 Members
2994 Online
110027 Solutions
New Discussion

RHEL 2.1 update 5: bonding up but not stable

 
David Gerard
New Member

RHEL 2.1 update 5: bonding up but not stable

We are using the HP Ethernet bonding driver in RHEL 2.1. There are two bonds on each box, one to the NetApps, one to the rest of the network. Each bond is set to fail over on link failure. But the kernel keeps seeing a link failure when there isn't one (mii checks every 100ms and can't see one) and switching to the other link, flapping roughly every few hours.

I've kludged around it by setting the downdelay to 2000ms (so it waits two seconds before flapping), but it's still trying to flap.

It's happening on two (identical) boxes in two locations, each with two separate bonded links, each of whose two links goes to different switches - I'm confident it's not dodgy hardware.

The boxes are DL580 G2, quad 3GHz, 16 gig memory. The kernel is 2.4.9e49-enterprise. The bonding driver is HP driver 1.0.4q (haven't tried the kernel driver, did 2.4.9 even have it included?). The card driver is HP driver e1000-5.4.11a-1. The switches are Ciscos, not sure what model - they don't see a link failure either.

We have other RHAS2.1 boxes (DL360s) going to the same switches which do not show this behaviour. Those boxes only have one bond each, though - these boxes are the only ones with two bonds.

We tried going to kernel 2.4.9e57, but bonding driver 1.0.4q doesn't say it supports that, and bonding doesn't seem to work. So I can't upgrade the driver (perhaps 1.0.4r will support the current kernel).

Has anyone seen or heard of this behaviour?