cancel
Showing results for 
Search instead for 
Did you mean: 

NIC Bonding on RHEL 4

SOLVED
Go to solution
Ross Kennedy
Occasional Advisor

NIC Bonding on RHEL 4

Hi
I'm trying to set up Bonding on a DL380G4.
OS is RHEL 4 with the latest patches plus the latest HP Softpaq's (7.30).

While the box is still on the network, I don't think I have bonding set up properly (if at all) as only eth0 is "live". It doesn't fail over to eth0 if I disconnect eth0.

Before attempting bonding, I did check that both NIC's do work!

I've attached a file with my config files and the output from ifconfig. Somewhere I read that ifconfig should show bond0 as well as eth0 and eth1 but on my system ifconfig only shows bond0.

I've lost count of the number of manuals I've read. Also trawled through this forum and everything looks as though bonding is set correctly.

Can anyone suggestion how I go about troubleshooting?
19 REPLIES
Patrick Terlisten
Honored Contributor

Re: NIC Bonding on RHEL 4

Hi Ross,

look at this Bugzilla Report:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=159500

Other question: Can you reach the server with ping or ssh? Is the server available on the network? Can you bring eth0 and eth1 up with "ifup eth0" or "ifup eth1"?

Regards,
Patrick
Best regards,
Patrick
Stuart Browne
Honored Contributor

Re: NIC Bonding on RHEL 4

Hrm.. Start by trying the HP supplied 'bcm5700' driver instead of the RH provided 'tg3' driver ( http://h18000.www1.hp.com/support/files/server/us/download/22318.html ).

If that still fails, also look at the 'HP tested bonding driver' ( http://h18000.www1.hp.com/support/files/server/us/download/22271.html ).

Also, what sort of switch do you have the other end of the network cables plugged in to? Have you set up (or does the switch auto-detect) trunking for the two ports?
One long-haired git at your service...
Ross Kennedy
Occasional Advisor

Re: NIC Bonding on RHEL 4

Thanks for the reply.
I can connect to the box quite happily. PING and SSH both work.

Here is what happens when I run ifup.

[root@penguin ~]# ifup eth0
tg3 device eth0 does not seem to be present, delaying initialization.
[root@penguin ~]# ifup eth1
Enslaving eth1 to bond0

See attachment

eth1 makes an appearance in ifconfig (good) and netstat (which isn't what I would expect) so you are on the right track in that eth1 appears to start disabled. I can see in netstat -i that the RX count for eth1 is increasing but the TX count is not changing.

However, the bonding function still isn't working. To me, it looks as though I have 2 NIC's called bond0 and eth1 with the same IP and ethernet addresses but they are not bonded.
Ross Kennedy
Occasional Advisor

Re: NIC Bonding on RHEL 4

Stuart,
Thanks for the reply.

The system is in our lab just now. Both NIC's are cabled into a hub which is taking care of the age old problem of HP/Compaq NIC's failing to auto detect 100M/Full on our switches. ethtool shows both bond0 and eth1 have autodetected to 100M/Full OK.

I've checked that I DO have bcm5700 installed.
[rkennedy@penguin ~]$ rpm -q bcm5700
bcm5700-7.4.12b-1

I confess I've tried changing modules.conf but it doesn't seem to make any difference but I'll definitely go back to alias eth0/1 bcm5700.

I did initially try to install the bonding driver but read somewhere that bonding is natively supported on this server&OS combination.



Steven E. Protter
Exalted Contributor

Re: NIC Bonding on RHEL 4

This is very problematic on Linux, and it gets worse if you use Gigabit cards.

The only GB cards I've gotten to work in the RedHat/Centos world is Intel. You have to add code to your /etc/init.d/network script to make the bonding work.

Also if you use ethtool after doing the bonding, you will find the speed does not show accurately.

If you want specefic configuration files, use this webform to contact me.

http://www.isnamerica.com/contactsep.shtml

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Daniel Hili
Occasional Visitor

Re: NIC Bonding on RHEL 4

Hi Ross,

Bonding is supported natively by RHEL and I have tested with both tg3 and e1000 drivers.

I've recently had a lot of problems configuring two e1000 NICs as bond0 on an RHEL 3 system.

We could successfully load up the bonding module using modprobe whilst the operating was up and running but for some unknown reason, after every reboot the NICs would start up wrong and we wouldn't see all devices listed properly. Using ifconfig -a we would see NICs listed as devXXXXX rather than the usual eth0, eth1, etc.

Anyway to keep a long story short, we renamed our script /etc/sysconfig/network-scripts/ifcfg-bond0 to ifcfg-zbond0. Adding the "z" effectively changed the starting lineup of the bonding interface to after the slave interfaces. Ever since we made this change bonding has worked a dream but I've still got a call logged with RH to figure out what is going on.

Hope it helps.

Dan
Matt Palmer_2
Respected Contributor

Re: NIC Bonding on RHEL 4

Hi,

have u looked here:

http://docs.hp.com/en/B9903-90046/ch05s04.html

that is redhat specific, I have got this workin g successfully on SuSE SLES8, not sure if there is much difference in the procedure, but one problem I had was that the instructions were slightly back to front in places, namely adding the bond lines to the /etc/modules.conf doc before trying to use ifenslave to enlist the specific NICS to the bond0.

HTH

regards

Matt Palmer
Matt Palmer_2
Respected Contributor

Re: NIC Bonding on RHEL 4

Hi, the other thing I forgot to mention is that installing the proliant support PAQ (PSP) is quite handy, as you can see in a user friendly format, how your cards are working, ie: if you've set the mode= line to '1' then the SIM web front end(http://serverip:2301) should tell you under NIC that bond0 is up using active-backup mode, if you set mode to 3 then it will say, 'switch assisted load-balancing, and if 2 'balanced-xor'.

One thing to note is that mode 3 does not work well if both NICS are going into the same network switch

one last thing is did u rebuild the kernel when you did this to get the bonding rpm or did you use another method?

regards

Matt Palmer
Stuart Browne
Honored Contributor

Re: NIC Bonding on RHEL 4

To counter SEP's issues with gigabit cards, I've had success with the Broadcomm series that are in the HP server range.

Just don't have any here to play with at the moment.

One long-haired git at your service...
Matt Palmer_2
Respected Contributor

Re: NIC Bonding on RHEL 4

Ross Kennedy
Occasional Advisor

Re: NIC Bonding on RHEL 4

First off. Thanks for all replies.

Dan
Renaming ifcfg-bond0 to ifcfg-zbond0 didn't help.

Matt
I didn't rebuild the kernel after installing the softpaq. Can I show my ignorance and ask how you rebuild the kernel?

Even though modules.conf alaises eth0/1 to bcm5700, messages.log is still referencing tg3 when I disconnect cables. Maybe my kernel isn't what I hoped it was?

Jul 21 12:23:29 penguin kernel: tg3: bond0: Link is down.
Jul 21 12:23:48 penguin kernel: tg3: bond0: Link is up at 100 Mbps, full duplex.
Jul 21 12:23:48 penguin kernel: tg3: bond0: Flow control is on for TX and on for RX.
Jul 21 12:23:49 penguin snmpd[2157]: Received SNMP packet(s) from 147.114.178.59
Jul 21 12:24:02 penguin kernel: tg3: eth1: Link is up at 100 Mbps, full duplex.
Jul 21 12:24:02 penguin kernel: tg3: eth1: Flow control is on for TX and on for RX.
Jul 21 12:24:50 penguin kernel: tg3: bond0: Link is down.
Jul 21 12:25:04 penguin kernel: tg3: bond0: Link is up at 100 Mbps, full duplex.
Jul 21 12:25:04 penguin kernel: tg3: bond0: Flow control is on for TX and on for RX.


The insight manager page for the NIC's doesn't make any mention fault tolerance. It only shows eth1 (no mention of eth0 or bond0). I've attached a copy/paste from the SIM page.


Stuart Browne
Honored Contributor

Re: NIC Bonding on RHEL 4

bring down 'bond0', 'eth0', and 'eth1', 'rmmod tg3', then try to bring it up again.
One long-haired git at your service...
Ross Kennedy
Occasional Advisor

Re: NIC Bonding on RHEL 4

Stuart

Removing tg3 didn't help. It just came back again after restarting. Even though I have installed the bcm5700 package, I'm worried I need to rebuild the kernel but don't know how to!

Ross
Matt Palmer_2
Respected Contributor

Re: NIC Bonding on RHEL 4

Hi,

If you follow the bonding link that I posted previously, it gives a step by step method of rebuilding the kernel on a specific platform, RH included. It details using cloneconfig and mrproper,etc. Your are only really adding the bonding into the existing kernel.

good luck

Matt
Steven E. Protter
Exalted Contributor

Re: NIC Bonding on RHEL 4

Apologies for not replying when promised.

http://www.hpuxconsulting.com/bond.tar

That is my config.

Hopefully its helpful.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Eric van Dijken
Trusted Contributor
Solution

Re: NIC Bonding on RHEL 4

Is /etc/modules.conf still used in RHEL4, didn't they move it to /etc/modprobe.conf?

Maybe thats why you still load the t3 module, instead of the bcm5700.
Watch, Think and Tinker.
Ross Kennedy
Occasional Advisor

Re: NIC Bonding on RHEL 4

Success! But I'm very confused. Generous points to be awarded for all.
Especially Erik who came up with the answer (although I did stumble on
the answer on Friday - honest!).
All the instructions for bonding say you should be editing /etc/modules.conf
but my breakthrough came when I stumbled upon /etc/modprobe.conf while trying
to figure out why bcm5700 didn't appear in lsmod.
I deleted /etc/modules.conf and added the "alias bond0 bonding" directive
to modprobe.conf and it all seems to work.
Is this a difference between 2.4 and 2.6 kernels?

At this time
bond0 appears as "Switch-assisted Load Balancing (round-robin)" in HPSIM.
The NICS show up as team members in HP SIM.
TCP stays up when I pull either cable and the HP NIC agents generate SNMP traps
as expected
NIC Redundancy Decreased (Rev 1): Major Event
instead of
penguin: NIC Connectivity Lost Trap (Rev 1): Major Event

Things I have learned.

Forget modules.conf and use modprobe.conf.

Re-install HP softpaqs when you patch the kernel (oops). I installed the softpaq
then patched the kernel and bcm5700.ko wasn't carried over to the kernel drivers
directory (/lib/modules/2.6.9-11.ELsmp/kernel/drivers/net/) until I re-installed the softpaq

This didn't help me solve the problem of loading the bcm5700 module as I get
a fatal error from modprobe. What does the error mean?

[root@penguin ~]# modprobe bcm5700
FATAL: Error inserting bcm5700 (/lib/modules/2.6.9-11.ELsmp/kernel/drivers/net/bcm5700.ko): Invalid module format
[root@penguin ~]# ls -l /lib/modules/2.6.9-11.ELsmp/kernel/drivers/net/bcm5700.ko
-rwxr--r-- 1 root root 1293907 Jul 22 12:10 /lib/modules/2.6.9-11.ELsmp/kernel/drivers/net/bcm5700.ko

I'm successfully running with the tg3 driver and native bonding driver
(the HP bonding driver lists bcm5700 as a pre-req so I gave up on it).

The syntax for insmod does not match the HP install notes.

[root@penguin ~]# insmod bcm5700
insmod: can't read 'bcm5700': No such file or directory

Does any of this make sense to people?
Eric van Dijken
Trusted Contributor

Re: NIC Bonding on RHEL 4

Makes sense to me :)

You came as far as i did, before i stopped using bonding on RHEL4. I'll just put in on hold until the next Update comes out (or a PSP > 7.30)

Instead of using insmod, try modprobe (not so sure anymore, but worth a try)

I even got the driver for the BCM card from Broadcom (8.1.55) that was worse (Debug info on screen, for about 2 min. Than a system panic)

Watch, Think and Tinker.
Steven E. Protter
Exalted Contributor

Re: NIC Bonding on RHEL 4

The fact you got it working with that hardware at all shows considerable skill and a bit of good luck.

Do make sure its stable, and is actually providing decent throughput, before you move this project off your radar screen.

Congrats.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com