ProLiant Servers (ML,DL,SL)
1752646 Members
5526 Online
108788 Solutions
New Discussion юеВ

Re: bonding/teaming fails intermittently - DL385

 
Alex Lazarevich
New Member

bonding/teaming fails intermittently - DL385

DL385 running RHEL4-AS 64bit, using both onboard NIC's in a team/bond, using the tg3 driver as supplied by RH. Occasionally, the bond fails, and then three seconds later it's back up. This happens about 2-3 per month, and we have to resolve it now:

Oct 5 20:46:23 zeus kernel: NETDEV WATCHDOG: eth0: transmit timed out
Oct 5 20:46:23 zeus kernel: tg3: eth0: transmit timed out, resetting
Oct 5 20:46:24 zeus kernel: tg3: tg3_stop_block timed out, ofs=2000 enable_bit=2
Oct 5 20:46:24 zeus kernel: tg3: eth0: Link is down.
Oct 5 20:46:24 zeus kernel: bonding: bond0: link status definitely down for interface eth0, disabling it
Oct 5 20:46:27 zeus kernel: tg3: eth0: Link is up at 1000 Mbps, full duplex. Oct 5 20:46:27 zeus kernel: tg3: eth0: Flow control is on for TX and on for RX.
Oct 5 20:46:28 zeus kernel: bonding: bond0: link status definitely up for interface eth0.

This hiccup is enough to cause havok, because this is one of our main fileservers. I've search a bit and found that some people have trouble with the onboard NIC's, expecially when the switches are using spanning tree. We use spanning tree and cannot disable.

We are considering stopping using the onbarod NIC's, and swtiching to dual Intel PCI-X NIC's.

Has anyone else seen this problem, and does a move to different NIC's colve it?

Alex
3 REPLIES 3
sandeep_raman
Honored Contributor

Re: bonding/teaming fails intermittently - DL385

sandeep_raman
Honored Contributor

Re: bonding/teaming fails intermittently - DL385

Alex Lazarevich
New Member

Re: bonding/teaming fails intermittently - DL385

yes, i know, redhat has bugs. but they haven't done anything about it since last year, and they probably won't. that's why i'm thinking a switch to intel NIC's will solve the problem. i was just hoping someone could confirm that they are using an intel NIC bond and it works fine?

thanks,

alex