Networking

Re: Server Loses Network for no apparent reason?

 
SOLVED
Go to solution
Al Licause
Trusted Contributor

Re: Server Loses Network for no apparent reason?

tg3 is broadcoms driver. It will eventually replace the bcm5700 driver.

It appears that you are using the tg3 driver.

Might want to confirm that with the contents of /etc/modprobe.conf and/or ethtool.
L_Dieter
Occasional Advisor

Re: Server Loses Network for no apparent reason?

Hi Robert,


Maybe a rather stupid question: checked on duplicate IPs? The only time I had this kind of problems, it was due to duplicate IPs.

Best regards,
Dieter
Alexander Samad
Frequent Advisor

Re: Server Loses Network for no apparent reason?

Are you using PSP and thus the HP driver for the nic. If so did you upgrade the driver when yuo upgraded the kernel ?

When you loose connectivity what does a tcpdump on the interface produce, do you see any traffic at all, what does ethtool give you link state etc.
Stuart Browne
Honored Contributor

Re: Server Loses Network for no apparent reason?

I don't suppose you've tried the simple "replace the cable" bit?

Cables die *shrug*.
One long-haired git at your service...
Robert Walker_8
Valued Contributor

Re: Server Loses Network for no apparent reason?

Gday,

A couple of answers to your questions. We have just moved its ip address and the problem still persists. We have upgraded it from 100Mbs to a bonded pair of GB NICs (admittedly still using the same DL380 NICs provided on the mother board) with no result - ie the server didnt failover to its backup NIC - the whole network stack froze effectively.

We have run a TCPDUMP as the call was logged with Redhat and they too asked whether tcpdump showed anything. The only thing is the following:

10:16:22.275318 arp who-has 192.168.10.30 tell myserver.example.com
10:16:25.275570 arp who-has 192.168.10.31 tell myserver.example.com
10:16:26.276324 arp who-has 192.168.10.31 tell myserver.example.com
10:16:27.276075 arp who-has 192.168.10.31 tell myserver.example.com
10:16:30.276330 arp who-has 192.168.10.30 tell myserver.example.com
10:16:31.276085 arp who-has 192.168.10.30 tell myserver.example.com

10.31/10.30 are our DNS/Wins windows servers which are also defined in /etc/resolv.conf

Again a service network restart resolves the problem or an ifup/ifdown on the interface.

Robert.
Robert Walker_8
Valued Contributor

Re: Server Loses Network for no apparent reason?

Gday,

Problem still occuring, has happened again today. We downgraded the kernel from 2.6.9-42.0.3 to 2.6.9-34.0.2 as this is the time frame when it started - always the case two things done and then a problem appears.

The problem seems to be mostly around the NFS file transfer time about 30 minutes to 1 hour into the transfer. As mentioned tcpdump just shows no network activity, as if the network were switched off! Teaming/bonding dont seem to help as the interface appears to be up but not communicating.

Any ideas?

Robert.
Andrew Gilbrt
New Member

Re: Server Loses Network for no apparent reason?

Robert,

We are in the throes of a very similar situation. We have a number of servers it is occurring on, some it is not. We have tried a variety of fixes, including bonding/unbonding, drivers, kernel revs. Will try to get a more detailed post to you.

Andrew Gilbrt
New Member
Solution

Re: Server Loses Network for no apparent reason?

Robert,

Have you seen this thread?

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=898761

Looks quite promising
Robert Walker_8
Valued Contributor

Re: Server Loses Network for no apparent reason?

Gday,

Going to close this thread as it appears we have a similar problem posted here:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=898761

Thanks for all who have contributed - I will look at updating my drivers (something which has been festering in my mind for a while - however thought it to be a Redhat issue). Well see how that goes.

Messy XMAS & New Year to you all!

Robert.