System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

eth3: e1000_watchdog: NIC Link is Down + ASR System Reboots

Jair Patete
Occasional Visitor

eth3: e1000_watchdog: NIC Link is Down + ASR System Reboots

Hello mates, i've recently upgraded a Proliant Support Pack to the 7.70 version into 3 of our Red Hat Enterprise Linux Release 3 update 8 Servers running into a ProLiant DL 380 G5, and it became a complete MESS!

In one of these servers, after rebooting, it had lost the whole networking interfaces configuration, stating that the MAC addresses configured were diferent that the expected to find. The MAC addresses weren't changed at all at the ifcfg-ethX file. So HP support installed a new NIC and problem solved, although i still had to configure the new MAC addresses by hand because, of course, i rolled back into the old version of the e1000 driver.

In another of the servers, after a day of the update, it gave a RAM CHIP failure into the /var/log/messages but nothing into the IML, even with the brand-new hpasm daemon running. After several ASR reboots, the system never booted again, i tried to fix the filesystem by using e2fsck tool but nothing came up. I ended up backing up the whole filesystems and re-installing the OS. After succeeding the system couldn't boot for the first time, since it froze completely after showing "bringing up eth0 interface"... For my surprise, the second server had a NIC problem as well, again, HP support changed the NIC and all went fine.

In the third server, nothing happened, even when they are all 3 servers the same model, firmware, hardware and software, all went good but a message stated at /var/log/messages telling me this....

Let's hear your opinions, i want to know if there is something further... I have like 50 servers remaining for the update but i'm afraid i'm not gonna keep going with this until something comes up with this issue...

BTW, 2 of the 3 servers had the ASR enabled, the one that had it disabled is the only one that haven't had any problem.. yet..

<-----
kernel: e1000: 13:04.0: e1000_probe: (PCI-X:133MHz:64-bit) 00:13:21:78:a7:c0
Aug 6 15:56:52 kernel: e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
---->

<----
This was repeated very often, since ASR was enabled this may have caused a system hang... Not sure..
e1000: eth1: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex
Aug 6 15:56:52 kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
Aug 6 15:56:52 kernel: Tx Queue <0>
Aug 6 15:56:52 kernel: TDH <0>
Aug 6 15:56:52 kernel: TDT <4>
Aug 6 15:56:52 kernel: next_to_use <4>
Aug 6 15:56:52 kernel: next_to_clean <0>
Aug 6 15:56:52 kernel: buffer_info[next_to_clean]
Aug 6 15:56:52 kernel: time_stamp
Aug 6 15:56:52 kernel: next_to_watch <0>
Aug 6 15:56:53 kernel: jiffies
Aug 6 15:56:53 kernel: next_to_watch.status <0>

----->

Thanks in advance