- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- How to troubleshoot a NIC?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-11-2004 12:11 PM
тАО12-11-2004 12:11 PM
How to troubleshoot a NIC?
Yet at the console everything is fine. Netstat reports active network connections, and in fact any current network connections will succesfully resume after I unplug and reconnect the network cable or do ifconfig lan1 down, then ifconfig lan1 up.
For example: I have a telnet connection open... The server in questions suddenly becomes unreachable. I walk over to the console and do one of:
* ifconfig lan1 down, then ifconfig lan1 up
* or /sbin/init.d/net stop, then /sbin/init.d/net start
* or just unplug the network cable and plug it back in
And all the sudden the machine is accessible again... I go back to my desk, where my telnet connection had frozen... I start typing again and now the telnet connection is responsive again.
I can't tell if there is anything that triggers this problem.
No log entries in the syslog or the messages file. Lanadmin doesn't indicate any errors when displaying statistics, either. lanscan show the NIC as being up while the problem is happening. I also did a linkloop to another HP-UX server and that was unsuccessful, so no layer2 connectivity. Yet the Cisco switch that this machine is connected to shows no interface errors.
Any suggestions?
The only thing I can come up with is that the NIC be going bad. If that's the case, how do I find out? Can I run diagnostics on this card with stm?
I did run xstm and I'm attaching the output of the infotool and the status. I am not familiar with the support tools, however, so can someone walk me through the diag?
Also, here's the output of lanscan:
Hardware Station Crd Hdw Net-Interface NM MAC HP-DLPI DLPI
Path Address In# State NamePPA ID Type Support Mjr#
8/16/6 0x0060B0XXXXXX 0 UP lan0 snap0 1 ETHER Yes 119
8/20/5/7 0x0060B0XXXXXX 3 UP lan3 snap3 2 ETHER Yes 119
8/12/1/0 0x001083XXXXXX 1 UP lan1 snap1 3 ETHER Yes 119
and ioscan -fnk -C lan:
Class I H/W Path Driver S/W State H/W Type Description
========================================================================
lan 1 8/12/1/0 btlan4 CLAIMED INTERFACE PCI Ethernet (10110009)
lan 0 8/16/6 lan2 CLAIMED INTERFACE Built-in LAN
/dev/diag/lan0 /dev/ether0
lan 3 8/20/5/7 btlan0 CLAIMED INTERFACE EISA card INP0500
P.S. The other two NICs are disabled and not in use.
Also, I do have HP hardware support... What kind of proof will they need before they will agree to replace the card?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-11-2004 12:54 PM
тАО12-11-2004 12:54 PM
Re: How to troubleshoot a NIC?
1) Another computer on your network comes up with the same IP address. The HP box becomes suddenly unreachable and might not show anything on its log.
To diagnose: disconnect the hp box from the lan and ping its ip address. If you get an answer the problem is identified.
2) The NIC card itself: use cstm mstm or X based xstm. Run the excercize function. If it fails replace the card.
3) Another NIC card on this box is being brought up on the same network. You say the other cards are inactive, but if there is a script with ifconfig being run or someone running sam, HP-UX will not support two cards with the same network on the same box. This will cause the box to immediately drop off the network.
4) Software problems. Make sure the box is patched to the most recent quartery release and HWE hardware enablement bundle.
You can run diags on the card with stm. I like xstm. It does a red icon when a card is bad.
If the system is mission critical, bring down the card, change the cable over and use one of the other two cards. If the problem recurs, see option 1 above.
Hope this helps. This was a very detailed post. Thanks.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-11-2004 04:32 PM
тАО12-11-2004 04:32 PM
Re: How to troubleshoot a NIC?
I have seen this behaviour mostly due to another station with the same IP address/MAC Address.
I suggest you run 'netfmt' and see the logs.
You will find few log files under /var/adm as nettl.*. Run 'netfmt -f /var/adm/
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-11-2004 11:29 PM
тАО12-11-2004 11:29 PM
Re: How to troubleshoot a NIC?
Also are you sure that when you went to lanadmin that you got the correct one? By default lan0 comes up first. You have to say nmid x where x is one more than the number that comes up to get to the next one. This looks like a simple case of duplex mismatch which would cause the switch to error out the port at random intervals but since you say you are not seeing errors I wondered if you got the right one in lanadmin. Make sure you have both ends of the circuit set to auto auto or 100Full 100 Full. auto 100Full will cause the auto side to default to 100Half which will cause your problem if the switch is one of those that errors out ports with too many errors.
You can also get a similar behavior if you are connecting over a gateway and the gateway does not respond to pings only the timeout is usually about 3 minutes.
A good test if there is another device on the local network that responds correctly is linkloop. See the man entry for details.
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-12-2004 09:40 AM
тАО12-12-2004 09:40 AM
Re: How to troubleshoot a NIC?
I also discovered that I'm missing the latest driver update for the card - patch PHNE_30301. So I'll be installing that soon.
Thanks for the suggestions. I doubt that another machine is taking over the IP address, but I will check that out if/when the problem occurs again. I did run the exercise function of the Support Tools Manager (stm), and the card passed. It looks, however, that the exerciser simply generates some extra traffic to 10 random hosts on the same subnet, thus putting the NIC under a mild load. I don't think this will have many results in my case. Also, I forgot to mention that I did check the nettl logs and there are no erros logged in there for this NIC. I have also been monitoring the switch that this NIC is connected to, but nothing there, either.
Now, two last questions:
* When I get status on this card with the Support Tools Manager (stm) is shows that there is a diagnostic piece installed, and that it is licensed:
Installed tools:
Diagnostic : bt100d (Licensed)
Verifier : lan
Exerciser : lan
Information : dlpi
Expert Tool : None
Firmware Update : None
See the bt100d line above. How do I run diagnostics on the card from inside of stm? (I use either cstm or xstm). Do I need some sort of a password to enable the diagnostics?
* Finally, this card shows up as a PCI card when I do ioscan:
lan 1 8/12/1/0 btlan4 CLAIMED INTERFACE PCI Ethernet (10110009)
See PCI Ethernet above. Indeed, 8/12 is a "GSCtoPCI Bridge". However, when I lookup J3515A, everything says that this is an HSC card. So what is the HSC bus? Is it different than PCI? And if so, is the card PCI or HSC?
Thanks much! All the input is and has been greatly appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-13-2004 01:36 AM
тАО12-13-2004 01:36 AM
Re: How to troubleshoot a NIC?
You do not say what version of btlan4 driver you have now, but depending on how old it is, there have been a few fixes for card hangs. I would suggest the newest driver first.
As far as "proof" for the HP engineer goes, I don't think proof is necessary. Open a call and discuss it with the response center and your CE. We want to help you attain the best service from your system possible, so we'll work out the best action plan we can to solve the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-13-2004 04:15 AM
тАО12-13-2004 04:15 AM
Re: How to troubleshoot a NIC?
Even if another system had the same IP address, linkloop, since it deals not with IP addresses _should_ remain uneffected.
I'd go with the suggestion to be up on all the latest patches and see what happens.
The suggestions to check the netfmt/nettl log is very good - you might also check dmesg output.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-15-2004 06:37 AM
тАО12-15-2004 06:37 AM
Re: How to troubleshoot a NIC?
I couple of things I want to point out before closing the thread. Hoping that they may be useful for someone else in the future:
* I don't think anyone mentioned that I should check the level of nettl logging. There are 4 levels, and usually Disasters and Errors are logged by default. It may have been a good idea to temporarily turn on Warning and Info logging for the NIC. See nettladm(1M) and nettlconf(1M).
* Certain pieces of hardware can have a diagnostics program run on them from the Suport Tools Manager. This particular card (J3515A) does. Runing diagnostics requires a license to be supplied. HP can provide the customer with a 1 day temp diagnostics license. I was told that diagnostics are not always useful but may be worth a shot.
* Finally, HP tech support for HP-UX 9000 servers was more then happy to replace this card, just based on the suspicion that it's bad. They probably won't do that for all types of hardware, but a NIC is easy to replace I guess.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-15-2004 06:42 AM
тАО12-15-2004 06:42 AM