1748145 Members
3685 Online
108758 Solutions
New Discussion юеВ

Lan card driver issues

 
SOLVED
Go to solution
Marc Dijkstra
Trusted Contributor

Lan card driver issues

Hey all,

Scenario: Clustered RP5430's, shared FC storage and standalone lock disk. Both servers have base core 10/100 and 2 10/100 dual port lan/scsi combo cards.
The B server was rebuilt after a crash, after the rebuild we noticed the lan0 displaying erratic behaviour, dropping packets, going to sleep etc. Noticed the following with ioscan -fknC lan on the B server:

rodb / (root) $ ioscan -fknC lan
Class I H/W Path Driver S/W State H/W Type Description
========================================================================
lan 0 0/0/0/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
/dev/ether0
lan 1 0/4/0/0/6/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
lan 2 0/4/0/0/7/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
lan 3 0/7/0/0/6/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
lan 4 0/7/0/0/7/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)

While on the A server it looked like this:

proda / (root) $ ioscan -fknC lan
Class I H/W Path Driver S/W State H/W Type Description
========================================================================
lan 0 0/0/0/0 btlan3 CLAIMED INTERFACE HP PCI 10/100Base-TX Core
/dev/diag/lan0 /dev/ether0 /dev/lan0
lan 1 0/4/0/0/6/0 btlan3 CLAIMED INTERFACE HP A5838A PCI 100BaseTX/SCSI COMBO
/dev/diag/lan1 /dev/ether1 /dev/lan1
lan 2 0/4/0/0/7/0 btlan3 CLAIMED INTERFACE HP A5838A PCI 100BaseTX/SCSI COMBO
/dev/diag/lan2 /dev/ether2 /dev/lan2
lan 3 0/7/0/0/6/0 btlan3 CLAIMED INTERFACE HP A5838A PCI 100BaseTX/SCSI COMBO
/dev/diag/lan3 /dev/ether3 /dev/lan3
lan 4 0/7/0/0/7/0 btlan3 CLAIMED INTERFACE HP A5838A PCI 100BaseTX/SCSI COMBO
/dev/diag/lan4 /dev/ether4 /dev/lan4

I loaded the HWCR and GR patches to no avail, so I rebuilt the server. I loaded HP-UX 11.00 and the GR/HWCR, and now, it looks like this:

prodb / (root) $ ioscan -fknC lan
Class I H/W Path Driver S/W State H/W Type Description
========================================================================
lan 0 0/0/0/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
/dev/diag/lan0 /dev/ether0 /dev/lan0
lan 1 0/4/0/0/6/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
/dev/diag/lan1 /dev/ether1 /dev/lan1
lan 2 0/4/0/0/7/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
/dev/diag/lan2 /dev/ether2 /dev/lan2
lan 3 0/7/0/0/6/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
/dev/diag/lan3 /dev/ether3 /dev/lan3
lan 4 0/7/0/0/7/0 btlan3 CLAIMED INTERFACE PCI Ethernet (10110019)
/dev/diag/lan4 /dev/ether4 /dev/lan4

I can get no information on the A5838A driver from the prodA server, could the fact that /dev/lan0 was not attached to the first card have caused this very erratic behaviour?the first bit was obviously completely off, but where is the description field kept?. lanscan is happy, and an ifconfig lan0 is happy.... I am worried that something BIG is missing, as as I am 2000KM from home and leaving tomorrow, would like some guru hand holding!

What is missing?

Thanks,

MND
"A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila"
7 REPLIES 7
T G Manikandan
Honored Contributor

Re: Lan card driver issues

Just use lanadmin
for some hints

check what sort of errors you get in there.

#lanadmin
lan
display


Do you have any messages in the syslog.log file

Revert

James Murtagh
Honored Contributor

Re: Lan card driver issues

Hi Marc,

I wouldn't have thought anything "big" is missing, not on the second build at least. If you can configure and access the cards they should be OK. The PCI description fields were enhanced for 11i I believe....the only record I have at 11.00 is an enhancement request (JAGab13349) that was done for PHKL_18543.

The only thing I can think of is the firmware and patch levels. Have you used STM to check the firmware levels on the two servers? For instance for the A5838A the required firmware is 41.36. PHSS_27466 provides 42.19 so this may be worth checking. Other than that check for the latest PCI/ioscan/net patches.

Regards,

James.
Marc Dijkstra
Trusted Contributor

Re: Lan card driver issues

James,

I was much happier with the server after a rebuild and repatch, the descriptor field worried me I must say... thanks!

One thing is still a botherment, I was mucking about with the cards (lots of collisions, but otherwise OK -> on both servers) but I did a landiag->lan->reset on lan0 (ppa0) and was much surprised to see the telnet connection hang-up.... in the past even a telnetted landiag reset comes back. It also shut down the card for 10 seconds, causing MC Serviceguard to switch to backup lan. After 10 seconds it switched back to lan0, see below.

Nov 18 08:48:19 prodb cmcld: Timed out node proda. It may have failed.
Nov 18 08:48:19 prodb cmcld: Attempting to form a new cluster
Nov 18 08:48:19 prodb cmcld: lan0 failed
Nov 18 08:48:19 prodb cmcld: Subnet 10.3.0.0 switched from lan0 to lan4
Nov 18 08:48:19 prodb cmcld: lan0 switched to lan4
Nov 18 08:48:19 prodb cmcld: Attempting to adjust cluster membership
Nov 18 08:48:20 prodb cmcld: Resumed updating safety time
Nov 18 08:48:21 prodb cmcld: 2 nodes have formed a new cluster, sequence #5
Nov 18 08:48:21 prodb cmcld: The new active cluster membership is: proda(id=1), prodb(id=2)
Nov 18 08:48:29 prodb cmcld: lan0 recovered
Nov 18 08:48:29 prodb cmcld: Subnet 10.3.0.0 switched from lan4 to lan0
Nov 18 08:48:29 prodb cmcld: lan4 switched to lan0

Any ideas why that would happen?

MND
"A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila"
T G Manikandan
Honored Contributor

Re: Lan card driver issues

I doubt the problem with the cable or probably with that Lan card.
cmquerycl could not even find the card.


Try replacing the cable first

Thanks
Marc Dijkstra
Trusted Contributor

Re: Lan card driver issues

TG,

I have tried all of that, what is important to remember is

a)Even when /dev/lan0 was not bound to the card, I could still do all commands associated, lanscan, landiag, ifconfig.
b) I could not however, do a cmcheckconf -> no errors, just hang up.

Since the 2nd rebuild, all that seems to have disappeared.... and apparently my client did indeed change lan cables etc.

MND
"A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila"
James Murtagh
Honored Contributor

Re: Lan card driver issues

Hi Marc,

What we know is that you have a two node S/G cluster and there are differences in the servers. All issues seem to be network related.

I would advise :

Ran show_patches on both servers for active patches, concentrating on network patching.

If STM is installed, use this to run a report on the in-built lan card on host b. Also, although not ideal, reset the in-built lan card on host a to see if you experience the same problems.

I would also look at patches PHNE_25907 and PHNE_25580 (Built-in PCI 100BASE-T patch & PCI 100BT lan cumulative).

Also check the firmware level and ensure server b's cards are configured as expected i.e. 100FD auto-negiotiation off.

Regards,

James.
rick jones
Honored Contributor
Solution

Re: Lan card driver issues

/dev/lanN files are obsolete - starting back in 10.20. that they appear at all on 11 or 11i is an artifact of the lineage of specific drivers. The transport in 11 and later does not use those old "LLA-style" device files. Instead it opens /dev/dlpi and attaches to the specific PPA (Physical Point of Attachment - IIRC).

Resetting a NIC can indeed take up to 10 or 11 seconds if memory serves - a time during which traffic through that NIC will not flow. I suspect the reset time depends on whether or not autoneg is being used.

Collisions are, in and of themselves, normal - at least for a half-duplex link :) What are abnormal would be "late collisions" which in the past indicated that the network diameter was too large, but today probably suggests a mis-match between the NIC and the switch - one side hardcoded to say full-duplex, the other set to autoneg.

I myself prefer to use autoneg everywhere, others seem to want to hard-code everything.
there is no rest for the wicked yet the virtuous have no pillows