Operating System - HP-UX
1753278 Members
5505 Online
108792 Solutions
New Discussion юеВ

Lancard stops working ... after a week

 
Steven E. Protter
Exalted Contributor

Re: Lancard stops working ... after a week

Shalom,

Another idea.

Perhaps there is another machine coming on with the same ip address elsewhere on the network once a week.

Or worse some Linux box with a shell script creating a vitural interface.

The way to detect is to have the system off network and ping its ip addy at the time of the problem. If you get an answer, my hunch is right.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Elmar P. Kolkman
Honored Contributor

Re: Lancard stops working ... after a week

Steven,

the idea is nice, but alas... untrue.
The first things to do was doing pings to the machine from different locations. All backup related servers have host-routes over the backup lan, which makes the machine reachable, but when we do pings from the machine to something outside the userlan IP range or something in it but without a host route, it doesn't work... Even pinging the default gateway fails.

Pinging from another system to the broken one fails to. Unless it goes over the other networks (backup or storage lan).
Every problem has at least one solution. Only some solutions are harder to find.
Jay Kidambi
Advisor

Re: Lancard stops working ... after a week

Linkloop works at the link layer. If there is no connectivity at the link level, there is no point debugging the problem up the stack: at IP or TCP.

Keep things simple. Run

# lanadmin -g 2

a few times next time after you notice that the link has stopped working. Specifically, examine the MIB counter ifOutQLen. If this counter is non-zero, and doesn't decrease at all (i.e., keeps increasing, or stays constant), it may indicate a hang at the link level. If you notice this behavior, I suggest that you replace the NIC first, if you have another to spare.

HTH,
Jay
Elmar P. Kolkman
Honored Contributor

Re: Lancard stops working ... after a week

I ran the lanadmin commands, but interactively. But I didn't know precisely which parameters were related to the problem, so I didn't notice anything strange. If it happens again I will look at the lanadmin -g.

As for swapping the card: we don't have any spares... but a good support contract on the system, so we will ask HP to do it if we (and they) don't see any other solution. (I haven't yet logged a call at HP, though.)

Every problem has at least one solution. Only some solutions are harder to find.
LoC_1
Frequent Advisor

Re: Lancard stops working ... after a week

Hi
If this happens again and you cannot linkloop, there is a problem hardware layer.
There is 3 possible cause, the cable, the port on the switch or the nic card.
You can try to reset the lan card with lanadmin using the following.
lanadmin
lan
ppa
use the instance number for the card.
reset
then quit

That should reset the card. If that fails I would arrange to have the nic card replaced. If it works I would start by changing the port on the switch if another one is available, then I would replace the cable and card in that order.

You can also look at the nettl log on the server and it may indicate the cause of the problem
Elmar P. Kolkman
Honored Contributor

Re: Lancard stops working ... after a week

I've got good and bad news.

We installed the Sept 2005 patch bundle, resulting in a troublefree week of operation.

But: the problem just re-occurred. That's the bad news.

The good news: now I can look into some of the suggestions more deeply.

First thing to notice:
lanadmin -g 2 gives the 'Inbound Discards' amount... and this increases.
And it gave a outbound queue length of 1023.
Now this gave me an idea: what if the outbound queue is full for the card, so no more packets can be send?

So I used lanadmin interactively, moved to the lan menu and gave the card a reset. It started working again. So I could solve the issue without a full reboot.

If course I had to reset the default route, because the default gateway was unreachable for too long, but that was easy, standard route delete/route add work.

I will keep you informed on the progress with this problem, because we still have to figure out what's filling the outbound queue.
Every problem has at least one solution. Only some solutions are harder to find.
Elmar P. Kolkman
Honored Contributor

Re: Lancard stops working ... after a week

The problem was still occurring for the last months, on both RP3410's with the combi cards we have in our backup environment.

Apparently the problem is on driver level, because HP just sent us a new patch for the IGE lan driver to install, with our problem reported in the list with bugs fixed with this patch.

Now let's see if PHNE_34340 really solves our issue...
Every problem has at least one solution. Only some solutions are harder to find.