Operating System - HP-UX
1833757 Members
2885 Online
110063 Solutions
New Discussion

Busy NETISR on J6000 running HP-UX 10.20.

 
Roger M. Case
Occasional Contributor

Busy NETISR on J6000 running HP-UX 10.20.

I have a J6000 with 2 cpu's (hence 2 netisr processes) where at any given level of cpu activity - 90% of that activity is related to the NETISR processes. Glance (gpm) shows that disk utilization is < 1%, Memory utilization is < %1, Network utilization < 2% utilization (if that, typical is 80 to 90 inbound/outbound packets a sec).

nettune shows all default values in place!

lanadmin shows that inbound/outbound counters that are related to tracking errors and collisions are 0.

Lan0 is a builtin 100Mb Full duplex on a Switch hub so the only traffic the workstation is seeing is only addressed to it.

I did search the forums and found that either my problem is a overworked nic card getting flooded by a ton of network traffic (which is not the case) or that a firmware patch is needed for the nic - but a search on HP's support site for fixing busy NETISR on a J6000 did not find anything.

Before the flood of "10.20" is a dinosaur statements come in - We running 10.20 due to a legacy application is running on this box.

Thanks in advance for all suggestions - Tim
3 REPLIES 3
rick jones
Honored Contributor

Re: Busy NETISR on J6000 running HP-UX 10.20.

Interesting problem - indeed, the general cause of netisr CPU time is packet processing. What I cannot remember is if loopback traffic flows through the netisr or not. Do you happen to have much in the way of loopback traffic?

As for firmware, there is no "firmware" as such for the core NIC on a j6000. However, there have been a number of firmware updates for the jNNNN's over the years, so I suppose you could make sure you have the latest general j6000 firmware patch installed.

Alas, (here comes the dinosaur statement :) there isn't much in the way of supported, in-the-field profiling available for 10.20, or I would say run that (eg prospect) and see where all the CPU time is being spent to get a clue as to the cause.

If you have disc util < 1% and mem util < 1% and net util M 2%, just what is this legacy application doing?!? :)
there is no rest for the wicked yet the virtuous have no pillows
Roger M. Case
Occasional Contributor

Re: Busy NETISR on J6000 running HP-UX 10.20.

Hi Rick - responding to your questions...
Q1: Do you happen to have much in the way of loopback traffic?
Ans: For comparison purposes I listed both lan0 and lo0 from a netstat (uptime is 3.5 days)...

Name.... Ipkts Ierrs Opkts Oerrs Coll
lo0 785402 0 785399 0 0
lan0 10390013 0 1083262 0 0

After doing this I was surprised to see that the # inbound != # outbound loopback packets (HOW DO YOU GAIN/LOOSE LOOPBACK PACKETS?), and this is unusual! So it would appear that netisr (well at least maybe in 10.20) does handle loopback packets too.


Q2)...just what is this legacy application doing?!?
This box is running a Mentor Graphics electrical application. I find typically 2 to 3 users (running a X Server application on a Windows PC) are logging in to the box to execute the application and DISPLAY="their pc workstation IP address:0.0". These users all have their "HOME" accounts on a different Unix box. So the non-loopback traffic is a mix of X and NFS packets (the bulk being the X protocol). Running "top", it typically shows the local X server taking the bulk of the cpu (as apposed to gpm showing its the netisr's). My first thought was maybe someone has a book resting on their keyboard and is sending out a flood of packets to the X process - but due to the overall inbound packet traffic being very low - I have discounted this theory. I have not check to see as far as the NFS stuff if its all ver 3 to ver 3 or a mix of ver 2 and ver 3. But right now I don't feel the problem is nfs related. Thanks for the inquiry - Tim
rick jones
Honored Contributor

Re: Busy NETISR on J6000 running HP-UX 10.20.

The "problem" with netstat statistics (particularly on 10.20) is they are kept in 32-bit counters. So, while there is only a small chance that they have wrapped, they may have wrapped in the 3.5 days.

I suspect that the sources to beforeafter (ftp://ftp.cup.hp.com/dist/networking/tools/ ) would compile and run on 10.20 (heck, even other Unixes...) and with that you could save two separate "snapshots" to files and do the subtraction.

There can be "loss" in loopback with UDP if the socket buffers overflow. There can even be overflow (I think, it has been a long long time) of netisr queues for loopback traffic. So, lo0 stats will not necessarily match in versus out (I didn't look to see which was larger).

You might also sanity check things by looking at the lanadmin stats for your interface.
there is no rest for the wicked yet the virtuous have no pillows