Operating System - HP-UX
1830769 Members
2655 Online
110015 Solutions
New Discussion

Load Average in the 30's but system fine

 
SOLVED
Go to solution
Coolmar
Esteemed Contributor

Load Average in the 30's but system fine

Hi,

My uptime is reporting a load average in the 30's but the system is fine. Top shows all processes in control and the system is not slow at all. WE did have a runaway a couple of days ago and I killed it and since then the system has been fine. Is there anyway to clear the runqueue or whatever that "load" represents?
5 REPLIES 5
Uday_S_Ankolekar
Honored Contributor

Re: Load Average in the 30's but system fine

use UNIX95 command to get top cpu used process.

UNIX95= ps -e -o ruser,vsz,pid,pcpu,args | more

You can then sort it on any field you want.

-USA..
Good Luck..
Rick Garland
Honored Contributor

Re: Load Average in the 30's but system fine

Is the load continuing to go higher?

Taking a guess - any pfs mounting done recently and have trouble?
Any shells (ksh) proces hanging out there with no association?

Many possibilities, need to investigate further.
Jeff Schussele
Honored Contributor
Solution

Re: Load Average in the 30's but system fine

Hi Sally,

It's not the run queue that you have to worry about - that just shows the system is doing a lot of work, apparently efficiently as you say response is fine.
What you *do* have to worry about is the priority queue. If that starts going up then you'll have trouble. It shows processes being bumped off the CPU without completing their jobs. That's when the system will slow down.
So check the priority queue via glance/gpm & you'll see low values, I'm sure.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Coolmar
Esteemed Contributor

Re: Load Average in the 30's but system fine

No pfs mounting or shells gone awry. The system is basically idle....yet the load is at 30. It is not climbing either...it has been at 31-31 for 3 days.
Bill Hassell
Honored Contributor

Re: Load Average in the 30's but system fine

You likely have many looping processes that are performing a very small I/O then waiting for a short time (perhaps a polling program) and repeating. sar probably reports a high system percentage (sar -s 1 10) compared to user percentage. Te high sys percentage indicates programs are hitting the kernel hundreds to thousands times per second. Because the majority of the time is spent waiting on the kernel, the load will appear high. If there are a lot of the shrt run programs, sar -w 1 10 will show very high pswch/s (process context switches) indicating short runtime programs.

NOTE: uptime and top report the runqueue average as a "load" but it is definitely not what you think. The "load" is the average run queue length over the measurement period. Suppose you have a program that is doing polling and simply queries the kernel 10 times per second. Now start the same program 10 times and now the context switch time is 100 times per second with virtually no user time.

There is nothing to clear in the runqueue--your system is doing what the programs tell it to do. I would take a ps listing and reboot. If the load returns to normal, find the programs that no longer are running after a reboot.


Bill Hassell, sysadmin