Simpler Navigation for Servers and Operating Systems - Please Update Your Bookmarks
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
If you have bookmarked forums or discussion boards in Servers and Operating Systems, we suggest you check and update them as needed.
General
cancel
Showing results for 
Search instead for 
Did you mean: 

Has anyone seen a Load Avg of 99-112 on a 6 way k580?

Rhonda Thorne
Frequent Advisor

Has anyone seen a Load Avg of 99-112 on a 6 way k580?

I am IBM certified perf and tuning and waiting for HP perf and tuning adv cert. However, I have never seen a load avg of 99-112 on a 6 way with 1440 procs. And this lasted for more than 4 hours. 800+ proc sleeping and 600+ running. nproc kernel param at 2000. OMG.... I have placed a software call to HP, but has anyone else had this much CPU activity on a k580 6way 4G memory?

Just curious....
Rhonda
Sharing my knowledge of UNIX flavors
8 REPLIES
Bill Hassell
Honored Contributor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

The processor 'load' as measured by top and uptime is the average size of the run queue, which is not a very good measure for today's computers. Start with 6 processes, each consuming 100% CPU. Load factor will be 6 and overall compute load is 100%.

But change the type of processes to badly written client/server polling where each process polls a client several times a second. In this scenario, the overall CPU load is low (from a user perspective) but system overhead will be very high (context switching, LAN overhead), especially when 500 copies of the program are run at the same time. LAN traffic will be very high, system overhead will approach 90% and the runqueue (load factor) might rise to 100 or even 200.

Yet the system seems to respond well. In this (not so) artificial case, the processes are stacked in the runqueue and run as fast as the dispatcher can get them started. But they have almost nothing to do (just a couple of LAN packets) so they complete the poll very quickly, sleep for a short time and back they come.

While the programs are collectively I/O and system overhead intensive, they allow interactive processes to still respond quickly and the system will seem reasonably responsive.

But the easy answer is that your load factor is unusually high and you should probably track down the culprit(s). It could be caused by a series of multi-threaded processes that are having problems.

Generally, I see the biggest CPU hogs as processes that perform massive LAN and CPU tasks at the same time, or processes that drain the filesystem bandwidth. The easiest way to crush a system is to start about 50 copies of:

du -s / > /dev/null

or

find / > /dev/null

Put these processes on a system with 6 processors and everyone will start complaining. That's why I never allow find / on any of my systems (a very common how-to example in beginner Unix classes).


Bill Hassell, sysadmin
Vikas Khator
Honored Contributor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

Hi Rhonda ,

Yes I have seen that . User had written a Perl script that was spawning child processes .

System was horribly slow and we had tough time killing the process as it was spawning faster than we could kill.
Keep it simple
Paula J Frazer-Campbell
Honored Contributor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

Hi

154.6 Load was my record on a 6way K580,
The machine stayed up and I was able to bring it under control without a panic.

600 + runnning need to be investigated - start with the sofware guys they are usually to blame.

"I just thought" is the normal excuse.

;^)

Paula
If you can spell SysAdmin then you is one - anon
Ovidiu D. Raita
Valued Contributor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

Just wanted to let you know that I had the same problem with a 6CPU, 8GB K580 running a Telecomm billing application ( don't want the mention the name) poorly written that used to fork hundreds of processes at the same time that died instantly.

With gpm I could notice that I had 400-500 died processes every at every refresh. It wasn't to much I could do. I wish I was a performance guru at that time and do some magic ...

Ovidiu
Simple solutions to complex problems
Rhonda Thorne
Frequent Advisor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

Thanks for the input. I know what is causing the performance degradation... we are running oracle and have out grown our current systems. (New vlass 2600 and N class 4000 are coming in today) woohoo.

I was just wondering if the old k580 240Mhz CPU's could handle the load until we migrated to new platforms.

Rhonda
Sharing my knowledge of UNIX flavors
Rhonda Thorne
Frequent Advisor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

Good to see ya again Vikas. Happy New year.

Rhonda
Sharing my knowledge of UNIX flavors
David Totsch
Valued Contributor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

I never use CPU Load Average as a direct measurement. Take a look at Global Wait states ("B" in GlancePlus). In the lower right corner you will see "Priority". If you see processes blocked on priority, you can safely say you are overwhelming the CPU(s) or there are several process that are hogging the CPU(s).

-dlt-
Paula J Frazer-Campbell
Honored Contributor

Re: Has anyone seen a Load Avg of 99-112 on a 6 way k580?

Hi

Controlling your load can be done in many ways and depends upon your connectivity and of course what your company is doing.

1. X25 connections can be controlled by reducing the number of connections allowed in the file /etc/x25/x25configxxx

2. If in a sales environment insist that the admin is reduced or done at quiet times so that the server can concentrate on selling.

3. Many users like multiple sessions running - stop or reduce them.

4. Defrag the database - check the forums on how to do this.

5. Stop all non-essential background processes.

6. Stop large reports / searches.

HTH
Paula
If you can spell SysAdmin then you is one - anon