Operating System - HP-UX
1836856 Members
2446 Online
110110 Solutions
New Discussion

Re: Script to kill CPU HOG

 
SOLVED
Go to solution
brian_31
Super Advisor

Script to kill CPU HOG

Hi Team:

Currently in our Nclass(11.0,64bit) we have some runaway processes which quickly occupy 100% CPU. The SA manually kills the process(kill -15) after 45 minutes(this would give sufficient time for him to take some glance outputs or tusc outputs). The need would be to automate this. So where do we start..may be top..when the process hits 95 % of CPU the script should pick up the process and then do all the collection(i am not too sure if we can automate this in glance, but may be we run some tusc outputs on the PID)and after 30 minutes do a kill on the PID. I need some help to script this. I am sure that this would also help everyone 'coz it once so happened that the SA killed a system process by mistake on the production box and we want that to be done by a script.

Thanks
Brian.
10 REPLIES 10
steven Burgess_2
Honored Contributor

Re: Script to kill CPU HOG

Hi

Try this

#!/usr/bin/sh

echo "\nkill process script"
echo "====================="

listfile1=/tmp/rm_npcserver.list
listfile2=/tmp/rm_npcserver.list2
sleep 1
echo "Please enter you choice of process"
read choice

rm -f $listfile1 $listfile2 2>&1

ps -ef | grep $choice | cut -c10-15 > $listfile1
sed 's/^/kill -15 /g' $listfile1 > $listfile2
chmod 777 $listfile2
$listfile2

echo "\nkill process script ends"



You can change the kill with whatever value you feel necessary

Regards

Steve
take your time and think things through
James R. Ferguson
Acclaimed Contributor

Re: Script to kill CPU HOG

Hi Brian:

Some words of caution. First, just because a process consumes the majority of a CPU isn't necessarily bad or wrong. A compute-bound process with very few interrupts (I/O, etc.) will exhibit this kind of behavior.

Second, if you do decide to script an "automatic" kill, issue the kill as a 'kill -15' and then if that doesn't work, try 'kill -1' {SIGHUP) and then only as a last resort, 'kill -9'. Remember that a 'kill -9' can't be caught and may leave shared memory segments lying around along with temporary files.

Third, if you are using 'ps' to locate a process by name, use the UNIX95 (XPG4) form and the '-C' option to locate your process by its basename. See the 'ps' man pages for more details. This avoids extraneous matches for processes you don't want!

You should arm the XPG4 form only for the duration of the 'ps' command by doing this:

# UNIX95= MYPID=`ps -C tar`

Note that no semicolon occurs after the equal sign. Rather, there is a blank character and then the variable and command into which the output is placed.

Regards!

...JRF...
Tom Danzig
Honored Contributor
Solution

Re: Script to kill CPU HOG

I started writing a script to the same thing. What I had is attached. You could easily modify to kill the offending process. As it stands it prints a message if a "runaway" is found.



Martin Johnson
Honored Contributor

Re: Script to kill CPU HOG

The problems with automating the "kill" process is:

1 - the possibility of killing a valid process (like java).

2 - when the system becomes busy, the CPU threshold will not be met, hence the offending process will not be killed.

After being burnt for killing the wrong process, I have measureware send an alert to HP Openview Operations. I always investigate before killing.

HTH
Marty
James R. Ferguson
Acclaimed Contributor

Re: Script to kill CPU HOG

Hi (again) Brian:

Sorry, I dropped part of the suggested syntax for finding a process's pid by name. I meant to write:

# MYPID=`UNIX95= ps -C tar -o pid|awk 'NR>1 {print $0}'`

This capture all pids for (by example) 'tar' processes into the MYPID variable. The column headings over the list are skipped (here with 'awk'). If MYPID is empty, no processes match:

# [ -z "$MYPID" ] && echo "nothing matches"

Regards!

...JRF...


Robert Gamble
Respected Contributor

Re: Script to kill CPU HOG

Brian,

I would only implement a auto-kill script if you have investigated the following:

1. What is the most consistent process killed ?
2. Is that process legitimate to the end-user ?
3. If it is indeed a runaway process, and it happens consistently, I would contact support for the application/process and see if this is a known issue or a new bug.

I would not recommend auto-killing any process that just happens to be the 'top' process.

Good Luck!
brian_31
Super Advisor

Re: Script to kill CPU HOG

Hi:
Thanks guys. I did not quite understand this part from Martin..
- when the system becomes busy, the CPU threshold will not be met, hence the offending process will not be killed.

Otherwise to answer other questions..
I know what procees is the offender(application process). But it did not Understand how you would pick up the top process from ps -C?

Thanks
Brian.
Martin Johnson
Honored Contributor

Re: Script to kill CPU HOG

what I meant by that statement can best be represented by example:

Threshold set at 95%. On a system with 1 CPU, one looping process will get almost 100% CPU utilization. Two looping processes, competing for the single CPU, will get about 50% CPU each. This is far less than the 95% needed to trigger the kill.

HTH
Marty
brian_31
Super Advisor

Re: Script to kill CPU HOG

Hi There,

Could someone help by explaining Tom Danzig's script(the two awk statements) also how do i modify this to start monitoring the runaway of the problem PID?(like the script should capture glance commands on the problem PID's for 30 minutes(get system calls, where it is spending time etc..) and then kill the process. Is it possible Gurus??

Thanks
Brian.
brian_31
Super Advisor

Re: Script to kill CPU HOG

Hello All:

Good Morning!!! Can someone help please. The actual message is above.

Thanks
Brian.