Secure OS Software for Linux
cancel
Showing results for 
Search instead for 
Did you mean: 

script for monitoring hang process

SOLVED
Go to solution
bong_3
Advisor

script for monitoring hang process

repost...

Hi all,

I just started to read/learn scripting for linux and find it very interesting. Now i'm trying to monitor a process that always hangup on my linux and perhaps kill it by running a script under cron.

see my attachment...

from there, when i run a top command it shows me that bash process under user1 takes up a lot of process which causes my linux hang.

I wonder whether anyone can suggest how to write a script for this or perhaps share your scripts if happens that you already have one.

Appreciate anyones help.

Bong
5 REPLIES
G. Vrijhoeven
Honored Contributor

Re: script for monitoring hang process

Hi Bong,

It is possible to monitor the CPU usage of processes in a script. But what action would you like to take. Stopping the process ( kill) or e-mail the root user? How often would you like to monitor etc.

Can you give us some more info.

Gideon
G. Vrijhoeven
Honored Contributor

Re: script for monitoring hang process

Hi Again,

You could look into PRM. This way you can manage CPU time. Check:

http://resourcemanagement.unixsolutions.hp.com/WaRM/prm_linux/schedpolicy.html

Gideon
Steven E. Protter
Exalted Contributor

Re: script for monitoring hang process

Bill McNamara has three threads asking for good sysadmin scripts.

http://forums1.itrc.hp.com/service/forums/pageList.do?userId=BR180643&listType=question&forumId=1

There are hundreds of scripts many of which work just fine for Linux.

Thats a good starting point.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
bong_3
Advisor

Re: script for monitoring hang process

Hi,

Yes, Gideon! i want to kill and perhaps notify a user for the action done.

when i run this command:
ps aux | grep bash
user1 1285 94.6 0.4 2552 592 tty1 S Jan27 0:00 -bash
user1 29422 0.0 1.4 2828 1796 pts/0 S 11:13 0:00 -bash
user1 29580 0.4 1.3 2808 1776 pts/1 S 11:14 0:00 -bash
root 29627 0.2 1.3 2756 1700 pts/1 S 11:14 0:00 bash
root 29651 0.0 1.3 2756 1700 pts/1 R 11:14 0:00 bash

what condition should i make so i can just get the PID 1285 w/ 94.6 CPU usage and kill it then ignore the rest of the bash process?

Steven: thanks for the link... i'm now starting to collect those posted scripts. :)

bong
Ian Kidd_1
Trusted Contributor
Solution

Re: script for monitoring hang process

I'm a little concerned in providing this script for two reasons:
(1) I still feel the best course of action is to determine WHY the bash shell is using that much CPU and to fix that issue rather than killing the shell. I don't know what your environment is like, but issuing a kill -9 against a shell would uncleanly kill anything that shell was running. I'd worry about database corruption issues and lost data.
(2) I just wrote the script and it hasn't been thoroughly tested.

You mentioned that you are just learning linux scripting. Here's an opportunity to practice. TEST this script thoroughly before using. Understand what each line does. I've commented out the kill portion as a safety precaution. The forums provides a lot of info for free, but doesn't make any guarantees, nor is there any warranties. It's "use at your own risk". AGAIN, test this script THOROUGHLY before uncommenting the kill portion and moving it into production.

This script could've been a bit cleaner, but the logic of this script seemed like a good idea at the time. You could set up cron to run this at certain intervals (after you test). I put the THRESHOLD variable at the top - that way, you can easily customize at what CPU threshold you want the kill invoked.


THRESHOLD=95
for PROCESS in `ps aux | grep bash | grep -v grep | awk '{print $2}'`
do
USER=`ps -p $PROCESS -u | grep $PROCESS | awk '{print $1}'`
CPU=`ps -p $PROCESS -u | grep $PROCESS | awk '{print $3}' | awk -F. '{print $1}'`
if [ "$CPU" -ge "$THRESHOLD" ]
then
echo "Killed user $USER bash shell PID $PROCESS due to excessive CPU usage" | mail $USER root
# kill -9 $PROCESS
fi
done
If at first you don't succeed, go to the ITRC