Operating System - HP-UX
1751932 Members
4951 Online
108783 Solutions
New Discussion юеВ

shell script needs to check for running instances of itself

 
SOLVED
Go to solution
Tom Wolf_3
Valued Contributor

shell script needs to check for running instances of itself

Hello all. I have a ksh script on HP-UX servers that runs out of cron every 15 minutes. Within this script, I need to write some logic that checks if any other instances of this same script is running on the host. If there are no other instances, then the script can continue to execute. However, if another instance is running, then the current session should send an email and terminate immediately. This script is only executed via cron and on rare occasions it hangs. This causes issues for subsequent runs of the same script. That's why I'm trying to implement this logic. Anyway, this is proving to be a bit more difficult that I thought. I was thinking simple logic like the following would be sufficient but since the script is executed via cron, there appears to be multiple processes associated with the script name each time it runs.




NUM_OF_SCRIPTS=`/usr/bin/ps -ef | grep <script-name> | grep -v grep | wc -l`

if [ $NUM_OF_SCRIPTS != 1 ]; then
echo "$EMAIL-BODY" | /usr/bin/mailx -s "Your script is hung on `hostname`" "$EMAIL_ADDRESS"
exit 1
fi


Anyway, I'd really appreciate hearing from anyone who has had to resolve a similiar issue. I can provide more details if needed.

Thanks in advance.

Tom Wolf
7 REPLIES 7
James R. Ferguson
Acclaimed Contributor
Solution

Re: shell script needs to check for running instances of itself

Hi Tom:

Using 'grep' to look for a process by name is prone to finding things you don't want to find. That said, a better way is this:

# UNIX95= ps -C syslogd -o pid,comm

This sets the XPG4 (or 'UNIX95') behavior only for the duration of the command line since setting it can have side-effects for other commands that you may not want or know.

The command looks for a process or processes named "syslogd" in this example and reports the pid and command name.

Since a heading line is returned which you may not want, you can suppress it thusly:

# UNIX95= ps -C syslogd -o pid= -o comm=

Now you can do:

# NUM_OF_SCRIPTS=$( # UNIX95= ps -C syslogd -o pid= -o comm= | wc -l )

Notice that the use of backticks is deprecated. The '$( ... )' POSIX notation is more readable.

Regards!

...JRF...

Keith Bryson
Honored Contributor

Re: shell script needs to check for running instances of itself

I do something similiar, but with a kill to clean-up rogue processes - you could adapt this:

MY_PROC=$$
ps -ef | grep scriptname | grep -v grep | awk {'print $2'} | grep -v $MY_PROC
| xargs kill -9

Best regards and good luck - Keith
Arse-cover at all costs
Bill Hassell
Honored Contributor

Re: shell script needs to check for running instances of itself

James wrote:

# NUM_OF_SCRIPTS=$( # UNIX95= ps -C syslogd -o pid= -o comm= | wc -l )

The # in front of UNIX95= is extraneous. It should read:

# NUM_OF_SCRIPTS=$(UNIX95= ps -C syslogd -o pid= -o comm= | wc -l)

I would probably simplify the line to just:

# NUM_OF_SCRIPTS=$(UNIX95= ps -C syslogd -o pid= | wc -l)

since you don't need any additional information for the email message.

As mentioned, you use ps to find processes by name, never grep since grep is not limited to a specfic field from ps. UNIX95= turns on additional options (see man ps).

A couple of notes about full paths (/usr/bin/ps):

The reason to use explicit paths is to avoid oddball paths that can exist in a variety of environments. I used to specify /usr/bin for everything until I realized that I can set my script's PATH precisely the way I want -- export PATH=/usr/bin won't affect the parent process or shell. So all my scripts start with export PATH=/usr/bin:/usr/contrib/bin...etc

And to find the basename of your script, use the shell's builtin capability: MYNAME=${0##*/} which returns your script's name regardless of what path was used to start it (ie, ./script /usr/local/bin/script, etc) which will appear in the ps listing.

So your actual script might have this code:

#!/usr/bin/ksh
export PATH=/usr/bin
MYNAME=${0##*/}
NUM_OF_SCRIPTS=$(UNIX95= ps -C $MYNAME -o pid= | wc -l)
if [ $NUM_OF_SCRIPTS -gt 1 ]; then
echo "$EMAIL-BODY" | mailx -s "$MYNAME script is hung on $(hostname)" "$EMAIL_ADDRESS"
exit 1
fi


Bill Hassell, sysadmin
James R. Ferguson
Acclaimed Contributor

Re: shell script needs to check for running instances of itself

Hi (again) Tom:

Bill, thanks for correcting the extraneous "#".

With regard to a 'kill -9' --- use this only as a last resort.

A 'kill -9' cannot be trapped by a process and thus the process has no chance to cleanup temporary files or remove shared memory segments. This is quite undesirable.

Instead, perform an escalating series of 'kill' actions.

Capture your pid list in a variable as:

# PIDS=$(UNIX95= ps -C myprocess -o pid=)

Then do:

# kill -1 ${PIDS}
# kill -15 ${PIDS}
# kill -9 ${PIDS}

This leaves a 'kill -9' as the last resort only.

Regards!

...JRF...

Re: shell script needs to check for running instances of itself

Hello Tom,

Instead of searching the process table for an instance of your script, I suggest you to use a lock.

For instance, you might check the presence of a file (the "lock") to determine if your script must exit or continue and remove this lock when your script terminates :

trap 'rm -f my_lock' EXIT

if [[ -f my_lock ]] ; then
my_mailing_function
exit 1
else
print $PID > my_lock
fi

There are faster and more reliable methods to set locks but as your script is only ran every 15 minutes, this one might be ok.

Cheers,

Jean-Philippe

Steve Post
Trusted Contributor

Re: shell script needs to check for running instances of itself

Feel free to tell me what's wrong with this. Apache uses this concept so I tried it too.

Lock=/tmp/mycronjob.lock
if [ -f $Lock ];then
echo "running mycronjob already."
echo "stopping this instance of mycronjob."
exit
else
echo "${$}" > $Lock
<<<>>>
rm $Lock
fi


The lock file is holding the process id of the last job or current job. I suppose you could use ps commands to verify the pid listed in the lock file is really what it is supposed to be.
You could also check the age of that lock file. If it is 2 days old? You job has been stuck (or still running!!) for 2 days.

Bill Hassell
Honored Contributor

Re: shell script needs to check for running instances of itself

> MY_PROC=$$
> ps -ef | grep scriptname | grep -v grep | awk {'print $2'} | grep -v $MY_PROC
| xargs kill -9

Please heed what James said about kill -9 but more important, NEVER use grep to find a process by name. Suppose you want to terminate all sh processes:

ps -ef | grep sh | grep -v grep
root 4 0 0 Jan 14 ? 01:14 unhashdaemon
root 768 1 0 Jan 14 ? 00:00 sh /etc/vx/bin/vxrelocd root
root 19854 1 0 Jan 25 ? 00:00 /opt/ssh/sbin/sshd
root 20666 1 0 Jan 25 ? 00:00 /sbin/sh /usr/dt/bin/dtrc
root 19494 1 0 Jan 25 ? 00:00 sh /etc/vx/bin/vxrelocd root
root 439 437 0 09:57:27 pts/0 00:00 -sh
root 437 19854 0 09:57:21 ? 00:04 sshd: root@pts/0

In this example, your script would terminate the kernel hashdaemon (because it has sh in it) as well as all ssh processes and the ssh daemon, Xwindows code and a couple of VxVM processes.

Because you cannot tell grep where to look, it is a very dangerous combination (ps and grep). For HP-UX, grep has an XPG4 option -C that searches the kernel process table for the name. No leading path names, no user IDs, no parameters on the command line, just the process name. Compare the output from these two ps listings:

ps -ef | grep sh | grep -v grep
UNIX95=1 ps -fC sh

The grep version makes many bad mistakes while the -C option is exactly correct. NOTE: UNIX95=1 is a flag to ps to use XPG4 extensions. It is on the command line so it lasts only for the duration of the ps command. Do not set this permanently in your login environment. UNIX95=1 can affect other processes in unexpected ways. For convenience, you can make an alias to take advantage of -C and also -H and -o options:

alias ps="UNIX95=1 /usr/bin/ps"

To see a hierarchical listing (similar to pstree), use ps -eH. To choose exactly which columns to see, use ps -e -o vsz,pid,args

man ps


Bill Hassell, sysadmin