1753797 Members
7313 Online
108799 Solutions
New Discussion юеВ

catching process death

 
Ignacio Javier
Regular Advisor

catching process death

Hello gurus:

I would like to ask you if you konw a development technique or if the hpux11.23 SO has some way to cath a process, not controlled death.

What i want is to develop a process than knows inmediatly when some others has dead.

Thanks in advance
11 REPLIES 11
SANTOSH S. MHASKAR
Trusted Contributor

Re: catching process death

Hi,

Pl. elaborate what exactly u want to do.

Ignacio Javier
Regular Advisor

Re: catching process death



Ok, what i want is to develop a process to control others, to start them if they die.

I├В┬┤ve got now one that does it sending a signal every 5 seconds and i want to improve it, so they intercomunicate or some other way within SO, so the controlling process knows when the other die, just in the moment it takes place.

Regards
Veeru_1
New Member

Re: catching process death

Hi Ignacio,

I am developing a new tool - sentinel - which does exactly that. It is available for beta users with WDB 5.6 available for free download from the web at http://www.hp.com/go/wdb. The tool is not yet supported by HP since we first want to validate its need.

Please take a look at it. The specific usage that you want to try is

$> /opt/sentinel/bin/sentinel -silent -exit mail

You'd have to specifiy SENTINEL_MAIL env variable to the appropriate email address (it defaults to root@localhost).

Instead of using the 'mail' action for the 'exit' event, you can also specify some other command using the '-cmd ' action. For example, if you want a script test.sh to be executed automatically when the process dies, do

$> /opt/sentinel/bin/sentinel -silent -exit -cmd test.sh

Sentinel either spawns a process if you specify the command with arguments, or it takes a pid.

Other events of interest to you would be 'errexit' (non-zero exit code) or 'abort' (uncaught signal).

For more events and actions, do

$> /opt/sentinel/bin/sentinel -help

or look at the man page after adding /opt/sentinel/docs/man/man1 to MANPATH env variable.

Please inform me if it works ok for your case by sending an email to Veeru at hp dot com.

Cheers!
Veeru
Ralph Grothe
Honored Contributor

Re: catching process death

I think to have read about Sentinel, which Veeru recommends, in some Linux context not so long ago.
Sounds promising.
But there are numerous ways of monitoring processes.
For your situation probably overkill, but I for instance have many vital processes monitored by my Nagios server.
There comes a handy Nagios plug-in called check_procs with quite a few options.
If you define a Nagios event_handler and write a wee wrapper script you can even have crashed procs be restarted through the event handler by Nagios unattendedly, but getting notice of it however.
Albeit, one should always determine the cause for a crashed process rather than having it simply restarted automatically.

Madness, thy name is system administration
A. Clay Stephenson
Acclaimed Contributor

Re: catching process death

One word: SIGCHLD.
If it ain't broke, I can fix that.
Ralph Grothe
Honored Contributor

Re: catching process death

That's fine if the monitoring process is the parent process, so that it actually can catch a SIGCHLD to get the child's exit value through waitpid.
But what if the monitor or agent is a completely unrelated process?
Madness, thy name is system administration
Patrick Wallek
Honored Contributor

Re: catching process death

If you have several unrelated processes that you are trying to keep running, why not start them via /etc/inittab and use the respawn option. That way if they die, they get restarted automatically.
Emil Velez
Honored Contributor

Re: catching process death

depending on how you have measureware configured it will capture how much resources a proces has used in that minute. Measureware collects data on interesting processes and can collect data on processes when they start, when it dies and if a process exceeds a threshold. This is configured in a file called /var/opt/perf/parm assuming that you have measureware (openview performance agent) installed and configured.

You can have a program do the extract command to watch for the data on that process.

Another option would be to configure glance with the advisor mode to print out the process list and look for a state of died for that process name.

A. Clay Stephenson
Acclaimed Contributor

Re: catching process death

Since this question was asked in the context of a development technique then it naturally fits into the concept of a parent process which fork()'s and exec()'s a child proces to do the actual work while the parent just sits there (okay wait()'s) with a SIGCHLD handler to notify it when the child terminates or receives a signal. I see no reason to reinvent the wheel for something that UNIX has been able to do since it was a wee babe.
If it ain't broke, I can fix that.