Operating System - HP-UX
1833189 Members
2810 Online
110051 Solutions
New Discussion

Re: All child processes and other processes are getting killed.

 
Amith_2
Frequent Advisor

All child processes and other processes are getting killed.

Hello,

I have a process, which will fork child processes every time a new request comes in.

Yesterday suddenly all the child process got killed along with some other processes running in the server. The parent process was still executing.
The log is

"fhm_b01.log:26/03|21:31:06: Info|Signal <16-SIGUSR1> caught raising SIGTERM
fhm_b01.log:Termination signal received"

Once I restarted the sessions everything went fine and there is no errors till now.

These processes were running without any problem for last 6 to 7 years. This is the first time we are getting such an error.

Could anyone please tell me what could be the reason of the error?
3 REPLIES 3
Steven E. Protter
Exalted Contributor

Re: All child processes and other processes are getting killed.

Shalom,

Server use patterns change.

Check /var/adm/syslog/syslog.log for more errors. I'm looking for something to indicate kernel parameters need to be changed.

nproc
nfile
maxuprc (Big suspect here it limits the number of processes per user and has caused this behavior for me in the past).

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Amith_2
Frequent Advisor

Re: All child processes and other processes are getting killed.

Thanks Steven,

Could you please tell me what type of errors might be there in the syslog.log file. I have not analysed these files mich before.

Thanks
Amith
Bill Hassell
Honored Contributor

Re: All child processes and other processes are getting killed.

HP-UX won't arbitrarily kill processes. Look in root's shell history for some kill commands. Also check crontab for a new job that is supposed to kill certain processes, especially ANYTHING that uses ps and grep together (very dangerous). Similarly, did someone write a script to kill processes by name? Again, using ps and grep in such a script will almost always do the wrong thing. Note that SIGUSR1 is a very specific kill signal (ie, kill -16) so this is definitely a signal from some new code on your system.


Bill Hassell, sysadmin