Operating System - HP-UX
1833701 Members
3004 Online
110062 Solutions
New Discussion

Problem with SIGCHLD (Migration from 10.10 to 11.11)

 
Vervelle
New Member

Problem with SIGCHLD (Migration from 10.10 to 11.11)

Hello,
I'm migrating an application from HP-UX 10.10 to HP-UX 11.11 (C and Pro*C).
I have a main process that launches (with vork) several other processes and monitors them (using signal( SIGCHLD, ... ) to be alerted when one child process dies.

I have the following problem : each time a child process dies, the main process seems to hang indefinitely.
The application was working fine on 10.10.
I tried to remove the signal( SIGCHLD, ... ) and the problem disappeared.

Does anyone have any idea on how to solve this problem ?


Thanks,
Nicolas
4 REPLIES 4
Mike Stroyan
Honored Contributor

Re: Problem with SIGCHLD (Migration from 10.10 to 11.11)

What does your signal handler do?
There are very few calls that are safe inside of a signal handler.

Do you call signal() again in the handler to reset the handling of SIGCHLD? If you do that before you actually wait for the child you can get into an infinite loop as you reenter the signal handler as soon as you call signal().

You really should look into using a more modern call such as sigaction to set up a signal handler. It does not always reset the handler to the default every time a signal is received.
Vervelle
New Member

Re: Problem with SIGCHLD (Migration from 10.10 to 11.11)

I tried to simplify the signal handler to be sure that nothing in it was causing the hang.

I left only:
static void CaptureEndProcess(int va_signal)
{
int vl_infoPID;
int vl_pid;
vl_pid = waitpid(-1,&vl_infoPID,WNOHANG);
signal(SIGCHLD,CaptureEndProcess);
}
but nothing changed.

I even tried:
static void CaptureEndProcess(int va_signal)
{
_exit(0);
}

The process is not terminated when a child dies.

More strange:
if I try to kill the parent process (kill ) before any of his children dies, the parent process dies.
if I try the same command after a child died, the parent process doesn't die: I have to use kill -HUP (or kill -9) to kill it. I can also kill it with two commandes: kill -26 and then kill .

I don't understand what's going on
Mark Grant
Honored Contributor

Re: Problem with SIGCHLD (Migration from 10.10 to 11.11)

To me, this sounds like problem is not in the signal handler as such, but in the setting up of the signal trap itself.

It looks like the parent process is going somwehere other than your signal handler when a SIGCHLD is recieved. If I remember correctly you need to pass signal() an integer for the signal, a pointer to the function and an integer for the flags. Make sure this part is exactly as you think it is. It might have worked by accident on 10.20 if your function pointer isn't right.

One other thing, are you using a value for the signal (18) or SIGCHLD? It might be different than it used to be, "kill -l" (that's a lower case "L", not a 1) will confirm.
Never preceed any demonstration with anything more predictive than "watch this"
Vervelle
New Member

Re: Problem with SIGCHLD (Migration from 10.10 to 11.11)

Finally I have found where the problem is.

I am using Oracle 8.1.7.4 and it seems that when I connect to Oracle (EXEC SQL CONNECT), Oracle fusses with the signal handling.
I call again signal( SIGCHLD, ... ) after the connection to Oracle and the problem seems to be solved.

Thanks for the help