1833414 Members
3093 Online
110052 Solutions
New Discussion

Re: OCD termination

 
Raj Kotaru
Occasional Contributor

OCD termination

Hello,

I am a developer with Orbital Sciences Corp., in Columbia, MD. I have a C program that communicates with a remote DTC port via DDFA by spawning an ocd (oubound connection daemon). The OS version in use is: HP-UX B.10.20.

At startup, the ocd that was spawned by a previous version of this program is terminated by sending a "kill -15" signal to the ocd. After this signal is sent, the program re-creates a new ocd for the current run.

The problem is that (soon after re-creating the new ocd) the attempt to "open()" the pseudonym (ie., the device file) in O_RDWR mode fails. The errno is set to ENOENT (No such file or directory). This problem manifests itself intermittently (the failure rate is about 50%). I have introduced a delay period of 5 seconds between creation of the new ocd, and the open() system call, but the problem still persists.

The device file being opened is the same for the ocd terminated, and the ocd just started. Given this, should I also introduce some delay time interval between the kill command, and the command that restarts the new ocd? Also, should the device file be closed (via close()) before the previous ocd is killed?

Any suggestions or feedback would be greatly appreciated.

Thanks
Raj Kotaru

Principal Software Engineer
Orbital TMS
7160 Riverwood Drive
Columbia, MD

(443)259-7264
kotaru.raj@orbital.com
3 REPLIES 3
A. Clay Stephenson
Acclaimed Contributor

Re: OCD termination

I don't know if it's going to help but I would have the signal handler definitely close all the file descriptors. It would also probably be a good idea to have your signal handler issue a signal() to ignore further signal 15's. I would also include a 2 or 3 second sleep after all the closes and the actual exit() call. This will probably allow enought time for any possible cleanup that the device driver need to do. One last point, I assume that you are calling exit() and not _exit.

If it ain't broke, I can fix that.
Raj Kotaru
Occasional Contributor

Re: OCD termination

Thanks for your response.

I would like to clarify that the my C program is sending the "kill -15" signal to the OCD process. According to the DDFA documentation, it is the OCD program that traps this SIGTERM and then closes the device file.
If so, I am not sure what the behaviour would be if I were to put code in my C program to close the file descriptor port, and then send a kill signal.

Another clarification is that the shutdown, and re-start of the OCD happens when my C program starts up. Hence, I am not invoking exit() or _exit() at all.

Thanks again.

A. Clay Stephenson
Acclaimed Contributor

Re: OCD termination

Sorry, I interpreted your question to mean that you had coded a replacement for ocd. I think at this point, you should use the debug version of ocd 'ocdebug' to get more diagnostics.
If it ain't broke, I can fix that.