Operating System - HP-UX
1831814 Members
2582 Online
110030 Solutions
New Discussion

SIGKILL from unidentified source

 
SOLVED
Go to solution
Ed Loehr_1
Occasional Advisor

SIGKILL from unidentified source

One of my PostgreSQL postmaster processes just received a SIGKILL from an unidentified source. We find no evidence of the source in command histories or logs, and have no programs/scripts that we know of that would send SIGKILL. That would seem to leave the OS and PostgreSQL itself.

Any ideas about what kind of circumstances, if any, would result in HPUX sending SIGKILL? Anything else I can investigate?

The machine had ample RAM.

Thanks,
Ed
7 REPLIES 7
Sandman!
Honored Contributor
Solution

Re: SIGKILL from unidentified source

Is there a core file generated? If so run the "file" command on it to isolate the routine or program raising the SIGKILL i.e.

# file core
Ed Loehr_1
Occasional Advisor

Re: SIGKILL from unidentified source

No core file remaining, but that's good to know.
Sandman!
Honored Contributor

Re: SIGKILL from unidentified source

Did anybody use the fuser command on the filesystem running Postgres SQL processes? Since "fuser -k" sends a SIGKILL to all processes using a mount point.

How about any changes to the environment i.e. OS upgrade/patches or application related patches/upgrades? Was the application running file until this point in time or has this happened before? Just some troubleshooting scenarios you can use to isolate the cause.

Try restarting the application in debug mode with logging so that the next time it happens it can be traced. Or manually send it a SIGKILL and see if the scenario resembles the one you have (which might explain a user raised SIGKILL).

~cheers
Ed Loehr_1
Occasional Advisor

Re: SIGKILL from unidentified source

No fuser -k usage found in cmd history logs. App has been running ok, though we're also dealing with some DB table locking issues that do not appear to be related at this point. Restarting is not an option due to production system.
Bill Hassell
Honored Contributor

Re: SIGKILL from unidentified source

You can narrow the problem down to the owner of the postmaster process. Only the owner (and root) can issue a SIGKILL. Also check all cron jobs which run as root or run as the postmaster owner. HP-UX never sends a SIGKILL. Unlike SIGSEGV or SIGHUP which originate in the kernel, SIGKILL is user-processes only (shell, user program, etc). A good test is to have each user with root or PostgreSQL login to explain why kill -9 is a bad thing to do. Those that explain that it is the only way to terminate a process will need a new login...(no access to kill -9 for production programs)


Bill Hassell, sysadmin
Dennis Handly
Acclaimed Contributor

Re: SIGKILL from unidentified source

>Bill: HP-UX never sends a SIGKILL.

Are you sure? I've seen cases where mysterious SIGKILLs have occurred. I've always blamed it on the kernel when I couldn't get anyone to confess. I've heard vague rumors that it did SIGKILL when VM was oversubscribed??
Bill Hassell
Honored Contributor

Re: SIGKILL from unidentified source

I think that the kernel will just create an ENOMEM condition for out-of-memory situations (and similarly for ENOSPC, ENFILE, ENOLCK, etc). Then the normal signal handling will take place in the program. If there is an emergency condition in the kernel that would cause actual kernel generated SIGKILL's, I would assume that dmesg and/or syslog would have some details.


Bill Hassell, sysadmin