- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Need help finding source of unhandled SIGALRM that...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2011 08:17 AM
07-19-2011 08:17 AM
Hello all,
I have been pulling my hair for a while trying to find the source of a problem that causes a process in a multi-process application to exit without a CORE after a few days of load. Any advice or tips to help pinpoint the source would be greatly appreciated.
OS is HP/UX 11.11 and the application is written in C.
The problematic process is multithreaded, and it basically receives messages on an IPC queue and spawns connection threads through libcurl to send HTTP requests to external servers and handle asynchronous reponses that update the application. The application uses semaphores to synchronize between processes and threads.
What I have found so far is that an unhandled SIGALRM seem to be finding its way to the process without being handled. I have tried setting a dummy signal handler (that just reassigns itself to SIGALRM when a SIGALRM is received) to no avail and I have also tried ignoring it right at the start of the main like so:
new_action.sa_handler = SIG_IGN; sigemptyset (&new_action.sa_mask); new_action.sa_flags = 0; sigaction (SIGALRM, &new_action, NULL);
I know that the library handling the semaphores used by the process overrides the alarm handler with its own SIGALRM handler for semaphore events (to avoid deadlocks), but it restores the old handler after completing like so:
sigemptyset(&act.sa_mask); act.sa_flags = SA_RESETHAND; act.sa_sigaction = NULL; act.sa_handler = was_alarm; sigaction(SIGALRM, &act, NULL); alarm(time_left);
If I run the process from GDB and set breakpoints on _exit, when the process exits after a few days all I get is:
warning: Temporarily disabling or deleting shared library breakpoints:
warning: Disabling breakpoint #2
Program terminated with signal SIGALRM, Alarm clock.
The program no longer exists.
Stopped due to shared library event
(gdb) bt
No stack.
There are two things I need to find out:
1- What is sending the alarm. I went through the code and do not understand why an alarm signal would come up unhandled. If I tell GDB to stop on SIGALRM, I still can't find where an alarm expired by going through the stack and the threads. Is there any way in GDB to find the source of a signal?
2- Why is the SIGALRM not handled by either the handler or the SIG_IGN. Any ideas would be appreciated.
Thanks for your time,
Max
Solved! Go to Solution.
- Tags:
- SIGALRM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2011 01:02 PM
07-19-2011 01:02 PM
SolutionYou might try using tusc to see what's going on.
sigemptyset(&act.sa_mask); act.sa_flags = SA_RESETHAND; act.sa_sigaction = NULL; act.sa_handler = was_alarm; sigaction(SIGALRM, &act, NULL); alarm(time_left);
Why are you calling alarm(2) here? It seems you should check was_alarm for SIG_IGN and not call alarm.
Are you blocking SIGALRM while you are in your library signal handler?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-25-2011 07:28 AM
07-25-2011 07:28 AM
Re: Need help finding source of unhandled SIGALRM that terminates process
Thanks Dennis, using tusc did help me narrow it down to a racing condition with threads and calls to alarm().
There was a problem with our signal stacking function that saved and restored signals when a semaphore lock was needed. This code was not thread safe since the alarm timer is global to the whole process.