- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Unable to kill a non-zombie process. Expert help n...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 05:18 AM
06-02-2004 05:18 AM
Unable to kill a non-zombie process. Expert help needed
this is probably the most strange issue I've ever seen in HP-UX...
With top we've detected an apparently hung process, using 100% CPU. And one of the server's CPU shows 100 % of SYS utilization.
The offending process is a java process, a Tomcat server. We tried to restart the Tomcat server but it failed. Then we just tried to kill the process but it survived, so we tried the powerful kill -9 but the process was still there. The state of the process is running:
From top:
CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
3 ? 21243 root 152 20 267M 72656K run 46359:00 100.17 100.00 java
From ps -el:
401 R 0 21243 1 0 152 20 501ba600 7199 - ? 46360:44 java
Then we tried to trace the process with tusc but we got an error:
# tusc 21243
tusc: ttrace(TT_PROC_ATTACH, 21243, 0, 0, dad0001, 0): Permission denied
tusc: no process to attach to
Glance doesn't report this high CPU usage, for example, in low load hours, while top still reports 100 % CPU usage in one of the processors, glance just reports 3 % total CPU usage.
We are aware that glance metrics differ from top's ones and are more accurate. But from glance we also obtain the following information about CPU by Processor:
CPU State Util LoadAvg(1/5/15 min) CSwitch Last Pid
2 Enable 0.0 2.0/ 2.0/ 2.0 0 21243
CPU Util User Nice NNice RealTm Sys Intrpt CSwitch Trap Vfault
2 0.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
And this state doesn't change over the time, so, process 21243 is using CPU 2 all the time to do nothing!!
Going one step further, we've used the kitrace trouhg the script runki to trace what is happening in the kernel. We've observed that the offending process appears every 10 ms, in the hardclock() rutine in the kernel using the processor 2. The following is an extract of this trace:
0.001814 cpu=2 seqcnt=2763898760 pid=21243 ktid=-1 utid=-1 HARDCLOCK pc=0x17b1b0 sym=_sleep state=SYS
0.011816 cpu=2 seqcnt=2763899157 pid=21243 ktid=-1 utid=-1 HARDCLOCK pc=0x17b138 sym=_sleep state=SYS
0.021811 cpu=2 seqcnt=2763899315 pid=21243 ktid=-1 utid=-1 HARDCLOCK pc=0x17d4b4 sym=splsched state=SYS
0.031811 cpu=2 seqcnt=2763899452 pid=21243 ktid=-1 utid=-1 HARDCLOCK pc=0x15a904 sym=$PIC$124 state=SYS
0.041813 cpu=2 seqcnt=2763899572 pid=21243 ktid=-1 utid=-1 HARDCLOCK pc=0x17b138 sym=_sleep state=SYS
....
Any ideas?
Thanks in advance.
Best regards,
Juan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 05:29 AM
06-02-2004 05:29 AM
Re: Unable to kill a non-zombie process. Expert help needed
I can't tell you what exactly this process is waiting on but I can tell that most times the "solution" is a reboot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 05:50 AM
06-02-2004 05:50 AM
Re: Unable to kill a non-zombie process. Expert help needed
In that output note the priority (PRI) value.
In your case it's 152.
Anything from 128 - 153 is not only in the kernel range (128 - 177) - it's in the sleep mode of kernel range & is completely nonsignalable - i.e. unkillable.
It's probably waiting on a resource almost all the time and only comes out of that range for very short periods - when it could be killed.
Yoy might try a kill -1 (hangup) or -24 (stop) in the hopes that when it does "wake up" it will act on those. If those don't work, then your best bet is to determine it's parent PID - PPID & try killing that in hopes it will reap it's child.
But it almost appears to me to be poor programming practice that it eats up so much CPU, but is rarely signallable. If it's always waiting on some I/O resource then that resource's data *ought* to be buffered at least. Or simply it's waiting on something it will never see & it's spending a lot of wasted time in the hopes it will.
My 2 cents,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 05:58 AM
06-02-2004 05:58 AM
Re: Unable to kill a non-zombie process. Expert help needed
have a look at WCHAN field in ps output for your Java process. Look for the associated processes and try killing them first and then may be try for killing Java process.
Hope that helps.
Regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 06:01 AM
06-02-2004 06:01 AM
Re: Unable to kill a non-zombie process. Expert help needed
But as told by Gurus, Anything that is in kernel mode, can not be killed.
You may try killing the parent process.
Anil
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 06:06 AM
06-02-2004 06:06 AM
Re: Unable to kill a non-zombie process. Expert help needed
I'd run mstm on this system and look for trouble.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 08:21 AM
06-02-2004 08:21 AM
Re: Unable to kill a non-zombie process. Expert help needed
if it is possible try to kill the parent process - PPID and all java related processes
br Jan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 09:15 AM
06-02-2004 09:15 AM
Re: Unable to kill a non-zombie process. Expert help needed
What you need here is not a kitrace or tusc or gdb since the process is not signable, but you need a kernel stack trace. There are utilities that can do this. I suspect this process in a kernel trap mode and will never be released.
Since you have kitrace I suspect you also have crashinfo. Could you please run that with the -t option and paste the stack trace for this pid? That would be a start.
The best and the right thing to do is to take a TOC of the system and send it in to HP for analysis. Then you can see why the process is stuck and what kind of resources is it waiting for. This could be a programming issue too and the crashdump analysis will give the answer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2004 10:48 PM
06-02-2004 10:48 PM
Re: Unable to kill a non-zombie process. Expert help needed
Thanks everyone for your responses.
Following Navin's advice I've attached the stack traces of the offending process's threads.
Now I'm convinced I will have to stop the server to fix this problem. But I'd want to obtain as much information as I can about this issue to find the origin of the problem. I've also opened a case with HP, and I've sent them all this info, I am waiting for their answer.
Best regards,
Juan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-03-2004 04:28 AM
06-03-2004 04:28 AM
Re: Unable to kill a non-zombie process. Expert help needed
But I would involve the Tomcat developers because looks like they might not be handling signals in the code correctly and handling proper thread exit procedures. I doubt if you will find any signals pending on these threads. Other than the kills you might have manually issued.
Could you make sure you have the following patches assuming you are on 11.11.
[PHKL_28695/PACHRDME/English] - 11.11 Cumulative VM, Psets, Preemption, PRM, MRG
[PHKL_28410/PACHRDME/English] - 11.11 vm preemption point, pdc, vhand performance
[PHKL_25212/PACHRDME/English] - 11.11 vm preemption point, mlock/async_io
[PHKL_28529/PACHRDME/English] - 11.11 VxFS mmap(2) performance improvement; vhand
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-03-2004 04:48 AM
06-03-2004 04:48 AM
Re: Unable to kill a non-zombie process. Expert help needed
After that, I would ensure you are running latest Tomcat and patches.
Rgds...Geoff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-03-2004 07:09 AM
06-03-2004 07:09 AM
Re: Unable to kill a non-zombie process. Expert help needed
Regards,
Fred
"Reality is just a point of view." (P. K. D.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2004 06:21 AM
06-09-2004 06:21 AM
Re: Unable to kill a non-zombie process. Expert help needed
we have received the response from HP Support. The problem is related with the bug JAGae65088:
Multithreaded STREAMS UP emulated driver hangs on thread exit with the following stack trace.
_switch+0xc4
thread_exit+0x1e8
thread_process_suspend+0x188
issig+0x2a4
syscall+0x8f0
syscallinit+0x554
This stack trace is the same as in our zombie thread.
This problem is solved with the patch PHNE_29825.
Thank you very much for your help.
Best regards,
Juan Gonzalez