1830165 Members
2385 Online
109999 Solutions
New Discussion

can't kill process

 
westb
Advisor

can't kill process

RHEL 4 u1 on BL25

Every couple of weeks a process gets into a state that consumes 99% of CPU and cannot be stopped or killed with kill -9. Rebooting seems to be the only solution.

The process is always different.
6 REPLIES 6
Steven E. Protter
Exalted Contributor

Re: can't kill process

Shalom,

sysctl output might be helpful here

Try killing the process without the -9

If its PPID is 1, then using kill -9 says kill the system. The results can't be good.

Check top for zombie proceses.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
westb
Advisor

Re: can't kill process

kill PID does not work either.
Andrew Bruce
Valued Contributor

Re: can't kill process

Hi westb,

A bit more info would be useful.

Is the process always different, or do you mean the process ID? Typically what type of process is involved? Is there any correlation betweem processes that perform this?

Some examples of processes you've seen get into this state would help.

When the process takes 99% CPU, are there other processes competing for CPU time?

Regards,

Andy Bruce
I Love it when a plan comes together!
westb
Advisor

Re: can't kill process

To date the process name has always been different.

smbd, vsftpd and su have been the culprits.

In the case of su, I was the person who issued this command. When I did a "Control D" to exit root it hung the putty session and then su got stuck in this weird state. Also, at the same time there was a "System error" message logged in the cron log.

Yes there are other jobs competing for cpu time.
Bill Thorsteinson
Honored Contributor

Re: can't kill process

Even if you can't kill it you should be
able to drop its priority using renice.

Take a look at the process with lsof and
see what files it has open. Does it have
any files or network connections open that
shouldn't be.

Connect with a tracing tool and see what
services it is calling.

You may want to reinstall any libraries
that the service has open. At least verify
checksums (preferably more that one checksum
per file).
westb
Advisor

Re: can't kill process

renicing the PID did not have any effect.

strace wouldn't do a thing on the PID.

I'll try lsof the next time, forgot about that one... thanks.

For what it's worth, I was given this possible explanation from elsewhere...
"Process 1575 is spinning in user code and has signals blocked. A process in Linux SHOULD NOT be able to get into this problematic state, but apparently the operating system is not yet 100% foolproof concerning spinning processes. Unfortunately, the only way to exit process 1575 will be with a reboot at your convenience".