System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

How To identify A hang Process in Linux

 
DhsDheeraj
Occasional Advisor

How To identify A hang Process in Linux

I am Looking for a script for identifying a hang process in Linux.
I am not sure whether its is on the basis of the %CPU time usage or any other parameters which decides the hang up of a process.
13 REPLIES
Steven E. Protter
Exalted Contributor

Re: How To identify A hang Process in Linux

Shalom,

A combination of top and or a process analysis script may be needed.

When I suspect a problem like this, I take a look at top and then run ps commands against the suspected process.

Automated identification is not possible in many cases. The eyes and mind of a systems administrator are required.

Take a look at this script for ps command options.

http://www.hpux.ws/?p=8

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
DhsDheeraj
Occasional Advisor

Re: How To identify A hang Process in Linux

Thanks for Your comments.But I am not able to get the Script.
Actually I am writing a Monitor program(using java) which should report the status of process if it is hanged up specially a web browser like mozilla.
Can You provide an idea how should i proceed.
Torsten.
Acclaimed Contributor

Re: How To identify A hang Process in Linux

I think you will easily know if a web browser hangs or not - it won't respond anymore ...

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
DhsDheeraj
Occasional Advisor

Re: How To identify A hang Process in Linux

Yeah a user can see that but how can i check that using java code.As i need to make a check in the code and then generate a summary for that.
So far What i have thought of taking the CPU usage(High one) and process status(Running) as the parameters to decide that a process has hanged up.But I am not sure of the approach whether its correct or not.
Please correct me if I am wrong.
Steven Schweda
Honored Contributor

Re: How To identify A hang Process in Linux

First, you're in an HP-UX forum, and there
is a Linux forum:

http://forums.itrc.hp.com/service/forums/familyhome.do?familyId=118

Define "hang process".

A program can get into a state where it uses
no CPU because it's waiting for something to
happen. It's really hung only if that thing
never happens. How can you decide if that
thing _will_ happen?

A program can use all (or nearly all) the CPU
for a long time without producing a result,
because it has much work to do. A user may
believe that it's hung, even though it's
simply working hard.

What condition are you trying to detect?
DhsDheeraj
Occasional Advisor

Re: How To identify A hang Process in Linux

Thanks Steven for providing the link for Linux forum and constructing a refined form of problem definition.
Well I am looking for the first condition.But how can we decide that_thing.
Can You Please tell me your opinion on this by taking care that I am looking for a solution to detect the mozilla web browser in the hang status.
Steven Schweda
Honored Contributor

Re: How To identify A hang Process in Linux

My opinion (which may be worth little) is
that you would need to know exactly why (or,
at least, approximately why) the program is
hung, that is, what it's waiting for. And
whether there's any chance of it ever getting
the thing it's waiting for. (My personal
psychic powers are weak, and I don't know how
to write a psychic computer program.)

In the case of a Web browser like Mozilla,
the program can send a request (HTTP) to a
server, and wait for a response. It can
_appear_ to hang because of a problem in the
server, but the server could recover at any
time, and all would be well again (unless
Mozilla times out and moves on before the
server responds). It could also have some
problem (internal, or a bad JavaScript or
Java program) which could cause it to wait
for some event which will never happen. I
don't know any practical way you can (from
the outside) distinguish between a
slow-to-respond server and a hung client.

What, exactly, is the problem which you are
trying to solve?
DhsDheeraj
Occasional Advisor

Re: How To identify A hang Process in Linux

Steven ,actually I am doing an assignment where i need to monitor the state of mozilla instance which will be used for connecting to pre-decided Server (for a web based application)
Now in case if the Browser get hang up (i do not know the reasons probably its in any case) then i need to identify it using my java code and report the same(for the time being i need to just log the data).
Can You suggest me an approach for that.
MSECO_1
Trusted Contributor

Re: How To identify A hang Process in Linux

I am not sure for Linux
This is for HP-UX:
ps -ef |grep defunct
Keep learning!!!
DhsDheeraj
Occasional Advisor

Re: How To identify A hang Process in Linux

I think dfunct represents the Zombie process.But the requirement is to monitor all processes.
dirk dierickx
Honored Contributor

Re: How To identify A hang Process in Linux

you can also take a look at what the process is doing with the 'strace' command. either no output appears for a long time. or you see the same calls appearing all the time, which could mean it's stuck in a loop.
Srimalik
Valued Contributor

Re: How To identify A hang Process in Linux

Hi, Dheeraj

The problem seems to be vague, why would somebody want to monitor a client(mozilla), One should worry about the hang of a server and not a client.

Still as you say you want to know if a process has not done anything for some specified amount of time(you need to fix a limit on that)-Also process consuming no cpu is not hanged always. What you can do is:

have a look at /proc//stat

This has a lot of numbers includeing the accounting information. You can extract the cpu time consumed in a loop and go on checking the previous value with the new value if they both are same for quite some time(you decide the time) you can take it a hanged.

-Sri
abandon all hope, ye who enter here..
Matti_Kurkela
Honored Contributor

Re: How To identify A hang Process in Linux

The problem seems to be almost equivalent to the Halting Problem of computability theory.

See Wikipedia's article about the Halting Problem:
http://en.wikipedia.org/wiki/Halting_problem

Any modern computer is essentially a Turing machine. Alan Turing proved in 1936 that there are no general solutions in the Halting Problem. In other words, your task is *impossible* unless you can get some extra information or are willing to accept some false detections.

For example, the problem becomes much easier if you have a way of determining whether the program is making progress or not. For server programs that take in requests and process them, this is usually easy: determine the maximum acceptable time to process a request, and if the program is spending much more time with one request it's probably hung.

Or you could simply monitor the length of the queue of unprocessed requests: if the queue grows too long, there is a problem.

For interactive programs like a web browser, the "queue" is in the user's mind, so your program cannot determine the queue length. You'll have to add some assumptions, for example:
- when the browser process has a large %CPU value, it is probably drawing a new page to the display
- nobody will want to wait 10 minutes for a webpage to be displayed
- so if the browser has a large %CPU value for 10 minutes continuously, it's probably hung
But someday you might need to read a super-heavy web page which takes 11 minutes to display on your old computer...


Assumptions like this are always specific to the type of program (web browser) and are not correct for any other type of program. For example, if a database engine is processing a lot of requests, it may have a high %CPU for hours at a time. But if your system is getting a lot of work done, that's exactly what is supposed to happen.

MK
MK