Re: one node hung reason

Shivkumar · ‎06-16-2006

Hi,

We have a 2 node serviceguard cluster running oracle 9i rac. Our sys admin team found one node in hung state
and rebooted the server manually. I think that the hung server could have gone for TOC and rebooted itself.

Can someone explain what could be this hung situation and why it had not gone for crash dump and rebooting ?

Thanks,
Shiv

Torsten. · ‎06-16-2006

Not every "hang" is the same.

You can feel the system is hung, but the system doesn't - one scenario.

The system is hang and will crash - even in non-sg environment.

Your control script is not working as expected.

And some more...

You need to analyze the logs to get the reason.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!

Deoncia Grayson_1 · ‎06-16-2006

The node could have panic for any reason as Torsten stated, you have to investigate the logs to find exactly why. You can look at the /etc/shutdownlog and also look in your syslogs, you also check /var/adm/crash to see if created a crash log.

Deonia

If no one ever took risks, Michelangelo would have painted the Sistine floor. -Neil Simon

Bill Hassell · ‎06-16-2006

Thyere is no reason to assume that the hang condition will cause a TOC/panic. You have to determine the state of the system. When you say 'hung', is that being measured with a network connection? The system may be running fine but a network router has failed so the link between your device (like a PC) is broken. When the system hangs, you must go to the console so you can bypass the networking. If you can get logged in, you'll have to determine if local networking is still working.

If the console login fails, there may be a hardware failure and your cluster failover is not correctly setup.

Bill Hassell, sysadmin

inventsekar_1 · ‎06-17-2006

i have a question SHiv,
how ur sys admins team found one node in hung state? is there any command?
i am curious to know that..

and what is meant by "crash" in unix?
sometimes my windows system crashed and when we install windows again we missed only the c drive datas. what will happen in unix crash? when we install unix os again, what datas we can get or recover?

Be Tomorrow, Today.

Shivkumar · ‎06-17-2006

Sekar,
Someone miscommunicated to us that it was in hung state. In fact the system panicked and rebooted itself. So it seems to be an expected behaviour.

Thanks,
Shiv

Bill Hassell · ‎06-17-2006

Actually, a system panic is not a 'normal' state. Your systems need up to date patches. It is not unusual for HP-UX systems to run for years and never have a panic and reboot. But these systems should be fully patched ever 4 to 6 months.

Bill Hassell, sysadmin

Michael Steele_2 · ‎06-17-2006

A panic and reboot is not the same as a 'hung' system. See Mr. Hassell's description.

You should investigate your logs files:

/var/adm/tombstones/ts98, ts99 (* HPMC's?*)
OLDsyslog.log (* Read from bottom up *)

SERVICE GUARD
cmreadlog /var/opt/cmon/cmomd.g
cmreadlog /var/opt/sgmgr/929917sgmgr.log
cmscancl -n node -o outputfile

SAVECRASH
/var/adm/crash/*

GSP > sl > e

Support Fatherhood - Stop Family Law

Categories

Company

Local Language

Forums

Discussions

Knowledge Base

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: one node hung reason

one node hung reason