Operating System - HP-UX
1753779 Members
7703 Online
108799 Solutions
New Discussion

Re: Server Rebooted Automatically

 
SOLVED
Go to solution
Michael Steele_2
Honored Contributor

Re: Server Rebooted Automatically

Hi

Looks like you have stepped upon an already known problem with a patch fix.

http://www13.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c01878548-2

http://www13.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c00905839-9

The patches are embedded in the kmine doc.s
Support Fatherhood - Stop Family Law
Fergus Brophy
Advisor

Re: Server Rebooted Automatically

Hello Michael,

I don't have access to those links. Would you be able to explain what was in them?

Thanks very much.
Fergus Brophy
Advisor

Re: Server Rebooted Automatically

Hello Viveki,
Are you sure about that. I have looked at other systems and i can see the same errors in that the nettl.log00 file. And by comparing the timestamps of the cable faulty or disconnect with the uptime on the server, it seems that this error occurs everytime the servers are shutdown.

Thanks.
Michael Steele_2
Honored Contributor

Re: Server Rebooted Automatically

PHSS_40145: 11.31 Serviceguard A.11.19.00
ABORT PANIC If cmcld receives unexpected data cmcld may hang resulting in a node TOC. The following messages will be logged in flight recorder
log SEC:01: Event - Unknown message version


See Attached
Support Fatherhood - Stop Family Law
Michael Steele_2
Honored Contributor
Solution

Re: Server Rebooted Automatically

Well, this is a personal question for you. Did you shutdown the server before halting the node?

IMPROPER SHUTDOWN ____

Another reason for a ServiceGuard TOC may be due to performing a shutdown or
the reboot command before taking the node out of the Serviceguard


A symptom of this is often recorded in the /etc/shutdownlog:

21:22 Tue Sep 06 2005. Reboot after panic: SafetyTimer expired, INIT,
IIP:0xe000000000643680 IFA:0xe0000001f8fd8056

The shutdown command initiates the "/sbin/init.d/cmcluster stop"
script, which performs a "cmhaltnode". Normally, cmhaltnode signals all
packages to shutdown, terminates all Serviceguard processes and terminates
the kernel safety timer which is used to detect a kernel hang.

If a package fails to halt properly however, cmhaltnode will not terminate
cmcld and the safety-timer process is left running. Consequently the
shutdown command will eventually perform a 'reboot' which will kill cmcld,
leaving the safety timer counting down. If the timer reaches zero before the
O/S shuts down, a TOC occurs.

Owing to the fact that "fuser -ku" is not designed to find and kill
all processes keeping files open, the most common cause of package halt
failure is the inability to umount a file system by the control script. (See
the packages' control log)

The recommended shutdown procedure is to perform cmhaltnode manually prior to
performing the shutdown command.
Support Fatherhood - Stop Family Law
Fergus Brophy
Advisor

Re: Server Rebooted Automatically

Thanks Michael,
I think i have the sequence of things clear now. From all the responses along with yours and putting 2 and 2 together, it looks as if, Node 1 lost connectivity with the cluster, the package tried to halt but failed to come down cleanly, due to a user in the mounted file system. Hence the safety timer was not stopped and the node rebooted as a result. All that is left is I have to figure out why the server lost connection with the heartbeat lan.
Thanks.
Michael Steele_2
Honored Contributor

Re: Server Rebooted Automatically

Hi

Well, I have to disagree with this comment "... due to a user in the mounted file system..."

In this case, the error in syslog will be vg unable to deactivate. And its a fairly common occurance that shows up here in the forum pretty regular. Not like your problem, which is hard to find search hits on.
Support Fatherhood - Stop Family Law
purushottamaher
Frequent Advisor

Re: Server Rebooted Automatically

Hi,

 

i Also had the same issue and i got the solution for the same issue here :

 

http://expertisenpuru.com/reboot-after-panic-server-rebooted-automatically-in-hp-ux/