Operating System - HP-UX
1848645 Members
4450 Online
104034 Solutions
New Discussion

Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

 
moonchild
Regular Advisor

Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

MC/SG 2 x nodes cluster V2600 as main and a V2500 as sec. with an XP256

From Nov. 2006, the MC/SG environment is not stable and the backup host (V2500) took over the service from the productive host (V2600) for 3 times at "9:03 Dec 26 2006", "8:31 Dec 27 2006" and "14:23 Feb 2 2007".

We then had adjusted some configuration of MC/SG on Mar 19 2007. While after that, the V2600 host had a new problem: it reboot occasionally then the service was taken over by V2500. The reboot took place this year at "20:32 Apr 17", "15:39 Aug 16" and "15:17 Sep 24".

the OldSyslog shows:
Aug 16 13:41:35 bocv26 cmcld: Processing exit status for service cmlvmd
Aug 16 13:41:35 bocv26 cmcld: Service cmlvmd terminated due to a signal(9).
Aug 16 13:41:35 bocv26 cmcld: Halting bocv26 to preserve data integrity
Aug 16 13:41:35 bocv26 cmcld: Reason: LVM daemon failed

thanks in advance
6 REPLIES 6
moonchild
Regular Advisor

Re: Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

attached is the second part of the file
moonchild
Regular Advisor

Re: Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

first part of file
Sameer_Nirmal
Honored Contributor

Re: Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

Looks like the MC/SG node is getting TOC'd as cmlvmd daemon is getting killed.

What is the MC/SG version being used?
Patch level ( OS and MC/SG )?
skt_skt
Honored Contributor

Re: Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

paste MCSG version with patch level.Include LVM patch level too.

#cmversion will work for newer versions.

analyse the flight recorder dumps under /var/adm/cmcluster/frdump.cmcld.x using /usr/contrib/bin/cmfmtfr. That may provide more information..

what was the recent configuration changes of MCSG as stated?any clue from the syslog during the reboot(look at OLDsyslog)

#ps -ef|grep cmclconf
#cat /etc/shutdownlog.

Are you sure any of the nodes are on resource crunch(like no more MEM/SWAP left). Just cross check the resource utlization while the TOC was triggered
TY 007
Honored Contributor

Re: Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

Hello Moonchild,

HP-UX OS Version = 11.00

Install the following Latest LVM Patch (if not installed):
PHKL_35742 s700_800 11.00 LVM Cumulative Patch

ServiceGuard A.11.14:
PHSS_32656 s700_800 11.X MC/ServiceGuard and SG-OPS Edition A.11.14

ServiceGuard A.11.13:
PHSS_30742 s700_800 11.X MC/ServiceGuard and SG-OPS Edition A.11.13

ServiceGuard A.11.12:
PHSS_26270 s700_800 11.00 MC/SG & SG-OPS Edition A.11.12

ServiceGuard A.11.09:
PHSS_27158 s700_800 11.X MC/ServiceGuard and SG-OPS Edition A.11.09

Thanks
moonchild
Regular Advisor

Re: Reboot after panic: SafetyTimer expired, isr.ior = 0'3400d4.0'f7fdeca8

MC SG rev A.11.12
OS 11.00

attached is the inventory.xml file