Integrity Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

rx4640 MP Error Message

 
SOLVED
Go to solution
manamirhastam
Frequent Advisor

rx4640 MP Error Message

The rx4640 server restarted suddenly without any shutdown log on OS.
The following lines belong to MP event logs:

Log Entry 266: 06 Aug 2007 13:07:47
Alert Level 2: Informational
Keyword: Type-02 1f6f06 2060038
OS Boot complete.
Logged by: OS Software Agent;
Sensor: OS Boot
Data1: boot completed-boot device not specified
0x2146B71D230213C0 FF0F066F001F0300

Log Entry 265: 06 Aug 2007 13:03:10
Alert Level 1: Major Forward Progress
Keyword: BOOT_START
CPU starting boot
Logged by: System Firmware 0
Data: Major change in system state
0x5480006300E013A0 0000000000000000


Log Entry 264: 06 Aug 2007 13:03:10
Alert Level 2: Informational
Keyword: Type-02 1d0a00 1903104
CPU starting boot
Logged by: Redundant w/ an E0 code;
Sensor: System Boot Initiated
Data1: transition to Running
0xC146B71C0E021390 FFFF000A001D0300


Log Entry 263: 06 Aug 2007 13:03:06
Alert Level 2: Informational
Keyword: Type-02 127002 1208322
Soft Reset
Logged by: Baseboard Management Controller;
Sensor: System Event
0x2046B71C0A021380 FFFF027000120300

Log Entry 262: 06 Aug 2007 13:02:20
Alert Level 3: Warning
Keyword: HP-UX_CRASHDUMP_STARTED
OS crashdump started (D700)
Logged by: HP-UX Kernel 1
Data: Legacy 20-bit PA HEX chassis code
0x7F80033701E01360 00000000000AD700


Log Entry 261: 06 Aug 2007 13:02:18
Alert Level 5: Critical
Keyword: HP-UX_HEX_FAULT_CODE
OS legacy PA hex fault code (Bxxx)
Logged by: HP-UX Kernel 1
Data: Legacy 20-bit PA HEX chassis code
0xBF80033801E01340 000000000002B100

Log Entry 260: 06 Aug 2007 13:02:17
Alert Level 7: Fatal
Keyword: Type-02 206f01 2125569
OS run-time critical shutdown
Logged by: OS Software Agent;
Sensor: OS Critical Stop
Data1: Run-time Stop
0x2146B71BD9021330 FF0F016F00200300


Log Entry 259: 06 Aug 2007 13:02:15
Alert Level 7: Fatal
Keyword: INIT_INITIATED
INIT initiated
Logged by: System Firmware 3
Data: Major change in system state - LED Command only
0xF480007903E01310 0000000000000003

Log Entry 258: 06 Aug 2007 13:02:15
Alert Level 7: Fatal
Keyword: Type-02 136f03 1273603
INIT Initiated
Logged by: Redundant w/ an E0 code;
Sensor: Critical Interrupt
Data1: Software NMI
Data2: OEM Code1: 0x3FOEM Code2: 0x00
0xC146B71BD7021300 003FA36F00130300


Log Entry 257: 06 Aug 2007 13:02:15
Alert Level 7: Fatal
Keyword: INIT_INITIATED
INIT initiated
Logged by: System Firmware 2
Data: Major change in system state - Major Change to log in Activity
0xF480007902E012E0 0000000000000002

Log Entry 256: 06 Aug 2007 13:02:15
Alert Level 7: Fatal
Keyword: Type-02 136f03 1273603
INIT Initiated
Logged by: Redundant w/ an E0 code;
Sensor: Critical Interrupt
Data1: Software NMI
Data2: OEM Code1: 0x3FOEM Code2: 0x00
0xC146B71BD70212D0 003FA36F00130300


Log Entry 255: 06 Aug 2007 13:02:15
Alert Level 7: Fatal
Keyword: INIT_INITIATED
INIT initiated
Logged by: System Firmware 0
Data: Major change in system state
0xF480007900E012B0 0000000000000000

Log Entry 254: 06 Aug 2007 13:02:15
Alert Level 7: Fatal
Keyword: Type-02 136f03 1273603
INIT Initiated
Logged by: Redundant w/ an E0 code;
Sensor: Critical Interrupt
Data1: Software NMI
Data2: OEM Code1: 0x3FOEM Code2: 0x00
0xC146B71BD70212A0 003FA36F00130300

Any idea?
8 REPLIES 8
Sameer_Nirmal
Honored Contributor
Solution

Re: rx4640 MP Error Message

As per the MP FPLs, there was a HPUX kernel panic. You need to get the crashdump analyzed. Log a s/w call to HP Support.

Your profile shows
"I have assigned points to 0 of 22 responses to my questions. "

Take some time to assign points to the responses against your questions/concerns.
This will ensure early and more responses to your future posts.
tkc
Esteemed Contributor

Re: rx4640 MP Error Message

This event can be triggered by the "tc" command from the MP, or from the button labeled "TOC" or "Transfer of Control" on the Management card or bezel of the system. There are also other causes of an INIT generated by software. You will need to analyse the crashdump to find out the root cause. If your system is one of the node in the service guard cluster, the TOC could also be possibly caused by a Service Guard TOC.
manamirhastam
Frequent Advisor

Re: rx4640 MP Error Message

Hi Sameer,
You are completely right about the points.
Regarding to the crash dump issue , I need a clear routine to analyse the crash dump.
I found many ways to check it.
There are differences between these methods.
For example to use Q4 for analyse checking there are several ways to issue q4 command.
What solution do you propose to analyse and check the crash dump , q4 ?
and is there any standard routine to use q4?
Thanks.
Robin T. Slotten
Trusted Contributor

Re: rx4640 MP Error Message

Use q4 to process the crash dump.

http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c01021636-6

http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c01037168-1

or search for "q4 dump" in the KB.

The clue will be in the what.out file. You may end up shipping the dump to HP to decypher all of it.

Here is a script I use to preprocess the dump. #NOTE I found the q4lib a level deeper than the documents showed.

--------snip--------
CRASHDIR=/var/adm/crash/crash.6
Q4_PERL_PATH=/opt/perl/bin/perl

set -x

cd $CRASHDIR

pwd


/usr/contrib/Q4/bin/q4prep -p

if [ ! -f /usr/contrib/Q4/lib/q4lib/sample.q4rc.pl ];then
if [ -f /usr/contrib/lib/Q4Lib.tar.Z ];then
uncompress /usr/contrib/Q4/lib/Q4Lib.tar
tar -xf /usr/contrib/Q4/lib/Q4Lib.tar
fi
fi


if [ ! -f ~/.q4rc.pl ];then
cp /usr/contrib/Q4/lib/q4lib/sample.q4rc.pl ~/.q4rc.pl
fi
if [ -f vmunix.gz ];then
/usr/contrib/bin/gunzip vmunix.gz
fi
/usr/contrib/Q4/bin/q4pxdb vmunix

------snip done -----
now run q4 and feed it the following commands:

#/usr/contrib/Q4/bin/q4 -p . #NOTE the "."

q4> trace event 0 > trace.out
q4> include analyze.pl
q4> run Analyze AU >> ana.out
q4> include whathappened.pl
q4> run WhatHappened -HANG > what.out
q4> exit


Rob...
IF you do it more than twice, write a script.
manamirhastam
Frequent Advisor

Re: rx4640 MP Error Message

I used q4 to analyse crash dump and I found the following line in what.out:

"Bad News: Cannot use the Kernel Stack when interrupted on the ICS."

Any idea ?
Sameer_Nirmal
Honored Contributor

Re: rx4640 MP Error Message

Looking at the stack trace, you may need to install this patch
http://www1.itrc.hp.com/service/patch/patchDetail.do?patchid=PHNE_35182&sel={hpux:11.23,}&BC=main|search|

I had came across few rx8640 servers panic with similar stack trace before and installted the latest cumulative ARPA Transport patch and it worked. I think applying this patch would resolve the panic issue.

Besides using Q4 as mentioned above, you can also use "CRASHLITE". The OS OE might have "CRASHLITE" (/usr/contrib/ktools/bin/crashlite) a new dump reading tool installed on the system. This tool is available with 11.23 Sept'04 release or later.
manamirhastam
Frequent Advisor

Re: rx4640 MP Error Message

Hi Sameer,

Do you propose to apply the PHNE_35182 as a single patch or installation all of the Standard Patch Bundles(Jun '07)?

Thanks
Sameer_Nirmal
Honored Contributor

Re: rx4640 MP Error Message

Oops!! I wanted to put the link for latest and recommended patch PHNE_35766 which supersedes PHNE_35812. The patch PHNE_35812 got a warning as you can see in the link I provided earlier. Install the patch PHNE_35766.
http://www1.itrc.hp.com/service/patch/patchDetail.do?patchid=PHNE_35766&sel={hpux:11.23,}&BC=main|search|

I would install this patch only at this time to get ride of the panic issue. As far as installing the Jun'07 patch bundle is concerned, I would do the assessment of the patches inside it and their recommended ratings etc. We usually apply a newly released patch bundle after 3-6 months after it's release and patches assessment is done unless something comes up and we have to install a concerned individual patch or patches.