HPE 9000 and HPE e3000 Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

 
Singar
Advisor

K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hello gurus,

Today one of our K Class 370 server got rebooted automatically. I checked /etc/shutdownlog file for any entries. The following msg was added before reboot
"Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0".

Following is the filtered output of /var/tombstones/ts99 file.

HPMC Chassis Codes = 0xcbf0 0x20b3 0x5408 0x5508 0xcbfb
HPMC Chassis Codes = 0xcbf0 0x510b 0x5408 0x5508 0xcbfb

Can anyone tell me what is wrong with my system ? 0x20b3 means what ??

 

 

P.S. This thread has been moevd from General to HP 9000. - Hp Forum Moderator

13 REPLIES 13
Adisuria Wangsadinata_1
Honored Contributor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi,

Need ur help to give us the complete ts99 file (as attachment).

Thanks & Cheers,
AW
now working, next not working ... that's unix
Singar
Advisor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi,

Attaching ts99 file...

thx
-singaravelu
Biswajit Tripathy
Honored Contributor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

If the system paniced, see if you have a
crash dump at /var/adm/crash created at the
time of crash?

- Biswajit
:-)
Isralyn Manalac_1
Regular Advisor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi,

I'd suggest ringing HP ESC to do a fingerprinting of the ts file. The ts file has indicated a possible system board failure. Do you have files under /var/adm/crash? Can you also please post the OLDsyslog.log?

Regards,

Ira
Singar
Advisor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hello all,

OLDsyslog.log is of no use. There is no potential message given for the panic.
I followed below procedure to get some details on panic using q4 debugger:

1) # cd /var/adm/crash/crash.0
2) # uncompress /usr/contrib/lib/Q4Lib.tar.Z
3) # tar -xf /usr/contrib/lib/Q4Lib.tar
4) # cp q4lib/sample.q4rc.pl ~/.q4rc.pl
5) # /usr/contrib/bin/gunzip vmunix.gz
6) # /usr/contrib/bin/q4pxdb vmunix
7) # q4 -p .

I got the following o/p when "q4 -p ." was run,
===========================================
@(#) q4 $Revision: 11.X B.11.23d Thu May 6 18:05:11 PST 2003$ 0
q4: (warning) Here are the savecore error messages -
q4: (warning) savecrash: Insufficient space to save full core dump,
q4: (warning) savecrash: Could not completely process dump image area 1 on /dev/
dsk/c0t4d0
Reading kernel symbols ...
Reading data types ...
Error: requested page does not exist on target machine.
q4: (warning) failing page number = 0xbdb status - -2
q4: (error) can not read symbol pdirhash_type from core file
quit
===========================================

The expected behaviour was that q4 shoud have taken me to q4 utility prompt. Looking at the message, I concluded that dump area is lesser size than required. Then I added one more dump device using "crashconf". Below is the o/p of crashconf

===========================================
Crash dump configuration has been changed since boot.

CLASS PAGES INCLUDED IN DUMP DESCRIPTION
-------- ---------- ---------------- -------------------------------------
UNUSED 91611 no, by default unused pages
USERPG 40278 no, by default user process pages
BCACHE 219784 no, by default buffer cache pages
KCODE 2567 no, by default kernel code pages
USTACK 1630 yes, by default user process stacks
FSDATA 684 yes, by default file system metadata
KDDATA 135937 yes, by default kernel dynamic data
KSDATA 31797 yes, by default kernel static data

Total pages on system: 524288
Total pages included in dump: 170048

DEVICE OFFSET(kB) SIZE (kB) LOGICAL VOL. NAME
------------ ---------- ---------- ------------ -------------------------
31:0x004000 314208 4194304 64:0x000002 /dev/vg00/lvol2
31:0x005000 105312 1048576 64:0x010002 /dev/vg01/lvol2 <<< This is the newly added dump device
----------
5242880
==========================================

Again tried q4.. again same error occured.

Any one has ever run q4 to get more details of panic ?

Or coult it be that I haven't configured crashconf to store panic details ?

thx
-velu
Singar
Advisor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

finally I got q4 run and got the following two files as described in http://67.176.164.101/ll/hpq4.html#full_doc

- trace.out
- ana.out

and when I grepped

grep HPMC ana.out trace.out

# grep HPMC ana.out trace.out
ana.out:Crash Event 1 (HPMC, struct crash_event_table_struct at 0xa89030):
ana.out:crash event was an HPMC

What is next ? We don't have support contract with HP?

Is there any one who is having K Class Service Manaual ?
Fabio Ettore
Honored Contributor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi Singar,

if you don't have a support contract by HP then it is the real problem.
I am pretty sure that the panic was for a bad CPU. I had a very similar problem a few days ago (with isr.ior entry in /etc/shutdownlog and HPMC) and by replacing a CPU the problem was solved.

Best regards,
Fabio
WISH? IMPROVEMENT!
melvyn burnard
Honored Contributor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

You have had a hardware problem cause your system panic. You will need to get this analysed to be able to look at what caused the problem.
I suggest you log a suppoprt call with your hardware mainternance supplier, or you could log a call with HP
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Deepak.R
Frequent Advisor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi,

Its a hardware issue and HPMC ( High Priority Machine Check ) occured in your system .

Detailed analysing of ts99 or PIM collected from bch manu is to be done using special decoders to know which part of the system has caused the error.

It could be a mainboard or I/O expansion board issue , need to log a call with HP , they may replace it on a on time payable basis since you do noit have a contract.

You will not find HPMC analysing information in service manuals and this is not a user performable action.


regards
deepak
Singar
Advisor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

I see there r 2 processors in the system ...
Can I disable the defect processor by ant means ?

TIA,
velu
Fabio Ettore
Honored Contributor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi,

yes, you can do it:

1. Reboot the system.

2. Interrupt the boot process within 10 seconds by hitting any key.

3. This will bring up the Main Menu prompt from the BCH.

4. Enter the Service command:

Main menu: Enter command> SER

5. Enter the CPUconfig command:

Service Menu: Enter command> CPUconfig [] [ON|OFF]

Where = [0|1] processor

6. Enter the RESET command to reboot the system:

Service Menu: RESET



Have you found what CPU is bad? And first of all are you sure that is the problem?
I wrote that about what happened to me but an analisys of core is needed to discover what is the reason of the problem.

Best regards,
Fabio
WISH? IMPROVEMENT!
Singar
Advisor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi Fabio,

Thank you for the cpu disable procedure.
Since I don't have support with HP, I have to determine the root cause and tell the top management to make budjet for the same, and get it sanctioned.

I am assuming that the problem could be due to one of the processor. Please check the below URL as to why I am suspecting CPU

http://unix.derkeiler.com/Mailing-Lists/HP-UX-Admin/2004-04/0075.html

Thank you all for the support !
-velu
Fabio Ettore
Honored Contributor

Re: K 370 - "Reboot after panic: , isr.ior = 0'1004000b.0'b86e12a0"

Hi,

ok, I suppose that the problem is about a bad CPU, following too my experience of few days ago. But I would advise you to not wait more than this because the definitive answer about the reason of the problem should come by analyzing the dump by HP.

Best regards and good luck,
Fabio
WISH? IMPROVEMENT!