System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

System crash and reboot on HP-UX 11.11 but dont know what happened - help ?

 
gunners
Frequent Advisor

System crash and reboot on HP-UX 11.11 but dont know what happened - help ?

Hi Everyone,

Just looking for a dig out. Trying to find out why our system crashed (hp/ux 11.11) last night.

Checked the /var/adm/syslog and the last entry in in is an ftp - but nothing else to suggest there was a problem.

 

The system rebooted itself ok , am wondering is there any ither logs or any way I can find out what caused it ?

Thanks in advance

7 REPLIES
Robert_Jewell
Honored Contributor

Re: System crash and reboot on HP-UX 11.11 but dont know what happened - help ?

You dont say what system model, but check out the following:
- /etc/shutdownlog
Look for the last entry. The entry may give more of a clue as to what happened.

- /var/tombstones/ts99
Look for the timestamp in the logs. A valid entry may indicate a hardware issue.

- /var/adm/crash/
Look at the latest directory if there is one (crash.0, crash.1, etc). If it was created at the time of the crash, review the INDEX file within. It likely contains the same info as the shutdownlog, but is another place to check.

After this, it depends on what you find.

-Bob


----------------
Was this helpful? Like this post by giving me a thumbs up below!
gunners
Frequent Advisor

Re: System crash and reboot on HP-UX 11.11 but dont know what happened - help ?

Bob , they are great pointers , many thanks will check out tomorrow.

Thanks again

gunners
Frequent Advisor

Re: System crash and reboot on HP-UX 11.11 but dont know what happened - help ?

While I will probably log a call with HP , just thought Id bounce this on having checked the logs - does this look like a memory dump or something ?. The system hadnt been rebooted since Feb (mabey it just seized up)

 

/etc/shutdownlog

 

00:13  Tue Aug 30 2011.  Reboot after panic: Data page fault

 

var/tombstones/ts99

 

 

 

--------------  Memory Error Log Information  --------------

 

   No errors logged for this bus

 

------------  I/O Module Error Log Information  ------------

 

  No IO subsystem errors recorded

 

-----------------  Processor 3 HPMC Information - PDC Version: 45.11  ------

 

 

   * * * No valid timestamp * * *

       No HPMC chassis codes logged

 

 

General Registers 0 - 31

00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000

 

04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

08-11  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

12-15  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

16-19  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

20-23  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

24-27  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

28-31  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

 

 

 

Control Registers 0 - 31

00-03  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

08-11  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

12-15  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

16-19  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

20-23  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

24-27  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

28-31  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

 

 

Space Registers 0 - 7

00-03  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

 

 

IIA Space (back entry)       = 0x0000000000000000

IIA Offset (back entry)      = 0x0000000000000000

Check Type                   = 0x00000000

Cpu State                    = 0x00000000

Cache Check                  = 0x00000000

TLB Check                    = 0x00000000

Bus Check                    = 0x00000000

Assists Check                = 0x00000000

Assist State                 = 0x00000000

Path Info                    = 0x00000000

System Responder Address     = 0x0000000000000000

System Requestor Address     = 0x0000000000000000

 

 

 

Floating Point Registers 0 - 31

00-03  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

08-11  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

12-15  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

16-19  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

20-23  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

24-27  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

28-31  0000000000000000  0000000000000000  0000000000000000  0000000000000000

 

 

 

PIM Revision                 = 0x0000000000000000

CPU ID                       = 0x0000000000000000

CPU Revision                 = 0x0000000000000000

Cpu Serial Number            = 0x0000000000000000

Check Summary                = 0x0000000000000000

SAL Timestamp                = 0x0000000000000000

System Firmware Rev.         = 0x0000000000000000

PDC Relocation Address       = 0x0000000000000000

Available Memory             = 0x0000000000000000

CPU Diagnose Register 2      = 0x0000000000000000

MIB_STAT                     = 0x0000000000000000

MIB_LOG1                     = 0x0000000000000000

MIB_LOG2                     = 0x0000000000000000

MIB_ECC_DATA                 = 0x0000000000000000

ICache Info                  = 0x0000000000000000

DCache Info                  = 0x0000000000000000

Sharedcache Info1            = 0x0000000000000000

Sharedcache Info2            = 0x0000000000000000

MIB_RSLOG1                   = 0x0000000000000000

MIB_RSLOG2                   = 0x0000000000000000

MIB_RQLOG                    = 0x0000000000000000

MIB_REQLOGa                  = 0x0000000000000000

MIB_REQLOGb                  = 0x0000000000000000

Reserved                     = 0x0000000000000000

Cache Repair Detail          = 0x0000000000000000

 

PIM Detail Text:

 

 

 

--------------  Memory Error Log Information  --------------

 

   No errors logged for this bus

 

------------  I/O Module Error Log Information  ------------

 

  No IO subsystem errors recorded

Processor [4]: Not Present.

Processor [5]: Not Present.

Processor [6]: Not Present.

Processor [7]: Not Present.

 

Return Warnings:

 

 

 

Return Revisions:

 

FRU INFORMATION

 

        Module              Revision

        ------              --------

        PA 8900 CPU Module  3.2

        PA 8900 CPU Module  3.2

        PA 8900 CPU Module  3.2

        PA 8900 CPU Module  3.2

 

Board Info!

  Format Version  : 0x1                   Language Code : 0x0

  Mfg Date        :                       Mfg Name      : JABIL

  Product Name    : augustus baseboard

  Serial Number   : 52JAPE4550007610

  Part Number     : A6961-60201

  Fru File Tp/Len : 0x1  Fru File : ^P

  Revision        : A  Eng Date Code : 4545

  Artwork Rev     : A5  Fru Info

 

Re: System crash and reboot on HP-UX 11.11 but don't know what happened - help?

>does this look like a memory dump or something?

 >/etc/shutdownlog: 00:13  Tue Aug 30 2011.  Reboot after panic: Data page fault

 

Yes.  As it said, you have a panic.  How up to date are you on patches?

Robert_Jewell
Honored Contributor

Re: System crash and reboot on HP-UX 11.11 but dont know what happened - help ?

Basically, a Data Page Fault means something hosed up a memory address.  Could be a CPU or could be a driver, but cant tell without a kernel dump analysis.  However, as Dennis alludes to, if your patch bundles are low (more than a year or two); consider updating. 

 

-Bob

----------------
Was this helpful? Like this post by giving me a thumbs up below!

Re: System crash and reboot on HP-UX 11.11 but don't know what happened - help?

>Could be a CPU or could be a driver, but can't tell without a kernel dump analysis.

 

It is more likely a software bug than hardware, so you need it analyzed.

Michael_Pelleti
Occasional Advisor

Re: System crash and reboot on HP-UX 11.11 but don't know what happened - help?

The support center has a tool called "crashinfo" that pulls out salient information from a crash dump. The "Q4" utility is also available in /usr/contrib/bin/q4, and has a "WhatHappened" command that can do something similar without opening a support call:

 

http://www.unixguide.net/hp/hpuxcrashdump.shtml

 

The /var/adm/crash/crash.* directory will have an "INDEX" file in it - this will show a line near the beginning tagged "panic," with an isr.ior value. If that's equal to "0'0.0'0," for example, then you're looking at a kernel-space null-pointer dereference data page fault. The isr.ior os the interruption space register and interruption offset register.