Operating System - HP-UX
1833159 Members
3132 Online
110051 Solutions
New Discussion

Re: server crash on 10.20 system

 
SOLVED
Go to solution
MikeL_4
Super Advisor

server crash on 10.20 system

We had a server crash last night at around 21:00 and the system rebooted and came back up, and so far no reoccurances.
This server has 12 CPU's and it came back up with 12, as I was and possibly still am suspecting a CPU error that caused it. there is nothing in syslog or messages that indicates any type of error. I've attached the tomestone ts99 file, which is showing CPU errors but they were recorded after the fact.
I have a core dump in var/adm/crash but don't know how to read or check it myself.
If there is anyone who could give some guidence I would appreciate it...
9 REPLIES 9
Pete Randall
Outstanding Contributor

Re: server crash on 10.20 system

Mike,

Check /etc/shutdownlog and /var/adm/syslog/OLDsyslog.log (if you haven't already) for any hints as to the cause.


Pete

Pete
Patrick Wallek
Honored Contributor

Re: server crash on 10.20 system

Heres the doc you want:
DOC ID OZBEKBRC00000611
http://www2.itrc.hp.com/service/cki/search.do?category=c0&mode=id&searchString=OZBEKBRC00000611&search.x=0&search.y=0&searchCrit=exactphrase&docType=Security&docType=Patch&docType=EngineerNotes&docType=BugReports&docType=Hardware&docType=ReferenceMaterials&docType=ThirdParty

If you have this machine under hwardware support, do what the doc says and then send everything to HP and let them tell you what's wrong.
Helen French
Honored Contributor

Re: server crash on 10.20 system

Check the shutdownlog file in /etc and the OLD syslog file. It should have some useful information for you. Also, I would do a crash analysis and will defenitely check the tombstone files for any errors. Check the time stamps too, when you read those files.

Check the suspected hardware with STM tools to pin point the issue.
Life is a promise, fulfill it!
John Carr_2
Honored Contributor

Re: server crash on 10.20 system

Hi

as Shiju says try the On Line Diagnostic Tools, there are three cstm mstm xstm all giving the same info. With a 12 processor server you should have no trouble in using the nice xstm one :-) John.
MikeL_4
Super Advisor

Re: server crash on 10.20 system

/etc/shutdownlog contains:
21:44 Mon Dec 15 2003. Reboot after panic: , isr.ior = 0'10040107.c0000000'f
75cb0c8

I checked OLDsyslog.log and it has nothing in it either indicating the problem cause.

I also did a search for Doc ID: OZBEKBRC00000611 but the search said not found ???

Helen French
Honored Contributor
Solution

Re: server crash on 10.20 system

This might be a patch issue. Are you up to date with patches? If no, apply all latest hardware, critical and application patches. Apply this patch if it's not in the system now - PHKL_24389 (s800 10.20 VxFS fsadm, bmap, performance cumulative patch). This patch has a solution for panic reboots after reporting an isr.ior error.

Also, to find the specified document, enter your documenr ID here, after selecting "search by Doc ID":
http://www2.itrc.hp.com/service/cki/enterService.do?admit=-1335382922+1071586846362+28353475
Life is a promise, fulfill it!
Helen French
Honored Contributor

Re: server crash on 10.20 system

To add, HP-UX 10.20 is no more supported now. You may need to consider moving to 11.X, so that you get the updated patches and fixes.
Life is a promise, fulfill it!
Richard Steven
Advisor

Re: server crash on 10.20 system


It is worth checking your power:

http://aa11.cjb.net/hpux_admin/2001/02/0059.html

:-)
If you don't ask you don't get!
Kent Ostby
Honored Contributor

Re: server crash on 10.20 system

Mike --

isr.ior =

indicates one of three things:

1) There was an HPMC (check the file /var/tombstones/ts99 and see if INSIDE the file there is a valid timestamp. If so, then log a HW call with HP support.

2) If there is not valid timestamp in the ts99 file AND you run ServiceGuard then you might have had a SG TOC. HP Software support can help you troubleshoot this and will need the full dump, output from cmscancl, and the syslog files from the time of the crash on all nodes.

3) The other time you will see this is if the system hung and you or another user forced a TOC or TC of the box.

Hope this helps,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"