1825766 Members
2432 Online
109687 Solutions
New Discussion

Re: Unexpected Shutdown

 
MAD_2
Super Advisor

Unexpected Shutdown

We have a RP-5470 running HP-UX 11.0 and there was a sudden and unexpected shutdown. Unfortunately, at the time this happened, EMS was turned off, so the only detail I have is from the shutdownlog
++++++++++++++++++++++++++++++++++++++++++++
14:28 Tue Jan 27 2004. Reboot after panic: , isr.ior = 0'1034090c.80000000'7
f3c3370
++++++++++++++++++++++++++++++++++++++++++++

However, I am not sure exactly what this means.

There was someone working behind the servers and I suspect it could have been an accidental power cable pull, but I can't prove that was the case at this moment.

Is there another way I can tell what happened? The server just suddenly went down!

Thanks for any ideas.
Contrary to popular belief, Unix is user friendly. It's just very particular about who it makes friends with
11 REPLIES 11
Helen French
Honored Contributor

Re: Unexpected Shutdown

Did you check the OLDsyslog.log file? Look there for any more details. Also, look at root mail if you have any error notifications.
Life is a promise, fulfill it!
Steven E. Protter
Exalted Contributor

Re: Unexpected Shutdown

Most likely you have a crash dump.

Its in /var/adm/crash

as a subdirectory.

If you find that, use the doc I'm attaching to run q4 analysis. Send the results to HP and they will tell you what to do next.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Michael Tully
Honored Contributor

Re: Unexpected Shutdown

Beside looking in the OLDsyslog, you might also get some hints from /var/tombstones/ts99 or did the system create a crash dump.
Anyone for a Mutiny ?
doug mielke
Respected Contributor

Re: Unexpected Shutdown

can use q4 to look at crash dump, if your system is configured to save it.

Easy place to look is at the last entries in OLDsyslog. dmesg might have a hint as well.

Also, look for files in each filesystems lost+found dir. these files need to be replaced, but may also give a hint of what was happening at the time of the panic.

kamal_9
Super Advisor

Re: Unexpected Shutdown

Hi
It seems that to be crash dump check in /var/adm/crash directories for the crash dump file .use strings to view those file .and you can try q4 analysis what SEP has attached
Sridhar Bhaskarla
Honored Contributor

Re: Unexpected Shutdown

It's hard to find the reason for panic with the panic string in /etc/shutdownlog.

Look for core files under /var/adm/crash as mentioned and run Q4 on it. If you go through WhatHappened.out, you may get some information. 'grep HPMC' from ana.out file. If it says "Crash event was an HPMC", then it's mostly like ly that you had a hardware failure.

Most of the times the crash occured due to a bad CPU. If it is a CPU failure, HP can easily find it out from /var/tombstones/ts99 file. If it is somethingelse, they can analyze Q4 files.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Anoop P_2
Regular Advisor

Re: Unexpected Shutdown

Hi!

If its an accidental cable pull, there cant be an entry in the shutdownlog, because it cant have had time for putting entries before the power failed!!!

Try to find out whether the shutdownlog entry is exactly of the same time and date of this shutdown. May be that the system has crashed, but at an earlier time!

Anoop
Bart Paulusse
Respected Contributor

Re: Unexpected Shutdown

We've had a similar situation a few years ago. Someone was working in the computerroom and suddenly the development system went down. Turns out this "someone" needed a wall socket for his radio..... and yanked the first plug in sight out....

I bet the someone working behind your server swears he didn't touch anything...

In our case proving it was a bit easier since the radio was loudly playing with the powercord of the server laying next to it.

John Stevens_2
Occasional Contributor

Re: Unexpected Shutdown

Just a thought, but if a cable was pulled as you suggest, there will likely be an entry in the GSP error log.


J.
Kent Ostby
Honored Contributor

Re: Unexpected Shutdown

MAD --

The panic string "isr.ior" indicates that the system had either an HPMC or ServiceGuard TOC.

There is an HP document which is used for pre-processing dumps before HP looks at them that discusses how to run a tool called Q4 and check for both the HPMC and the ServiceGuard TOC.

That document is:
http://www2.itrc.hp.com/service/cki/search.do?category=c0&mode=id&searchString=OZBEKBRC00000611&searchCrit=allwords&docType=EngineerNotes&search.x=28&search.y=11

The short way to distinguish between these types of failures is to answer these questions:

1) are you running MC/ServiceGuard ? If not then you had an HPMC which could be caused by a the missing cable or could be caused by something else. On most boxes, you will have a file in /var/tombstones ts99 which should be looked at by HP.

2) if you are running MC/ Serviceguard, then check the timestamp INSIDE the file /var/tombstone/ts99 and if it is valid (i.e. from the same time as your shutdown) then you likely had an HPMC (High Priority Machine Check) and you should have HP troubleshoot the ts99 file.

3) If there is no valid timestamp in the ts99 file and you run MC/Serviceguard then you likely had an MC/Serviceguard TOC which means that the machines lost contact with each other for a period of time > then the NODE TIMEOUT value and you need to contact HP to have them look at your Serviceguard setup. Most likely they will need the dump files in /var/adm/crash, output from cmgetconf, and the syslog files from both nodes in the cluster from the time of the crash (which on the crashed box will likely be the OLDsyslog.log).

4) If none of the above apply then follow the steps on the document I listed above and open a SW call with HP to have them look at the results.

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
MAD_2
Super Advisor

Re: Unexpected Shutdown

Great!!! Lots of responses. Sorry but yesterday I got a little side-tracked with other things to fix after the sudden shutdown and I did not get back to read replies until now.

I will look into GSP logs first. I do have a crash directory and will attempt later on to run q4 as per document attached by Steven. I also have a ts99 but have no clue of what to look for that will provide me with a hint of what happened. Of course, if there was a cable pulled, by the time I got to the sys room it was connected again and there is no way I can prove that is what actually happened (but it's only a suspicion, I can't think of anything else, and the guy just happened to be working in that general area where there was a good chance he could have pulled it. There were not other warnings). However, no matter the reason I would still like to find out because it could have been some type of hardware failure and that we need to be aware of.
Contrary to popular belief, Unix is user friendly. It's just very particular about who it makes friends with