1836762 Members
2653 Online
110109 Solutions
New Discussion

Re: crash

 
Indrajit Bhagat
Regular Advisor

crash

kindly help me in the following

(1) what are the different types of Crashes.
(2) Server is crashing & you are having Console access.. How will you find out the cause for crashing
4 REPLIES 4
Patrick Wallek
Honored Contributor

Re: crash

1) System Panic - usually software. High Priority Machine Check (HPMC) - usually Hardware. Transfer of Control (TOC) - Can be done manually from GSP/MP or initiated by ServiceGuard

2) The first place I check is /etc/shutdownlog. That should tell what caused the system to go down. It it was a panic or HPMC, then check /var/adm/crash for a crash dump. If you have a crash dump and a software support contract with HP, then open a call with HP and they will give you instructions on how to process the crash dump and how to send it to them. They will then analyze it and tell you what caused the problem initially.

Matti_Kurkela
Honored Contributor

Re: crash

Patrick's list was a good start, but it did not mention environment-related crashes. These are rather hard to diagnose with a remote console, because the system may be totally unresponsive.

A visit to the local console may often reveal BIG clues in these cases: for example, the smoke, the floodwater or maybe just the fact that there is unusually hot, dark & quiet in the server room (=total loss of power).

Then there are the situations where the console works but you cannot get the system to boot again, and so you cannot access anything on the system disk.

In these situations, if the system is equipped with a GSP, MP or iLO, there are some more places to check.

The console history (CL command on a GSP/MP): it may contain the "last words" of the crashing system. If some failure has rendered the system incapable of writing to the system disks, this may be the only remaining source of OS-level error information.

The hardware error log (SL command on a GSP/MP): with a serious hardware error, there may be several messages in this log, so remember to look at all the messages, not just the latest one. These messages are often very cryptic, but if you can capture the hardware error log (using a laptop connected to the console port, for example) and send the log to HP, the HP engineers may be able to find the root cause even before arriving on-site.

When a system crashes while running pre-boot self-tests, you may have to look into the diagnostic codes. The older servers (those equipped with a GSP or even older ones like the K-class) will display them automatically while the system is booting, but on newer servers you may have to use the Virtual Front Panel function of the GSP/MP to see them.

www.openpa.net has the list of diagnostic codes for K-class servers.

MK
MK
Patrick Wallek
Honored Contributor

Re: crash

Very good points, Matti.

I had a machine go down a few weeks ago and had to go into GSP/MP and check the logs. It wound up being the power supply in an A500 (only one in that machine), so none of the other tools worked. The errors logs from 'SL' at the GSP were the key.
A. Clay Stephenson
Acclaimed Contributor

Re: crash

The types of crashes are 1) perfectly elastic collisions (think molecules of a gas); 2) perfectly inelastic collisions (think of a ball of dough hitting a wall; and 3) everything in between --- OR --- was your question not related to mechanics?

In any event, normally the best tool for crash analysis (assuming this is a UNIX question) is an examination of the dump file using a tool like q4 or crashinfo.
If it ain't broke, I can fix that.