Operating System - HP-UX
1834449 Members
2647 Online
110067 Solutions
New Discussion

Server crash - problem determination process

 
Steven Hargus_3
Advisor

Server crash - problem determination process

Came in this morning and found one of our N4000's had rebooted itself. There is nothing in /var/adm/crash, but there is a ts99 file in /var/tombstones.

I remember reading a note from the Technical Knowledgebase about the process for determining the cause for a server crash (whether it's hardware, kernel, software, etc., how to run Q4...) but now I can't find it. Has anyone seen that document?

Thanks!
5 REPLIES 5
Deoncia Grayson_1
Honored Contributor

Re: Server crash - problem determination process

Maybe this is the link you are looking for:

http://www2.itrc.hp.com/service/iv/docDisplay.do?docId=prodITRC/DE_SW_UX_swrec_EN_01_E/Dump.pdf
If no one ever took risks, Michelangelo would have painted the Sistine floor. -Neil Simon
Raj D.
Honored Contributor

Re: Server crash - problem determination process

Hi Steven ,

Here is something you need to know:

1. If there is a crashdump , then only you can run Q4 to analyse the dump.

2. Check for crash file in /var/adm/crash/crash.n/
( where n is the number of crash of your system , could be 0 , 1 , 2 etc).

3. Check if crashdump is enabled on the system or not:

# cat /etc/rc.config.d/crashconf

CRASHCONF_ENABLED should be =1

4. There is a utility called collection.sh , hp sends(when call logs) to analyse the crashdump , and also a binary file crashinfo.

copy the crashinfo file on /var/adm/crash and run the script. And send the file , hp tells the reason for crash.

5. also check
# last reboot
# cat /etc/shutdownlog
# tail -n 50 /var/adm/syslog/OLDsyslog.log ( for any error)
# dmesg



Hope this will help ,

Cheers,
Raj.
" If u think u can , If u think u cannot , - You are always Right . "
Raj D.
Honored Contributor

Re: Server crash - problem determination process

Hi Steven ,

Here is the Q4 crashdump analysis doc (in MS word format.) , you are looking for:


Cheers,
Raj.

" If u think u can , If u think u cannot , - You are always Right . "
rveri
Super Advisor

Re: Server crash - problem determination process

Steven ,

A final thought , check /var/tombstones/ts99 file , if any valid Time stamp .

Also Check if all the CPUs are visible and shows ok.

Cheers,
Raj.
Steven Hargus_3
Advisor

Re: Server crash - problem determination process

Actually, what I am looking for is a document I saw some time ago on the knowledgebase that gave a high-level process for determining why a server rebooted itself. Something like:

1. If /var/tombstones/ts99 exists then
a. If you see an HPMC line, then an HPMC was
generated, most likely this is a
hardware issue
b. If there is no HPMC line, then ....
2. Also check /etc/shutdownlog, if you see
"reboot" then the server was rebooted from
a software command. If you see "panic"
then it was a kernel panic.
3. If /var/adm/crash exists, then you can run
Q4 to determine the problem...

etc.