Operating System - HP-UX
1847730 Members
3281 Online
110265 Solutions
New Discussion

why HP-UX server got rebooted

 
SOLVED
Go to solution
student_of_unix
Occasional Advisor

why HP-UX server got rebooted

Hi guys, we genrally get server reboot tickets that are auto genrated( patrol genrated).

can someone tell me the (systematic)trouble shooting steps if we come accross this issue...i mean investigating why the server got rebooted. i guess the first step would be chcking uptime and then /var/adm/syslog/syslog.log files.

thanks and regards
AK

PS: i would really appriciate if u can present it in systematic point wise way or like a check list so that i can use it easily.
8 REPLIES 8
Patrick Wallek
Honored Contributor

Re: why HP-UX server got rebooted

I'm going to give you some things to check, but it really is up to you, "student", to determine their order and applicability to your system.

Things to check:

/etc/shutdownlog
/var/adm/syslog/OLDsyslog.log
The old /etc/rc.log (I think rc.log.old, but I can't remember correctly at the moment).

A 'who -r' will show when the system came back up.

If there was an abnormal crash, you should have a crash dump in /var/adm/crash. You can search for threads on how to use Q4 to analyze this. HP Is also helpful in analyzing crash dumps.
student_of_unix
Occasional Advisor

Re: why HP-UX server got rebooted

thanku so much for the reply, ok any specific keyword that I can grep in these files...like say "Panic" and what are the different scenarios when a server gets rebooted.

And if it was a manual reboot, can we find out who rebooted it as we have power broker installed on our servers.
skt_skt
Honored Contributor
Solution

Re: why HP-UX server got rebooted

/etc/shutdownlog reports the reason or who did issue the shutdown/reboot command
Mridul Shrivastava
Honored Contributor

Re: why HP-UX server got rebooted

If it is a hardware panic then there will be an entry in shutdonwlog like "reboot after panic".

If this is the case then please check for ts99 file under /var/tombstones

If it is a software panic then you need to check the crash dump under /var/adm/crash with the latest imestamp and then analyse it using q4.
Time has a wonderful way of weeding out the trivial
MHudec
Frequent Advisor

Re: why HP-UX server got rebooted

Check ntp synchronization and date via ntpq (there is nothing more happier than customer with sap/oracle stuff running few months ahead in the future, just because one forgot to check the date).
Check for missing hardware in no_hw state via ioscan (fibre channel stuff might get messed, pvlinks might be missing etc.).
Check for lvm issues (unavailable physical volumes etd.) via vgdisplay -v.
Check for missing filesystems (not mounted etc.) via fstab (compare possibly to lvm information from vgdisplay), check also swap situation via swapinfo.
Check also rc.log (/etc/rc.log), compare it to older (/etc/rc.log.old).
Check for the reason of reboot, via shutdownlog (usualy there is a login stated, or panic word).
If parisc system check for tombstones (/var/tombstones/), look for ts99 and check for HPMC codes there (hardware issue).
If itanium system check for machine check aborts (/var/tombstones/), look for files starting with mca (hardware issue too).
If software crash, check crash dir (/var/adm/crash) for latest crash, check INDEX file in that crash dir (like crash.1/INDEX).
Also if there is a remote console attached to the system, console system and activity logs (SL, SEL etc.) might reveal useful information too (like power issues etc.).
Check also syslog, both of them, new as syslog.log and older as OLDsyslog.log.
Check for serviceguard issues if it is running there via cmviewcl (also package control log files in /etc/cmcluster).
This is all that I can pull out from my mind this very morning :), but all needed should be here.

Martin
Steven E. Protter
Exalted Contributor

Re: why HP-UX server got rebooted

Shalom,

The system itself can decide to reboot itself.

If a CPU fails a HPMC, High Priority Machine Check will be issued.

This will be evident in the console logs and the system will probably perform poorly after such a situation.

Basically you need to go through these posts, look around until you find the cause. Systems with GSP will have some record of the event, even if it was caused by cleaning people pulling the plug.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Andrew Merritt_2
Honored Contributor

Re: why HP-UX server got rebooted

Another place to look is the EMS event log, in /var/opt/resmon/log/event.log. Any serious events should also have been flagged in syslog, but the full information will be here. (Also check that you have a up-to-date version of the OnlineDiags.)

Andrew
student_of_unix
Occasional Advisor

Re: why HP-UX server got rebooted

thank you every one