Networking
cancel
Showing results for 
Search instead for 
Did you mean: 

How to isolate cause of system crash

frederick hannah
Super Advisor

How to isolate cause of system crash

I have an rp3440-4, hpux 11.11 that rebooted itself as cisco switch work was being done in the same computer room. I have checked syslog and dmesg for clues, but I cant find why the server rebooted. It wasnt a power issue as the switch work was being done on the opposite end of the room and only the rp3440 was impacted. Any ideas of where I can look to find the cause of the outage?
6 REPLIES
Pete Randall
Outstanding Contributor

Re: How to isolate cause of system crash

Since dmesg is going to be current rather than historical that won't do much good. Anything under /var/adm/crash? You say you checked syslog - did you check the old syslog? /var/adm/syslog/OLDsyslog.log


Pete

Pete
frederick hannah
Super Advisor

Re: How to isolate cause of system crash

Yes, I checked syslog and OLDsyslog. Nothing at all. This is the second time that work on that switch has caused this server to reboot. But only this one server.
Steven E. Protter
Exalted Contributor

Re: How to isolate cause of system crash

Shalom,

If the system is part of a serviceguard cluster, switch work could trigger a TOC, transfer of control and crash by interrupting the heartbeat.

To get good help, you are going to need to provide a lot more detail on your environment.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
frederick hannah
Super Advisor

Re: How to isolate cause of system crash

The server is avi package-less cluster. Service guard is only there for lan failover.
However, the cluster was down after the reboot so there is a chance MCSG may have had a hand in it. I restarted the cluster, the response was cluster not started, but the cluster did restart w/o incident. Strangely enough, there were no MCSG-related messages referencing cluster activity in syslog.


Raj D.
Honored Contributor

Re: How to isolate cause of system crash

Frederick,

- Check GSP chasis logs.
- Check /var/adm/crash/crash.x # where x is crash number.
- check /var/tombstones/ for any file with grep cpu # if true cpu caused the crash.
- check /etc/shutdownlog
- If it is due to the power trip, there will not be any log in the server. But GSP log will have power failure log.
- Check the timestamp of last portion of /var/adm/syslog/OLDsyslog.log and first portion of /var/adm/syslog/syslog.log and you can figure out what time it was rebooted and if it is related to the maintenance.


Cheers,
Raj.
" If u think u can , If u think u cannot , - You are always Right . "
S.N.S
Valued Contributor

Re: How to isolate cause of system crash

Hi,

The MP in IA 64 or GSP in PA-RISC would almost certainly have some related entry...
That would be a start to get to the root cause...

Let us know your progress, findings.

HTH
SNS
"Genius is 1% inspiration, 99% Perspiration" - Edison