1832976 Members
2902 Online
110048 Solutions
New Discussion

Re: Alert 13 system hang

 
Eloísa Martínez
New Member

Alert 13 system hang

Hello,
I had a situation with one HP 5470 server. Suddenly the OS hung and we had to shutdown the server by pressing the power button.

This is the message from GSP.

Log Entry # 0 :
SYSTEM NAME: consola2
DATE: 02/09/2004 TIME: 08:01:12
ALERT LEVEL: 13 = System hang detected via timer popping

SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 4 = timeout

CALLER ACTIVITY: F = display_activity() update STATUS: 0
CALLER SUBACTIVITY: 00 = implementation dependent
REPORTING ENTITY TYPE: E = HP-UX REPORTING ENTITY ID: 00

0x78E000D41100F000 00000003 00000000 type 15 = Activity Level/Timeout
0x58E008D41100F000 00006801 0908010C type 11 = Timestamp 02/09/2004 08:01:12

There is no new entry in shutdownlog; and the syslog only registered the startup.

This is the rbootd.log.
Mon Feb 9 02:31:07 2004 : STARTUP
Mon Feb 9 02:31:07 2004 : ppa=0 Ether

Mon Feb 9 02:31:07 2004 : lan0: type: Ether NMID = 0
Mon Feb 9 02:31:07 2004 : ppa=1 Ether

Mon Feb 9 02:31:07 2004 : lan1: type: Ether NMID = 1
Mon Feb 9 02:31:07 2004 : ppa=2 Ether

Mon Feb 9 02:31:07 2004 : lan2: type: Ether NMID = 2
Mon Feb 9 02:31:07 2004 : ppa=3 Ether

Mon Feb 9 02:31:07 2004 : lan3: type: Ether NMID = 3
Mon Feb 9 02:31:07 2004 : ppa=4 Ether

Mon Feb 9 02:31:07 2004 : lan4: type: Ether NMID = 4
Mon Feb 9 02:31:07 2004 : matched lan0 : ppa 0
Mon Feb 9 02:31:07 2004 : matched lan1 : ppa 1
Mon Feb 9 02:31:07 2004 : matched lan2 : ppa 2
Mon Feb 9 02:31:07 2004 : matched lan3 : ppa 3
Mon Feb 9 02:31:07 2004 : matched lan4 : ppa 4
Mon Feb 9 02:31:07 2004 : got 5 lan device(s)
Mon Feb 9 02:31:07 2004 : INITIALIZATION COMPLETE

Thanks in advanced.
6 REPLIES 6
Sridhar Bhaskarla
Honored Contributor

Re: Alert 13 system hang

Hi,

You would have TOC'ed the box and taken dump of the system so that HP could analyze the dump and identify the problem. I suspect it would be most likely a bad CPU. It would be bit hard to find it out but HP may help you by going through the GSP logs.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Steven E. Protter
Exalted Contributor

Re: Alert 13 system hang

Timer popping usualy does not hang the system.

1) Have HP look at the GSP card it may be bad.
2) Check for a crash dump in /var/adm/crash: If there is a subdirectly, run q4 analysis so HP can tell you what patch you are missing. Attaching a cookbook.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
curt larson_1
Honored Contributor

Re: Alert 13 system hang

and the syslog only registered the startup

the system startup script for syslogd moves syslog.log to OLDsyslog.log before starting the syslogd daemon.

might not be anything there, but if there was anything in the syslog.log before the sytem went down, it is now in OLDsyslog.log.
Omololu Shobayo
Frequent Advisor

Re: Alert 13 system hang

check the error logs from the GSP and inform HP of the errors. You should have something written to /var/tombstone/ts99 file, if there is an hardware issue.

Contacting HP would be your best bet for analysis of both your ts99 and the errors on the gsp.
Ron Thompson
Advisor

Re: Alert 13 system hang

When I have had the GSP error "System hang detected via timer popping" it has always turned out to be a bad platform monitor board that needed replaced.
Jeff Schussele
Honored Contributor

Re: Alert 13 system hang

Hi,

A common cause for a timer popping in a multi-CPU system is a CPU going bad & hanging causing another CPU that's waiting for a resource that it's never going to get because the hung CPU isn't ever going to give it up.

SO check for the ts99 tombstone & the INDEX file in the crash dump for clues - like a CPU with no timestamp or such.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!