Operating System - HP-UX
1835038 Members
3279 Online
110073 Solutions
New Discussion

Re: server suddenly reboot without problem signal

 
SOLVED
Go to solution
Sam_88
Occasional Advisor

server suddenly reboot without problem signal

I got a problem on my server (HP A500) since it reboots suddenly and I am new for HP-UX. So, please help. I am attach the syslog.log for reference.
Paul Mayor
18 REPLIES 18
Pete Randall
Outstanding Contributor

Re: server suddenly reboot without problem signal

Sam,

You need to look at the OLDsyslog.log, the /etc/shutdownlog, and crash dumps in /var/adm/crash, and any diagnostic logs (assuming you're running diagnostics).


Pete


Pete
Jean-Luc Oudart
Honored Contributor

Re: server suddenly reboot without problem signal

Did you check the other logs to see if that's a "proper" reboot or unwanted one.

check this link :
http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x335fcc0a76c1ae4da5364278123d54c8,00.html

Rgds,
Jean-Luc
fiat lux
Steven E. Protter
Exalted Contributor
Solution

Re: server suddenly reboot without problem signal

Check the NIC card, cable and switch, it appears that your LAN connection was interupted. This could result in the issues you've noted in this thread.

From the outisde on a Unix/Linux box. ping hostname

If the ping times gradually climb, you need a NIC card replacement.

dmesg

Check for hardware faults.

ioscan, look for NW_HW or unclaimed devices.

Check /var/adm/crash for crash dumps.

If there is one, follow the attached guide to analyze the dump. You may need a patch.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Helen French
Honored Contributor

Re: server suddenly reboot without problem signal

What version of MC/SG (MC ServiceGuard cluster software) are you using? Applying these patches might solve your issue:

For HP-UX 11.X, MC/SG version A.11.09 - PHSS_27158:

http://www2.itrc.hp.com/service/cki/patchDocDisplay.do?patchId=PHSS_27158

For HP-UX 11.X, MC/SG version A.11.14 - PHSS_29122:

http://www2.itrc.hp.com/service/cki/patchDocDisplay.do?patchId=PHSS_29122
Life is a promise, fulfill it!
Pete Randall
Outstanding Contributor

Re: server suddenly reboot without problem signal

Sam,

Since the last entry I see in your syslog is someone su'ing to root, I have to wonder if this is somehow connected. Do you know who did the su? Do you know what they did after they su'd? What does /etc/shutdownlog say?


Pete


Pete
E. Wong
Frequent Advisor

Re: server suddenly reboot without problem signal

Please check for possibles cron jobs that may trigger a reboot.

run

# crontab -l

and check for each of the jobs scheduled.


compute, therefore you are
Sam_88
Occasional Advisor

Re: server suddenly reboot without problem signal

Thnaks for you all info.

Here is the OLDsyslog.log file. Please help me to see whether the system crash is due to hardware problem or other.

Paul Mayor
Pete Randall
Outstanding Contributor

Re: server suddenly reboot without problem signal

Sam,

I don't think we're seeing all of the OLDsyslog.

Could you just do a "tail -100 /var/adm/syslog/OLDsyslog.log" and attach that output?


Pete


Pete
Sam_88
Occasional Advisor

Re: server suddenly reboot without problem signal

Thanks for your reply


Here is the modified OLDsys.log

Thanks
Paul Mayor
Pete Randall
Outstanding Contributor

Re: server suddenly reboot without problem signal

Sam,

I still don't see anything. Did you see anything suspicious in the shutdown log? Anything under /var/adm/crash? With what we've been able to see so far, there's no way to tell what caused the reboot.


Pete


Pete
Kent Ostby
Honored Contributor

Re: server suddenly reboot without problem signal

Sam -- The key file is going to be /etc/shutdownlog.

Please post the output from:

tail /etc/shutdownlog

If shows something like :

reboot after panic: trap type 1 HPMC
or
reboot after panic: isr.ior = (some number)

then you're probably looking at a HW problem and we need to look at /var/tombstones/ts99 to see if there is current data INSIDE the file.

Please post shutdownlog information as that will tell us if you have a likely HW or SW problem.
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Sam_88
Occasional Advisor

Re: server suddenly reboot without problem signal

Hi all,

Here is the /etc/shutdownlog form the problem A500 server.

06:18 Tue May 15, 2001. Halt: (by shkodb02!root)
07:15 Tue May 15, 2001. Reboot: (by shkodb02!root)
02:33 Thu May 17, 2001. Halt: (by shkodb02!root)
23:00 Thu May 17, 2001. Reboot: (by shkodb02!root)
02:09 Sat May 19, 2001. Reboot: (by shkodb02!root)
12:05 Wed May 30, 2001. Reboot: (by shkodb02!root)
16:24 Wed May 30, 2001. Reboot: (by shkodb02!root)
17:09 Wed May 30, 2001. Reboot: (by shkodb02!root)
18:14 Wed May 30, 2001. Reboot: (by shkodb02!root)
16:17 Fri Jun 1, 2001. Reboot: (by shkodb02!root)
10:35 Tue Jun 5, 2001. Reboot:
10:54 Thu Jun 14, 2001. Halt: (by shkodb02!root)
10:54 Fri Jun 15, 2001. Reboot: (by shkodb02!root)
17:24 Tue Jul 24, 2001. Halt:
15:52 Sun May 11, 2003. Halt: (by shkodb02!root)
10:26 Tue May 13, 2003. Halt: (by shkodb02!root)
db800
15:27 Wed May 14, 2003. Reboot: (by shkodb02!root)ior = 0'505a400.80000000'6f
01:37 Fri May 16 2003. Reboot after panic: , isr.ior = 0'd503800.80000000'74
67800
10:30 Fri May 16, 2003. Reboot: (by shkodb02!root)
11:16 Fri May 16, 2003. Reboot: (by shkodb02!root)
11:29 Fri May 16, 2003. Reboot: (by shkodb02!root)
15:29 Sun May 18, 2003. Halt: (by shkodb02!admin1)
11:49 Thu Aug 21, 2003. Reboot:
11:59 Thu Aug 21, 2003. Halt: (by shkodb02!root)
12:11 Thu Aug 21, 2003. Reboot:
13:50 Thu Aug 21, 2003. Reboot: (by shkodb02!root)
02:20 Fri Aug 22 2003. Reboot after panic: , isr.ior = 0'fb22400.80000000'60
b5800
10:41 Fri Aug 22 2003. Reboot after panic: , isr.ior = 0'a220400.80000000'88
d9800
10:48 Fri Aug 22, 2003. Reboot: (by shkodb02!root)
23:55 Sun Aug 24, 2003. Halt: (by shkodb02!root)
02:46 Mon Aug 25, 2003. Halt: (by shkodb02!admin5)
03:13 Mon Aug 25, 2003. Reboot:
12:54 Thu Aug 28 2003. Reboot after panic: Spinlock deadlock!
10:34 Sun Aug 31 2003. Reboot after panic: , isr.ior = 0'63d1c00.80000000'50


Also, the ts99.dat file is attached

Thanks very much

Sam
Paul Mayor
Pete Randall
Outstanding Contributor

Re: server suddenly reboot without problem signal

Sam,

I get a 404 error when I try to open the ts99 file, but obviously your system paniced and re-booted. An analysis of the ts99 file might help to narrow down the cause.


Pete


Pete
Sam_88
Occasional Advisor

Re: server suddenly reboot without problem signal

Hi Pete,

I attached the file again. Please help me to see whether anything goes wrong.

Thanks


Sam
Paul Mayor
Pete Randall
Outstanding Contributor

Re: server suddenly reboot without problem signal

Sam,

All I can tell with my limited knowledge is that it looks like you had a HPMC (High Priority Machine Check). That will definitely result in a panic and reboot, but why you had the HPMC still needs to be determined. Hopefully someone will be able to diagnose something from your ts99 file.


Pete


Pete
Kent Ostby
Honored Contributor

Re: server suddenly reboot without problem signal

Sam --

Definitely looking at a HW problem and probably a CPU.

You're probably should open a HW call so you can get the correct diagnosis and get the situation fixed right away.

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Sam_88
Occasional Advisor

Re: server suddenly reboot without problem signal

Hi Pete,

Thanks for your advise on my problem.
Base on the log files, Is that problem on the processor1 for my case? I am getting crazy on this situtation as it reboots without any signals and it is so confuse. Anyway, thanks for your quick response.

Sam
Paul Mayor
Kent Ostby
Honored Contributor

Re: server suddenly reboot without problem signal

Again, if you have access to HW support, you're probably better off going to them to get a definitive answer.

If I had to guess at something, I'd look at CPU 0 since it did not report in at the time of the reboot (i.e. the No HPMC Chassis Codes logged) .

Normally, all CPUs should report in to the ts99 file if there is some sort of crash.

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"