Operating System - HP-UX
1822895 Members
3457 Online
109645 Solutions
New Discussion юеВ

unexpected system shutdown

 
Kurt Beyers.
Honored Contributor

unexpected system shutdown

Hi,

A HP-UX server (rp 5430) running HP-UX i had an unexpected system shutdown. This is an extract of the OLDsyslog.log file:

Mar 22 17:37:48 servername rpcbind: terminate: rpcbind terminating on signal. Restart with "rpcbind -w"
Mar 22 17:37:48 servername diagmond[1211]: Exit due to receipt of unexpected signal (3)
Mar 22 17:37:49 servername /usr/lbin/ups_mond[1497]: /usr/lbin/ups_mond: reboot -halt invoked due to UPS error cited in previous syslog message
Mar 22 17:37:49 servername /usr/lbin/ups_mond[1497]: /usr/lbin/ups_mond: UPS /dev/tty0p2 could not execute command S120
Mar 22 17:37:53 servername vmunix:
Mar 22 17:37:53 servername vmunix: sync'ing disks (1 buffer to flush): 1
Mar 22 17:37:53 servername vmunix: 0 buffers not flushed
Mar 22 17:37:53 servername vmunix: 0 buffers still dirty

Does anybody knows what causes this problem? The UPS seems to have received some kind of error.

Kurt
10 REPLIES 10
James Murtagh
Honored Contributor

Re: unexpected system shutdown

Hi Kurt,

It is hard to be sure with the extract here but it looks like the server was already on the way down when you got the UPS message. The ups_mond has obviously detected something, if you could maybe attach the OLDsyslog.log I might be able to tell you more.

Also check the last entry in /etc/shutdownlog, it will tell you who invoked the reboot. I would also check the ts99 file in /var/tombstones for a valid timestamp, attach this too if you are unsure how to read this file.

One last thing - I can't believe you got those last few entries in the syslog - hpux is not running at this point so I don't understand how syslog can log them!?

Regards,

James.
Kurt Beyers.
Honored Contributor

Re: unexpected system shutdown

James,

There are no relevant messages prior to those I've copy/pasted in the post.

What do you mean with checking if there is a valid time stamp in /var/tombstones/ts99?
James Murtagh
Honored Contributor

Re: unexpected system shutdown

Hi Kurt,

I was just checking in case you had hardware problems. In the ts99 file there should be a line (near the top) either saying "No valid timestamp" or a line with the date and a row of chassis codes in hex.

Does the shutdownlog not have an entry for the reboot? Was a crash dump produced?

Regards,

James.
Kurt Beyers.
Honored Contributor

Re: unexpected system shutdown

No entries in the shutdownlog.
James Murtagh
Honored Contributor

Re: unexpected system shutdown

Hi Kurt,

When a server reboots for any reason you only have a few places to look for evidence if the default configuration is in place:

1) /etc/shutdownlog --> reboot/panic message is logged here
2) /var/tombstones --> The ts99 will indicate if there were hardware problems
3) /var/adm/crash --> dumps are defaulted to be produced here
4) /var/adm/syslog/OLDsyslog.log --> this gives an indication of what was happening beforehand, most use in S/G clusters

You mentioned no message in the shutdownlog. The few times I have seen this there was a complete loss of power to the server on each occasion. This would certainly be a reason for the UPS to kick in. However, my first point was based on the fact that rpcbind and diagmond are located suspiciously close in the shutdown sequence that to see both of them being sent kill signals in the same second suggested a shutdown was already in progress. The next thing after these processes to go down too is syslogd.

If there is no dump and no (valid) ts99 entry then you have little evidence of what happened.

Regards,

James.
T G Manikandan
Honored Contributor

Re: unexpected system shutdown

There was a problem with the UPS and it send did a system shutdown as per the configuration at /etc/ups_conf.

Also check this link

http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0xc67084534efbd5118ff40090279cd0f9,00.html
T G Manikandan
Honored Contributor

Re: unexpected system shutdown

check this pdf for Appendix A and C.

http://www1.itrc.hp.com/service/cki/cache/200000059144896.pdf


"reboot -halt invoked due to UPS error cited in previous syslog message "

What are the previous messages reg ups_mond


Thanks

Donald Kok
Respected Contributor

Re: unexpected system shutdown

Hi Kurt,

I had this a year ago. The Current went down for ten minutes. It happened in the working hours, so I could observe the behaviour. The system was still working on the UPS. The UPS-manager made the system to go down, while the current was allready up.

Did not find a way to tune this. Now we have a different UPS system, which handles this a bit easier.

HTH
Donald
My systems are 100% Murphy Compliant. Guaranteed!!!
Kurt Beyers.
Honored Contributor

Re: unexpected system shutdown

Apperantly the shutdown occurs when a Unify database is being shutdown on the server. The shutdb command seems to shutdown the server as well.

No core dumps in /var/adm/crash and the file /var/tombstones/ts99 looks normal as well.


In the OLDsyslog.log file are no previous ups_mond messages found.

We'll look further in the Unify database first.

regards,

Kurt
Steven E. Protter
Exalted Contributor

Re: unexpected system shutdown

Looks to me like the UPS thought it lost power and the system tried to shutdown.

GSP should pick this up at the console, Ctrl-b up there.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com