Operating System - Tru64 Unix
1753861 Members
7515 Online
108809 Solutions
New Discussion юеВ

Re: Server reboots without any logs

 
Ninad_1
Honored Contributor

Server reboots without any logs

Hello

I have a ES40 with 4.0F connected to RA8000 via SAN switch. Oracle 8i .
Yesterday the server came to p00>>> prompt all of a sudden at 10:30 in the morning.I rebooted the server and checked for any logs indicating the failure.
I checked the logs in /var/adm/syslog.dated directories , but nothing checked for /var/adm/messages , checked uerf -R , checked mail to root , no crash dumps in /var/adm/crash. No trace of the failure.
Again at around 13:30 the server came to P00>>> prompt . So I thought if there is any problem with the UPS supply so I changed to other UPS to observe whether the problem gets solved. Today morning again at 11:00 the server came to P00>>> prompt. And in all the cases I checked all the logs I have mentioned above but without any clue about the failure.
Can you help me out of this situation ? What else do I need to check ? what could be causing this sort of behaviour.
Please help me

Thanks and eagerly waiting for your replies

Ninad
5 REPLIES 5
Johan Brusche
Honored Contributor

Re: Server reboots without any logs


Ninad,

When it's at the P0>>> you can give several SRM commands to get info about the state of the ES40 before rebooting, eg try:

>>>show config
>>>show mem
>>>show power
>>>show error
>>>show fru
>>>show dev
>>>cat el
>>>info 3
>>>info 5
>>>info 6
>>>info 7
>>>show pci_parity (should ben 'ON')

finally type
>>>crash
to obtain a memory dump.

All the above can best be done with the console set to SERIAL, and a PC connected to it with its COM-port for logging the output.

Another thing you did not mention....also check for HW errors (after reboot) in binary.errlog with the help of the "ca" (compaq analyze) utility.

Johan.

_JB_
Ninad_1
Honored Contributor

Re: Server reboots without any logs

Hello Johan

I think I missed out to tell you a major observation. The system doesnt come to P00>>> prompt from running mode directly - what happened was that I was sitting just nearby the system when suddenly I heard sound just like the system was powered on and when I saw the system had actually rebooted I mean like a power reboot. But all other equipment connected to the same UPS power supply like one of the power supplies of the RA8000, the SAN switch etc were functioning properly. Still I had shifted the power supply of the ES40 server to another UPS after which also it rebooted once. Sorry I did not mention this infact I was also a bit confused because of this serious problem that I failed to register this fact in my brains.


When the system came to P00>>> prompt after the - so to say power reboot , before booting the system I had checked
more el, show power and show mem
and no errors were reported.
God forbid and next time doesnt come but if it happens again i will check the rest of the commands too , but now that I have given a major observation do you think these commands will help?

What could be the problem now ? Please help ?

Thanks
Ninad

>>Another thing you did not mention....also check for HW errors (after reboot) in binary.errlog with the help of the "ca" (compaq analyze) utility.

I have used uerf which also checks binary.errlog and it does not report any hardware problem
Mohamed  K Ahmed
Trusted Contributor

Re: Server reboots without any logs

Ninad,

As mentioned above, check the sh power command at the P00> prompt. Also check if you have any problems with the power supply in the back of the ES40.

Did you add anything to the system and then it started doing that or it has been running fine then suddenly this behaviour came up?
Any additions that hook up to the same UPS outlet?
It sounds like a power issue or a UPS issue. Is there any record of any fluctuation coming out of the UPS?, we had a problem similar to this, and then we found out that the UPS is not giving enough power to systems connected to it and occasionly, some of them gets a power spike or low voltage and it quits/reboots. Maybe you have something overloading the UPS line or the UPS is not stabilizing power correctly.
We had to replace a module in the UPS and in later stages replace the whole UPS to come to normal operation.

HTH

Mohamed
Ninad_1
Honored Contributor

Re: Server reboots without any logs

Hi Again

Johan you were right , I had not checked for commands like show error , show fru etc
This morning again the server rebooted so at the P00>>> prompt I gave the commands you gave. Also in the binary.errlog I was not able find any errors as at some point in the file may be it used to give error reading syserr which I later found out that it was a shortcoming of uerf and hence should use dia or ca instead. I used dia to check the binary.errlog but still I am not able to make out for any errors.
I am attaching the output of
P00>>> show error
and the file binary.errlog
I would be grateful if you can throw some light on the errors shown.
We have been running this same configuration from 25/3/2004 ( On this date we had added 1 GB memory from another system )


Thanks and regards

Ninad
Ninad_1
Honored Contributor

Re: Server reboots without any logs

The binary.errlog file attachment

Ninad