Operating System - Tru64 Unix
1752766 Members
5221 Online
108789 Solutions
New Discussion юеВ

Re: My server down any HELP

 
hass_2k
Occasional Advisor

My server down any HELP

hi
please urgent reply is needed ...my server tru unix64 es40 elpha is down and i dont know the probelm where here is the story ....
my sever was working nicely normaly and there was filuer fan then i had changer the fan after that i have startup my server it works nicly but after cupls of hrs it reboot by it self and when i try to boot my server it didnt boot .... i have retune the old fan and i start up my server agine it run but after cuples of hrs doing reboot agine and didnt start agine ...could u plz any help its urgent
thanking u
hassan
6 REPLIES 6
Ivan Ferreira
Honored Contributor

Re: My server down any HELP

I had similar problems once but with an ES47. If the problem was or is the fan, then high temperatures may cause that your server won't boot. What is the error that you get when booting?

In our case, when the temperature increased, a memory module started to have problems.

When the server crash, at the SRM run:

show power

To check the power and environmental status.

You should check your binary.errlog with dia, and /var/adm/crash/crash-data if exists.

As usual, my recomendation is to install webes, the Crash Analysis Tool will help a lot.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
hass_2k
Occasional Advisor

Re: My server down any HELP

thanks for ur reply but the probelm is that its booting and working for cupls of hrs working normally after that it do rebooting by its self mean while i cant boot agine thts the probelm
thinking u
hass
Victor Semaska_3
Esteemed Contributor

Re: My server down any HELP

Hassan,

Have you checked /var/adm/messages for any errors. If not there check /var/adm/syslog.dated/current/daemon.log and kern.log. When we had a server overheat there were messages in the file.

It sounds like the fan is failing after a couple of hours and the system shuts itself down.

Vic
There are 10 kinds of people, one that understands binary and one that doesn't.
hass_2k
Occasional Advisor

Re: My server down any HELP

well i think i got some erro which can help u all to solve my probelm
Panic_savecore.tru
sub:evmalert(700) system panic: pinc(cpu0):domain panic promoted


hope u will identify the erro
Harmanjit_1
Frequent Advisor

Re: My server down any HELP

Hi,

If you want to keep running your server, you can rename S51envmon file from /sbin/rc3.d, which will stop monitoring your environment and system will run even if the temperature raises. This is temporary solution. You have get the fan replaced but till that time you can continue your server working.

Thanks
Harman
Johan Brusche
Honored Contributor

Re: My server down any HELP


Two entries before the author posted:

evmalert(700) system panic: pinc(cpu0):domain panic promoted

indicating that there was an AdvFS domain panic, promoted to a system panic, probably because the domain involved was root_domain.

Bad news, hopefully you have a recent vdump of the root.

Rgds,

___ Johan.

_JB_