Operating System - Tru64 Unix
1753430 Members
4741 Online
108793 Solutions
New Discussion юеВ

Re: Tru64unix:ES40 server restarting automatically

 
Learn_1
Regular Advisor

Tru64unix:ES40 server restarting automatically

Hi,
having problem in Alpha Server ES40 running Tru64unix.a few days ago due to dirty power failure and UPS failure the server went down unplanned.now since from that date, the server is having problem that it restarts automatically once everyday at anytime.last night i shut it down accordingly alongwith the controllers.now i want to inquire what log files can help me to diagnose the cause and which monitoring tools can be helpful to pin point the problem.the server is also running Oracle database.one more thing to mention is that there is an error reported on the desktop as below.
"Console log for ranjha.mobilink.net.pk
STDC Assertion failed:"childp"
file=actionTt.c
line=394 "

your detailed suggestions will help me to resolve this issue.
11 REPLIES 11
Ralf Puchner
Honored Contributor

Re: Tru64unix:ES40 server restarting automatically

First check if a crashdump exists on /var/adm/crash. If so post it here.

If not check if binary.errlog (dia -R) contains machine check errors, if so open a call within the support center for hardware analyze.

Next check the messages and daemon.log for information, maybe the emvd timers and settings are too low.
Help() { FirstReadManual(urgently); Go_to_it;; }
Dave Bechtold
Respected Contributor

Re: Tru64unix:ES40 server restarting automatically

Hello,

I agree with Ralph in checking all the various logs, the Environmental Monitoring daemon (envmond) should report Power supply status to the logs, daemon.log, kern.log, or messages. It will manny times broadcast critical failures to all terminals connected at the time of the failure. You can use "man envmond" and envconfig to configure envmond daemon parmaters.

But, if envmond is not running - you will not have a warning, etc...

Review the binary.errlog using the appropriate tool - DECEvent - dia for ES40.

The ES40's can have multiple power supplies, it's possible that one of them is marginal now after taking a power hit - maybe try reducing the number of power supplies to isolate the suspect - or have them checked out by Field Service.

As for the "STDC Assertion failed ..." this is most likely being produced by a "C" program using the G-LIB (GNU Freeware "C" Library) and or gcc compiler. Find out what application and or who the user is using the file 'actionTtc.c' and work it from that angle.

Hope that helps,
Dave Bechtold
Learn_1
Regular Advisor

Re: Tru64unix:ES40 server restarting automatically

Hi,
Attached is the crash dump file for further suggestion.
Learn_1
Regular Advisor

Re: Tru64unix:ES40 server restarting automatically

attached isd the log for further consideration.
this error log is referring to the cpu0.
Ralf Puchner
Honored Contributor

Re: Tru64unix:ES40 server restarting automatically

The log indicates a machine check error which is simply a fault on cpu/cache/memory.
Log a call within the HP support center and sent binary.errlog for further investigation to them.
Help() { FirstReadManual(urgently); Go_to_it;; }
Learn_1
Regular Advisor

Re: Tru64unix:ES40 server restarting automatically

Hi Ralph,
Thanks for your guidance.i will be logging a support call with HP soon.just for your opinion i am also forwarding you the binary.errlog.this log states error in power supply and memory.i have c hecked the power supply status which is OK also there is't any error during the memory test.now you can also check this binary.errlog and suggest any further action which can help me to pin point the faulty component so that i can bring this server online in a stable condition as soon as possible.
Once again thanks.
Ralf Puchner
Honored Contributor

Re: Tru64unix:ES40 server restarting automatically

The memory test does not test all kind of problems (and does not test the cache memory of the CPU).

A machine check requires special tools (memcheck decoder) to analyze the root cause of the problem. These tools are only available for hardware specialists within the support center.

Help() { FirstReadManual(urgently); Go_to_it;; }
Learn_1
Regular Advisor

Re: Tru64unix:ES40 server restarting automatically

hi Ralf,
i did logged call with hp and at the i was advised to replace all the DIMM modules as they diagnosed that these modules are faulty.
i am arranging the modules.in the mean time we had the identical system in the cabinet which was not being used.so just to keep work going we replaced the system disks with the spare server and started up the server alongwith the storage.the system kept working fine for a week and just today system restarted again.now all the hardwre is new even the memory modules.now what u think what could be the problem
Ralf Puchner
Honored Contributor

Re: Tru64unix:ES40 server restarting automatically

why not sending the binary.errlog and crash-data (if available) to the support center again? It seems to be the same root cause again....
Help() { FirstReadManual(urgently); Go_to_it;; }