Operating System - Tru64 Unix
1753743 Members
4867 Online
108799 Solutions
New Discussion юеВ

Tru64 5.1b system reboots

 
SOLVED
Go to solution
Adam Strobel
Frequent Advisor

Tru64 5.1b system reboots

Hi--

I have a system that is rebooting about every other day or so. When I do a show power it reports that everything is OK. the only errors I get are attached.

any help would be great!!

attached is the /var/adm/messages logs.

thanks,

Adam
8 REPLIES 8
Ivan Ferreira
Honored Contributor
Solution

Re: Tru64 5.1b system reboots

Machine check errors often occurs if you have:

A problem with the power supply, you said that show power shows everything ok, this should not be your case.

A memory problem. Replace memory modules.

A CPU problem. Replace CPU.


But, is better if you post the /var/adm/crash/crash-data.X, the last file created.

Also you should check the /var/adm/syslog.dated directories, the kern.log file.

And you should install decevent available on the installation CD, associated products, so you can use the dia command to see the binary log errors (hardware errors).

Also, is recommended that you always have the latest patch kit.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b system reboots

Thanks Ivan-

Attached is the crash file.

I also looked in the /var/adm/syslog.dated directory but not sure how to view??


--Adam
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b system reboots

I also ran dia -R|more but there is no recent messeges, the most recent message was from Jun 2nd.


and yes every time the system reboots I run "show power" and everything says good.

--Adam
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b system reboots

sorry for keep adding new items but I just remembered that I left the consol cable connected to the controller so I check the consol and it did report this message.

%CER--HSG_YAMATO_2 > --02-AUG-2005 00:47:12-- Shelf 1 has a bad power supply or-
fan
HSG_YAMATO_2 >

%EVL--HSG_YAMATO_2 > --02-AUG-2005 00:47:17-- Instance Code: 03F70401
Template: 65.(41)
Occurred on 02-AUG-2005 at 00:47:12
Power On Time: 1. Years, 162. Days, 0. Hours, 8. Minutes, 54. Seconds
Controller Model: HSG80
Serial Number: ZG90305160 Hardware Version: E08(30)
Software Version: V87F-1(BB)
Port: 1. Target: 0.
Instance Code: 03F70401
HSG_YAMATO_2 >
%CER--HSG_YAMATO_2 > --02-AUG-2005 00:48:12-- Shelf 1 fixed
HSG_YAMATO_2 >
Mark Poeschl_2
Honored Contributor

Re: Tru64 5.1b system reboots

Does the time of the storage console error correspond to a Unix machine check? If not, I'd say they're two unrelated issues. Also, your ACS running in that HSG80 is way below current patch level. 8.7-7 is the latest patch for 8.7 and 8.8 has been released. 8.8 would require an upgrade purchase.
Ivan Ferreira
Honored Contributor

Re: Tru64 5.1b system reboots

It seems that you storage has some problem. Maybe a FAN so overhead stopped the storage. You should check the FANs and power supply leds of the storage, you may need to replace them.

Also, you should check the HSG guide, to know how to use the FMU utility, basically you need to type:

run fmu
show time
show last_failure most_recent
exit

or

run fmu
show time
show last_failure all
exit

To see if the errors occurs when the system reboots. Compare the storage time with the system time.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b system reboots

thanks everyone. I'm going to replace the power supply and fan and see how it goes.

I will let you know,

Adam
Davor_7
Regular Advisor

Re: Tru64 5.1b system reboots

hi strobel,

i met the same problem on HSG80, but after replacing the FANS and POWERSUPPLY. the problem didnot recover.

how's the result on your side?