Operating System - Tru64 Unix
1752783 Members
5972 Online
108789 Solutions
New Discussion юеВ

Re: messages - CPU error

 
Karthik S S
Honored Contributor

Re: messages - CPU error

Thank you all ...

I will ask the user to call up HP.

Thanks,
Karthik S S
For a list of all the ways technology has failed to improve the quality of life, please press three. - Alice Kahn
Mobeen_1
Esteemed Contributor

Re: messages - CPU error

Karthik,
I have seen this happen many a times as many of our colleagues have suggested you could try

1. To power down your machine and refix the
CPUs or even swap them and see how things
go. This will also confirm whether the
error is actually on the CPU (as the
CPU position changes when you swap, the
error if any should give you a different
CPU location)

2. Use Decevent to look for any errors

Many a times these errors are caused by environmental factors and mostly due to CPU fan failures. But in the cases where CPU fans have failed, it will for sure give you a message on the fan failure.

If i were you, i would log a call with HP and have them change the CPUs in question. It really depends on the criticality of these servers at your site. In my case, i cannot afford a downtime and so if there is any doubt, just HP and myself will make a decision to replace them without taking any chances

I also have seen many posts in this forum with same issues. It would be interesting to see what resolutions others have carried out without having to replace the CPUs.

regards
Mobeen
Karthik S S
Honored Contributor

Re: messages - CPU error

Mobeen,

Thanx I will try that before calling hp...

by the way, how do I use decevent? is it a command??

Thanks,
Karthik S S
For a list of all the ways technology has failed to improve the quality of life, please press three. - Alice Kahn
Mobeen_1
Esteemed Contributor

Re: messages - CPU error

Hello Karthik,
Please review the link below and you will be able to use those commands.

http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V40G_HTML/AQTLSBTE/DOCU_008.HTM

Let me know if you still have issues trying to figure it out.

regards
Mobeen
Michael Schulte zur Sur
Honored Contributor

Re: messages - CPU error

Karthik,

the easiest way to use decevent is:
dia -R | more
to let it go backward in time.

Michael
Karthik S S
Honored Contributor

Re: messages - CPU error

Thanks Michael,

Some of the output from decevent,

------

Machine Check Reason x0086 Alpha Chip Detected ECC Err, From B-Cache
stdin
Ext Interface Status Reg xFFFFFFF085FFFFFF
DATA SOURCE IS BCACHE
CORRECTABLE ECC ERROR
D-ref fill

Machine Check Reason x0086 Alpha Chip Detected ECC Err, From B-Cache

Ext Interface Status Reg xFFFFFFF085FFFFFF
DATA SOURCE IS BCACHE
CORRECTABLE ECC ERROR
D-ref fill
EV5 Chip Rev 5
Ext Interface Address Reg xFFFFFF0029831EAF
Fill Syndrome Reg x000000000000B500
Interrupt Summary Reg x0000000100000000
Correctable ECC Errors (IPL31)
AST Requests 3-0: x0000000000000000

WHOAMI x00000000 CPU0 Detected This Error

---------------------------

looks like a CPU cache / memory error.

Thanks,
Karthik S S
For a list of all the ways technology has failed to improve the quality of life, please press three. - Alice Kahn
Mobeen_1
Esteemed Contributor

Re: messages - CPU error

Karthik,
From the DECevent output it looks like this is a correctable cache error. I would suggest that you power down the machine and take this opportunity to remove the CPUs and put them back into their sockets :-)

I think the cache will be cleared totally when the machine is powered down.

Regards
Mobeen
Karthik S S
Honored Contributor

Re: messages - CPU error

Thank you Mobeen .. I have asked my colleague to do that. Machine is not located in my office :-)


-Karthik S S
For a list of all the ways technology has failed to improve the quality of life, please press three. - Alice Kahn
Mobeen_1
Esteemed Contributor

Re: messages - CPU error

Karthik,
Thats great. I am sure that should helo.

I would appreciate if you could post back if things were ok after doing the same. Looks like many people are having similar issues on the Alphas :-) and your post will most certainly help them all.

regards
Mobeen

Karthik S S
Honored Contributor

Re: messages - CPU error

I will do that Mobeen .. but there may be a delay as I am not involved in this issue directly ..

-KarthiK S S
For a list of all the ways technology has failed to improve the quality of life, please press three. - Alice Kahn