Simpler Navigation for Servers and Operating Systems - Please Update Your Bookmarks
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
If you have bookmarked forums or discussion boards in Servers and Operating Systems, we suggest you check and update them as needed.
Operating System - Tru64 Unix
cancel
Showing results for 
Search instead for 
Did you mean: 

Alpha ES40 rebooted with "Machine check processor fatal abort" error

Azim_3
Occasional Advisor

Alpha ES40 rebooted with "Machine check processor fatal abort" error

Hi to ALL,

I have compaq alpha server ES40 with tru64 v5.1b o/s in a cluster mode,Firmware revision: 6.3-2
My server got rebooted with the below mentioned error.
Jul 1 22:26:04 atd1 vmunix: Machine Check Processor Fatal Abort
Jul 1 22:26:04 atd1 vmunix: Machine check code = 0x1000000a0
Jul 1 22:26:04 atd1 vmunix: Ibox Status = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Dcache Status = 0000000000000008
Jul 1 22:26:04 atd1 vmunix: Cbox Address = 0000000101dbcd40
Jul 1 22:26:04 atd1 vmunix: Fill Syndrome 1 = 00000000000000d3
Jul 1 22:26:04 atd1 vmunix: Fill Syndrome 0 = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Cbox Status = 000000000000000b
Jul 1 22:26:04 atd1 vmunix: EV6 captured status of Bcache mode = 000000000000000d
Jul 1 22:26:04 atd1 vmunix: EV6 Exception Address = 0000000122af84d0
Jul 1 22:26:04 atd1 vmunix: EV6 Interrupt Enablement and Current Processor mode = 0000007ee0000008
Jul 1 22:26:04 atd1 vmunix: EV6 Interrupt Summary Register = 0000000080000000
Jul 1 22:26:04 atd1 vmunix: EV6 TBmiss or Fault status = 0000000000000290
Jul 1 22:26:04 atd1 vmunix: EV6 PAL Base Address = 0000000000018000
Jul 1 22:26:04 atd1 vmunix: EV6 Ibox control = fffffe0006304396
Jul 1 22:26:04 atd1 vmunix: EV6 Ibox Process_context = 0000080000000004
Jul 1 22:26:04 atd1 vmunix: O/S Summary flag = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Cchip Base Address (phys) = 00000f01a0000000
Jul 1 22:26:04 atd1 vmunix: Cchip Device Raw Interrupt Request = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: DRIR Register Decode:
Jul 1 22:26:04 atd1 vmunix: PCI Device Interrupt Mask = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Cchip Miscellaneous Register = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Misc Register Decode:
Jul 1 22:26:04 atd1 vmunix: Cchip Revision: 00
Jul 1 22:26:04 atd1 vmunix: ID of CPU performing read: 00
Jul 1 22:26:04 atd1 vmunix: Pchip 0 Base Address (phys) = 00000f0180000000
Jul 1 22:26:04 atd1 vmunix: Pchip 0 Error Register = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Pchip Error Register Decode:
Jul 1 22:26:04 atd1 vmunix: PCI Xaction Start Address = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: PCI Command: Interrupt Acknowledge
Jul 1 22:26:04 atd1 vmunix: Pchip 1 Base Address (phys) = 00000f0380000000
Jul 1 22:26:04 atd1 vmunix: Pchip 1 Error Register = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Pchip Error Register Decode:
Jul 1 22:26:04 atd1 vmunix: PCI Xaction Start Address = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: PCI Command: Interrupt Acknowledge
Jul 1 22:26:04 atd1 vmunix: CPU 1 is prevented from being rebooted.
Jul 1 22:26:04 atd1 vmunix: The system must be reset or power cycled to clear this state.
Jul 1 22:26:04 atd1 vmunix: panic (cpu 1): Processor Machine Check
Jul 1 22:26:04 atd1 vmunix: syncing disks...

After browsing similar threads,the problem is most likely with CPU/RAM.
can anyone pinpoint what is the actual problem?
Wheather the above mentioned hardware need to be replaced???
Attaching herewith crash-data ,binary.errorlog.zip,

Waiting for prompt reply.

rgds
azim
2 REPLIES
Ivan Ferreira
Honored Contributor

Re: Alpha ES40 rebooted with "Machine check processor fatal abort" error

You should check:

1-

Your firmware revision is tha latest.
You have tha latest patch kit installed.

2-

Check power supply/environment errors (>>> show power)
Replace memory
Swap/Replace cpu

3-

Oracle was the process running at the time of the crash, check oracle log files.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Michael Schulte zur Sur
Honored Contributor

Re: Alpha ES40 rebooted with "Machine check processor fatal abort" error

Azim,

this is clearly a case to open a call with HP.

greetings,

Michael