System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

Alpha ES40 rebooted with "Machine check processor fatal abort" error

Azim_3
Occasional Advisor

Alpha ES40 rebooted with "Machine check processor fatal abort" error

Hi to ALL,

I have compaq alpha server ES40 with tru64 v5.1b o/s in a cluster mode,Firmware revision: 6.3-2
My server got rebooted with the below mentioned error.
Jul 1 22:26:04 atd1 vmunix: Machine Check Processor Fatal Abort
Jul 1 22:26:04 atd1 vmunix: Machine check code = 0x1000000a0
Jul 1 22:26:04 atd1 vmunix: Ibox Status = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Dcache Status = 0000000000000008
Jul 1 22:26:04 atd1 vmunix: Cbox Address = 0000000101dbcd40
Jul 1 22:26:04 atd1 vmunix: Fill Syndrome 1 = 00000000000000d3
Jul 1 22:26:04 atd1 vmunix: Fill Syndrome 0 = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Cbox Status = 000000000000000b
Jul 1 22:26:04 atd1 vmunix: EV6 captured status of Bcache mode = 000000000000000d
Jul 1 22:26:04 atd1 vmunix: EV6 Exception Address = 0000000122af84d0
Jul 1 22:26:04 atd1 vmunix: EV6 Interrupt Enablement and Current Processor mode = 0000007ee0000008
Jul 1 22:26:04 atd1 vmunix: EV6 Interrupt Summary Register = 0000000080000000
Jul 1 22:26:04 atd1 vmunix: EV6 TBmiss or Fault status = 0000000000000290
Jul 1 22:26:04 atd1 vmunix: EV6 PAL Base Address = 0000000000018000
Jul 1 22:26:04 atd1 vmunix: EV6 Ibox control = fffffe0006304396
Jul 1 22:26:04 atd1 vmunix: EV6 Ibox Process_context = 0000080000000004
Jul 1 22:26:04 atd1 vmunix: O/S Summary flag = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Cchip Base Address (phys) = 00000f01a0000000
Jul 1 22:26:04 atd1 vmunix: Cchip Device Raw Interrupt Request = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: DRIR Register Decode:
Jul 1 22:26:04 atd1 vmunix: PCI Device Interrupt Mask = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Cchip Miscellaneous Register = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Misc Register Decode:
Jul 1 22:26:04 atd1 vmunix: Cchip Revision: 00
Jul 1 22:26:04 atd1 vmunix: ID of CPU performing read: 00
Jul 1 22:26:04 atd1 vmunix: Pchip 0 Base Address (phys) = 00000f0180000000
Jul 1 22:26:04 atd1 vmunix: Pchip 0 Error Register = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Pchip Error Register Decode:
Jul 1 22:26:04 atd1 vmunix: PCI Xaction Start Address = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: PCI Command: Interrupt Acknowledge
Jul 1 22:26:04 atd1 vmunix: Pchip 1 Base Address (phys) = 00000f0380000000
Jul 1 22:26:04 atd1 vmunix: Pchip 1 Error Register = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: Pchip Error Register Decode:
Jul 1 22:26:04 atd1 vmunix: PCI Xaction Start Address = 0000000000000000
Jul 1 22:26:04 atd1 vmunix: PCI Command: Interrupt Acknowledge
Jul 1 22:26:04 atd1 vmunix: CPU 1 is prevented from being rebooted.
Jul 1 22:26:04 atd1 vmunix: The system must be reset or power cycled to clear this state.
Jul 1 22:26:04 atd1 vmunix: panic (cpu 1): Processor Machine Check
Jul 1 22:26:04 atd1 vmunix: syncing disks...

After browsing similar threads,the problem is most likely with CPU/RAM.
can anyone pinpoint what is the actual problem?
Wheather the above mentioned hardware need to be replaced???
Attaching herewith crash-data ,binary.errorlog.zip,

Waiting for prompt reply.

rgds
azim
2 REPLIES
Ivan Ferreira
Honored Contributor

Re: Alpha ES40 rebooted with "Machine check processor fatal abort" error

You should check:

1-

Your firmware revision is tha latest.
You have tha latest patch kit installed.

2-

Check power supply/environment errors (>>> show power)
Replace memory
Swap/Replace cpu

3-

Oracle was the process running at the time of the crash, check oracle log files.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Michael Schulte zur Sur
Honored Contributor

Re: Alpha ES40 rebooted with "Machine check processor fatal abort" error

Azim,

this is clearly a case to open a call with HP.

greetings,

Michael