Alpha Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

AlphaServer 1000A has CIA machine check ECC errors and fails to reboot

 
Highlighted
Regular Advisor

AlphaServer 1000A has CIA machine check ECC errors and fails to reboot

Hello Alpha fellows,

Original symptoms: I could only ping the server, telnet was not possible any
more...

After connecting a laptop on the console serial port I found the following
on the console log (repeatedly)

CIA machine check: vector=0x630 pc=0xfffffc0000855700 code=0x86
machine check type: correctable ECC error (retryable)

CIA machine check: vector=0x630 pc=0xfffffc00008556f4 code=0x86
machine check type: correctable ECC error (retryable)

... etc


I pressed the reset button to try to reboot...

. start booting
...
probing PCI-to-EISA bridge, bus 1
probing PCI-to-PCI bridge, bus 2
bus 2, slot 0 -- pka -- QLogic ISP1020
bus 0, slot 11 -- ewa -- DECchip 21140-AA
ed.ec.eb..

But then I got an endless series of error messages such as:

Processor correctable error through vector 00000063.

EI_STAT: FFFFFFF0C5FFFFFF EI_ADDR: FFFFFF00010500CF
FILL_SYN: 0000000000000094 ISR: 0000000100000000

Processor correctable error through vector 00000063.

EI_STAT: FFFFFFF0C5FFFFFF EI_ADDR: FFFFFF0001041D8F
FILL_SYN: 0000000000000094 ISR: 0000000100000000

. etc

But the boot failed, because the above error repeated itself continuously...


What advice could you give me?

Do I have a memory problem, I suppose?

How can I find which RAM chips could be in failure ?

Or should I replace the CPU board?

AlphaServer 1000A running Linux Red Hat 7.2

Thanks, Geert/

1 REPLY 1
Highlighted
Regular Advisor

Re: AlphaServer 1000A has CIA machine check ECC errors and fails to reboot

After a couple of days of searching and checking all possible hardware components... I found out that indeed one module of SIMM memory in bank 0 was dead... I removed one bank of memory and the system lives again...

Strange that the error message is a bit cryptic

At the end I have learned a lot about Alpha hardware and repair + setup + interpreting diagnostics...