Integrity Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

rx2600 with cache problem, won't boot

 
SOLVED
Go to solution
Unixnerd
Occasional Contributor

rx2600 with cache problem, won't boot

Let me start by saying that I have about 20 years experience with PA-RISC workstations but this is the first Itanium server I've touched.

I'm getting the following errors:

Alert Level 3: Warning
Keyword: MEM_CACHE_LINE_0_WR_RD_FAILED
Memory ECC normal write/read test failed

Alert Level 3: Warning
Keyword: BOOT_CPU_LATE_TEST_FAIL
CPU failed late selftest

Alert Level 3: Warning
Keyword: BOOT_CPU_CONFIG_FAIL
CPU struct init failed
Logged by: System Firmware 1

When I then try to run a CO command from the monitor to start a console on the main processors it fails (no shock there!) and hangs until I press Ctrl-B to return to the monitor cpu.

I've tried physically swapping the two 1.06GHz cpus over but it makes no difference, is this referring to an external cache I wonder?

The machine beeps twice on power up, but I don't think that's related.

Could this actually be a problem with the main (12Gb) ram? I've reseated all of it.

Thanks in advance :-)
Who needs a life when you've got Unix?
3 REPLIES 3
Sameer_Nirmal
Honored Contributor
Solution

Re: rx2600 with cache problem, won't boot

The problem appears to be with one of the DIMMs of the main memory.

Here is the OS event detail which is the same as reported by SFW/MP in their own format.

Event 299

* Severity: MAJOR_WARNING
* Event Summary: Memory ECC normal write/read test failed
* Event Class: System
* Problem Description:
After FW's first access to main memory, FW detected that the CEC logged an error after reading back what was just written.
* Cause / Action:
C: The DIMM that maps to cache line 0 is in a chipspare condition A: Contact HP support C: The DIMM that maps to address 0 is not seated properly A: Check all of the DIMMs in the system and make sure that they are inserted fully into the slot with the retention mechanism in place C: System may be running at the wrong frequency. A: Verify the system bus frequency and the memory bus frequency.
* Automated Recovery: None
* Event Generation Threshold: 1 occurrence

The DIMM problem caused the CPU test failure which in turn made it "deconfigured".
Sandy Chen
Honored Contributor

Re: rx2600 with cache problem, won't boot

Hi,

Does the 12GB of memory just been added? Perhaps you could try with minimum config for the memory, just a quad to make sure your memory is the culprit.

Regards,
Sandy
I never think of the future. It comes soon enough.
Unixnerd
Occasional Contributor

Re: rx2600 with cache problem, won't boot

Thanks guys, it was a bad simm! When the error message said CACHE I just assumed it was cpu cache.

Can't thank you enough.
Who needs a life when you've got Unix?