ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

EDAC i5000 MC0: NON-FATAL ERRORS Found!!!

Endre Ligeti
Visitor

EDAC i5000 MC0: NON-FATAL ERRORS Found!!!

Hello!

I administer a ProLiant ML350 G5 server which is running Debian Lenny AMD64 (Linux filerpri 2.6.26-2-amd64 #1 SMP Thu Nov 5 02:23:12 UTC 2009 x86_64 GNU/Linux). A few days ago some strange messages writes on the console and to the syslog too.

Examples:

Nov 8 06:25:02 filerpri kernel: [160849.062055] EDAC i5000 MC0: NON-FATAL ERRORS Found!!! 1st NON-FATAL Err Reg= 0x2000
Nov 8 06:25:02 filerpri kernel: [160849.062108] EDAC MC0: CE row 0, channel 0, label "": (Branch=0 DRAM-Bank=3 RDWR=Read RAS=6421 CAS=112, CE Err=0x2000)

Nov 8 07:07:22 filerpri kernel: [163491.047139] EDAC i5000 MC0: NON-FATAL ERRORS Found!!! 1st NON-FATAL Err Reg= 0x10000
Nov 8 07:07:22 filerpri kernel: [163491.047187] EDAC MC0: CE row 0, channel 0, label "": (Branch=0 DRAM-Bank=3 RDWR=Read RAS=6144 CAS=0, CE Err=0x10000)

The console is totally unusable because of them: about 1-3 seconds a new message appears with various RAS/CAS values. Mostly the "Reg" value is 0x2000, but sometimes it is 0x10000.

I did some research on this messages and found no exact solution. Is it a hardware (memory) failure or just an annoying kernel bug? How can I fix it or get rid of them?

Thanks for all help!

Sincerely,
Endre