ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

What is the threshold for declaring memory failed?

What is the threshold for declaring memory failed?

Our ProLiants (we have mostly DL380 G3s), unlike some other servers we have, have a nice Boolean idea of when memory is bad: a message is logged to the IML, a fault light goes on next to the DIMM, etc. Beautiful.

Does anyone know what exactly is the condition that triggers this decision? That is: how bad is bad?
2 REPLIES
Prashant (I am Back)
Honored Contributor

Re: What is the threshold for declaring memory failed?

HI,

(I)
There is no threshold information as such for the same. But if it is reporting uncorractable memory error message. HP will get it replaced.

(II)
Check the mangement using the http:\\servername:2301
then ou can check the thresold limits.

(III)
If it is corrected memory error message then latest mangements drivers need to be loded on the same server.

Regards,
Prashant S.
Nothing is impossible

Re: What is the threshold for declaring memory failed?

> Check the mangement using the
> http:\\servername:2301

I don't use web-based management, but my understanding is that all values available in the web interface are available via SNMP. Perhaps the OID you are referring to is cpqHeCorrMemErrorCntThresh, in the CPQHLTH MIB.

This is set to 5. Meaning, I suppose, that if a given DIMM sees more than 5 errors, it is considered failed, *regardless of how much time elapses between the errors*?

And when is the counter reset? On every soft boot? On every power-cycle?