Simpler Navigation for Servers and Operating Systems - Please Update Your Bookmarks
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
If you have bookmarked forums or discussion boards in Servers and Operating Systems, we suggest you check and update them as needed.
General
cancel
Showing results for 
Search instead for 
Did you mean: 

hpasmcli, Where does it get its info from?

GSmith1011
Occasional Visitor

hpasmcli, Where does it get its info from?

Hi there,

Can someone answer this for me please...

On a DL585 server running RedHat, hpasmcli is showing a dimm as "degraded", if i then move the dimm to another slot, it still shows as degraded, so it looks like there must be a file containing the serial number or the like of the dimm. If i then put this dimm in another server all is fine. So where does "hpasmcli" get it's info from please?
2 REPLIES
Matti_Kurkela
Honored Contributor

Re: hpasmcli, Where does it get its info from?

It's not a "file" as such.

The DL585 uses ECC error correcting memory. The ECC feature on the DIMMs is complemented with memory error counters on the system board. When hpasmcli shows the DIMM as "degraded", it means a non-negligible number of memory errors has been counted on that DIMM since system startup. As the system has not crashed, all of the errors have been corrected in real time by the ECC, but it still indicates the memory is not working quite up to the specifications.

When you move the DIMM in another server, is that server using the same memory bus speed as the DL585? It might be that the DIMM works fine if used at lower speed.

The memory DIMM also has a Serial Presence Detect (SPD) chip - a tiny EEPROM containing the DIMM's serial number, speed rating and other information. The lm_sensors package on Linux has a script "decode-dimms.pl" which you can use on some systems to take a peek at the SPD EEPROMs if you're curious.

It is certainly *possible* that the server might track DIMM serial numbers that have been detected as "degraded", so that the fault indication is displayed consistently even though the DIMM actually produces errors only very rarely. This tracking would probably use the hardware error log (IML), as anything as complex as writing a file to a disk might endanger data integrity if the system has uncorrectable memory errors.

(If the actual instructions that handle the writing would be corrupted by the memory errors, you would get disk corruption on top of the failing memory issue - and you definitely would not be happy.)

A complete and accurate answer would be available only from someone who knows the internals of the DL585 system board, i.e. a HP engineer.

MK
MK
GSmith1011
Occasional Visitor

Re: hpasmcli, Where does it get its info from?

Hi MK,

Thank you for your reply, that has given me a path to look into to see if i can resolve this issue.

Thank you again

Garry