Operating System - HP-UX
1834836 Members
1897 Online
110070 Solutions
New Discussion

How do I get detailed cstm info for memory errors?

 
Tony Williams
Regular Advisor

How do I get detailed cstm info for memory errors?

In the CLI STM I can see memory errors:
cstm
map
sel dev 2
il

I want to see detailed info on the errors I want to get the date and time stamps that is not available in the infolog. What command do I run to get the detailed log data?
11 REPLIES 11
Torsten.
Acclaimed Contributor

Re: How do I get detailed cstm info for memory errors?

Don't know what your device 2 is, but let's start with

echo "selclass qualifier memory;info;wait;infolog" | /usr/sbin/cstm

post the output. Then we will proceed with more details.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Tim Nelson
Honored Contributor

Re: How do I get detailed cstm info for memory errors?

A graphical version of stm is availble with xstm then r-click on memory icon then info.

In some cases additional information is available if you install a license password. Password can be gotten from this ITRC site.
Torsten.
Acclaimed Contributor

Re: How do I get detailed cstm info for memory errors?

Do you have any problems with the memory?

Maybe you have, if you other results than this:

...
Memory Error Log Summary

The memory error log is empty.

Page Deallocation Table (PDT)

The Page Deallocation Table is empty.

PDT Entries Used: 0
PDT Entries Free: 100
PDT Total Size: 100
...


Please come back with your results.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Tony Williams
Regular Advisor

Re: How do I get detailed cstm info for memory errors?

I have attached two files the 1st "cstm_mem_infolog.txt" with the output from:
echo "selclass qualifier memory;info;wait;infolog" | /usr/sbin/cstm

The 2nd is "mem_detail.txt" and its the output that is in this file that I want to reproduce:

==========================================================================
DIMM Location: DIMM 1B
Error Type: Single-bit error
Page Status: Pending - Page is reserved by OS and is not obtainable
First Detected: Wed Jul 2 21:37:46 2008
Last Detected: Wed Jul 2 21:37:46 2008
Error Count: 2
Error Addr: 0x3ff947f00
==========================================================================
Tony Williams
Regular Advisor

Re: How do I get detailed cstm info for memory errors?

The 2nd file
Torsten.
Acclaimed Contributor

Re: How do I get detailed cstm info for memory errors?

There are a lot of entries, but you will find very often this line:

Page Status: Pending - Page is reserved by OS and is not obtainable

The diagnostics wants to reserve this memory area for testing, but it can't because the area is in use. The system will still log the same error again and again.

You should try to reboot the server, doing this will reserve the suspect areas to the diagnostics. If you still get errors, consider to replace the DIMM.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Tony Williams
Regular Advisor

Re: How do I get detailed cstm info for memory errors?

The command option I was looking for is the "vd" view detail option in logtool. The vd option gives data, time, and location of the memory errors.

Logtool Utility>vd
Formatting of the memory error log is in progress.

-- Converting a (651368) byte raw log file to text. --
Preparing the Logtool Utility: View Memory Report File ...

-- Logtool Utility: View Memory Report --

System Start Time: Mon Jun 30 09:19:00 2008

Last Error Detected Time: Thu Jul 17 12:34:52 2008

Logging Time Interval: 600 Seconds

==========================================================================
DIMM Location: Ext 1 DIMM 1C
Error Type: Single-bit error
Page Status: Active - Page is active and can be used
First Detected: Thu Jul 17 12:34:52 2008
Last Detected: Thu Jul 17 12:34:52 2008
Error Count: 1
Error Addr: 0x498adb380
==========================================================================
Torsten.
Acclaimed Contributor

Re: How do I get detailed cstm info for memory errors?

OK, you are already in logtool.
This was what I would proceed with ...

Only 1 Single-bit error is nothing to worry about - this just happens.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Tony Williams
Regular Advisor

Re: How do I get detailed cstm info for memory errors?

Sorry I abbreviated the logfile. There are actually close to 3000 errors on mostly 1 DIMM across close to 3000 different addresses, but there are no PDT entries. What does that suggest?

An EMS event also has not registered do you know what the threshold is for memory errors to generate an EMS event?

System start: Mon Jun 30 09:19:00 2008.
Last error detected: Thu Jul 17 12:04:52 2008.
Logging interval: 600 seconds.
2943 address(es) with errors logged in memory error log.

The Logtool Utility provides full details about the memory error log.

Page Deallocation Table (PDT)

The Page Deallocation Table is empty.

PDT Entries Used: 0
PDT Entries Free: 100
PDT Total Size: 100
Torsten.
Acclaimed Contributor

Re: How do I get detailed cstm info for memory errors?

Only if the diagnostic has tested a suspect memory area and found it's bad, this block goes into PDT. That's way I suggested the reboot, because the diags cannot test "in use" areas of memory.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Tony Williams
Regular Advisor

Re: How do I get detailed cstm info for memory errors?

Thanks everyone for all of your help.