Operating System - HP-UX
1832862 Members
2749 Online
110048 Solutions
New Discussion

Single bit error (SBE) event

 
mark.M
Frequent Advisor

Single bit error (SBE) event

Action : Contact your HP support representative to check the memory boards. Really?????
That repaired error bit rightly simply why be ?
How must you handle?




>------------ Event Monitoring Service Event Notification ------------<

Notification Time: Tue Aug 9 18:07:44 2005

server sent Event Monitor notification information:

/system/events/memory/49 is >= 4.
Its current value is CRITICAL(5).



Event data from monitor:

Event Time..........: Tue Aug 9 18:07:43 2005
Severity............: CRITICAL
Monitor.............: dm_memory
Event #.............: 4500
System..............: lion01

Summary:
Memory Event Type : Single bit error (SBE) event. A correctable single
bit error has been detected and logged.


Description of Error:

The memory component:

Cell/Node: 0
MC/EXT: 0
DIMM: 0b

is experiencing an excessive number of single bit errors.

Probable Cause / Recommended Action:

Although the single bit errors are being corrected, it is strongly advisable
to evaluate whether any memory replacement is warranted at this time. This
condition indicates a potential problem.

Additional Event Data:
System IP Address...: xx.xx.xx.xx
Event Id............: 0x42f8726000000000
Monitor Version.....: B.01.00
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_dm_memory.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 130
Received within...: 7 day(s)
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/800
EMS Version.....................: A.03.20
STM Version.....................: A.24.00
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/dm_memory.htm#4500

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v



Component Data:
Message in ll_msg (set: 50 msg: 10) did not exist in catalog.
Catalog type is MONITOR_INFO
Catalog version is A.01.00
Module name is dm_memory
Message set number is 50
Message number is 10
Message size is 72
Message parameter 1:
Message parameter size = 12
Message parameter is a literal
Literal text is 49
Message in ll_msg (set: 50 msg: 11) did not exist in catalog.
Catalog type is MONITOR_INFO
Catalog version is A.01.00
Module name is dm_memory
Message set number is 50
Message number is 11
Message size is 72
Message parameter 1:
Message parameter size = 12
Message parameter is a literal
Literal text is 20


>---------- End Event Monitoring Service Event Notification ----------<
6 REPLIES 6
lawrenzo
Trusted Contributor

Re: Single bit error (SBE) event

indicating a possible memory issue with the hardware

check cstm:

echo "selclass qualifier memory;info;wait;infolog" | cstm

syslog and dmesg for further errors - if you have support contact HP and replace fauly memory
hello
Mel Burslan
Honored Contributor

Re: Single bit error (SBE) event

This is a no-worries kind of error unless the output you get from the cstm command from the previous post keeps changing and error counts get increasing constantly. If the error counts change by one every once in a blue moon, it really is not something to worry about. Otherwise, you really need to log a support call and in the call, give them the cstm output so that they can locate the module which is failing.
________________________________
UNIX because I majored in cryptology...
Joseph Loo
Honored Contributor

Re: Single bit error (SBE) event

hi,

not to worry, unless the error reported for the memory is exceeding thousands. refer to "Memory Error Log Summary" in stm for info.

for more info:

http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000080081538

sometimes, a reboot of the system does clear this error as it may be some application in the system which has cause problem during memory allocation process. but that do not happen too often.

regards.
what you do not see does not mean you should not believe
Sreedhar Nathani
Valued Contributor

Re: Single bit error (SBE) event

Hi,

SBE(Single Bit errors) are normally correctable errors by the hardware.

What it means, when accessing a particular address on the memory there was a problem with address, OS marked that address is bad and it uses different address.

When a SBE error encounted on a memory module its address is released from the OS and updates in the PDT table, so that that address won't be used in future. For some reason if system is not able to deallocate that address, if any other process tries to use the same address again you will get the EMS Message.

when you reboot the server, that address will be updated in the PDT table(which can't be released by the OS) so it won't use by OS in future.

Incase you are seeing this messages on the same memory module and same memory address then reboot the system will resolve the problem.

Incase you are seeing this message on the same module and different address, then you may needs/plan to replace the DIMM.

You can find whether SBE is coming in same address or not from this commands.
#
cstm>ru lo
logtool>VD ** view the entire memory log
logtool>VDA **View errors pertaining to deallocated memory errors

Note : DBE(Double bit error) will occur if 2 adjust memory address are having the problem and will cuase system to do HPMC

Hope this helps
Rick Garland
Honored Contributor

Re: Single bit error (SBE) event

With Single Bit Errors, these are correctable memory errors.

Keep track of these. If you get many in a short time then you want to place a service call.

DCE
Honored Contributor

Re: Single bit error (SBE) event

Just went through this - If the error is an one time event it is harmless. If you start getting multiple errors on the same module, it is time to replace it. My original error was 2.5 years ago, then n activity til recently (one more message). System is still running fine