cancel
Showing results for 
Search instead for 
Did you mean: 

EVM ALERT [700]???

ericfjchen
Regular Advisor

EVM ALERT [700]???

I received the following mail from EVM. What's going on for this box?

======================= Binary Error Log event =======================
EVM event name: sys.unix.binlog.hw.correctable_rpt_switch

Binary error log events are posted through the binlogd daemon, and
stored in the binary error log file, /var/adm/binary.errlog. This
event is posted when a high number of correctable errors have been
reported over a short period of time, and the system has
temporarily suppressed or re-enabled reporting. Reporting is
automatically re-enabled five minutes after suppression has
started.

Action: If this event occurs repeatedly contact your service
provider.

======================================================================

Formatted Message:
Correctable error reporting state changed

Event Data Items:
Event Name : sys.unix.binlog.hw.correctable_rpt_switch
Priority : 700
PID : 354
PPID : 1
Event Id : 76
Timestamp : 26-Dec-2004 06:03:59
Host IP address : 10.86.XX.XX
Host Name : node07
User Name : root
Format : Correctable error reporting state changed
Reference : cat:evmexp.cat:300

Variable Items:
subid_class (INT32) = 120
subid_type (INT32) = 0
binlog_event (OPAQUE) = [OPAQUE VALUE: 488 bytes]

============================ Translation =============================
binlogshow: Unable to connect to a Compaq Analyze translation server.
binlogshow: The following Compaq Analyze servers are configured:
localhost
======================================================================

2 REPLIES
Johan Brusche
Honored Contributor

Re: EVM ALERT [700]???


Some ECC memory locations in your systemhave got single bit errors, but the ECC logic was able to correct the date. Repeatedly reading from these bad memory locations, each time logs this in binary.errlog or crdlog.

To avoid wasting CPU cycles and diskspace with this logging, the binlogd temporarily suspends logging of this kind of events.

The wsea utility from WEBES can be used to determine in what memory SIMM the problem is located.
The command syntax is:
/usr/sbin/wsea x analyze

Call your local service provider, with the info provided by wsea, so that they can send an engineer with the correct parts. (with that much errors the component is bound to degrade into a non-correctable state)

__ Johan.


_JB_
Johan Brusche
Honored Contributor

Re: EVM ALERT [700]???


Note: If this event is registered during system startup, then you do not have to worry about it. ECC error reporting is delayed on purpose during that time.

__ Johan.

_JB_