- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: EMS Event Notification (memory)
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-22-2005 12:36 AM
08-22-2005 12:36 AM
EMS Event Notification (memory)
i have the following notification on the syslog file
Aug 18 01:00:12 CENTRAL1 EMS [2520]: ------ EMS Event Notification ------ Value: "MAJORWARNING (3)" for Resource: "/system/events/memory/0_5" (Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 165150722 -r /system/events/memory/0_5 -n 165150721 -a
Aug 21 03:41:44 CENTRAL1 EMS [2520]: ------ EMS Event Notification ------ Value: "SERIOUS (4)" for Resource: "/system/events/memory/0_5" (Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 165150722 -r /system/events/memory/0_5 -n 165150723 -a
Aug 22 08:26:17 CENTRAL1 EMS [2520]: ------ EMS Event Notification ------ Value: "MAJORWARNING (3)" for Resource: "/system/events/memory/0_5" (Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 165150722 -r /system/events/memory/0_5 -n 165150724 -a
when executing the CMD /opt/resmon/bin/resdata -R 1651 ... it gives
ARCHIVED MONITOR DATA:
Event Time..........: Sun Aug 21 03:41:44 2005
Severity............: SERIOUS
Monitor.............: dm_memory
Event #.............: 4400
System..............: CENTRAL1
Summary:
Memory Event Type : Single bit error (SBE) event. A correctable single
bit error has been detected and logged.
Description of Error:
The memory component:
Cab/Cell or Node: 0/1
MC/EXT: 0
DIMM: 0C
is experiencing a high rate of correctable single bit errors on a
single component.
Probable Cause / Recommended Action:
Although the single bit errors are being corrected, it is advisable to
closely monitor the situation. If an excessive rate of single bit errors
occur, an event with higher severity will be generated.
Additional Event Data:
System IP Address...: 172.20.0.101
Event Id............: 0x4307e9e800000000
Monitor Version.....: B.01.00
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_dm_memory.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 100
Received within...: 7 day(s)
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/800/S16K-A
EMS Version.....................: A.04.00
STM Version.....................: A.41.00
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/dm_memory.htm#4400
v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v
Component Data:
Physical Device Path....: 0/5
Tag 2...................: 20
my question is:
- should i replace the memory module.
- why the ems notification is sometime "MAJORWARNING (3)" and sometime "SERIOUS (4)"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-22-2005 12:41 AM
08-22-2005 12:41 AM
Re: EMS Event Notification (memory)
Not sure why you get different warning levels
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-22-2005 12:59 AM
08-22-2005 12:59 AM
Re: EMS Event Notification (memory)
It is required to change atleast one module listed here as the notification is repeating very frequently. The notification is serious and majorwarning depending upon the the no. of error reported in time. The description of EMS event whose details are provided here is majorwarning and the occurance has happened 100 times in 7 days. For the event with SERIOUS notifications you would find it more.
It is adviced to changes the part listed here.
HTH,
Devender
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-22-2005 02:46 AM
08-22-2005 02:46 AM
Re: EMS Event Notification (memory)
Dave
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-22-2005 02:56 AM
08-22-2005 02:56 AM
Re: EMS Event Notification (memory)
You're getting errors almost every day...I think you should plan memory replacement, or risk an unplanned server crash...
Enjoy :)
Pedro
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-22-2005 02:58 AM
08-22-2005 02:58 AM
Re: EMS Event Notification (memory)
echo 'selclass qualifier memory;info;wait;infolog' | cstm >/tmp/meminfo.`date +%m%d%H%M`
run this from cron hourly or at intervals you choose. In this report you will see a section as follows (yours may vary depending on the number of memory modules and their sizes but the look will be the same)
Memory Error Log Summary
Error
Ext/DIMM Error Address Error Type Page Count
------------ ------------------ ---------- --------- -----
EXT0/2b 0x00000000632a9681 Single-Bit 0x00632a9 1
EXT0/0a 0x000000007088c5c1 Single-Bit 0x007088c 12
EXT0/0a 0x00000000708cc5c1 Single-Bit 0x00708cc 2
EXT0/0b 0x000000002ccfd801 Single-Bit 0x002ccfd 1
EXT0/3a 0x0000000004bcfd01 Single-Bit 0x0004bcf 1
EXT0/2b 0x000000006767e281 Single-Bit 0x006767e 2
EXT0/0b 0x00000000510bc001 Single-Bit 0x00510bc 1
EXT0/1b 0x000000003758db81 Single-Bit 0x003758d 1
EXT0/1b 0x000000004198d101 Single-Bit 0x004198d 5
EXT0/1b 0x00000000419c1101 Single-Bit 0x00419c1 1
EXT0/3b 0x00000001853b9041 Single-Bit 0x01853b9 1
EXT0/3b 0x0000000009b9a5c1 Single-Bit 0x0009b9a 7
EXT0/2b 0x000000004367e1c1 Single-Bit 0x004367e 1
System start: Fri Nov 14 16:38:45 2003.
Last error check: Mon Aug 22 07:31:37 2005.
The important information for you is contained in the last column of each line, which is the error count on each module. If one or more of the modules have this count constantly increasing, I would call in for a hardware service to get it replaced. Otherwise, one or two counts every few days is not something you want to lose any sleep on.
UNIX because I majored in cryptology...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-22-2005 04:17 AM
08-22-2005 04:17 AM
Re: EMS Event Notification (memory)
Please keep an eye on this error if it comes regularly ..or it came for only once.
# cat /var/adm/syslog/syslog.log | grep "memory/0_5"
and if its occuring everyday , take this with HP to get the memory module replace , before it can be serious.
Cheers ,
RajD.