- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Syslog Message.
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 10:44 AM
01-21-2003 10:44 AM
I have attached the syslog message(11.0 n-class)and also the corresponding action i took. Please let me know what i should do further...
Syslog Message:
Jan 20 12:49:05 EMS [1827]: ------ EMS Event Notification ------
Value: "MAJORWARNING (3)" for Resource: "/system/events/memory/192"
(Threshold: >= " 3") Execute the following command to obtain
event details: /opt/resmon/bin/resdata -R 119734274 -r
/system/events/memory/192 -n 119734277 -a
-----------------------------------------------
/opt/resmon/bin/resdata -R 119734274 -r /system/events/memory/192
-n 119734277 -a OUTPUT BELOW.....
CURRENT MONITOR DATA:
Event Time..........: Mon Jan 20 12:49:05 2003
Severity............: MAJORWARNING
Monitor.............: dm_memory
Event #.............: 4300
System..............:
Summary:
Memory Event Type : Single bit error (SBE) event. A correctable single
bit error has been detected and logged.
Description of Error:
The memory component:
Cab/Cell or Node: 0
MC/EXT: 1
DIMM: 0b
is experiencing correctable single bit errors (SBE) on a single
component.
Probable Cause / Recommended Action:
Although the single bit errors are being corrected, it may be advisable to
monitor the situation. If an excessive rate of single bit errors occur, an
event with higher severity will be generated.
Additional Event Data:
System IP Address...:
Event Id............: 0x3e2c369100000000
Monitor Version.....: B.01.00
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_dm_memory.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 70
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/800
EMS Version.....................: A.03.20
STM Version.....................: A.30.00
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/dm_memory.htm#4300
v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v
Component Data:
Physical Device Path....: 192
Tag 2...................: 20
Thanks in advance,
David.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 10:48 AM
01-21-2003 10:48 AM
Re: Syslog Message.
Do keep an eye on.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 10:49 AM
01-21-2003 10:49 AM
Re: Syslog Message.
You also may want to run stm and check run some information against your memory to check your PDT. If you got a couple of entries in there, it may be a good thing to start placing a call for a possible problem on your memory board.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 10:52 AM
01-21-2003 10:52 AM
Re: Syslog Message.
As Rick has mentioned a sole single-bit error or several over a long period is nothing to become alarmed about - that's what ECC memory is designed to handle.
What I would recommend is to go into stm & check the PDT (Page Deallocation Table) & verify that you don't have a bunch of pages deallocated. That would indicate a DIMM(s) is going bad.
Rgds,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 10:52 AM
01-21-2003 10:52 AM
Re: Syslog Message.
Since the error was self corrected, there is nothing you can do now but keep monitoring the situation based on the following action recommended in your post:
Probable Cause / Recommended Action:
Although the single bit errors are being corrected, it may be advisable to
monitor the situation. If an excessive rate of single bit errors occur, an
event with higher severity will be generated.
Hai
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 11:25 AM
01-21-2003 11:25 AM
Re: Syslog Message.
We had started receiving this error on our V-2600 server and it started out similar about 1 every month or so, then without warning we lost a STICK of Memory and the server de-allocated the memory from usage. We had to schedule a downtime period to have the CE come in and replace the STICK.
My advice to you is go ahead and have the STICK replaced as long as you have a Maintenance Agreement on the server. Don't wait for it to manifest itself into a bigger problem.
Gl
Frank G.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 11:39 AM
01-21-2003 11:39 AM
Re: Syslog Message.
Eugeny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 11:50 AM
01-21-2003 11:50 AM
Re: Syslog Message.
Thanks much for all the responses..
Thanks
David.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 11:54 AM
01-21-2003 11:54 AM
Re: Syslog Message.
What you have to do is basically, find the memory and highlight it, then click on Information, then click on Run. This will generate a log which at the bottom of the log, you will find the PDT Table Entries. It will tell you how many are Free, how many are Used, and how many are Available
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 11:56 AM
01-21-2003 11:56 AM
Re: Syslog Message.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 12:18 PM
01-21-2003 12:18 PM
Re: Syslog Message.
I have attached my stm report. Can someone suggest accordingly.
Thanks
David.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2003 12:25 PM
01-21-2003 12:25 PM
SolutionWell...3 pages (12 Mb) deallocated is not that bad. But on the otherhand they're all on the same DIMM. You're not in any imminent danger but I would think that you should schedule downtime to have that particular DIMM replaced under warranty within the next couple of weeks...just to be safe....and to get that 12 MB of RAM back.
Rgds,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-23-2003 11:45 AM
01-23-2003 11:45 AM
Re: Syslog Message.
Thanks for all the answers. I just got couple of more questions to clarify/learn:
1.How do we know from the reports (attached above)that it occurs on the same DIMM (Jeff pl.help_
2. CRC - what is this
3. Is STM the only way to find this error or is there any other alternate method?
Please advise.
Thanks
David.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-23-2003 12:06 PM
01-23-2003 12:06 PM
Re: Syslog Message.
Ok, I'll tackle your questions.
1) From the stm report you'll note:
A) on the err log summary, ALL errors occurred on DIMM EXT1 (mem carrier 1) / Ob (slot b - all memory comes in DIMM pairs. In this case 0/a & 0/b are a pair). So you see that ALL errors are on the 0b DIMM
B) on the PDT - you can match back the addr (0x03dbf38, b38 & fb8) to the summary to verify that these are all on EXT1/0b, even the the PDT references both 0a & 0b.
2) CRC - Cyclical Redundancy Check. This is the err checking schema employed. Basically a checksum used to calculate whether the value has changed or not.
3) Although STM is the best way, there are others. The system log (/var/adm/syslog/syslog.log) as well as the GSP error log are other places to find these errors denoted.
HTH,
Jeff