- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Re: ECC Correctable, Sensor Number 4
ProLiant Servers (ML,DL,SL)
1820195
Members
3868
Online
109620
Solutions
Forums
Categories
Company
Local Language
юдл
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
юдл
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-05-2010 05:50 AM
тАО04-05-2010 05:50 AM
ECC Correctable, Sensor Number 4
With a DL160 G5, we've experienced a crash which seems to be due to a bad memory module.
Through IPMI we're receiving (correctable) memory error events like this:
SEL Record ID : 0050
Record Type : 02
Timestamp : 04/03/2010 09:22:18
Generator ID : 0002
EvM Revision : 04
Sensor Type : Memory
Sensor Number : 04
Event Type : Sensor-specific Discrete
Event Direction : Assertion Event
Event Data : 000cff
Description : Correctable ECC
About 4 to 5 of these events are generated each day.
It's unclear which DIMM this sensor is related to. Sensor Number 4 doesn't appear as a sensor number in the listing of the sensors while the sensor number for "Memory ECC" is sensor number 2.
It would seem that sensor number 4 refers to a subsensor of the memory sensors but how can we find out which DIMM slot relates with sensor number 4?
Through IPMI we're receiving (correctable) memory error events like this:
SEL Record ID : 0050
Record Type : 02
Timestamp : 04/03/2010 09:22:18
Generator ID : 0002
EvM Revision : 04
Sensor Type : Memory
Sensor Number : 04
Event Type : Sensor-specific Discrete
Event Direction : Assertion Event
Event Data : 000cff
Description : Correctable ECC
About 4 to 5 of these events are generated each day.
It's unclear which DIMM this sensor is related to. Sensor Number 4 doesn't appear as a sensor number in the listing of the sensors while the sensor number for "Memory ECC" is sensor number 2.
It would seem that sensor number 4 refers to a subsensor of the memory sensors but how can we find out which DIMM slot relates with sensor number 4?
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-05-2010 04:35 PM
тАО04-05-2010 04:35 PM
Re: ECC Correctable, Sensor Number 4
What's the server firmware version?
Check this out:
HP ProLiant DL160 G5 Server Series - "Lower Critical-going low Assertion" and "Lower Critical-going low Deassertion" Events Appears Within the IPMI Log and Shuts Down a Number of Servers
Issue
A population of HP ProLiant DL160 G5 servers may shut down seemingly randomly. Within the IPMI log, variants of the following message may appear:
Generic 02/01/2009 17:51:40 FAN4 ROTOR1 Lower Critical-going low Assertion Generic 02/01/2009 17:51:41 FAN4 ROTOR1 Lower Critical-going low Deassertion
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=3663526&prodTypeId=12169&objectID=c01884069
The fan in question may vary.
Solution
The issue is resolved in a Single Point release server firmware 2009.4.13 , and later regular releases.
Check this out:
HP ProLiant DL160 G5 Server Series - "Lower Critical-going low Assertion" and "Lower Critical-going low Deassertion" Events Appears Within the IPMI Log and Shuts Down a Number of Servers
Issue
A population of HP ProLiant DL160 G5 servers may shut down seemingly randomly. Within the IPMI log, variants of the following message may appear:
Generic 02/01/2009 17:51:40 FAN4 ROTOR1 Lower Critical-going low Assertion Generic 02/01/2009 17:51:41 FAN4 ROTOR1 Lower Critical-going low Deassertion
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=3663526&prodTypeId=12169&objectID=c01884069
The fan in question may vary.
Solution
The issue is resolved in a Single Point release server firmware 2009.4.13 , and later regular releases.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-06-2010 01:45 AM
тАО04-06-2010 01:45 AM
Re: ECC Correctable, Sensor Number 4
The server firmware/BIOS is:
version: O12 (04/09/2008)
We'll upgrade this to the latest version.
If the server is said to "shutdown" due to this, does this mean the server somewhat gracefully shuts down or simply crashes? I ask this because when this issue occurs with us, the Proliant is completely stuck. No video output and no response to the keyboard. The only way to restart the server (besides pulling the power plug) is to press the power button for at least 5 seconds.
Still, with 4 to 5 of these ECC messages being generated per day, an upgrade of the firmware/BIOS will quickly indicate or at least suggest whether the firmware is related to this issue.
Thanks
version: O12 (04/09/2008)
We'll upgrade this to the latest version.
If the server is said to "shutdown" due to this, does this mean the server somewhat gracefully shuts down or simply crashes? I ask this because when this issue occurs with us, the Proliant is completely stuck. No video output and no response to the keyboard. The only way to restart the server (besides pulling the power plug) is to press the power button for at least 5 seconds.
Still, with 4 to 5 of these ECC messages being generated per day, an upgrade of the firmware/BIOS will quickly indicate or at least suggest whether the firmware is related to this issue.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-12-2010 02:36 AM
тАО04-12-2010 02:36 AM
Re: ECC Correctable, Sensor Number 4
The BIOS and firmware have been upgraded to BIOS version: O12 (07/27/2009) and firmware version 3.11.
Since this upgrade there have been no more memory or other errors reported in the system event log.
I hope this was the cause of the crash. Thanks for the advice.
Since this upgrade there have been no more memory or other errors reported in the system event log.
I hope this was the cause of the crash. Thanks for the advice.
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Learn About
News and Events
Support
© Copyright 2025 Hewlett Packard Enterprise Development LP