Server Management - Systems Insight Manager
1753536 Members
6358 Online
108795 Solutions
New Discussion

cpqHeThermalDegradedAction

 
luca1
Frequent Advisor

cpqHeThermalDegradedAction

Hi at all,
I installed in my envinroment HP sim 6.2.  I have 2 servers HP: proliant ML570 G4 and proliant ProLiant DL360 G7 monitored by sim.
When in server room  I had a problem with system cooling I received error snmp from ML570 G4:

 

Event Name: (SNMP) Thermal Status Degraded (6041)
URL: xxxxxxxxxx:2381/
Event originator: my ML570 G4 server
Event Severity: Critical
Event received: 02-Aug-2012, 18:05:40
 
Event description: The temperature status has been set to degraded in the specified chassis and location.  The server``s temperature is outside of the normal operating range.  The server will be shutdown if the
cpqHeThermalDegradedAction variable is set to shutdown(3).


firs question: Why I received only snmp trap temperature from proliant ML570, while I didn't receive message from ProLiant DL360?

 

second question: Could I set sim to specific threshold for all servers? For example I want receive snmp temperature message when the temperature is above 37°C?

 

thanks

 

3 REPLIES 3
shocko
Honored Contributor

Re: cpqHeThermalDegradedAction

You need to check the event log on the DL to see if an event was actually logged. Secondly, to my knowledge you cannot set a thermal warning threshold in SIM. SIM simply reacts to the criticality of this type of event.

If my post was helpful please award me Kudos! or Points :)
luca1
Frequent Advisor

Re: cpqHeThermalDegradedAction

Ok,

Too me think that is not possible set a specific threshold.

Thanks for your post.

 

 

Anonymous
Not applicable

Re: cpqHeThermalDegradedAction

Hi Luca,

 

Re: Question 1:

There is a reasonable chance that the thermal thresholds between the two servers are different and by design. I would suggest doing an SNMPwalk of the thermal MIBS which start at: 1.3.6.1.4.1.232.6.2.6.1.0

 

The Temperature sensor readings are at 1.3.6.1.4.1.232.6.2.6.8.1.4.0.1

4.0.1 = ioBoard

4.0.2 = ambient

4.0.3 = CPU 1 core 1

4.0.4 = CPU 1 Core 2

 

Information Source

 

Re: Question 2

When I asked the question of HP some time ago, I was advised that HP hard code the thresholds and they do this to ensure modification does not occur which would void warranty. To that end, HPSIM relies on SNMP Traps for thermal alerts.