ProLiant Servers (ML,DL,SL)
1753797 Members
7298 Online
108799 Solutions
New Discussion

Re: Processor heated above trip temperature. Throttling enabled.

 
Karolis_S
Advisor

Processor heated above trip temperature. Throttling enabled.

I have an issue with 20x ProLiant DL380p Gen8 servers. These servers are part of HP Vertica cluster. All servers are running on power performance mode.
OS: CentOS release 6.6 (Final), kernel  2.6.32-504.8.1.el6.x86_64 

there are a lot of messages in /var/log/mcelog and dmesg:

CPU27: Package temperature above threshold, cpu clock throttled (total events = 86472)
CPU15: Package temperature above threshold, cpu clock throttled (total events = 86471)
CPU14: Package temperature above threshold, cpu clock throttled (total events = 86472)
CPU28: Package temperature above threshold, cpu clock throttled (total events = 86472)
CPU24: Package temperature above threshold, cpu clock throttled (total events = 86472)

CPU 25 THERMAL EVENT TSC d944f35ad5d65e
TIME 1447757250 Tue Nov 17 11:47:30 2015
Processor heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
mcelog: Too many trigger children running already
STATUS 880003c3 MCGSTATUS 0
MCGCAP 1000819 APICID 23 SOCKETID 1
CPUID Vendor Intel Family 6 Model 62
Hardware event. This is not a software error.

Integrated Management Log show no error's.. Temparature in server room is normal. 

Server Firmware Version Info:
HP ProLiant System ROM 08/02/2014 
HP Smart Array P420i Controller 6.00 Embedded
iLO 2.10 Jan 15 2015 
Intelligent Provisioning 1.61.45 
Power Management Controller Firmware 3.3 
Power Management Controller Firmware Bootloader 2.7 
SAS Programmable Logic Device Version 0x0C 
Server Platform Services (SPS) Firmware 2.1.7.E7.4 
System Programmable Logic Device Version 0x32 

It looks like mcelog bug according to https://bugzilla.redhat.com/show_bug.cgi?id=924570

Anyone facing same issue? Can you recommend how to resolve it?

1 REPLY 1
Jimmy Vance
HPE Pro

Re: Processor heated above trip temperature. Throttling enabled.

That BZ looks to be geared more towards laptops.

What kind of load are you running on these systems. If it is very heavy CPU / memory intensive you can hit the CPU internal temp trigger before the system can react and ramp up the fans which will result in the MCE entry. Try setting the fans to 'Increased Cooling' and see if the errros stop. Increased Cooling will usually resolve the issue. If you still see the error try 'Maximim Cooling'

 

No support by private messages. Please ask the forum!