ProLiant Servers (ML,DL,SL)
1823032 Members
3686 Online
109645 Solutions
New Discussion

Re: High CPU Temp on Proliant G7 Servers

 
Stephan G
Regular Advisor

High CPU Temp on Proliant G7 Servers

Hi,

 

i think there is a bug either with the iLO Firmware 1.50 or BIOS.

 

Starting with Friday 4 of our Gen 7 Servers stating high temperatures of the CPU while the other servers are operating fine.

 

You can fix it when you power off the server and make it completely powerless. That's why i think it's a bug of the iLO Firmware. After a reset of the iLO or upgrading the firmware (which also resets the iLO) the CPU temp is back to 40 degrees.

 

I shut down a server for a whole night and after restarting it this morning it still says to high temperature. 82 degrees CPU. the other temps are just fine.

 

Does anyone else experience these problems ?

 

Greets

Stephan

60 REPLIES 60
Jan Soska
Honored Contributor

Re: High CPU Temp on Proliant G7 Servers

Hello,

Is this behavior the same in all g7 server (any type)?

we have some g7 servers with latest ilo - 1.55 form march 2013 and we do not have such issues.

 

So my recomendation is to install newer ilo (1.55), here is link for win2008r2 -

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=4154847&prodTypeId=18964&prodSeriesId=4154735&swLang=8&taskId=135&swEnvOID=4064

 

and come back with results...

 

Jan

Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

I upgraded already all servers.

 

They are DL380 G7 or DL360 G7.

 

I just want to make sure it is the iLO and not another fault.

Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

Doesn't help. After 12 hours the servers stopping and starting again.

 

HP Support case is opened.

Oscar A. Perez
Honored Contributor

Re: High CPU Temp on Proliant G7 Servers

Does it stop if you revert iLO3 firmware back to version 1.28?

See the below advisory. It is for the DL980 G7 but, I've seen reports that it is happening on other G7 servers.
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c03720197

How often can you reproduce it? I could provide you with a beta 1.57 that fixes the DL980 issue for a quick testing.




__________________________________________________
If you feel this was helpful please click the KUDOS! thumb below!
Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

Great. That's exactly what is happening.

 

At the moment the servers are stable. i can't reproduce it when i want, it just happens.

 

It would be great if you can send me a link to this beta firmware. So i can try it on one server.

 

Greets

Stephan

Oscar A. Perez
Honored Contributor

Re: High CPU Temp on Proliant G7 Servers

You can download it from here:

 

ftp://ilo4me:G!v3t2me@ftp.usa.hp.com

 




__________________________________________________
If you feel this was helpful please click the KUDOS! thumb below!
Oscar A. Perez
Honored Contributor

Re: High CPU Temp on Proliant G7 Servers

Hi,

 

Have you had the chance to test the iLO3 beta 1.57? 

 

Thanks

Oscar

 




__________________________________________________
If you feel this was helpful please click the KUDOS! thumb below!
Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

Hello,

 

not yet. It did not happen again.

 

If it does i will apply the firmware.

 

Greets

Stephan

Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

I applied the new non beta 1.57.

 

It happened again with two servers. HP is on it. Noone else with these problems ?

 

It's not that cold in our the room where the servers are. But it happens only on G7 Servers.

All others (G4 to G6) are operating normal.

 

Greets

Stephan

Sbrown
Valued Contributor

Re: High CPU Temp on Proliant G7 Servers

Disable the thermal shutdown. If the servers are > 2 years old, it is time to regoop with arctic silver if they have been run in a warm condition!

Also some cpu's have failing sensors inside of them, your choice is catastrophic shutdown (thermal shutdown) or buy new cpu - you know what damage happens to your o/s filesystem, would you rather have that or monitor your temperatures another way!

The amount of damage thermal shutdown does with its abrupt cold power down is very bad. I'd take my odds with a server out of warranty and let it overheat and die gracefully, but you can query sea of sensors and use environmental monitoring to do a better job than ILO , but that is only my opinion!

If you like watching RAID-5/6 rebuilds - keep it on! the sensors on the motherboard and in the cpu do fail. it is fact. I'd RMA if I was under warranty but I am not. Once out of warranty, disable thermal shutdown. It is not like there is 1 fan blowing on the cpu's like a white-box desktop!
LFnet
Occasional Visitor

Re: High CPU Temp on Proliant G7 Servers

For the record, we're seeing this same problem on one DL380 G7, iLO Firmware 1.50.

 

The read temperature has been oscillating between 40°C and 82°C for about 20 minutes now.

 

All other systems from the same batch are fine, and this one has been working flawlessly since 2011-08.

 


Edit: Resetting the iLO seems to have fixed the problem for now.

Torsten.
Acclaimed Contributor

Re: High CPU Temp on Proliant G7 Servers

>> The read temperature has been oscillating between 40°C and 82°C for about 20 minutes now.


I have seen the same, but it looks like ILO fw 1.57 solves this.

 

 

7266    Repaired    Environment    7/3/2013 10:37    7/3/2013 10:37    1    System Overheating (Temperature Sensor 3, Location CPU, Temperature 82C)
7265    Repaired    OS    7/3/2013 10:37    7/3/2013 10:36    1    Automatic Operating System Shutdown Initiated Due to Overheat Condition
7264    Repaired    Environment    7/3/2013 10:37    7/3/2013 10:36    1    System Overheating (Temperature Sensor 3, Location CPU, Temperature 40C)
7263    Repaired    OS    7/3/2013 10:36    7/3/2013 10:36    1    Automatic Operating System Shutdown Initiated Due to Overheat Condition
7262    Repaired    Environment    7/3/2013 10:36    7/3/2013 10:36    1    System Overheating (Temperature Sensor 3, Location CPU, Temperature 82C)
7261    Repaired    OS    7/3/2013 10:35    7/3/2013 10:35    1    Automatic Operating System Shutdown Initiated Due to Overheat Condition
7260    Repaired    Environment    7/3/2013 10:35    7/3/2013 10:35    1    System Overheating (Temperature Sensor 3, Location CPU, Temperature 40C)


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

It says so but it does not. This only applies to DL980 G7.

 

My call already rised to 2nd level support. Let's see what happens next ;)

Torsten.
Acclaimed Contributor

Re: High CPU Temp on Proliant G7 Servers

>> It says so but it does not.

 

Based on your test or on release notes?


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

Based on my tests.

I installed it on several G7 servers. But some of them still "overheating" (they don't restart and don't really overheat).

 

I also installed the latest drivers for each component.

Torsten.
Acclaimed Contributor

Re: High CPU Temp on Proliant G7 Servers

I have seen this on DL360 G7 and DL580 G7 with version 1.55.

What happens if you go back to 1.28?

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

Not tried that yet.

I'm waiting for more input from HP.

 

Also .. the ilo gets resetted when applying a "new" firmware. After resetting the fault is not reappearing for some days.

Torsten.
Acclaimed Contributor

Re: High CPU Temp on Proliant G7 Servers

If there are several servers with this issue, I would consider to downgrade one of them and see what happens.

Keep us posted.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

I installed it on a DL380 G7.

I will get back to you at the end of this week.

 

 

Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

On the server with 1.28 there is no log entry concerning overheating. So it looks like this helped.

 

But it's not the solution after all ;). Maybe the second level comes around with a new version.

Oscar A. Perez
Honored Contributor

Re: High CPU Temp on Proliant G7 Servers

Hi Stephan,

 

Could you please provide a screenshot of the IML entries showing CPU overheating plus details of what Processors do you have?  Do the servers actually shut down or is this just the false reporting?




__________________________________________________
If you feel this was helpful please click the KUDOS! thumb below!
StephanGee
Occasional Advisor

Re: High CPU Temp on Proliant G7 Servers

The one server is

Produktname ProLiant DL380 G7 Seriennummer - Prozessorpaket 1 Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Prozessorpaket 2 Intel(R) Xeon(R) CPU X5650 @ 2.67GHz

the other one

Produktname ProLiant DL360 G7 Seriennummer - Prozessorpaket 1 Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Prozessorpaket 2 Intel(R) Xeon(R) CPU X5650 @ 2.67GHz

 

 

So the CPU is the same.

 

Normally they abort before they shutdown. There was only one server that really shut down.

Oscar A. Perez
Honored Contributor

Re: High CPU Temp on Proliant G7 Servers

Hi Stephan,

 

How long does it take to reproduce the issue? Yesterday, I got a DL380G7 with two Intel x5650 processors and it's been running fine with iLO3 1.57.

 

Anyway, I found a bug in the code that I believe could explain the false over temp. The problem is that since I can't reproduce the issue, I can't prove that what I found is the actual root cause.  Could you please test an iLO3 1.58 beta real quick?

 

ftp://ilo4me:G!v3t2me@ftp.usa.hp.com

 




__________________________________________________
If you feel this was helpful please click the KUDOS! thumb below!
Stephan G
Regular Advisor

Re: High CPU Temp on Proliant G7 Servers

Well that's the problem it can last one day or one week. Some of my G7 servers never had this problem again after resetting the iLO.

 

So there is not that big of testing here either. As friday is our office day i would like to perform the update in the evening. The downgrade to 1.28 on one of these servers freezed it ;)

 

Just want to be sure.

 

Edit:

Another Server had the same problems last night too.

I will try it now with this one.

But it's another processor:

Produktname ProLiant DL360 G7 Seriennummer - Prozessorpaket 1 Intel(R) Xeon(R) CPU E5640 @ 2.67GHz Betriebssystemumgebung Microsoft® Windows Server® 2008 Standard x64 Version, Service Pack 2 (Build 6002)

 

 

Greets

Stephan