ProLiant Servers (ML,DL,SL)

DL380 GEN9 fans 1,2,3 are not ramping up properly

 
SOLVED
Go to solution
yurtesen
Regular Visitor

DL380 GEN9 fans 1,2,3 are not ramping up properly

On a machine with dual E5-2620 v4. I ran a test which utilize CPUs fully.

CPU1 did not rise above 40C, and CPU2 was 60C

I expected fans 1,2,3 to ramp up more but they did not ramp up as much as fans 4,5,6.

These are the fan speeds when machine is idle:

9%  11%  14%  17%  17%  17% 

 

These are the fan speeds when both processors are loaded and 2nd CPU is 60C. Although CPU1 shows 40C still. Sure 1,2,3 are going few percent faster but not quite enough. Notice that 4,5,6 are at 23% but they are blowing at CPU1 so I am no sure why they would get much faster when the CPU2 is the one which is getting hot.

12%  15%  18%  23%  23%  23% 



Interesting thing is that when I start the test, momentarily fans 1,2,3 ramp up properly. This is few seconds after the test starts, before it settles at lower speeds.

27%  34%  27%  20%  20%  20% 

 

So what is going on here?

Thanks!

4 REPLIES 4
Cali
Honored Contributor

Re: DL380 GEN9 fans 1,2,3 are not ramping up properly

Hi,

the Fan Speed is not only driven by CPU Temperature.

The iLO is monitoring 100s of Temperature Sensors in the Server including,: Power Supply, Memory Modules, and PCI Cards.

The Fan Speed is driven by all of these Sensors.

60 °C for the CPU is not a Degraded or Critical Temp, even 95 °C may be OK.

17/20/23% is not a big difference, do not worry, nobody cares for this.

The iLO Chip monitored all Sensors and take care, that nothing is going too hot.

Cali

 


======================
That was not planned in this way.
yurtesen
Regular Visitor

Re: DL380 GEN9 fans 1,2,3 are not ramping up properly

@Cali E5-2620v4 max temperature 74C. The sensor measured temperature will be less than the actual processor temperature. So it is logical to assume that processor is already throttling due to temperature. I could possibly test performance difference between CPU1 and CPU2.

Also there are 29 temperatures reported in this machine. So not sure why you say iLO monitors 100s of sensors. (is it documented somewhere?)

Yes the fan speed is not only due to CPU temperature, but I am specificly increasing temperatures of both CPUs and the one which is much cooler gets faster fan speed with no other temperature spiking around that area.

So I do not understand why you make so exaggerated and negative post about this quite real issue.

The fans blowing to CPU1 are ramping up more than the fans blowing up on CPU2. There is huge difference in temperatures between CPU1 and CPU2. The fact that it is ramping up wrong fans should be enough to raise eyebrows. Yes people care about this sort of stuff.

Cali
Honored Contributor

Re: DL380 GEN9 fans 1,2,3 are not ramping up properly

The Number of Sensors depends on your Hardware Setup.

Every memory module, possible Disk, and PCI Card has a Temp Sensor. (24 Dimm + 24 Disk =48 only for this).

Did you try to set the thermal configuration:

https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=sf000046358en_us

And check for iLO4 Firmware Update.

I'm not from HPE!


======================
That was not planned in this way.
yurtesen
Regular Visitor
Solution

Re: DL380 GEN9 fans 1,2,3 are not ramping up properly

Well I left the machine on overnight and today it works better and as expected. Right now it does ramp up 1,2,3 more than 4,5,6.. Also when CPU2 was 56C, CPU1 was 43C which is reasonable considering system ramped up the fans for 1,2,3 but the 4,5,6 seem to be running at minimal speed, which is quite ok considering CPU1 was quite cool.

12% 16% 12% 11% 11% 9%

So I made a cold restart with full power off and right after boot re-tested and I get same fan speed values as in my first post. Fans 4,5,6 settled at 23% again.

Based on this, I would say that the evidence shows that the issue has nothing to do with the temperature sensors or temperature in general.

After cold restart, it took like 5-10 minutes for 4,5,6 to go to 17% and another 10 minute till 16%, then went down to 14%.... So I do not know what it is doing but it looks like it takes hours for fans to stabilize. Eventually idle values go down to:
9% 11% 11% 11% !11% 9% and after that things seem to work more predictably.

Anyway, I won't spend more time on this, it is somewhat strange but it is what it is. Anyhow thanks for the responses.