System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

CPU Temperature above threshold

 
Jeremy Simmons
Occasional Advisor

CPU Temperature above threshold

Brand new ML150 G6 (2.13 GHz Quad Core)

CentOS 5.3 keeps complaining intermittently that CPU 2 and CPU 3 are hot, but checking hardware monitoring in BIOS and using the HPASM utility, it only shows one temp for the physical CPU (and it is well within threshold).

So where is Linux getting 4 CPU temps and why is it wrong? How can I make the warnings go away?
12 REPLIES
Lukas Th. Hey
Occasional Visitor

Re: CPU Temperature above threshold

Hi there,

do you have "acpitool" installed and what does this tool say?

The BIOS is not important to a system as soon as the linux kernel has been loaded into the main memory. So let's use the values the operating system delivers ;-)

I haven't been working with HPASM so far but I guess it would be interesting to compare the values from HPASM with these from acpitool.

My 0.02 USD
Jeremy Simmons
Occasional Advisor

Re: CPU Temperature above threshold

I haven't found a way to get ACPI data. ACPITool shows no data and lm_sensors can't find any sensors.
Goran Koruga
Honored Contributor

Re: CPU Temperature above threshold

Hello.

You said it's a brand new system - all new CPU-s have built-in sensors. Did you run sensors-detect? That's pretty helpful whenever one has to use let's-save-money-on-useless-things brand-name systems without additional sensors.

Goran
Jeremy Simmons
Occasional Advisor

Re: CPU Temperature above threshold

Of course. But lm_sensors doesn't seem to support the chipset and I haven't found a driver to make it work yet. All the research keeps coming back to just use IPMI/HPASM since that is what HP provides and supports.

What I do find curious, and maybe this is the problem, is the temperatures being reported by IPMI.

CPU temp - 73 C
Intake (ambient) temp - 27 C
DIMM temps - 32-34 C
PCI temps - 31-40 C

These all seem a bit high, especially since the room is cooled to about 73F (22 C) so the intake temp seems rather high and all the postings I've found regarding CPU temps suggest that 50 C is normal and 70 C would be almost meltdown.

But the HP software says 73 C is within "normal operating range" and is normal all the way up to something like 128 C.

If these temps are normal for this system (or the sensor module reports higher than actual temps) but Linux expects lower numbers, that would be an issue.
Goran Koruga
Honored Contributor

Re: CPU Temperature above threshold

That was the point of my previous mail - builtin CPU temperature sensors don't require chipset drivers; i.e. k8temp, k10temp or coretemp ones.

Goran
Goran Koruga
Honored Contributor

Re: CPU Temperature above threshold

Sorry it's too early for me obviously - post not mail.
Jeremy Simmons
Occasional Advisor

Re: CPU Temperature above threshold

And how do you view those sensors if sensors-detect can't find any?
Goran Koruga
Honored Contributor

Re: CPU Temperature above threshold

Did you verify if your kernel has those drivers? Something like this:

ls -l /lib/modules/$(uname -r)/kernel/drivers/hwmon

Look for coretemp, k8temp or k10temp, depending on your CPU type. Note that k10temp has been added to the kernel only recently.

Goran
Jeremy Simmons
Occasional Advisor

Re: CPU Temperature above threshold

Only k8temp is there but that's only for AMD processors, isn't it?
Jeremy Simmons
Occasional Advisor

Re: CPU Temperature above threshold

Well, you pointed me in the right direction.

The 5.3 kernel removed the coretemp module and lm_sensors likewise changed not to detect it.

I found a patched lm_sensors with detection enabled and installed the coretemp module.

This website was very helpful: http://www.pendre.co.uk/archive-linux.php

Now I can see what the core temps are being reported as - they claim to be around 90C. Which seems a bit warm.

I have a support case open to determine what the real issue is.
Jeremy Simmons
Occasional Advisor

Re: CPU Temperature above threshold

I just wanted to post the resolution here.

It took a few days working with HP Support, sending them logs and such and they finally sent a tech out today. As soon as the tech opened the cover he saw the problem. Unfortunately, had I look just a little closely I would have seen it also, but being that the server is in a remote office I haven't had much time with it physically.

It's a dual processor board but only 1 processor is installed. When HP built the server, they put the heat sink on the blank slot.

Good job HP.

But it's nice when it's a simple fix and even nicer to know Intel's CPU thermal controls and Linux builtin monitoring actually work.

Thanks for your help.
Jeremy Simmons
Occasional Advisor

Re: CPU Temperature above threshold

See previous post.