ProLiant Servers (ML,DL,SL)
1820601 Members
1793 Online
109626 Solutions
New Discussion

Server Overheating - Sensor 31 - HD Controller

 
SOLVED
Go to solution
mjthurgood
Advisor

Server Overheating - Sensor 31 - HD Controller

I have an HP ProLiant DL360p gen8 server that powered down unexpectedly.  iLO shows a critical temperature condition at sensor 31-HD Controller.  Controller health, however, is good.  I'm going to open up the case now and see if I get burned.  Has anyone else seen this condition?  Is it a faulty sensor, perhaps?  Can I get a link to the latest free firmware update utility for this server?

Picture.jpg

9 REPLIES 9
Suman_1978
HPE Pro

Re: Server Overheating - Sensor 31 - HD Controller

Hi,

Have you opened the chassis and checked for any heating issue?
In the iLO, please check the IML log.

Here is the link for Drivers and Firmware, some are free and some not.

Thank You!
I work with HPE but opinions expressed here are mine.
HPE Tech Tips videos on How To and Troubleshooting topics



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
mjthurgood
Advisor

Re: Server Overheating - Sensor 31 - HD Controller

Thank you for your response.  I've opened case #5374403328 with HPE tech support to further investigate.  Before opening the case, I tried disconnecting the Super CAP battery from the Smart Array P420i controller.  During POST, messages were shown stating that this was an unsupported configuration; nevertheless, the server booted into ESXi.  There was no change to the critical temperature, however.  I shut down and reconnected the Super CAP battery, and while working with HPE tech support, applied the latest firmware (8.32) for the Smart Array P420i.  Again, no change to the critical temperature.  I've left the server powered off and unplugged overnight, and before ESXi fully loads, the critical temperature error returns.

I uploaded AHS logs to the case and the support engineer found the critical temperature errors, along with mention of an issue with fan 7 and one DIMM reporting errors.  Fan 7 was replaced in October 2022.  Both fan 7 and all DIMMs show healthy in iLO.

The server with trouble is sandwiched between two other 1U servers, with no space in between.  The top server has sensor 31 at 85 degrees C; the bottom one at 59 degrees C.  The top server has a RAID 5 and RAID 1 arrays.  The bottom one only has RAID 1, as does the server with trouble.  Here is a view of IML.

I'm going to pull the server out of the equipment rack and plug it in by itself in a different room with the cover removed and see if the elevated temperature is indeed coming from the Smart Array P420i.  If so, I guess we should try a replacement?  Or perhaps it is a motherboard issue . . .

Picture.jpg

 
mjthurgood
Advisor

Re: Server Overheating - Sensor 31 - HD Controller

I have the server pulled out of the equipment rack and resting on a table, with the top cover off.  It has booted successfully into ESXi, but continues to report Sensor 31-HD Controller at 124 degrees C.  That location is quite cool.  In fact, there are no locations inside the server that are too hot to touch.  Diagnosis?  Bad sensor?  Does that mean a motherboard replacement is required?

mjthurgood
Advisor

Re: Server Overheating - Sensor 31 - HD Controller

Updated iLO firmware to 2.82.  No change in critical temperature issue.

mjthurgood
Advisor

Re: Server Overheating - Sensor 31 - HD Controller

HP support has concluded: replace cache module, then system board, then HD controller, in that order.

mjthurgood
Advisor
Solution

Re: Server Overheating - Sensor 31 - HD Controller

Mystery solved!  I found the items in the first picture loose inside the case.  I didn’t notice that the heat sink was loose at first, as it was located right where it should be, but I noticed some movement when I was disconnecting network cables from the back.  That heat sink sits on top of the integrated SAS controller in the second picture.  The plastic posts holding the heat sink in place have snapped in two.  I found one spring that encloses the plastic posts inside the case, but haven’t located the second one.  You can see the posts that have snapped off in the second picture.

Loose PartsLoose PartsWhere the Parts Should BeWhere the Parts Should Be

 

Sunitha_Mod
Moderator

Re: Server Overheating - Sensor 31 - HD Controller

Hello @mjthurgood,

That's excellent! 

We are extremely glad to know the problem has been resolved and we appreciate you for keeping us updated. 



Thanks,
Sunitha G
I'm an HPE employee.
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
DanielMarinov
New Member

Re: Server Overheating - Sensor 31 - HD Controller

I have the same issue. Did you replace the heatsink pins, and with where did you get the replacement from?

mjthurgood
Advisor

Re: Server Overheating - Sensor 31 - HD Controller

We found a server on eBay with matching processors and storage controllers, for around $265, and transferred all other server components into it.  See, for example:  HP ProLiant DL360p Gen8 Server BOOTS 2x Xeon E5-2620 2.0GHz 96GB RAM NO OS/HDD | eBay