ProLiant Servers (ML,DL,SL)
1753813 Members
7956 Online
108805 Solutions
New Discussion юеВ

Re: ML350 CPU Thermal Issue

 
Mike Lauer
New Member

ML350 CPU Thermal Issue

A few times per month our Proliant server will automatically restart due to the CPU thermal sensor detecting overheating. It happens at random times during the day. We replaced the thermal sensor and cleaned out the chassis. The server is located in a room that is about 72-degrees. All fans are working fine and their is plenty of room around the server for air flow.

Here is the event log:

12:42:00 Server Agents Information Events 1136
"System Information Agent: Health: A Temperature Sensor Condition has been set to ok. The system's temperature has returned to the normal operating range.
Chassis: '0'; Location: '6'
(Location values: 1=other, 2=unknown, 3=system, 4=system board, 5=I/O board, 6=CPU, 7=memory, 8=storage, 9=removable media, 10=power supply, 11=ambient, 12=chassis, 13=bridge card)


12:42:00 PM cpqasm2 Information None 4110
Environment Abnormality Auto Shutdown (EAAS) cancelled.


12:42:00 PM cpqasm2 Information None 4103
The system temperature (thermal sensor #2) has cooled down to below the threshold.


12:38:59 PM Server Agents Warning Events 1135
"System Information Agent: Health: A Temperature Sensor Condition has been set to degraded.
The system may or may not shutdown depending on the state of the thermal degraded action value '3'.
Chassis: '0'; Location: '6'
(Thermal degraded action values: 1=other, 2=continue, 3=shutdown)
(Location values: 1=other, 2=unknown, 3=system, 4=system board, 5=I/O board, 6=CPU, 7=memory, 8=storage, 9=removable media, 10=power supply, 11=ambient, 12=chassis, 13=bridge card)


12:38:59 PM USER32 Information None 1074
"The process winlogon.exe has initiated the restart of computer SERVER on behalf of user NT AUTHORITY\SYSTEM for the following reason: Legacy API shutdown
Reason Code: 0x80070000
Shutdown Type: restart
Comment: HP ProLiant System Shutdown Service: System is too hot or has lost cooling."


12:38:59 PM cpqasm2 Error None 4111
Environment Abnormality Auto Shutdown (EAAS) initiated due to thermal reasons, either resulting from the system overheating, or from the loss of cooling.




Any help or recommendations about how to resolve this problem is greatly appreciated.
11 REPLIES 11
Ryan Goff
Valued Contributor

Re: ML350 CPU Thermal Issue

which generation ml350 is this and how many processors are installed?
Mike Lauer
New Member

Re: ML350 CPU Thermal Issue

Generation 4 with a single processor.
Ryan Goff
Valued Contributor

Re: ML350 CPU Thermal Issue

Have you replaced the heatsink before? If not that will fix this issue.
Mike Lauer
New Member

Re: ML350 CPU Thermal Issue

I have not replaced the heatsink. I'll give that a try.
KMullins
Frequent Advisor

Re: ML350 CPU Thermal Issue

As Ryan said replace the heatsink, call hp and explain the problem as this is a pretty common problem with the G4's heatsink.
KMullins
Frequent Advisor

Re: ML350 CPU Thermal Issue

Almost forgot check the fan on the heatsink
The fan should be on the Media bay side of the processor not the memory side.

This can cause the same problem aswell
David Paris_1
Frequent Advisor

Re: ML350 CPU Thermal Issue

Exactly ML350 G4 have problems with the overheat the ML350 G3 have that air conduct for the processor and memories, i believe that G$ also have but its optional. we have hundreds of ML350 G4 and we had that problem and i can tell the other problem that model have is NMI problems, we try everything, but the problem is only solved with the change of the system with the latest fimware.

By

Re: ML350 CPU Thermal Issue

Check first your processor heat sink fun if it is working fine, clean it as possible. Check if it is properly mounter over on the processor. If the still there then its time to replace a new heat sink.
Suraj Wellala
Occasional Advisor

Re: ML350 CPU Thermal Issue

I am also experiencing the same problem with My ML 350G4P Server. As I could remeber this happened after I upgraded memory from 2GB to 6GB.
I have attached the System Log file for refernce.
I checked the CPU Fan and all fans are working properly.

Thank you very much.