1754809 Members
3738 Online
108825 Solutions
New Discussion юеВ

Re: Thermal Shutdown

 
Kevin Everts_1
Regular Advisor

Thermal Shutdown

The air conditioning unit in my small server room failed last night and the temperature in the room got up to 105 degrees farhenheit. I have 10 servers but only 1 shut itself down. The server that shut down was a DL360 G4. The servers that didn't shut down are DL360 G3, DL380 G4, DL380 G2, ML310 G3, DL585 G1, ML370 G2. I checked SMH and it is setup to shutdown on thermal failure. Any idea why they wouldn't have shut down?
6 REPLIES 6
David Claypool
Honored Contributor

Re: Thermal Shutdown

Check the logs. Do the other 9 systems have any thermal degraded conditions logged? They may not have crossed the threshold and were perfectly happy to keep running.
Kevin Everts_1
Regular Advisor

Re: Thermal Shutdown

I just checked the logs and there isn't anything in the other servers.
Rob Leadbeater
Honored Contributor

Re: Thermal Shutdown

Hi,

I'm guessing that the DL360 G4 has a lower temperature threshold.

We had a similar issue last year in our computer room with a mixture of DL380 G3's and G4's, DL360 G3's and G4's etc.

The first machine to shut itself down was a DL360 G4, even though others in the same rack managed to stay up longer. The others did eventually shutdown though.

Cheers,

Rob
Kevin Everts_1
Regular Advisor

Re: Thermal Shutdown

Do you know what the ambient temperature was in the room when the other servers shut down?
Rob Leadbeater
Honored Contributor

Re: Thermal Shutdown

Hi,

The ambient temperature in the room hit around 49├В┬░C / 120├В┬░F - it might have been higher, that's as far as the thermometer could go !

I've just had a look at the logs of some of the servers and there are different thresholds.

DL360 G4 Shutdown at 42├В┬░C
DL580 G2 Shutdown at 52├В┬░C
DL380 G4 Shutdown at 50├В┬░C
DL360 G3 Shutdown at 56├В┬░C

As you can see, the DL360 G4 went down a lot quicker than the others.

Cheers,

Rob
David Claypool
Honored Contributor

Re: Thermal Shutdown

Just be aware that you can't infer ambient temperature from internal sensors or correlate sensor readings between models. The thresholds are set based on several factors measured in the lab and are a function of:

a. the proximity to the heat source (e.g. CPU)
b. the amount of airflow past the sensor, and
c. the specifications of the component being protected