Operating System - HP-UX
1839410 Members
3392 Online
110145 Solutions
New Discussion

Heat too high, what problems?

 
SOLVED
Go to solution
Tracey
Trusted Contributor

Heat too high, what problems?

On Monday night I received a call from the facilities engineer telling me that the air conditioner in the computer room was not working, and that the temperature was 92 degrees. When I came in to shut everything down, one N4000 and one K460 were displaying temperature out of range message on the console. I brought all machines down, but when we manually cooled things out (really BIG fan) I brought each machine up long enough to see if there were any major problems - there didn't seem to be.

Since it was Christmas Eve, the repair person was having trouble getting a new compressor, so all machines stayed down until late the next day (Christmas day). I brought them up last night, everything seems fine.

What problems/concerns should I be looking out for on these machines, they include several of the following: K460, N4000, D210, L2000
7 REPLIES 7
Steven Sim Kok Leong
Honored Contributor
Solution

Re: Heat too high, what problems?

Hi,

For most systems, there is a built-in temperature threshold. Once the temperature threshold is exceeded, the system will perform an automatic shutdown.

At least that ws what happened to my 2 x L1000 servers when the air conditioner failed and the backup did not kick in.

Hope this helps. Regards.

Steven Sim Kok Leong
Brainbench MVP for Unix Admin
http://www.brainbench.com
John Bolene
Honored Contributor

Re: Heat too high, what problems?

Surprising that they did not shut down since they were at a higher temp.

The main thing is that when processors get hot, they can start to exhibit compute errors. The internal disk drives may have their life shortened.

Sounds like you shut them down before any permanent damage was done.
It is always a good day when you are launching rockets! http://tripolioklahoma.org, Mostly Missiles http://mostlymissiles.com
Roger Baptiste
Honored Contributor

Re: Heat too high, what problems?

<>

Firstly, it is surprising that the systems didnt shut down or "hang up" at such an high temperature (92!). Probably you got there and took action at the right time.

I had ran into similar problems two years back on christmas day with couple of K580's,
but found no damage of anyway whatsover.

What problems to look for?? Now that the system is up and running and the airconditioning is hopefully up, i wouldn't
see anything to check for. Make sure the disks are all working fine (ioscan it). If you have EMC's or other high end disk arrays, get the vendor engineer to look at their logs.

HTH
raj
Take it easy.
John Payne_2
Honored Contributor

Re: Heat too high, what problems?

I would expect that you will see failures in the future. In the next couple weeks, (unless you are really lucky) you may see processors go bad, disk go bad, etc.

A day is kind of a long time to be down, if you have older machines with old disks. I would just keep an eye on things for a while.
Spoon!!!!
Tracey
Trusted Contributor

Re: Heat too high, what problems?

Right now all ioscans/disks/periphrials seem to be working properly 18 hours later. I fully epect that the heat problem will have shortened the lifespan on some of the disks - I just hope they all don't decide to pick the same time to go down!
Thierry Poels_1
Honored Contributor

Re: Heat too high, what problems?

Hi,

the N & L-class shutdown when they are overheated, not sure about the "older" K & D-class.

You're probably all right as they didn't shutdown automaticaly.
The only tricky part was probably getting the disks back up and running after they were completely cooled down. Powering on/off disk which have been running continiously for months is always tricky.

good luck,
Thierry.
All unix flavours are exactly the same . . . . . . . . . . for end users anyway.
A. Clay Stephenson
Acclaimed Contributor

Re: Heat too high, what problems?

Hi Tracey:

While this is not meant to be a definitive answer, I do have some data. About 3 1/2 years ago, I deliberately shutdown the HVAC and allowed a D380 and a K580 to overheat. Being under 24x7 maintenance and having redundant systems tends to make one fearless. I was mainly going to ascertain that the machines would indeed shut themselves down - they did. I also wanted to determine if the failure rates of the associated hardware were any higher than those that did not undergo the thermal excursion. I have always kept a log of hardware replacements and those boxes did not exhibit a significantly higher component failure rate than any of my other boxes. It appears that they shut themselves down before serious damage occurred. Note that the external drives did not shutdown at this time but ran until the UPS did a thermal shutdown. I observed no statistically significant higher failure rate in those drives either.

Regards, Clay
If it ain't broke, I can fix that.