Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Shutdown temperature

Elamparithi.P
Occasional Contributor

Shutdown temperature


what is the temperature limit in Alpha servers Es40,Es45 and integrity rx 2640 itanium for initiating a system shutdown condition.
6 REPLIES
John Gillings
Honored Contributor

Re: Shutdown temperature

This is known as a "piece of string" question.

I don't think there's any default mechanism in OpenVMS which initiates a controlled system shutdown depending on temperature. Different systems have different sets of temperature sensors on various system components which can be sampled from software (ranging from "none" to "lots").

There may be software available which monitors the system and can trigger actions at set temperature points for various system components (which might include invoking SYSHUTDOWN.COM). I'd imagine any system doing something like that would be configurable.

I'm sure there are temperature limits for different components at which the hardware will turn itself off, but the exact details most likely depend on the exact hardware configuration.
A crucible of informative mistakes
Hoff
Honored Contributor

Re: Shutdown temperature

You'll want to read the technical and service manuals for the various boxes of interest for details of the management interfaces and available sensors. You can acquire and read and research those documents as well as we can, after all.

There are threshold warnings and emergency (hard) shutdown temperatures established on many of the AlphaServer boxes. Some are simple over-temp sensors (trip them and the box shuts down), and some have associated thermal read-outs (which also trip out, but you can read the sensor).

Details on how to determine current temperatures those (for those that have the read-outs) are available via the service processor; the rmc or rcm, depending on the Alpha box.

Off-hand, I don't know if the rx2640 has these, but it would not surprise me to find that.

There are also out-board options available; mechanisms which can sense the environment and can then perform a controlled or an uncontrolled (hard) shutdown.

So is this a homework question or a promotional quiz, a customer query or survey, simple curiosity, a fried server room or what? That detail can help steer the discussion in the most appropriate direction. If you've got a fried server room, for instance, you'll likely want environmental monitoring and notification, not host-level monitoring; catch the cooling problem before the boxes overhead.

Richard W Hunt
Valued Contributor

Re: Shutdown temperature

This gets tricky to answer because it depends on two things: The accuracy of the thermal sensor and a setpoint for that sensor. Neither is guaranteed to be excruciatingly precise.

I've never had a thermal shutdown on my ES40s, mostly as a matter of luck. The last machine I managed on which a thermal shutdown occurred was an AS4100 in a room where one of the A/C compressors developed a pesky little leak. When the internal sensor for the AS4100 got over 97 degrees F, I knew it was just a matter of time.

With OpenVMS 7.3-2, you have the ability to do F$GETSYI calls for your various environmental vectors. Use something like this in your DCL environment:

$ XV = F$GETSYI( vector-name )

to get the appropriate sensor vector, which you can then decompose using the F$EXTRACT() lexical function. Vector names are FAN_VECTOR, POWER_VECTOR, THERMAL_VECTOR, and TEMPERATURE_VECTOR. Each is a string of 32 hex digits presented as 16 pairs. Extract the digit pairs. WARNING: They number backwards in this string. The RIGHTMOST digits are sensor 0 and they increase to the left. You need the system unit's engineering drawings to know where the sensors are located.

For the vectors, if the digit pair is "FF" then there is no sensor and you should do no testing - or not care about the reading. For all except the TEMPERATURE_VECTOR, a value of 0 is BAD and 1 is OK. The THERMAL_VECTOR is only for those systems that don't have sensors for the TEMPERATURE_VECTOR. If you have a usable temp vector, the thermal vector returns all FF (as far as I can tell). In the temp vector, the read-outs are each a hexadecimal, 2-digit temperature as degrees C. So if the vector has a 14, that is decimal 20 (degrees C), or 68 (degrees F).

After I lost the AS4100 a couple of times, I started monitoring the vectors with a little script that stored the values, compared current values to last scan's values, and sent me e-mail. It only notified me if a TEMPERATURE_VECTOR sensor changed upwards AND was over an arbitrary limit I chose for warning purposes. Got to the point that I knew before the on-deck operator did that the A/C was out again.

When I lost a fan one day, I also knew it because when the sensor changes from 0 to 1, something broke. The POWER_VECTOR is the status of the power supplies on the back of the machine. If you lose one of those, it is also a good thing to know.
Sr. Systems Janitor
Hoff
Honored Contributor

Re: Shutdown temperature

THERMAL_VECTOR and POWER_VECTOR are comparatively limited implementations; a subset of the AlphaServer series boxes have the underlying mechanisms.

See the specific I2C support, the associated WBEM pieces, and (for some boxes) the DECW$PRIVATEER logical name, and the SYS$GET_ENV_SENSORS service.

And AFAIK, the Itanium boxes are different here.

Folks here might well make a case for providing a single architected solution to this area (and when the data is available), but that's fodder for another discussion and for future processors. AFAIK, the hardware-level details, sensors and service buses, error log packets and other such have typically been considered private to HP and are entirely subject to omission or change or removal without notice.

If you want a reliable means to acquire this thermal data, install your own out-board environmental monitoring and keep the room from frying. Via the AC or via APC Netbotz or otherwise. There are cheap USB thermal monitors around, too; one of the HP folks got the Lascar Electronics El-USB-2 widget working with OpenVMS, and other platforms have other options.

AlphaServer boxes aren't good clocks, and (here) they're (also) not good thermostats.

cnb
Honored Contributor

Re: Shutdown temperature

Hi,

Some systems had their values changed in Firmware releases, so your settings may not match what is posted here.

On the Alpha's you can see what they are set to by using the Remote Management Console ('RMC') 'env' command sequence from the console or remote port (if enabled) by entering the CCL prompt:

My ES40 returns these default values:
P00>>> [[rmc
RMC>env

System Hardware Monitor

Temperature (warnings at 45.00C, power-off at 50.00C)

NOTE: Use this procedure with caution if your system is up as it can CRASH the system if used incorrectly. See the Service/User Manuals for complete RMC usage information.

Sorry, can't help you with the rx2640 temp values.

HTH,


Hoff
Honored Contributor

Re: Shutdown temperature