topic Re: Cluster Transitions in Operating System - OpenVMS

Cluster Transitions

Kevin Raven (UK) — Fri, 26 Feb 2010 11:03:43 GMT

Always happy to learn even at my age ....

Four node cluster.
One of the nodes crashes and exits cluster with a CPU bcache hardware fault.
I would expect the other 3 nodes to have hung for the 4 second interval we have RECNXINTERVAL set at. Then business as normal.
However when we look at our application latency stats , we don't see any hangs during the time of the crash. We measure latency in milliseconds and a 4 second hang would stick out like a sore one ....
Not complaining ....we survived a server crash and saw no loss of service ....
What am I missing ?

p.s. This is the first server crash we have had in 4 years. We run 24 hours ...Sunday evening start to Friday night finish.
Latency measured in milliseconds.
Above 3 seconds is considered an outage and investigated.
Rock on OpenVMS and ES45 Alpha servers ....

Re: Cluster Transitions

Robert Gezelter — Fri, 26 Feb 2010 11:48:43 GMT

Raven,

What are your applications doing?

Precisely what "response time" are you measuring?

- Bob Gezelter, http://www.rlgsc.com

Re: Cluster Transitions

Volker Halle — Fri, 26 Feb 2010 13:17:34 GMT

The explanation is simple:

If there was a real OpenVMS crash (probably a MACHINECHK), then - during the crash processing - the node has sent a 'last gasp' message over all cluster channels, so the other nodes IMMEDIATELY removed that node from the cluster, without the need of waiting for RECNXINTERVAL to expire.

RECNXINTERVAL only comes into play, if no 'last gasp' message has been sent, i.e. if you just HALT the machine or just power it off. Or if the network connections all fail at once.

OpenVMS clusters - the best clusters there are !

Volker.

Re: Cluster Transitions

Kevin Raven (UK) — Fri, 26 Feb 2010 13:22:37 GMT

Yes it was a Machine check type crash ...that explains it ....

Yes OpenVMS Clusters simply the best.

Re: Cluster Transitions

Hoff — Fri, 26 Feb 2010 13:28:01 GMT

Chances are, the cluster didn't need to count any votes from a quorum disk here and (assuming one vote each) the cluster (still) had quorum when the node dropped out; for cases such as these, the transitions are fast.

In particular, a crashing cluster member node usually sends out the so-called last-gasp message. Receipt of this message allows the remaining nodes to completely bypass the RECNXINTERVAL mechanisms; they "know" what happened to the host.

Cases where no last gasp is sent (console halt, network disconnection, certain hardware failuresm etc) will require a longer transition.

As for the voting configuration, if these boxes have a shared RAID-capable external controller, then the typical uptime configuration would be one vote to each host, and three votes to a shared quorum disk with controller RAID. You'd see fewer or no votes configured to the quorum disk when faster transitions are required, presuming the hosts are stable.

Inferring much from your statements around operations and latency and uptime, I'd also expect to be prototyping replacement servers here. Most any of the rx2660 series or blades would be the replacement path for this configuration. These boxes are nice, but they're also big and slow and getting older all the time.

Some reading:
http://labs.hoffmanlabs.com/node/153
http://labs.hoffmanlabs.com/node/569