Operating System - OpenVMS
1753787 Members
7636 Online
108799 Solutions
New Discussion юеВ

Re: Changing votes in a cluster without a cluster reboot

 
Carleen Nutter
Advisor

Changing votes in a cluster without a cluster reboot

I had a cluster of 4 Alpha VMS servers, each with 1 votes & a quorum disk with 3 votes; so that I could have at least 1 server up. (Quorum = 4 - either all 4 nodes or at least 1 nodes and the quorum disk.)
One old node has failed and wont be replaced. Now I need the quorum disk votes to maintain quorum. This is a 7x24 system and I can't reboot the cluster to adjust votes, expected votes, etc. easily. I can reboot one of the nodes without an application shutdown. My thought was to give that node 2 votes instead of 1. Then I'd have quorum with the 3 nodes (1 vote + 1 vote + 2 votes).
Because of, as yet diagnosed problems, the quorum watcher keeps loosing connection to the quorum disk (NAS array) and the cluster hangs for a few seconds.
Giving one node 2 votes doesn't fix the problem of losing connection to the quorum disk, but I'm hoping to mitigate the cluster hangs until the connection problem can be diagnosed and resolved.

I'm just looking at a sanity check on the proposed change so that it doesn't result in a split cluster.
Expected votes still = 7
Quorum disk votes still = 3
2 nodes with votes still = 1
1 node with votes = 2 (modparams changed and node rebooted)
Quorum still 4 (3 nodes up; or at least 1 node with the quorum disk votes).

When I can reboot the cluster, I'll adjust everything to be more relevant to the configuration.
13 REPLIES 13
Hoff
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

Adding an extra vote here will allow the existing hosts to ride over a problem with the quorum disk, and would avoid the I/O polling inherently required when the votes for the quorum disk need be counted toward quorum.

The delay for a cluster transition would be 3*QDSKINTERVAL when the votes from the quorum disk are required.

Given this environment could well involve a network problem, acquiring host-based votes without depending on the quorum disk may not be a sufficient salve for the instabilities. Clusters tend to be unstable when the network is unstable, and partitions (and hangs) can arise.
P Muralidhar Kini
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

Hi Carleen,

In the new setup you would have 3 nodes and a Quorum disk.
Quorum disk is generally recommended in case of a 2 node cluster.

In the new setup, you can have 1 vote each for the 3 nodes and do away with
the Quorum disk.
In this case,
Each node would have Vote = 1
Quorum = 2
Hence, If any two of the nodes are up then the cluster would be up.

Even in the new setup, do you want a configuration such that,
even if one node is up, you want the cluster to be up ?

Regards,
Murali
Let There Be Rock - AC/DC
P Muralidhar Kini
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

Hi Carleen,

The following link talks about OpenVMS cluster concepts.
http://h71000.www7.hp.com/doc/731final/4477/4477pro_002.html

This talks about Votes/Quorum/Quorum disk/Quorum disk watcher
and also rules for Specifying Quorum.

Regards,
Murali
Let There Be Rock - AC/DC
Hoff
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

Murali, quorum disks are common in some of these cluster configurations with rather more than two hosts.

In particular, for cluster configurations that are intended to survive the loss or downing of comparatively large numbers of hosts. A three node cluster that can survive with two of the nodes down, for instance, and as parallels this case.

With hardware RAID underneath the quorum disk and with comparatively short polling timers, these configurations do work nicely for the specific requirements.

With a short timer value, the cluster will have three (polling at 1 second intervals) to six (at 2) to nine (at 3) seconds for a transition if/when the votes from the quorum disk are needed.
P Muralidhar Kini
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

Hi Hoff,

>> A three node cluster that can survive with two of the nodes down, for
>> instance, and as parallels this case.
Yes, i get your point. You have give a good example as to why quorum disk
would be required in general (for any set of nodes in a cluster).
And this is exactly the reason why carleen is using the quorum disk in the
cluster setup.

Regards,
Murali
Let There Be Rock - AC/DC
John McL
Trusted Contributor

Re: Changing votes in a cluster without a cluster reboot

Why would you want a quorum disk in a cluster with an even number of nodes if each node has the same numer of votes? Isn't the intention of the qurum disk to prevent a cluster being divided into 2 or more parts, each of which "thinks" it's a cluster in its own right?

With 4 Alphaservers, set expected votes to 2 and the failure of one machine would mean that at least 2 others need to be present to form a cluster. If you have a greater number of machines all with equal votes, then just divide the number of nodes by 2 to get your expected votes.

If you have an odd number of machines and the loss of 1 leaves you with the potential to have two "disjoint clusters" with an equal number of nodes, just push expected votes up by 1. (Or round up after dividing your total nodes by 2.)

The only time that I see quorums disks as useful is when you have just a two node cluster and want them to always function as a cluster even when just one node is operating.

The above relies on having an equal number of votes from all machines, which might not be true if you have a mix of hardware/capabilities. Just be aware that the aim is to avoid a disjoint cluster and then do your calculations about the node combinations that you would be prepared to accept (e.g. loss of big machine means that two smaller machines must be present to take the load.)
Hoff
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

>Why would you want a quorum disk in a cluster with an even number of nodes if each node has the same numer of votes?

Usually because you want the cluster to continue running with fewer hosts present than would otherwise be possible with the traditional configuration.

> Isn't the intention of the qurum disk to prevent a cluster being divided into 2 or more parts, each of which "thinks" it's a cluster in its own right?

That's the quorum scheme you're thinking of. And of cluster partitioning. Of which the quorum disk is a component.

The quorum disk is a cheap and surprisingly effective surrogate for a voting node.

If you're curious about this stuff, run the math for the quorum values and the configurations for the degraded-cluster cases; calculate the quorum for the various configurations.
P Muralidhar Kini
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

Hi John,

>> Isn't the intention of the qurum disk to prevent a cluster being divided into 2
>> or more parts, each of which "thinks" it's a cluster in its own right?
This is what even i had in mind. Quorum disk being useful to prevent cluster
partition or in case of a 2 node cluster configuration. But if you look at the
response from Hoff and also the requirement of carleen, the overall
requirement is that, even if one node in a cluster is up then the whole cluster
should be up.

Carleen's proposed changes,
Qourum disk : 3
Node A : 1
Node B : 1
Node C : 2

Qourum = 4

In this case, even if two nodes goes down then the votes of 3rd node combined
with the qourum disk would meet the cluster qourum and hence the cluster
would be up.

Hence, the use of qourum disk is to ensure that even if multiple nodes go down
(worst case - only one node may be up), then also the cluster would be up.
This would not be possible if there were no quorum disk.

Regards,
Murali
Let There Be Rock - AC/DC
Robert Gezelter
Honored Contributor

Re: Changing votes in a cluster without a cluster reboot

Carleen, John and Murali,

I am in agreement with Hoff.

From my perspective, there are two simultaneous questions:

- How does one prevent the cluster from partitioning into two parallel clusters?

- What is the minimum set of nodes needed to ensure the continuation of cluster operations at a minimal level?

The conventional answer of "A quorum disk iff (and often ONLY IF) there are precisely two nodes in the cluster" is adequate, but not complete. It is correct in that the two-node cluster requires a quorum disk to allow cluster operations to continue unimpeded when only a single node is operational.

However, the two-node case is not the entire story. A quorum disk is also required if cluster operations must continue unimpeded when the population of cluster members reduces to either one or two nodes due to failures.

The latter case requires careful review of the votes to ensure that partitioning cannot occur.

- Bob Gezelter, http://www.rlgsc.com