Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Quorum mismatch between cluster members

 
SOLVED
Go to solution
GSmith
Occasional Advisor

Quorum mismatch between cluster members

Hi,
I have a situation I can't explain and welcome your expert advice. I have a 4 node cluster running 7.3-2, all with individual system disks. There is no quorum disk. One of the members is reporting quorum as 3 and expected as 4. The other three are showing quorum as 2 and expected as 3.

View of Cluster from system ID 41162 node: GHA1 16-JUN-2010 10:40:26
+-----------------------------------------------------------+
| SYSTEMS | MEMBERS |
|-----------------------+-----------------------------------|
| NODE | SOFTWARE | VOTES | EXPECT | QUORUM | STATUS |
|--------+--------------+-------+--------+--------+---------|
| GHA1 | VMS V7.3-2 | 1 | 4 | 3 | MEMBER |
| GHA2 | VMS V7.3-2 | 1 | 3 | 2 | MEMBER |
| GHA3 | VMS V7.3-2 | 1 | 3 | 2 | MEMBER |
| GHA4 | VMS V7.3-2 | 1 | 3 | 2 | MEMBER |
+-----------------------------------------------------------+

View of Cluster from system ID 41164 node: GHA2 16-JUN-2010 10:41:04
+-----------------------------------------------------------+
| SYSTEMS | MEMBERS |
|-----------------------+-----------------------------------|
| NODE | SOFTWARE | VOTES | EXPECT | QUORUM | STATUS |
|--------+--------------+-------+--------+--------+---------|
| GHA2 | VMS V7.3-2 | 1 | 3 | 2 | MEMBER |
| GHA3 | VMS V7.3-2 | 1 | 3 | 2 | MEMBER |
| GHA1 | VMS V7.3-2 | 1 | 4 | 3 | MEMBER |
| GHA4 | VMS V7.3-2 | 1 | 3 | 2 | MEMBER |
+-----------------------------------------------------------+

Display of parameters from all of the nodes is identical and looks correct.
GHA1 $ mc sysgen show expect
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
EXPECTED_VOTES 4 1 1 127 Votes
GHA1 $ mc sysgen show votes
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
VOTES 1 1 0 127 Votes


I used the set cluster command to put them all back in sync but even though it looks like the command was accepted the show cluster display doesn't change.

GHA1 $ set cluster /expected
%SET-I-EXPTD_VOTES, new value of expected votes is 4, yields quorum of 3

GHA2 $ set cluster /expected=4
%SET-I-EXPTD_VOTES, new value of expected votes is 4, yields quorum of 3

I am taking one of the nodes down this weekend and don't want any surprises.
21 REPLIES 21
Steven Schweda
Honored Contributor

Re: Quorum mismatch between cluster members

> [...] to put them all back in sync [...]

In sync with what? If the members have votes
of 4, 3, 3, 3, then how can expected votes be
only 4 (where one single guy (4) or any two
others (3+3) would have the expected votes?

The whole idea of these votes and a quorum is
to prevent multiple subsets happily forming
multiple clusters. I don't see how a value
of 4 expected votes can do that here.

> [...] don't want any surprises.

What would be unsurprising? How do you want
this stuff to work when not everyone is up?
Richard Brodie_1
Honored Contributor

Re: Quorum mismatch between cluster members

The important values are those in the cluster overview class: CL_EXP, CL_QUORUM, CL_VOTES. Those are the ones that relate to what set cluster/expected does, and the actual quorum.

The ones shown against the individual nodes should just reflect their own SYSGEN parameters.
Jan van den Ende
Honored Contributor

Re: Quorum mismatch between cluster members

GSmith,

>>>
GHA1 $ mc sysgen show expect
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
EXPECTED_VOTES 4 1 1 127 Votes
<<<

Well, this at least agrees with
>>>
| GHA1 | VMS V7.3-2 | 1 | 4 | 3 | MEMBER |
<<<

And if you do
$ mc sysgen show expect
at the other 3 nodes, I expect to see
| GHAn | VMS V7.3-2 | 1 | 3 | 2 | MEMBER |

i.e.,
the other 3 nodes have EXPECTED_VOTES = 3
... and that implies the potentia for a partitioned cluster!!!!

For your own good: please check, and correct ASAP!

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Ian Miller.
Honored Contributor

Re: Quorum mismatch between cluster members

Steven - each node has one vote, but the SHOW CLUSTER display shows expected votes of 3 on GHA2,3,4 and expected votes of 4 on GHA1.

____________________
Purely Personal Opinion
Steven Schweda
Honored Contributor

Re: Quorum mismatch between cluster members

> Steven - each node has one vote, but [...]

Ah. I was misled by comparing with my own
(strange) cluster, which has one voting node,
and a bunch with no votes. Thanks for the
info.

Never mind.
Hoff
Honored Contributor

Re: Quorum mismatch between cluster members

This looks to be the usual bogus cluster configuration, and confusion over the bootstrap settings (which provide the initial values, prior to successfully connecting to other boxes), the adjusted/current/live/running values (which Mr Brodie points to) and the meaning of the SET CLUSTER /EXPECTED command (which is typically used when downsizing a cluster under manual control).

Here? Just fix the votes and expected_votes settings in MODPARAMS.DAT and (if you're not maintaining AUTOGEN, which is itself a problem), and move on. Do grok the quorum scheme, but don't try to fool the quorum scheme. (Upon managing to successfully fool the quorum scheme, Bad Things tend to quickly happen to the disk data.)

If you have a shared interconnect here, you'll likely want to get a quorum disk configured here. This particularly if you have hardware RAID on that shared interconnect. Presuming shared hardware RAID and one-host survival, that'd typically have one vote for each host, three votes for the quorum disk, and expected_votes of (duh) seven. Presuming shared hardware RAID and three-host survival, one vote per host and one vote for the quorum disk, and expected_votes of (duh) five.

The VMS system management User Interface (UI) here has some serious shortcomings; it leaves folks rather confused about how all this works and (though this area has most definitely seen improvements in this regard) the diagnostics and automatic guidance have been lacking.

http://labs.hoffmanlabs.com/node/153
http://labs.hoffmanlabs.com/node/569
http://labs.hoffmanlabs.com/node/105
John Gillings
Honored Contributor

Re: Quorum mismatch between cluster members

This is exactly the reason why those SYSGEN parameters which should be common across the cluster (like EXPECTED_VOTES, anything to do with quorum disks, cluster number, password etc...) should be physically common using the AGEN$INCLUDE_PARAMS mechansim.

Isolate common parameters into a CLUSTER_MODPARAMS.DAT file on your cluster common disk and include it into each node specific MODPARAMS.DAT. This should eliminate inconsistencies like the one reported here.

(of course cluster software SHOULD detect and warn about such inconsistencies, but HP has resisted all efforts to fix the issues surrounding cluster misconfigurations, instead expecting each system manager to personally experience all the common pitfalls and reinvent the wheel fixing them)
A crucible of informative mistakes
John McL
Trusted Contributor

Re: Quorum mismatch between cluster members

I agree with Hoff and John G.

I suspect that your problem came about when adding node GHA1 to the cluster. That machine has appropriate settings for a 4 node cluster and the other 3 have appropriate settings for a 3 node cluster.
P Muralidhar Kini
Honored Contributor

Re: Quorum mismatch between cluster members

Hi GSmith,

>> I am taking one of the nodes down this weekend and don't want any
>> surprises.
You should consider yourself extremely lucky if you are not in for surprises
much before that!.

Each node has a Votes of 1. This is ok but then the expected votes setting is
incorrect. Threea of the nodes have expected votes of 3 while one of them have
a expected votes of 4. This can lead to a partitioned cluster.

As already indicated by Richard, you need to inclue the values of the CL_EXP,
CL_QUORUM and CL_VOTES in the "$SHOW CLUSTER" output to get a clear
picture of the cluster setup.

You need to relook at the Votes/ExpectedVotes settings of various nodes in the
cluster. The Links provided by Hoff should help you in this regard.

Regards,
Murali
Let There Be Rock - AC/DC