Re: Quorum mismatch between cluster members

Robert Gezelter · ‎06-17-2010

GSmith,

I also concur with Hoff and JohnG, particularly John's comments about AGEN$INCLUDE.

The safest way to ensure consistency is to have include a single file by reference. Then, all changes are a question of changing the common file and running the build (e.g., AUTOGEN).

This is why programmers use include files and system management is no different. Having two independent copies of something is a recipe for "evolution" (unintended differential changes) to occur.

- Bob Gezelter, http://www.rlgsc.com

GSmith · ‎06-17-2010

I appreciate all of the advice but you seem to have missed the reason I posted this question.

>>>
Display of parameters from all of the nodes is identical and looks correct.
<<<

SYSMAN> do mc sysgen show votes
%SYSMAN-I-OUTPUT, command execution on node GHA3
VOTES 1 1 0 127 Votes
%SYSMAN-I-OUTPUT, command execution on node GHA2
VOTES 1 1 0 127 Votes
%SYSMAN-I-OUTPUT, command execution on node GHA4
VOTES 1 1 0 127 Votes
%SYSMAN-I-OUTPUT, command execution on node GHA1
VOTES 1 1 0 127 Votes
SYSMAN> do mc sysgen show expected
%SYSMAN-I-OUTPUT, command execution on node GHA3
EXPECTED_VOTES 4 1 1 127 Votes
%SYSMAN-I-OUTPUT, command execution on node GHA2
EXPECTED_VOTES 4 1 1 127 Votes
%SYSMAN-I-OUTPUT, command execution on node GHA4
EXPECTED_VOTES 4 1 1 127 Votes
%SYSMAN-I-OUTPUT, command execution on node GHA1
EXPECTED_VOTES 4 1 1 127 Votes

Richard, Thanks for pointing out the CL_fields on the show cluster display.

CL_EXP = 1
CL_QUORUM = 3
CL_VOTES = 4

GSmith · ‎06-17-2010

CL_EXP = 4, not 1

Richard Brodie_1 · ‎06-17-2010

According to 'help field expected_votes':

"It is possible for this field to display a number smaller than the EXPECTED_VOTES parameter setting if the REMOVE_NODE option was used to shut down a cluster member or the SET CLUSTER/EXPECTED_VOTES DCL command was used since this node was last rebooted."

Well, I never knew that.

GSmith · ‎06-17-2010

-GHA4 experienced a bugcheck in April.
-GHA1 had a controlled re-boot in May.
-SHOW CLUSTER may show confusing values under certain circumstances as noted by Richard. I'll be sure to use the CLU_ fields in the future.

Hoff · ‎06-17-2010

> I'll be sure to use the CLU_ fields in the future.

You've found a cluster configuration error.

That's bad.

Go fix the values for your next reboot.

Your configuration can start processing with two votes and thus two nodes, and it should only start processing with three present.

Two two-node subclusters with shared storage would be Very Bad.

That sole four-vote node is likely going to prevent partitioning, but if somebody manages to get all four nodes set to three votes expected or if somebody has been using SET CLUSTER /EXPECTED to lower the required values, then all bets are off.

GSmith · ‎06-17-2010

Hoff,

Which 4 vote node? I have not the slightest idea of what you are trying to tell me to fix.

Hoff · ‎06-17-2010

I was looking at:

GHA1 | VMS V7.3-2 | 1 | 4 | 3 | MEMBER

Which lists four votes as expected.

If you should happen to get two nodes booted of your four votes and should those two not get connected to the other hosts (and thus get the calculated quorum values corrected), and should each of your nodes have expected_votes set to 3, then you can have two partitions within your cluster, and your shared data is corrupt.

Right now, that GHA1 box (which looks to have a correct setting for expected_votes) is the reason your cluster can't get into this partitioning case. That one setting prevents partitioning.

Quorum is the mechanism by which massive corruptions to your disks and other shared data are avoided. Failure to have a correct configuration can lead to data corruptions; your disks can end up completely corrupted long before you get to the login prompt.

I've seen a few of these partitioned configurations arise over the years, such as when somebody rebooted off the wrong system root, such as after a console battery failed. It's a mess.

Again, you do need to grok this stuff. Not look at the CL running settings. If the system parameters aren't set right, the corrections might not happen for cases where connectivity isn't fully available (for any of various reasons), and your data goes bye-bye.

Do you need to correct this Right Now? No. You have connectivity, and so long as the SET CLUSTER /EXPECTED isn't (mis)used, your quorum has been corrected to 3. Just get this fixed for your next reboot.

And consider getting that quorum disk going here, too, as that means you can survive the loss of more than one box in the cluster without a cluster quorum hang.

Please read what I linked to earlier.

Jon Pinkley · ‎06-18-2010

GSmith,

The output of "SYSMAN> do mc sysgen show" that you provided in your post from Jun 17, 2010 12:51:30 GMT looks convincing to me.

I disagree with Hoff, I don't see any evidence that the sysgen cluster parameters are misconfigured. The show cluster display does seem to be misleading, especially since you used set cluster/expected=4. This reporting may have been fixed since 7.3-2 (I can't reproduce the node specific displayed values changing in 8.3 with set cluster/expected (either changing down or back up) or set cluster/quorum. See more about this below).

Since neither votes nor expected_votes are dynamic parameters, we can assume that all nodes were booted with the correct value of expected votes. This is because, by default, sysgen shows the active parameters. All four of your nodes reported expected_votes 4. As long as you didn't change the CURRENT parameters, you should be fine when you reboot.

Most likely cause of this condition (in my opinion) is that when GHA1 had a controlled re-boot in May, the REMOVE_NODE shutdown option was used, and the remaining nodes adjusted quorum down. When GHA1 rebooted, it used EXPECTED_VOTES = 4 from the current parameter file and its show cluster display is consistent with the values in effect at the time it joined the cluster. The CL_EXP value will always ratchet up to be at least as high as the current total number of votes in the cluster, the only way it will ever go down is via a recompute quorum event, or when a cluster is formed. What the expected votes is protecting against is the "cluster formed" situation, you don't want a network error that keeps the nodes from seeing each other to allow multiple subsets to successfully form a cluster independently; if they do, and they have direct access to shared storage (shared SCSI, FC), that shared storage will get corrupted.

You are running 7.3-2. I just tried the set cluster/expected on a test cluster running 8.3 (Alpha) and on it where the expected votes was 3, the current votes were 2 (one for the single node, and one for the quorum disk). On 8.3 the show cluster display will reflect the changed values in the CLUSTER section, but not the individual node sections. Perhaps there was a bug in 7.3-2 that has been fixed in 8.3? See attachment for the results of my testing. Note that the Cluster's transition time is updated when the quorum changes, but the nodes is not.

Before you reboot the node this weekend, verify that the EXPECTED_VOTES is set to 4 in the CURRENT parameter on the node you will reboot. I have no reason to believe it has been changed, but you don't want a surprise.

Good luck,

Jon

it depends

Jon Pinkley · ‎06-18-2010

Here is the attachment with the testing I did trying to reproduce the value changing in the node specific portion of the display.

it depends

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Quorum mismatch between cluster members