Operating System - OpenVMS
1753797 Members
7320 Online
108799 Solutions
New Discussion юеВ

Re: 2 node cluster + quorum node - network breaks, what happens?

 
SOLVED
Go to solution
MarkOfAus
Valued Contributor

2 node cluster + quorum node - network breaks, what happens?

Greetings all,

There are 2 nodes (A & B), effectively primary and secondary, in isolated locations. A third node (C), the quorum node, is located in another location altogether.

Suppose the connection between A and B drops, but the connection to node C remains for both node A and B, what happens?

How do you prevent partitioning in this instance?
37 REPLIES 37
MarkOfAus
Valued Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

Actually, I should add that the connection is gigabit ethernet.

Thanks,
Mark.
Martin Hughes
Regular Advisor

Re: 2 node cluster + quorum node - network breaks, what happens?

Either NODEA or NODEB will crash with a CLUEXIT. Assuming that they have the same number of votes then I think the one with the higher SCSSYSTEMID stays up.
For the fashion of Minas Tirith was such that it was built on seven levels, each delved into a hill, and about each was set a wall, and in each wall was a gate. (J.R.R. Tolkien). Quote stolen from VAX/VMS IDSM 5.2
John Gillings
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

Martin is correct, one of the nodes A or B will be forced out and will CLUEXIT. However, I don't think you can reliably determine in advance which one will be selected for all possible cases. All you can say for certain is that node C (which still has complete connectivity) will survive. Your software will need to be able to handle either node being lost.

You cannot, and should not, decide in advance that a particular node will be chosen as the survivior of a comms failure. See Keith Parris's "creeping doom" scenario to understand why.
A crucible of informative mistakes
MarkOfAus
Valued Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

More questions then.

As these two nodes are shadowed (A & B), and C is only a quorum, how does VMS determine that A or B should shut down if the network/link goes down between A & B?

I should have added, my apologies, that users could connect into either A or B. Perhaps obvious, perhaps not and are NOT affected by the network/link failure.

What expected_votes and vote values determine this?

Is it possible that there is a failure like this, that data could be lost in the period it takes to transition to CLUEXIT? That is, people connected to A, and people connected to B. Or is this question too vague? :-)

Thanks, Mark.
John Gillings
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

Mark,

>VMS determine that A or B should shut down
>if the network/link goes down between A & B?

Cluster rules require direct paths from each node to each other node. If a path is lost, both A and B will realise that it can't see the other one. Since both nodes can also see node C. Since each of the proposed clusters A+C and B+C have the same number of votes, there has to be a tie break, and the victim must CLUEXIT. Thems the rules.

>What expected_votes and vote values determine this?

The only sensible voting scheme for this topology is each node 1 vote. EXPECTED_VOTES is therefore 3.

You could do A=2, B=1, C=1, and therefore EXPECTED_VOTES=4, then if the A-B link breaks, B would always get kicked out because the A+C cluster has mode votes. HOWEVER, with that distribution, if you lose A, the B+C cluster is not viable, AND you're exposed to the risk of a creeping doom failure at site A.

>data could be lost

Yes incomplete transactions in flight on the "losing" node may be lost, BUT the cluster state transition logic will ensure you won't have any data corruption.
A crucible of informative mistakes
labadie_1
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

You should have a look at Keith Parris page, at
http://www.geocities.com/keithparris/

particularly
Understanding Vaxcluster state transitions
Disk partitioning on Openvms
VmsCluster State Transitions in action
Using OpenvmsClusters for disaster tolerance

There is a lot of excellent material, I am sure I have forgotten some other documents.

Have a good reading.
labadie_1
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

At
http://www2.openvms.org/kparris/

see "Openvms Connection Manager and the quorum scheme"
which has an example of broken Cluster.
Jan van den Ende
Honored Contributor
Solution

Re: 2 node cluster + quorum node - network breaks, what happens?

John wrote
>>>
The only sensible voting scheme for this topology is each node 1 vote. EXPECTED_VOTES is therefore 3.
<<<

Note if if you DO want an unbalanced config in which any 2 nodes can continue, then
3 + 2 + 2 with EXPECTED_VOTES=7
_IS_ a valid option.
I do not know however, if this would influence the decision of WHICH node should CLUEEXIT.
So, if the choice of the remaining cluster has either 5 or 4 votes, is the 5-vote config favored?
I am quite curious myself!

fwiw

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
MarkOfAus
Valued Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

labadie,
>At http://www2.openvms.org/kparris/
>
>see "Openvms Connection Manager and the quorum >scheme"
>which has an example of broken Cluster.

Yes, I read this, and as he states, "systems in the minority voluntarily suspend processing while systems in the majority can continue to process transactions".

The issue is that both A+C and B+C are voting equals.