Operating System - OpenVMS
1753452 Members
5830 Online
108794 Solutions
New Discussion юеВ

Re: 2 node cluster + quorum node - network breaks, what happens?

 
SOLVED
Go to solution
MarkOfAus
Valued Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

John,

Thanks for your continuing help.

Mark,

>Since each of the proposed clusters A+C and >B+C have the same number of votes, there has >to be a tie break, and the victim must >CLUEXIT. Thems the rules.

Ok, I understand that now. I was getting myself worried that there would be some partitioning into 2 clusters. Ack!

>>What expected_votes and vote values >>determine this?

>The only sensible voting scheme for this >topology is each node 1 vote. EXPECTED_VOTES >is therefore 3.

So, equal votes for A+C or B+C and then let the "system" decide which one bites the dust? Ok.

>You could do A=2, B=1, C=1, and therefore >EXPECTED_VOTES=4, then if the A-B link >breaks, B would always get kicked out >because the A+C cluster has mode votes. >HOWEVER, with that distribution, if you lose >A, the B+C cluster is not viable, AND you're >exposed to the risk of a creeping doom >failure at site A.

Forgive me, why is it not viable? Does it not have 2 votes (half of 4 expected)? Or are you saying because forcing the system to make B+C the cluster could cause issues if A is still alive and chatting to C?

Karl Rohwedder
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

>Forgive me, why is it not viable? Does it not >have 2 votes (half of 4 expected)? Or are you >saying because forcing the system to make B+C >the cluster could cause issues if A is still >alive and chatting to C?

Quorum is Expectedvotes/2+1, so 4 -> 3.

regards Kalle
MarkOfAus
Valued Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

labadie,
> You should have a look at Keith Parris page, at http://www.geocities.com/keithparris/

Thank you, it is a very good site for information. Some of it, alas, is a bit over my head.

Merci.
MarkOfAus
Valued Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

Karl,


>Quorum is Expectedvotes/2+1, so 4 -> 3.

Oops, forgive me. I forgot.

MarkOfAus
Valued Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

Jan,

>Note if if you DO want an unbalanced config in which any 2 nodes can continue, then
3 + 2 + 2 with EXPECTED_VOTES=7
_IS_ a valid option.

Actually this is probably preferable if indeed it can affect the decision of which node to CLUEXIT. I would prefer A+C to up over B+C, but again, does this not lead to the "creeping doom scenario"?

Jan van den Ende
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

Mark,

actually, no; unless you set the AutoStart Action to BOOT.
If A-B connectivity breaks, then either A or B clueexits.
If on the remaining cluster (say, A + C) something ugly comes to pass, then the cluster looses another node, and goes into quotum hang.
Only to be released by human intervention. (which HAD BETTER be sufficiently instructed)

Should AUTOSTART be set to BOOT, then (say) B will try to reboot. As long as A and C see one another, while B sees C and not A, B will not be allowed to join.
In the creeping doom situation, maybe C will eventually loose connectivity to A, in which case B + C is valid. And NOW you loose all that was done on A...
Then again, if the entire A site is destroyed, so is the data at A.
_IF_ in this scenario the data is on 3-member shadow sets however, first, the members at B are disconnected. Then A goes, as do the A shadow members. So, B joins. But the ACTIVE shadow set is still at C (allbeit single member) and the DATA survives.
-- Incidentally that is why we like to be able to have two members at each site ( + one to be split off for Backup puprposes, ie, allow at least 7 members in a set... as requested again during Bootcamp.

hth.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Ian P
Occasional Advisor

Re: 2 node cluster + quorum node - network breaks, what happens?

Forgive my ignorance, but surely you have protected yourself from this scenario by making A and B see each other via site C? And the same goes for the other two pairs of systems. No triangular setup like this should be able to be broken by the loss of a single link.

Ian.
Ian Miller.
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

IanP - no, each member of a cluster has to have a direct connection to each other member.
No routing.
____________________
Purely Personal Opinion
Hoff
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

It's true that SCS cannot currently be routed. But Ian's scheme will work just fine if you have a bridged LAN configuration, and each network link traverses a different path. And assuming latency doesn't nail you.

One of the options some sites use is (for instance) FC over IP, and (in parallel) SCS over bridged LANs.

Setting Votes and Expected_Votes is in the OpenVMS FAQ (available via <>), and a write-up on the topic is also available at <>. If you do not understand the parameters and the quorum scheme based on what you've read here already and what's in the FAQ, you really need to take the time to understand it. Or to ask questions on what you don't understand.

The Quorum scheme is THE basis for clustering. Creative or incorrect settings for the values here can seriously scrozzle your data.

Stephen Hoffman
HoffmanLabs LLC
Andy Bustamante
Honored Contributor

Re: 2 node cluster + quorum node - network breaks, what happens?

To amplify on Ian's remarks, a direct connect can be switched or bridged as long as latency doesn't become an issue.

You can also add an extra pair of network adaptors and use a cross over cable between nodes A and B to maintain cluster communications. Assuming you have both systems configured for speed/duplex or auto-negotiate, your cluster traffice will use the new path without configuration. This would keep up the primary and secondary system in the event the WAN connection drops.

"There are nine and sixty ways of contructing . . . and every single one of them is right."


Andy Bustamante
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net