Operating System - OpenVMS
1827809 Members
1973 Online
109969 Solutions
New Discussion

Re: Quorum scheme comments required

 
Thomas Ritter
Respected Contributor

Quorum scheme comments required

A three node cluster has a quorum scheme of
Each node provides 1 vote.
Quorum disk provides 3 votes.
The quorum disk is presented via an EVA.
The motivation for this configuration is to able to boot a single node.

I would have
Each node provides 1 votes. No quorum disk.
Need two nodes for a cluster. If a single node is required then with a conversational boot change expected vote to 1.
What do you think ?
19 REPLIES 19
John Gillings
Honored Contributor

Re: Quorum scheme comments required

Hi Thomas,

Your quorum disk only needs NODES-1 votes to be able to boot a single node, so in your case 2 votes for the quorum disk.

EXPECTED_VOTES is therefore 5, quorum is 3, so a single node plus quorum disk can form a cluster. Similarly the 3 nodes can form a cluster without quorum disk. (I'd have to look at the numbers, but I think with an odd number of nodes, it works out the same with N or N-1, but with an even number of nodes you must use N-1 otherwise you hang if you lose the quorum disk).

On a SAN allocate the smallest possible device, make it your quorum disk. Don't put any data on it. Does it even need to be mounted on each node? If not, then don't.

The other option is a quorum node. I believe it should be possible to do it with a small Alpha (or even a VAX). In theory it doesn't need to even complete STARTUP.COM, it only needs to get as far as asserting votes. No disk sharing, no network, no logins, just join the cluster and contribute votes. Technically you don't even need license PAKs, (not sure about the legal impliciations though). Unfortunately I've never been able to convince anyone to try it out, or get OpenVMS Engineering to develop a special "quorum only" package (probably only a day or two's work in DCL to develop! Not much more than finding a convenient place in STARTUP.COM to STOP).

> conversational boot change expected vote to 1.

NO! Very bad plan. That is how partitioned clusters and irrepairable data corruption happens. If you care about your data, design and strictly enforce a quorum scheme that does exactly what you need without unnecessary exposure to risks.

(and to those who claim to have been playing with EXPECTED_VOTES "for years" without any problems, good luck to you! I know of people who've been doing similar things with chainsaws and circular saws sans blade guards, without accident. I also know a few with missing digits and/or limbs! Those that remove blade guards will eventually get cut!)
A crucible of informative mistakes
Thomas Ritter
Respected Contributor

Re: Quorum scheme comments required

thanks John. Are you saying that in a three node cluster you would use a quorum disk or a quorum node ? Why not just three nodes ?
Jon Pinkley
Honored Contributor

Re: Quorum scheme comments required

Each has advantages/disadvantages.

However, for the 3 node + QDisk I would use 1 vote for each node and 2 votes for quorum disk. Expected votes 5, quorum 3. Although it really makes no difference, as the way you present it is expected votes 6, quorum 4, and 1+3 still meets the minimum quorum.

Also note that with an EVA, you can create a 1GB disk for the QDisk and can use vraid1 to protect against failure of the quorum disk. You can't use HBVS on a quorum disk (because it couldn't be trusted as a tie breaker). VMS will refuse to use a DSA device as a quorum disk, so you can't ignore the rule.

Advantages of this, cluster will operate with any single node. It will boot and will continue to work when two nodes leave.

Disadvantage, longer cluster transition times.

For the one vote each with no quorum disk. Expected votes 3, quorum 2.

Advantages: quicker transition times. Simpler without quorum disk.

Disadvantages: Can't boot single node without conversational boot. Once second node joins cluster, cluster will hang when node drops below 2. Availability manager can fix, but it requires manual intervention.
And the disadvantage that John brings up, although I am unaware of a way to avoid setting expected votes to 1 is some cases, for example when upgrading a shared system disk. You MUST be sure that the other nodes are DOWN (as in powered off) when you remove the blade guards to be safe.

Jon
it depends
Michael Moroney
Frequent Advisor

Re: Quorum scheme comments required

It's generally better to avoid quorum disks, if possible. Cluster transitions with quorum disks are slower.

If you get rid of the quorum disk, set VOTES=1 and EXPECTED_VOTES=3 on all nodes. You'll need 2 nodes to be up, or, like you said, you'll have to boot with EXPECTED_VOTES=1 if you really want to have the "cluster" up with one node.

If you keep the quorum disk:

Giving it 3 votes makes it a single point of failure. No disk, no cluster, at least without rebooting with a lower EXPECTED_VOTES.

If you give it 2 votes and each node has VOTES=1 and EXPECTED_VOTES=5, you can have the cluster available if you have either 1 node plus the quorum disk, or if you have all 3 nodes available without (or with) the quorum disk.
Jon Pinkley
Honored Contributor

Re: Quorum scheme comments required

RE:"Are you saying that in a three node cluster you would use a quorum disk or a quorum node ? Why not just three nodes ?"

If you have a quorum node with its own system disk, you could give it 2 votes and never shut it down. Then you have the expected votes 5, quorum 3 cluster. So you can have the quorum node plus any one additional node, or you can have all three "production" nodes and not quorum node (for when you have to reboot the qurum node).

Jon
it depends
John Gillings
Honored Contributor

Re: Quorum scheme comments required

>Are you saying that in a three node
>cluster you would use a quorum disk or a
>quorum node ?

Thomas,

If you want to be able to boot with single node you need a Qdisk or Qnode. The node solution is potentially better, but technically that makes it a 4 node cluster, and you have to avoid the tempation to do ANYTHING on the quorum node. As soon as you have any activity on the quorum node, it becomes a critical component.

I think HP will also expect you to pay for licenses, even if you don't do any work on it. The votes will always be 1 for each "working" node and N-1 for quorum, in your case, N=3, so expected votes 5.

Ideally I'd like OpenVMS engineering to create a quorum device, just a little dongle to put on your network. Say a box with 4 ethernet connections and a micro OS. Give it a node name, cluster number & password, number of votes and expected votes. In theory I suppose it could maintain quorum for multiple clusters.

In the mean time, the cheapest and most practical is a quorum disk. As Jon suggests, a tiny virtual unit with some kind of controller based redundancy. Again avoid the tempation to store anything on it, and don't bother to back it up.
A crucible of informative mistakes
Michael Moroney
Frequent Advisor

Re: Quorum scheme comments required

You can increase the cluster transition speed when using a quorum disk by decreasing the value of QDSKINTERVAL at the expense of more I/Os to the disk. Default is 3 seconds.


Re: "Ideally I'd like OpenVMS engineering to create a quorum device, just a little dongle to put on your network. Say a box with 4 ethernet connections and a micro OS. Give it a node name, cluster number & password, number of votes and expected votes. In theory I suppose it could maintain quorum for multiple clusters."

I was actually thinking the same thing: What would it take to write a minimal program that would run on some specialized hardware that would speak the minimum SCS to contribute a vote or votes to a lan cluster. However, running it on anything other than some specialized dedicated bit of hardware opens you to a point of failure. My first thought was a little program on a Linux box.
John Gillings
Honored Contributor

Re: Quorum scheme comments required

Michael,

>What would it take to write a minimal
>program that would run on some specialized
>hardware that would speak the minimum SCS
>to contribute a vote or votes to a lan
>cluster.

Simplest (and cheapest?). Take a DS10 (2 network ports built in), add extra network adapters if required (there are 4 PCI slots), but other than that you should be able to get away with a minimal configuration. Install OpenVMS on a single internal drive. Configure as a cluster node with quorum votes, cluster group, cluster password. Turn off disk sharing, don't mount any disks, set interactive logins to 0. Don't install DECwindows or TCPIP or DECnet.

Edit SYCONFIG.COM (earliest user modifiable procedure?) add the line:

$ SET PROCESS/SUSPEND/ID=0

(or maybe just STOP/ID=0?)

By that time the node has joined the cluster, but we haven't configured any hardware. We just freeze or abort the startup procedure at that point and let the cluster interrupt services handle everything from there on. Note that it should NOT have sufficient votes to boot by itself (otherwise it would be a single point of failure).

Now just plug in the various network cables and boot your cluster. You'll need at least one other node to form a cluster.

I've never had the time or resources to test this, but I believe this box will behave as a quorum node, without the need for any licenses. Since you're effectively not running the operating system, I think HP would be hard pressed to argue that you need to pay for any licenses.

We once had a MicroVAX II hacked in a similar way to behave as a remote tape drive station. It had a number of TK drives, served to the cluster, but the node didn't boot far enough to look like a full cluster node.
A crucible of informative mistakes
Willem Grooters
Honored Contributor

Re: Quorum scheme comments required

DS10 - just for quorum? This might be feasable for a site with bigger (amounts of) iron, but I would love to have one ACTIVE in my environment ;). Environmentally, it's a bad idea - using a disk uses far less energy.
Willem Grooters
OpenVMS Developer & System Manager
Hakan Zanderau ( Anders
Trusted Contributor

Re: Quorum scheme comments required

John,

It's an interesting discussion.
I had a conversation with a collegue of yours (Anders Johansson) about the need of VMSCLUSTER license.

I logged the bootsequence as part of a preparation to a OpenVMS SYSMAN class and found that clusterserver is started WAY before any license have been loaded.
When is the VMSCLUSTER license checked ?
We ( actually HE ) thought at login.
But no one is EVER going to login to this node ;-)

bootorder:

SYS$STARTUP:VMS$INITIAL-050_VMS.COM
SYS$STARTUP:VMS$INITIAL-050_LIB.COM
SYS$STARTUP:VMS$CONFIG-050_CSP.COM
SYS$STARTUP:VMS$DEVICE_STARTUP.COM
.
.
.

A place to put STOP/ID=0 could be at the end of VMS$CONFIG-050_CSP.COM

This one is funny, I will try this today ;-)

regards,

Hakan Zanderau
HA-solutions
Don't make it worse by guessing.........
Jan van den Ende
Honored Contributor

Re: Quorum scheme comments required

Thomas, (and the others).

Until now I have not yet seen this accentuated:

>>>
The motivation for this configuration is to able to boot a single node.
<<<

Is it really the idea to BOOT one single node, or is the desire to be able to contuing operation with a single node up?

And if the latter, do you mean LOOSE two nodes simultaniously, or a planned step down towards single node?

Without a Quorum disk:

- booting single node.
As usual when starting a cluster, that can be done by boooting into SYSBOOT, and temporarily overruling the loaded params. (It is also possible to simultaniously boot enough nodes for quorum, in which case they will leave the "Waiting for quorum hang" together, but that is not the issue here.)

- step down to single node
Just shut down all but one nodes, specifying shutdown option "Remove_node".

What remains is, to continue running when simultaniously --loosing-- HALF or MORE of the running nodes.
That case WILL be covered by a quorum disk with enough votes to have a single node running, BUT, WHAT are the chances of loosing TWO nodes simultaniously (assuming 3 node cluster, more makes it only still LESS probable), COMPARED TO THE CHANCE OF LOOSING THE QUORUM DISK.

Michael Moroney already made the point: QUORUM DISK should be avoided if at all possible. In my opinion QD is a very nice poor-man's trick to enable 2-node clusters, but for configs of 3 nodes upwards, they are an avoidable pain.
Configurations may well be possible that lead to different conclusions, but (until now...) it has always held up upon deeper analysis.

fwiw.

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.
Wim Van den Wyngaert
Honored Contributor

Re: Quorum scheme comments required

On the subject of the quorum station :
1) remind that it has his own sysuaf
2) if you enable tcp or decnet you may need security monitoring (audit)
3) ours has decnet running and a monitoring (yes, in dcl). Just how do you find (e.g. disk) errors if you don't look at it ?
4) don't forget to patch it when needed (I don't but I'm not in a normal environment)

But I would like to have John's little device instead.

Fwiw

Wim
Wim
Robert Gezelter
Honored Contributor

Re: Quorum scheme comments required

Thomas,

I will ignore the question of QUORUM, except to say that I am in general agreement with the use of a QUORUM disk to preserve the possibility of single node bootstrap in a production environment.

As far as modifying the bootstrap environment, I often eschew using a conversational bootstrap in emergencies in favor of pre-configured alternative boot roots. This allows testing of the exact scenario in advance, and prevents typographical errors from causing catastrophe when the inevitable happens during the mid-watch.

- Bob Gezelter, http://www.rlgsc.com
Michael Moroney
Frequent Advisor

Re: Quorum scheme comments required

Re: "Simplest (and cheapest?). Take a DS10 (2 network ports built in), add extra network adapters if required (there are 4 PCI slots), but other than that you should be able to get away with a minimal configuration."

I was thinking of (legally) avoiding the need for a license, since whether you'd need a cluster license to do this or not is somewhat of a legal gray area. Of course, a "widget" could easily require some sort of license if HP wants to do so, as they own the VMScluster code.

I was also thinking of something treated as a network device like a router box rather than a computer, and perhaps be small enough to power with a wallwart, if not smaller.

If anyone is actually thinking of using a DS10 this way, I'd be more than happy to make an even swap for a DS10L. Only takes 1U in a rack, and slightly more environmentally friendly. :-)
Ian Miller.
Honored Contributor

Re: Quorum scheme comments required

As part of the DTSS service offering HP have software on a PC which can supply a vote.
____________________
Purely Personal Opinion
Hoff
Honored Contributor

Re: Quorum scheme comments required

Ian likely intended to write: "As part of the DTCS service offering HP have software on a PC which can supply a vote."

DTSS is something completely different from DTCS. Timekeeping versus disaster-tolerant cluster services.

Quorum disks with hardware-based mirroring and with a fairly-frequent polling setting are often a reasonable solution, and one I/O per unit isn't usually a big issue. This given that there's no "polling server" around (and no associated support for this hypothetical "polling server" inside the current cluster connection manager) and given there's no SCS and DLM stack for an x86 box, and for cases where there is no third-wheel VAX or Alpha or Itanium box (with a base and a cluster license) that can be pressed into service as a third vote.

Some quorum-related reading material:

http://64.223.189.234/node/569
http://64.223.189.234/node/153
Thomas Ritter
Respected Contributor

Re: Quorum scheme comments required

Thanks for the good replies. Sometime I feel I need to seek others ideas about areas I understand reasonably well. At the back of my mind what the question... should one bother with a quorum disk in a three node cluster at all ? Given in this site the quourm disk is presented by an EVA it will probably never fail and if it did it would probably be the least of our problem. So I guess having a quorum disk to allow single node activities is a simple way to go. But at my other site the four node cluster has no quorum disk and "remove_node" option gives us the flexibility we need.
Jon Pinkley
Honored Contributor

Re: Quorum scheme comments required

In 4 node, equal votes, no QDisk, expected votes should be set to 4 on all nodes. Quorum will be 3.

So if you you use remove node and shutdown 3 of the four nodes, the expected votes and quorum on the remaining node (let's call it NODEA) will be 1.

Now what happens when you boot another node (let's call it NODEB) into the cluster? It has expected votes = 4. I haven't tried this, so I don't know what will happen, but I would expect at least the newly added node to hang, and possibly the addition of the new node will also cause a recalculation of the expected votes for the node what was in the cluster by itself.

The question is, does the NODEB hang waiting for more members, and if it does, does it also cause NODEA to loose quorum? If so, that isn't great for availability.

Yes, you could boot NODEB conversationally, and set expected votes to 2, but that requires special handling. And what happens if you have 2 nodes up, and you then shutdown one with automatic reboot and shutdown flags of REMOVE_NODE, what happens when it reboots, does it hang?

With a quorum disk/node these things just work, without having to do anything special.

Jon
it depends
Robert Gezelter
Honored Contributor

Re: Quorum scheme comments required

Thomas,

I agree with Jon as to the point about manual intervention. When I speak with clients, I try to always point out that it is not the "controlled" situations that are interesting, it is the "uncontrolled" situations which must be dealt with.

Manual intervention has a very high error rate, particularly during a seeming crisis.

- Bob Gezelter, http://www.rlgsc.com