Operating System - OpenVMS
1752522 Members
4595 Online
108788 Solutions
New Discussion юеВ

Re: Quorum scheme comments required

 
Hakan Zanderau ( Anders
Trusted Contributor

Re: Quorum scheme comments required

John,

It's an interesting discussion.
I had a conversation with a collegue of yours (Anders Johansson) about the need of VMSCLUSTER license.

I logged the bootsequence as part of a preparation to a OpenVMS SYSMAN class and found that clusterserver is started WAY before any license have been loaded.
When is the VMSCLUSTER license checked ?
We ( actually HE ) thought at login.
But no one is EVER going to login to this node ;-)

bootorder:

SYS$STARTUP:VMS$INITIAL-050_VMS.COM
SYS$STARTUP:VMS$INITIAL-050_LIB.COM
SYS$STARTUP:VMS$CONFIG-050_CSP.COM
SYS$STARTUP:VMS$DEVICE_STARTUP.COM
.
.
.

A place to put STOP/ID=0 could be at the end of VMS$CONFIG-050_CSP.COM

This one is funny, I will try this today ;-)

regards,

Hakan Zanderau
HA-solutions
Don't make it worse by guessing.........
Jan van den Ende
Honored Contributor

Re: Quorum scheme comments required

Thomas, (and the others).

Until now I have not yet seen this accentuated:

>>>
The motivation for this configuration is to able to boot a single node.
<<<

Is it really the idea to BOOT one single node, or is the desire to be able to contuing operation with a single node up?

And if the latter, do you mean LOOSE two nodes simultaniously, or a planned step down towards single node?

Without a Quorum disk:

- booting single node.
As usual when starting a cluster, that can be done by boooting into SYSBOOT, and temporarily overruling the loaded params. (It is also possible to simultaniously boot enough nodes for quorum, in which case they will leave the "Waiting for quorum hang" together, but that is not the issue here.)

- step down to single node
Just shut down all but one nodes, specifying shutdown option "Remove_node".

What remains is, to continue running when simultaniously --loosing-- HALF or MORE of the running nodes.
That case WILL be covered by a quorum disk with enough votes to have a single node running, BUT, WHAT are the chances of loosing TWO nodes simultaniously (assuming 3 node cluster, more makes it only still LESS probable), COMPARED TO THE CHANCE OF LOOSING THE QUORUM DISK.

Michael Moroney already made the point: QUORUM DISK should be avoided if at all possible. In my opinion QD is a very nice poor-man's trick to enable 2-node clusters, but for configs of 3 nodes upwards, they are an avoidable pain.
Configurations may well be possible that lead to different conclusions, but (until now...) it has always held up upon deeper analysis.

fwiw.

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.
Wim Van den Wyngaert
Honored Contributor

Re: Quorum scheme comments required

On the subject of the quorum station :
1) remind that it has his own sysuaf
2) if you enable tcp or decnet you may need security monitoring (audit)
3) ours has decnet running and a monitoring (yes, in dcl). Just how do you find (e.g. disk) errors if you don't look at it ?
4) don't forget to patch it when needed (I don't but I'm not in a normal environment)

But I would like to have John's little device instead.

Fwiw

Wim
Wim
Robert Gezelter
Honored Contributor

Re: Quorum scheme comments required

Thomas,

I will ignore the question of QUORUM, except to say that I am in general agreement with the use of a QUORUM disk to preserve the possibility of single node bootstrap in a production environment.

As far as modifying the bootstrap environment, I often eschew using a conversational bootstrap in emergencies in favor of pre-configured alternative boot roots. This allows testing of the exact scenario in advance, and prevents typographical errors from causing catastrophe when the inevitable happens during the mid-watch.

- Bob Gezelter, http://www.rlgsc.com
Michael Moroney
Frequent Advisor

Re: Quorum scheme comments required

Re: "Simplest (and cheapest?). Take a DS10 (2 network ports built in), add extra network adapters if required (there are 4 PCI slots), but other than that you should be able to get away with a minimal configuration."

I was thinking of (legally) avoiding the need for a license, since whether you'd need a cluster license to do this or not is somewhat of a legal gray area. Of course, a "widget" could easily require some sort of license if HP wants to do so, as they own the VMScluster code.

I was also thinking of something treated as a network device like a router box rather than a computer, and perhaps be small enough to power with a wallwart, if not smaller.

If anyone is actually thinking of using a DS10 this way, I'd be more than happy to make an even swap for a DS10L. Only takes 1U in a rack, and slightly more environmentally friendly. :-)
Ian Miller.
Honored Contributor

Re: Quorum scheme comments required

As part of the DTSS service offering HP have software on a PC which can supply a vote.
____________________
Purely Personal Opinion
Hoff
Honored Contributor

Re: Quorum scheme comments required

Ian likely intended to write: "As part of the DTCS service offering HP have software on a PC which can supply a vote."

DTSS is something completely different from DTCS. Timekeeping versus disaster-tolerant cluster services.

Quorum disks with hardware-based mirroring and with a fairly-frequent polling setting are often a reasonable solution, and one I/O per unit isn't usually a big issue. This given that there's no "polling server" around (and no associated support for this hypothetical "polling server" inside the current cluster connection manager) and given there's no SCS and DLM stack for an x86 box, and for cases where there is no third-wheel VAX or Alpha or Itanium box (with a base and a cluster license) that can be pressed into service as a third vote.

Some quorum-related reading material:

http://64.223.189.234/node/569
http://64.223.189.234/node/153
Thomas Ritter
Respected Contributor

Re: Quorum scheme comments required

Thanks for the good replies. Sometime I feel I need to seek others ideas about areas I understand reasonably well. At the back of my mind what the question... should one bother with a quorum disk in a three node cluster at all ? Given in this site the quourm disk is presented by an EVA it will probably never fail and if it did it would probably be the least of our problem. So I guess having a quorum disk to allow single node activities is a simple way to go. But at my other site the four node cluster has no quorum disk and "remove_node" option gives us the flexibility we need.
Jon Pinkley
Honored Contributor

Re: Quorum scheme comments required

In 4 node, equal votes, no QDisk, expected votes should be set to 4 on all nodes. Quorum will be 3.

So if you you use remove node and shutdown 3 of the four nodes, the expected votes and quorum on the remaining node (let's call it NODEA) will be 1.

Now what happens when you boot another node (let's call it NODEB) into the cluster? It has expected votes = 4. I haven't tried this, so I don't know what will happen, but I would expect at least the newly added node to hang, and possibly the addition of the new node will also cause a recalculation of the expected votes for the node what was in the cluster by itself.

The question is, does the NODEB hang waiting for more members, and if it does, does it also cause NODEA to loose quorum? If so, that isn't great for availability.

Yes, you could boot NODEB conversationally, and set expected votes to 2, but that requires special handling. And what happens if you have 2 nodes up, and you then shutdown one with automatic reboot and shutdown flags of REMOVE_NODE, what happens when it reboots, does it hang?

With a quorum disk/node these things just work, without having to do anything special.

Jon
it depends
Robert Gezelter
Honored Contributor

Re: Quorum scheme comments required

Thomas,

I agree with Jon as to the point about manual intervention. When I speak with clients, I try to always point out that it is not the "controlled" situations that are interesting, it is the "uncontrolled" situations which must be dealt with.

Manual intervention has a very high error rate, particularly during a seeming crisis.

- Bob Gezelter, http://www.rlgsc.com