topic Re: Quorum Disk Failure in Operating System - OpenVMS

Quorum Disk Failure

Jack Trachtman — Thu, 05 Feb 2004 13:16:46 GMT

We have a two node cluster with a quorum disk and are familiar with what happens when a node is shut down or fails - the other node goes into a cluster reconfigure along with issuing Operator msgs.

What happens if the Quorum disk fails?

Is the scenario the same on the hosts - cluster reconfigure with msgs referencing the loss of the quorum disk or does something different happen?

What happens when the failed Quorum disk is made available again?

Do the hosts automatically recongize the appearance of the quorum disk and use it (with appropriate msgs)?

In other words, how does the loss/restore of a quorum disk compare to loss/restore of a host node?

One more question (for extra points!). What happens when the quorum disk "temporarily" disappears? We have our quorum disk on an HP SW SAN. If we make a Zoning or Presentation change that affects the quorum disk, besides VMS going through a Mount Verify, what does the host recovery look like?

Thanks much

Re: Quorum Disk Failure

Ian Miller. — Thu, 05 Feb 2004 13:25:16 GMT

the quorum disk watcher sees all! :-)
(in QDISKINTERVAL seconds)
You will see messages about the quorum disk being unavailable and as long as there are enough votes then the cluster will continue. When the qdsk returns it will be recognised as the quorum disk by the presence of the QUORUM.DAT and the votes (QDSKVOTES) counted.

I assume both systems are directly connected to the quorum disk.

Re: Quorum Disk Failure

Jan van den Ende — Fri, 06 Feb 2004 14:15:05 GMT

Well, except for its obvious passiveness, the quorumdisk should be considered as 'just' another node, although, due to the mentioned passivenees, much slower to respond.
It is indirect. The first node to reconnect checks the quorumdisk for recent 'stamps' by the other node(s). They are not there, so it simply leaves its own imprint. The second one finds teh stamp (and it better be of a known node). It also leaves its trace, and the next time #1 comes along it can conclude that the quorum disk is a valid member again.

Even IF (unwanted situation) the departure of the quorumdisk leads to a loss of quorum (the cluster 'hangs') this mechanism is above the hang, and if the return of Qdsk suffices to regain quorum, that WILL be recognised, and the hang will be over.

jan

Re: Quorum Disk Failure

John Eerenberg — Fri, 06 Feb 2004 16:33:27 GMT

In a 2 node cluster, one can avoid a hang if the quorum disk fails by setting Votes=2 on each node, and qdskvotes=1 so expected_votes=5. This way, if either node goes down, the requisite minimum of 3 votes is held and if the quorum disk goes bye-bye, 4 votes are held.
Only in the event of a double failure does the surviving node hang.
Maybe that will help someone.
john

Re: Quorum Disk Failure

Jan van den Ende — Sat, 07 Feb 2004 05:14:28 GMT

John,

works fine, but not even needed.
@ nodes each 1 vote + qdsk 1 vote = 3 votes expected. Any single voter gone leaves 2 vote +> quorum maintained.

The abovementioned temporary hang (and resume) would occur at one node out (eg, maintenace) and THEN having your SAN disconnecting and reconnecting the qdsk.

hth

Jan

Re: Quorum Disk Failure

Jan van den Ende — Sat, 07 Feb 2004 08:36:42 GMT

Sorry, two irritating typos:

"@ nodes" should read "2 nodes"
" +> " " " " => "

... I sometimes (have to) work on systems with different keyboard layouts. It should have been forbidden, but then, who should be allowed to declare "THE" correct layout?

Re: Quorum Disk Failure

Lokesh_2 — Mon, 09 Feb 2004 06:29:29 GMT

Hi ,

I agree with Jan. Why setting votes=2 for cluster members ? It works fine for me with votes=1 .

Best regards,
Lokesh

Re: Quorum Disk Failure

Wim Van den Wyngaert — Mon, 09 Feb 2004 07:01:15 GMT

If you give 2 votes to each cluster member and 1 to the quorum disk, you have the advantage that
IF
1) node 1 is stopped with REMOVE_NODE
2) the quorum disk gets lost after 1) has completed
THEN
your cluster is still alive (because a minority of 1 vote left the cluster with total votes equal to 3, so 2 votes left).

If you have 1-1-1, the cluster would hang until the disk is replaced or the second node is rebooted.

Re: Quorum Disk Failure

Jan van den Ende — Mon, 09 Feb 2004 08:10:45 GMT

Touchee Wim!

Indeed, that IS the reason to use 2-2-1.

Then the issue of SAN disconnecting/reconnecting disappears nearly completely: Only if one node left WITHOUT adjustment (eigther by crash or by operator forgetting "remove_node"), THEN if the SAN connection disappears BEFORE a SET CLUSTER/EXPECTED,only THEN will the hang still occur. Should be very rare.
Jan

Re: Quorum Disk Failure

Lokesh_2 — Mon, 09 Feb 2004 09:25:06 GMT

Hi,

Thanks for explaning the advantage of 2-2-1 . I will note it down.

Best regards,
Lokesh

Re: Quorum Disk Failure

John Eerenberg — Mon, 09 Feb 2004 13:54:07 GMT

Just some icing on the cake . . .
a parallel to the Remove_Node option is when one has a system crash.
I set up the 2-2-1 votes so that if Node A crashes, then I can go to Node B's console and adjust quorum:
^p
>>> dep sirr c
>>> cont
>>> q
>>> ^z

Now I can lose the quorum disk too and the surviving node can stay up as a standalone system.

BE very CAREFUL when using the above console commands. If I entered the above incorrectly *and/or* one makes a typo, the surviving node can crash as well. Test it before needing it.

john

Re: Quorum Disk Failure

Jan van den Ende — Mon, 09 Feb 2004 14:38:14 GMT

John,

just THAT would be your emergency escape if you get to HANG, eg 1-1-1 and losing 2.
In the 2-2-1 scheme, if one crashes, you DON'T have to IPC the other node. If your cluster is NOT YET hung, but you fear the crashed node might be down for whatever you define as 'a prolongued time', then you get to exactly the same result by SET CLUSTER/EXPECTED from any sufficiently privileged (CMKRNL, SYSNAM, & SYSLCK) process.
Alternatively, AMDS or Availability Manager if installed and configured correctly will do the same for you, and also in a much more controlled way.

Then again, I ALWAYS carry a note with the IPC sequence in my wallet. In 20 years I needed it twice, and then carrying a small note is invaluable!

Jan

Re: Quorum Disk Failure

John Eerenberg — Tue, 10 Feb 2004 10:25:32 GMT

Agreed!
I guess I still have an old school mentality since I still remember having to hit bit 28 to boot TOPS10 on a KI10 (are there any KI10's left on the planet??? or TOPS10 for that matter?)

> Then again, I ALWAYS carry a note with the IPC sequence in my wallet. In 20 years I needed it twice, and then carrying a small note is invaluable!

Once was all I really needed it so you have me beat. :-)
Kept it on paper as well until I recently bought an iPaq 1945; something about paper was more reassuring though . . .

Re: Quorum Disk Failure

Jack Trachtman — Tue, 10 Feb 2004 11:23:00 GMT

Two questions:

1) This is probably obvious, but what
does IPC stand for in the previous msgs
context?

2) In John's msg, he has ">>> dep sirr c",
which is a deposit cmd, but what is
"sirr" and why the value "c"?

Thanks

Re: Quorum Disk Failure

Ian Miller. — Tue, 10 Feb 2004 12:42:01 GMT

sirr = software interrupt request register.
C = ipl at which to request the interrupt i.e. 12
The handler for software interrupts at 12 is the recalculate quorum routine which has a prompt of IPC (interrupt prio C).

The use of this routine is not recommended nowadays. Use AMDS or Availability Manager. If you have not got AMDS/AvailMgr setup do so now!
They are invaluable for many reasons.

Re: Quorum Disk Failure

Wim Van den Wyngaert — Wed, 11 Feb 2004 02:32:12 GMT

All,

Remember when you fix the quorum by hand, you have the gate open for a split cluster.

E.g. an interbuilding cluster with a quorum station in 1 building. The quorum station + 1 node in the same building go down. The remaining node in the other building is alive but blocked because of missing quorum. You adjust the quorum and the single node continues. The interbuilding link is down and the nodes in the other building resume activity (e.g. after power failure). You now have 2 clusters.

That's why my procedures refuse to boot if I don't see all disks in both buildings.