Operating System - OpenVMS
1751770 Members
4543 Online
108781 Solutions
New Discussion юеВ

Re: Why mount the quorum disk ?

 
Robert_Boyd
Respected Contributor

Re: Why mount the quorum disk ?

What was your value for RECNXINTERVAL set to on each of the nodes in the cluster at the beginning of this event?

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Keith Parris
Trusted Contributor

Re: Why mount the quorum disk ?

Mounting the quorum disk gives you the advantage of mount verification, and better path failover processing.

It is also necessary for the quorum disk to be mounted in order for the Cluster_Server process to be able to initially create the QUORUM.DAT file when the quorum disk is first used. But it sounds like you're long past that point in time.
Robert Brooks_1
Honored Contributor

Re: Why mount the quorum disk ?

Keith wrote . . .

Mounting the quorum disk gives you the advantage of mount verification, and better path failover processing.

---

As the multipath subsystem REQUIRES mount verification in order to work at all, a device
mounted /NOMOUNT_VERIFICATION (or a quorum disk
not mounted) will not fail over at all, so it's not really a case of "better path failover", it's a case of whether or not there is any path failover at all.


-- Rob
Cass Witkowski
Trusted Contributor

Re: Why mount the quorum disk ?

I beleive that the Quorum disk will only be asked to cast a vote if it is needed. So when you boot up one node the quorum disk is needed but when the second node joins the cluster the quorum disk is no longer needed and so it no longer votes.

Cass
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

RECNX is on 100 seconds.

So, there is no failover (also MSCP ???) when the quorum disk is not mounted. I consider this as a shortcoming of the system (not to say a bug).
Instead of implementing it properly they display a console message.

So, doc should say :
Mounting is not required unless you need failover.

Now only the 3 questions remain and the question why votes were not given to the quorum disk.

Wim

Wim
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

Did some more testing. The cluster must have initially been booted due to 2 voting members, not the quorum disk. Sorry for that misinformation.

My further findings.

1. The file quorum.dat must exist for the disk to be counted in the voting schema (that's why it didn't count in my original case : the disk was never mounted).
2. The file quorum.dat will be created as soon as the disk is mounted. So, it's not created during the boot of the first cluster member to give the quorum disk the vote (disk is not yet mounted and it seems that it can not create the file without the disk being mounted).
3. Disk must be mounted to avoid console messages and to ensure failover (no prove because no test environment for it).

My 3 questions remain.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

Correction on 3.

The message "please mount quorum disk" is given ONLY when quorum.dat doesn't exist, so when it's trying to create the file.

Wim
Wim
Keith Parris
Trusted Contributor

Re: Why mount the quorum disk ?

Cass wrote:
> I beleive that the Quorum disk will only be asked to cast a vote if it is needed. So when you boot up one node the quorum disk is needed but when the second node joins the cluster the quorum disk is no longer needed and so it no longer votes. <

A quorum disk will contribute votes whenever it can be validated, whether the votes are needed or not. Wim explained that the reason he didn't see the quorum disk votes initially was that prior to his mounting the quorum disk, a QUORUM.DAT file did not yet exist on the disk, so it couldn't vote.

Wim wrote:
> 1) how long can a mount verification take ? And in this case, the task was dual : the remaining voting member must take control of all disks and it must serve them to the non-voting members. 120 seconds seems a long time for me. <

A mount verification can take up to MVTIMEOUT seconds. :-)

After the node goes away, with RECNXINTERVAL set to 100, you're going to wait 100 seconds before you give up and have a state transition to throw the node out of the cluster.

That plus the 120 seconds for SHADOW_MBR_TMO might explain the 220 seconds you saw.

> 2) how is it possible that the disk is removed 2 times ? <

Could that be an OPCOM message from another node?

> 3) the 2nd disk removal is almost 220 seconds after the mount verification was started. Why is it permitted to do overtime (SHA...TMO is 120)? <

SHADOW_MBR_TMO specifies the minimum acceptable amount of time we wait before a member is removed; it is allowed to take longer, but the disk won't be thrown out any sooner than this.

John wrote:
> A system only becomes a qf_watcher when the disk_quorum device is mounted on that system. See http://h71000.www7.hp.com/doc/731FINAL/4477/4477pro_002.html#integ_avail 2.3.9 points. <

A system can become a quorum disk watcher without the disk being mounted. The document you cite correctly advises "To permit recovery from failure conditions, the quorum disk must be mounted by all disk watchers" but technically, to become a quorum disk watcher only requires a proper setting for the SYSGEN parameter DISK_QUORUM and successful direct (not MSCP-served) access to the disk (see Roy Davis' VAXcluster Principles, p. 7-15).
Cass Witkowski
Trusted Contributor

Re: Why mount the quorum disk ?

Thanks for the clarification Keith.
Jan van den Ende
Honored Contributor

Re: Why mount the quorum disk ?

Thanks, Keith.

To me, this is just all the more arguments to try to avoid Quorum Disks whenever a valid quorum scheme can be constructed without QD (i.e., ANY config with > 2 nodes is better off without)

Just my EUR 0.02

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.