Operating System - OpenVMS
1752563 Members
4723 Online
108788 Solutions
New Discussion юеВ

Re: Why mount the quorum disk ?

 
Wim Van den Wyngaert
Honored Contributor

Why mount the quorum disk ?

I have a 2 node cluster with quorum disk.

If I boot the first node, it starts the cluster as if the quorum disk is a voting member. However, I didn't mount the quorum disk. And when the cluster is up and running, the current votes is on 2 instead of 3 (ana/sys show cluster, the votes field).

It's logical that the cluster was formed when the quorum disk was seen by the first node because the mount of the quorum disk is done a lot later. But why isn't the vote counted in the voting schema ?

Wim
Wim
19 REPLIES 19
faris_3
Valued Contributor

Re: Why mount the quorum disk ?

Hi,

Even not mounted, the quorum disk should give its vote : see
http://h71000.www7.hp.com/wizard/wiz_8592.html

So there must another prolem. Any Opcom messages ? What is the value of QF_VOTE in sh cluster/cont ?


Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

When I mounted the disk, the votes increased. So that link is not correct (for nodes 7.2).

In operator.log I find "Please mount the quorum disk", thats all.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

QF_VOTE is now YES. Before the change ???

In the status I saw qf_active but not qf_watcher.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

QF_VOTE was however missing in the status summary in sda. So it was NO.

Wim
Wim
faris_3
Valued Contributor

Re: Why mount the quorum disk ?

I think the access to the quorum file has always used physical i/o so the device need not be initially mounted to give its vote.

But in older versions of VMS, dismounting the quorum disk clears the valid bit for the unit, which makes a readpblk fail. (VOLINV).

Did you dismount the quorum disk ?
Or some event/error may have cleared the valid bit of the device.

( an SDA show device before the disk was mounted could give a hint)
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

The disk was not mounted since long. So, no the disk was not mounted in the life of the cluster. But I can't replay the scenario because it's now running production.

Small note : it's a cluster with 2 AS1000 with votes and access to disks running 7.2. In the same cluster we have 2 AS4100 running 6.21h3, these have no votes and no qdisk setup.

Wim
Wim
John Abbott_2
Esteemed Contributor

Re: Why mount the quorum disk ?

A system only becomes a qf_watcher when the disk_quorum device is mounted on that system. See http://h71000.www7.hp.com/doc/731FINAL/4477/4477pro_002.html#integ_avail 2.3.9 points.
Don't do what Donny Dont does
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

I have a simular cluster for testing.

I removed the mount of the quorum disk.
The cluster started without any problem and qf_votes is YES and votes is 3, so correct.

So, the mount is not needed to boot or to have the disk votes in the schema.

I rebooted the 4 nodes at once in my original case. I guess something went wrong.
But it IS solved by doing the mount (can prove it with console output).

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Why mount the quorum disk ?

On the same cluster, I did a shutdown of 1 of the 2 voting systems. With option REMOVE_NODE.

MVTIMEOUT 3600
SHADOW_MBR_TMO 120

The disks are connected to a dual controller that is in turn connected to the 2 voting systems. For the non-voting members, the disks are MSCP served. None of the systems is fully patched.

At 19:12:45 the cluster transition started
At 19:12:45 cluster transition completed
At 19:12:48 the first mount verification message was displayed
At 19:12:49 the message "xxx has been removed from VMScluster" was displayed
At 19:12:50 the first mount verification complete message was displayed
At 19:13:04 the cpu of xxx halted
At 19:15:35 a disk was thrown out of the shadow set
At 19:16:29 the just removed disk is mentioned to be offline and 1 second later it is AGAIN removed from the shadow set

My questions.
1) how long can a mount verification take ? And in this case, the task was dual : the remaining voting member must take control of all disks and it must serve them to the non-voting members. 120 seconds seems a long time for me.

2) how is it possible that the disk is removed 2 times ?

3) the 2nd disk removal is almost 220 seconds after the mount verification was started. Why is it permitted to do overtime (SHA...TMO is 120) ?

Wim
Wim