Operating System - OpenVMS

Re: New clustered shadowset

 
SOLVED
Go to solution
Aaron Sakovich
Super Advisor

Re: New clustered shadowset

Woot! I knew asking here would elicit tons of good tips!

Jack, et. al. -- yes, the disks are MSCP served to the cluster members; there is no shared storage.

John, you talk about using /Policy=Require_Members. The downside that I see to that is that if our cluster boots and 2 members of the 3 are available, quorum is reached, but we still would not see the volume mounted until the 3rd system booted, right? That's not our goal -- we're happy with 2 members of the cluster. It's a development & QA cluster, so there's a good chance that we'll have at least one system down or rebooting for any variety of reasons. And because of that, I'm loath to do anything that's going to require massive amounts of hand-waving and magical incantations to get the system up and running after just one host reboots.

Martin and Jan -- about dismounting DSA vs. DKB during shutdown. I'm trying to figure out if the best bet would be to first dismount the local drive DKB200 so that it's removed cluster-wide and then dismount the DSA shadowset on the local host, OR should I just dismount the DSA volume on the local host? If I dismount the drive, it doesn't remove it from the permanent definition of the shadowset, it just tells the other hosts that this drive is unavailable until it's mounted again, right? With the common system files on the shadowset, will I even be able to dismount the DSA volume on the local host during SyShutdwn? Is there anything special that I should do (or even could do) to facilitate the clean removal of the local drive with these shared files?

And, yes, the common drive is to be used for the standard complement of shared files: SysUAF, RightsDB, TCPIP$Hosts, Qman, LPD spool directory, etc.
Ian Miller.
Honored Contributor

Re: New clustered shadowset

If you dismount the local drive the other members of the shadow set will update the definition of the shadow set (in the SCB) so that local drive will require a full copy if added in to the shadow set.

Just dismount the DSA. Although there will always be files open (UAF, queue files etc) and therefore I think it unlikely you can avoid the resulting shadowset merge.
____________________
Purely Personal Opinion
Wim Van den Wyngaert
Honored Contributor

Re: New clustered shadowset

This is the mount procedure we use on all our systems. You use it as

@xx single_disk dsa1 disk1 $45$dka401,$45$dka301

Note that some parts may require our context.

Wim
Wim
Aaron Sakovich
Super Advisor

Re: New clustered shadowset

Ah, I found the HBMM docu on one of my 7.3-2 systems -- it was included in a subsequent UPDATE ECO. I'm going to transfer it to my PDA so I can read it at the coffee shop... ;^)

Thanks to all for your input -- I'm going over the info you've provided to see if I can come up with any more questions.
Robert_Boyd
Respected Contributor

Re: New clustered shadowset

Ian,

It depends on how you do the DISMOUNT of the local drive -- from the HELP DISMOUNT /POLICY text:


/POLICY=[NO]MINICOPY[=(OPTIONAL)] (Alpha/I64 only)

Controls the setup and use of the shadowing minicopy function.

Requires LOG_IO (logical I/O) privilege to create bitmaps.

The exact meaning of the MINICOPY keyword depends on the context
of the DISMOUNT command, as follows:

1. If this is a dismount of a single member from a multi-member
shadow set, a write bitmap is created to track all writes
to the shadow set. This write bitmap may be used at a later
time to return the removed member to the shadow set with a
minicopy.

If the write bitmap cannot be initiated and the keyword
OPTIONAL is not specified, the dismount will fail and the
member will not be removed.

If you omit the /POLICY qualifier or if you specify
/POLICY=NOMINICOPY, no bitmap will be created.

Cheers,

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Aaron Sakovich
Super Advisor

Re: New clustered shadowset

Hi, me again...

So, going over the stuff about HBMM, the one thing that has me concerned is this one statement in the OVMS FAQ:

"In a regular Merge operation, Shadowing uses an algorithm similar to the Full Copy algorithm (except that it can choose either of the members' contents as the source data, since both are considered equally valid)"

Ref:
http://hoffmanlabs.org/vmsfaq/vmsfaq_011.html#mgmt63_mm

Okay, so in my 3 host scenario, where I hope to have at least 2 hosts up at any time (quorum = 2), do I even need to use HBMM? Will MiniCopy handle both the expected departure of a host via shutdown, and the unexpected crash?

Or do I want to setup HBMM to handle the case when 2 of my hosts die, then at least one comes back? Will enabling HBMM in any way negatively affect the benefits of MiniCopy?

Oh, and I'm still wrestling with doing a "Dismount DSA10:" or "Dismount $10$DKB200:" in my shutdown. Or both. At the present time, I think the correct scenario would be to dismount the shadow set, then if the local drive is still mounted on any of the other hosts, dismount it (remembering to include the MiniCopy policy request). Is that legit, or is there a better/easier way to do it?
Jan van den Ende
Honored Contributor

Re: New clustered shadowset

Aaron,

"merge" kicks in when two or 3 members of a shadowset stay on line, but one node that has the set mounted leaves unexpectedly ("crashes").
Now, if any writes to that VOLUME are in progress (remember I empasised the distinction between drive and volume?) there is NO way for the other nodes to know how far each of the parallel writes to each of the drives has progressed.
In other words, the members are NOT guaranteed to be identical, and that MUST be corrected. A "merge" is needed. As there was no way of knowing WHERE the writes-in-progress are, the WHOLE drive had to be checked. A VERY lengthy process! In HBMM Engeneering has found ways of keeping track of where any write is to occur, and when it has completely finished. And that, in such a way, that the surviving cluster members have access to this info. Now, ONLY those disk blocks that might be affected need re-synching. A matter of seconds (or fractions thereoff).

COPY (and MINICOPY) occurs, when a DRIVE is (re-)added to the volume. (following a MOUNT Of course, MINIcopy is only possible if the drives once WERE identical (ie, they formed a shadowset, and one drive was DISMOUNTed), AND all changes since the separation have been kept track of.

Completely different scenarios, and rather different objects, with different algorithms. Although, of course the global object is to reach the goal where all drives contain reliably the same info in the same location, ie, the Volume is consistent.

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Aaron Sakovich
Super Advisor

Re: New clustered shadowset

Jan,

Super -- that made things clear.

I've also just found this thread in Italy:

http://www.eleuteron.it/mess_160141_3319014.html

Where the same thing is said, down towards the bottom. There's also an unanswered question in that thread about whether you should only use /Policy=MiniCopy in the dismount command. If I understand it correctly, it would only slow down your shutdown if you had to create the bitmap from scratch on all the nodes when you dismount the volume. It would make more sense to create it during the mount, but still verify that it's there during the dismount. Is that a valid statement?

I'm getting close -- thanks for everyone's help!
Robert_Boyd
Respected Contributor

Re: New clustered shadowset

Aaron,

You definitely want to enable HBMM by using the /POLICY=MINICOPY on the mount.

If you want to speed up shutdown a bit, you can dismount the local member of a shadowset first with /Policy=Minicopy, then dismount the DSA device. Removing the member first will save the other cluster members from having to go through shadow member timeouts after the node leaves.

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Martin Hughes
Regular Advisor

Re: New clustered shadowset

> Oh, and I'm still wrestling with doing a "Dismount DSA10:" or "Dismount $10$DKB200:" in my shutdown. Or both. At the present time, I think the correct scenario would be to dismount the shadow set, then if the local drive is still mounted on any of the other hosts, dismount it (remembering to include the MiniCopy policy request). Is that legit, or is there a better/easier way to do it?
<

If your aim here is to achieve a MINICOPY then you need to dismount the member (DKBnnn) not the volume (DSAnnn).

Unless I am mistaken, if you dismount the DKB member using /POLICY=MINICOPY in SYSHUTDWN, this will create the master write bitmap on the system that is shutting down. This master write bitmap will be deleted when the node shuts down, and therefore a full copy will be required to add the member back into the shadowset.

What I would do is dismount the DKB member using /POLICY=MINICOPY from one of the other cluster members, so that the master write bitmap can be created and maintained there. And I would do it manually rather than trying to automate this process.
For the fashion of Minas Tirith was such that it was built on seven levels, each delved into a hill, and about each was set a wall, and in each wall was a gate. (J.R.R. Tolkien). Quote stolen from VAX/VMS IDSM 5.2