Operating System - OpenVMS
1820227 Members
3617 Online
109620 Solutions
New Discussion юеВ

Automatic mounting of HBVS disks in systartup_vms.com

 
SOLVED
Go to solution
MarkOfAus
Valued Contributor

Automatic mounting of HBVS disks in systartup_vms.com

Hello all,

What is the consensus, if any, on the automatic mounting in systartup_vms.com of DSAnnn: devices. I am specifically talking about data disks, that is, disks not containing important information for the system but purely and simply used to store user data on them.

While playing around the other day, with LD, I managed to make an inited logical drive be the master and a drive with the data to be the copy target. Not good. (This was creating a NEW shadow set)

I have read extensively on the issue of synchronising the disks, but there are so many scenarios that can occur, so I wonder whether an automatic mount of data disks is a "foolproof" way to get the system running after a reboot.

A scenario:

One server, the primary server, nodeA, has a HBVS volume called DSA1:, made up of $3$dka100 and $4$dka100. 3 is the allocation class for nodeA, 4 is that of nodeB.

Suppose the device $3$dka100 disappears out of the volume, and I am supposing some mount verification happens anyway, and then writing of data resumes to DSA1: whereby it only has $4$dkc100 as a member and which resides on the other node, nodeB.

If nodeB then goes down, and $3$dka100 comes back online, then the data is corupted?


I realise this is all covered in chapter 6 of "Volume Shadowing for OpenVMS", but I have learnt the hard way that manuals are often vague, partially correct until you read several other manuals or just plain wrong.

So, in summary, can I trust implicitly that the shadow set generation number will protect my data from being wiped by an older member starting to copy to a newer member?


I hope I am not too vague in my description.

Regards,
Mark
19 REPLIES 19
MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Sorry, after the second last paragraph, should be:

And, how do I prevent the scenario of $3$dka100 becoming the only member of DSA1: after leaving the volume and getting new data updated onto its older data (the newer data is on $4$dkc100)?
Robert Brooks_1
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

So, in summary, can I trust implicitly that the shadow set generation number will protect my data from being wiped by an older member starting to copy to a newer member?

--

In general, yes, the generation number is the way that the shadowing driver determines in which direction a copy goes.

In your setup, it appears as though you've got two (or more) nodes with the storage local to a single node, and is MSCP-served to the other nodes. If that's not correct, then I think we'll need a bit more detail w.r.t. your configuration.

Still, MSCP or not, the shadowing driver will not clobber data. I'm not going to state, however, that it absolute, positively cannot happen. Bugs do happen, and the shadowing driver is likely the biggest and most complex of all the VMS device drivers.

End-user error, however, can lead to "bad things" happening. I cannot
guess what went wrong with your LD-created device experiment.

-- Rob
Volker Halle
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

if you have the members of a shadowset connected to 2 different systems - not connected to a shared IO bus (I call this a poor-man's shadowing configuration) - and you're relying on MSCP serving the members to the other node, it may be a good idea to:

- use MOUNT/NOCOPY to prevent the shadowset to be mounted, if shadowing thinks a shadow-copy would be required

or

- explicitly check, that both members exist AND their hosts are available using F$GETDVI("disk","HOST_AVAIL"), before issuing the MOUNT command.

While experimenting with LD, did you INIT the new volume with another label ? Or did you do:

$ MOUNT/SHAD=(LDA1:,LDA2:) DSA1: label
$ DISM DSA1:
$ INIT LDA1: label
$ MOUNT/SHAD=(LDA1:,LDA2:) DSA1: label

It never happend to me, that a shadow-copy went into the wrong direction. When mounting shadow-members manually, I always use /CONFIRM and look at the output from MOUNT, before I answer "Y" ...

Volker.
MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Volker,

This "poor man's shadowing configuration" is exactly what we've got here.

So, using mount/nocopy, as expected would not allow the copy to proceed. I thought of this. I could then use analyze/system to determine the master and then what?

What if it thinks it's the master when in fact the other disk is? Or can this not happen due to the generation number?

(PS. is there another way of seeing which is the master, ie, lexical function?)


With the LD, I did, I think (I did so many to test various things), init/erase lda2: data99
Then I did the mount dsa99: /shad=(lda1:,lda2) data99

lda1: was already existing with data on it.

Then I watched the contents of lda1: being nulled out.

I too use /confirm when manually mounting shadow sets. That is good advice for all.

Regards
Mark
MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Robert,

Two nodes, local storage, each share one disks to make up a two device shadow set.

There are 3 shadow sets. Interconnect is NI.

The data is being backed up, so if there is a corruption, we can rebuild, but that is not my concern. Bugs also can't be helped. What I was trying to discern is the likelihood of corrupting data if one node begins a copy when the other is more up to date.
Volker Halle
Honored Contributor
Solution

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

MOUNT/NOCOPY will not mount the shadowset DSAn: at all. This may not be want you want to do during startup, but it would be a protection against starting automatic shadow-copies during boot. Determining 'who is shadow master' of a non-existing shadowset can only be done via tools like DISKBLOCK (from OpenVMS freeware disk), which can read and display/format the SCB (Storage Control Block) of a disk (when mounted foreign). And you would manually have to evaluate the same data as MOUNT would need to do during MOUNT, if it can access the SCBs of both potential members - that IS the critical point ! So you should re-write your mount procedures during startup, that an automatic mount should only occur, if both members are visable and their hosts are available (i.e. cluster members and MSCP servers). Then let shadowing decide who the master is...


With the LD, I did, I think (I did so many to test various things), init/erase lda2: data99
Then I did the mount dsa99: /shad=(lda1:,lda2) data99
lda1: was already existing with data on it.


This is what I had expected. By INIT LDA2:, you write a new shadow generation date on the disk. And shadowing will honor that date and - of course - copy from the NEW to the OLD disk ! If you would have used another label for INIT LDA2:, this should not have happened.

Volker.
Robert Gezelter
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

To paraphrase Rob slightly, it is not clear precisely what happened during your experiment. I would propose to clarify the conversation, that you add the following intermediate step to your experiment and repeat it.

After initializing the dummy volume, and BEFORE mounting either volume, dump the Shadowing control blocks on the volume in hexadecimal using DUMP (after, of course, MOUNTing each volume /FOREIGN /NOWRITE).

Using the DUMP of the shadowing structures, follow through the process described in the documentation.

Often, behavior that seems counterintuitive becomes far more clear, or at least comprehensible, than the results of simple experiments that do not yield the anticipated result.

- Bob Gezelter, http://www.rlgsc.com
Volker Halle
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

it may be a wise idea (especially in your configuration), that you INIT new disks, which are to be brought into existing shadowsets, with a label like TEST and NOT with the same label as the existing shadowset. Then nothing bad can happen, even if your system crashes at the wrong time...

Volker.
Volker Halle
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

another thing to keep in mind: with this 'poor man's shadowing configuration', you are actually running something like a multi-site cluster. Strict rules need to be defined and followed in such a configuration to prevent data loss after (multiple) site failures. Your case is the easier one, as the cluster and IO interconnects are the same, i.e. the LAN. Things get even more interesting when using Fibre Channel SANs.

If you're running a current version of OpenVMS (like V7.3-2 or higher), you can also take advantage of the mini-copy feature to prevent full copies after a shutdown of one node - which will break the existing shadowsets if exceeding SHADOW_MBR_TMO seconds. This can be done like this:

When shutting down node B, use SYSMAN to issue the following command on node A:

SYSMAN> SET ENV/NODE=nodeA
SYSMAN> DO DISM/POLICY=MINICOPY $4$DKA100
SYSMAN> EXIT

This will create a WBM (Write Bitmap) for $4$DKA100: on NodeA and keep track of further writes to the DSA1: shadowset from Node A. Once you boot NodeB again and remount the DSA1: shadowset, $4$DKA100: can be brought back into the shadowset without a FULL copy. This saves a lot of time and resources.

Volker.
John Gillings
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

Yes you can trust the generation number (GN) to protect your data, BUT (and this is BIG BIG BUT!) MOUNT needs to be able to see ALL the generation numbers of ALL potential virtual units to know which one is the latest.

For example, if we have VU1 with GN=X and VU2 with GN>X

If we MOUNT VU1 and form a new shadowset using VU1 alone, MOUNT has no knowledge of VU2. The generation number of the newly formed shadow set will be updated, and will be higher than that of VU2. If VU2 is now added to the shadowset, it will be overwritten, thus losing data.

A close-to foolproof mechanism is to always specify ALL members in a MOUNT command and include /POLICY=REQUIRE_MEMBERS. If any VUs aren't available, the MOUNT will fail. This should eliminate the scenario above.

The only problem is, if you're recovering from a site failure, you know there will be some members missing, and therefore the mounts will fail. You'll therefore require some kind of manual override to recover the systems with reduced shadowsets. One possibility is to use a user defined SYSGEN parameter to override the policy, but a safer option for disaster recovery is to skip all attempts to mount automatically and do the recovery by hand.
A crucible of informative mistakes
Wim Van den Wyngaert
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

We do it as John Gillings said. We use USERD2 with values 1 and 2 (default 0) for regulating special action :
1 : not all disks are required but only continue if at least 1 disk is present
2 : try mount with available disks and continue if it fails

Note that we wait up to 60 seconds for the devices to become existing and available (slow config process). After this we abort the boot (unless userd2 says to continue)

Wim
Wim
MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Jon,

" A close-to foolproof mechanism is to always specify ALL members in a MOUNT command and include /POLICY=REQUIRE_MEMBERS. If any VUs aren't available, the MOUNT will fail. This should eliminate the scenario above.
"

Can you then clarify something for me. Is the /policy=require only applicable at mount time? That is, if we need to bring down a node or take a disk out of shadow set, will this be allowed? If that is the case, then /policy=require_members is the answer.

MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

John,
Apologies.


" A close-to foolproof mechanism is to always specify ALL members in a MOUNT command and include /POLICY=REQUIRE_MEMBERS. If any VUs aren't available, the MOUNT will fail. This should eliminate the scenario above.
"

Can you then clarify something for me. Is the /policy=require only applicable at mount time? That is, if we need to bring down a node or take a disk out of shadow set, will this be allowed? If that is the case, then /policy=require_members is the answer.
MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

John,
Apologies.


" A close-to foolproof mechanism is to always specify ALL members in a MOUNT command and include /POLICY=REQUIRE_MEMBERS. If any VUs aren't available, the MOUNT will fail. This should eliminate the scenario above.
"

Can you then clarify something for me? Is the /policy=require only applicable at mount time? That is, if we need to bring down a node or take a disk out of shadow set, will this be allowed? If that is the case, then /policy=require_members is the answer.
Volker Halle
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

while checking the keywords for MOUNT/POLICY, you may also consider the /POLICY=VERIFY_LABEL keyword.

This keyword will only allow disks to become a copy target, if they have a label of 'SCRATCH_DISK'.

In your LD: example, you could have at least prevented overwriting LDA1: with /POL=VERIFY_LABEL

Volker.
John Gillings
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

/POLICY=REQUIRE_MEMBERS means all volumes in the MOUNT command must be accessible for the mount to complete. Once the shadowset is created, members may be removed or added as per normal. Once the shadowset is established, as long as you don't dismount the last member, the generation number should protect you (on the other hand, it's a good policy to never reduce a shadowset below 2 members).

The purpose of REQUIRE_MEMBERS is to satisfy exactly your concern - to make sure the correct member is chosen as the master when a shadowset is first formed.

As Volker has suggested, if he's right about the explanation for your "accident", that could have been prevented with /POLICY=VERIFY_LABEL.

See HELP MOUNT/POLICY. When used correctly, you can use it to construct MOUNT commands that will guarantee data integrity of the resulting shadowset, or fail.

Note - for maximum safety, you should write procedures to define all the shadowset actions you want to permit, and strictly enforce operational rules that ONLY those procedures may be used to affect shadowsets. No matter what you do, your shadowsets are always vulnerable to (privileged!) fat finger commands.

Look after your shadowsets and everything else will follow.
A crucible of informative mistakes
MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

John,

"The purpose of REQUIRE_MEMBERS is to satisfy exactly your concern - to make sure the correct member is chosen as the master when a shadowset is first formed."

That's great. I will use it as the standard way of mounting disks automatically. On the occasions when it doesn't have the other disk accessible, I will use the USERD1 parameter to modify the behaviour of the automounting command file. Brilliant!


"As Volker has suggested, if he's right about the explanation for your "accident", that could have been prevented with /POLICY=VERIFY_LABEL."

Yes, I see it now. The label is what undid me, so I know in future to not use the same label or use the /policy=verify_label or use the "scratch_disk"


"Note - for maximum safety, you should write procedures to define all the shadowset actions you want to permit, and strictly enforce operational rules that ONLY those procedures may be used to affect shadowsets. No matter what you do, your shadowsets are always vulnerable to (privileged!) fat finger commands."

This is precisely what I am trying to pursue. As I am not the only one with the ability to modify/reboot/wreak havoc on the system but I am responsible for the procedures around operating the servers, I needed to create such procedures that avoid "fat finger commands." (so eloquently said).

Thank you for your sage advice.


MarkOfAus
Valued Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Use /policy=require_member to prevent a possible corruption when creating shadow sets.

Use sysgen parameters to dictate the actions in non-normal situations OR manually mount the shadow sets with /confirm.

When creating shadow sets, avoid using the same label as the other device to prevent incorrect shadow copy master being set.

Thank you to all for your invaluable assistance.

Regards
Mark
Jon Pinkley
Honored Contributor

Re: Automatic mounting of HBVS disks in systartup_vms.com

Mark,

I realize this thread is closed, but I've done some experimenting, and the only way I can get a copy to go in the wrong direction is if I mount the newly initialized member into a single member shadowset, and then dismounting the DSA virtual unit. This sets the generation number to the current 64bit system time, which then looks more current than the other shadow member.

Initializing a disk sets the generation to zero, so unless something is done to change that, the newly initialized disk will not be used as the copy source. This is true whether or not init/shadow is used.

Here's a summary:

Preexisting conditions:

DSA99: mounted with members $4$lda1,$4$lda2 in steady state condition.

$dismount [/cluster] dsa99: ! at this point both $4$lda1 and $4$lda2 are identical.

$ init $4$lda1: itrcshad ! generation = 0
$ mount dsa99: /shadow=($4$lda1:,$4$lda2) /policy=require_members itrcshad ! this will overwrite $4$lda1, since $4$lda2 how has higher gen

If instead of the above, we did

$ init $4$lda1: itrcshad ! generation = 0
$ mount dsa99: /shadow=($4$lda1:) itrcshad/nocopy ! /nocopy ensures that this is the only member in the shadowset
$ dism dsa99: ! after this generation set to current time, i.e. more current than $4$lda2:
$ mount dsa99: /shadow=($4$lda1:,$4$lda2) /policy=require_members itrcshad ! this will overwrite $4$lda2, since $4$lda1 how has higher gen

When mounting shadowsets interactively, always use /confirm, as it will tell you what will be the copy target, and ask for your permission to proceed. If you don't expect a copy to occur, specify /nocopy to ensure that it does not.

To reset the shadow generation number to 0: $ mount/override=(shadow)

The advice about /require_members is good, and the advice to initialize to different label is good.

I have attached a log of my experiments showing the nitty gritty details, including the use of the freeware tool DISKBLOCK.

Have fun,

Jon
it depends