Operating System - OpenVMS
1754284 Members
2825 Online
108813 Solutions
New Discussion юеВ

Re: Booting from System Disk Shadow Set

 
SOLVED
Go to solution
VMS Support
Frequent Advisor

Booting from System Disk Shadow Set

Question ?

Shadow set consists of two members.
DSA0 -DGA100 + DGA200
Two nodes in cluster. SERVER1 + SERVER2
Both have boot path set for DGA100. Root 0 (SERVER1) and Root 1 (SERVER2).
At some point DGA100 exits shadow set (Via dismount/cluster).
Remains out of shadow set for period of time (10 Hours).
SERVER2 is shutdown and needs to reboot. This will boot from DGA100 that is still not member of DSA0 since dismount some 10 hours prior.

1 )Will SERVER2 boot ?
2 )Will it boot then bugcheck when shadow set DSA0 forms (What I think it will do).

Only way to boot server is to modify boot path to DGA200 or add DGA100 to shadow set (DSA0) and wait for copy to complete.

One for the panel.
3 REPLIES 3
Jan van den Ende
Honored Contributor
Solution

Re: Booting from System Disk Shadow Set

Hi,

simplest (and most transparant) solution):

set BOOTDEF_DEV to DGA100,DGA200,

although, AFAIK, if you boot from DGA200, (and did not do any tricky things with your system disk definitions in ALPHAVMSSYS.PAR, & this incarnation of DSA0 is still the same as the one that DGA100 has been a member of, meaning SERVER1 has stayed up) the following will happen: (only steps relevant for explanation, all other steps omitted)
a. DGA100 is -ACCESSED- (but NOT "MOUNTED"), and a.o. ALPHAVMSSYS.PAR is read.
b. DSA0 is found to be specified as the system disk
c. DSA0 is found to be mounted, with member(s) DGA200
d. DGA100 is de-accessed
e. DSA0 is mounted, that being with DGA200 as current member(s)
(potentially:) if DGA100 is mounted to DSA0, it will be recognised as needing a SHADOW_COPY.

So, essentially it boils down to the fact that Engeneering (way back when) also constructed your kind of problem, and, refusing to allow this potential corruption, solved it for us.

Cheers.

Have one on me.

Jan
Don't rust yours pelled jacker to fine doll missed aches.
John Gillings
Honored Contributor

Re: Booting from System Disk Shadow Set

You appear to have fibre channel disks (DGA), in which case there are probably multiple paths to the drives (4?). The recommendation from engineering is to define BOOTDEF_DEV to contain all 4 paths to your notional primary disk (DGA100), and that all members of the cluster define the same list (though you may wish to change the order to distribute load across the paths).

Loss of DGA100 would be considered a serious failure, and you should NOT attempt to recover from it automatically. OpenVMS *MIGHT* get it right, but I don't think it's possible to guarantee it will always "do what you mean".

The big trick in recovery is to always know what it is you're recovering from. Shadow sets are the trickiest to get right, and there are many cases where you simply cannot trust the OS to do it automatically (even if it's OpenVMS!). You really need a person to make sure important data is not destroyed.

For non-system disks, I recommend you always use /POLICY=REQUIRE_MEMBERS (V7.3 and higher) when mounting shadow sets. This means the shadow set won't mount unless every member named on the MOUNT is present. That ensures there isn't a more recent member "out there" that you missed, and avoids unnecessary shadow copies and merges. If you want to get sophisticated, you can use one of the user SYSGEN parameters to indicate to startup that you're recovering from a failure, to allow shadow sets to mount with members missing.

You can't use REQUIRE_MEMBERS for a system disk, so what you do depends on how important your data is. One option is to banish all modifiable data from your system disk - move all your cluster files to another disk. Then it probably doesn't matter if the shadow set is mounted backwards as nothing should have changed! This is usually a good idea anyway, especially if your cluster has more than one system disk.

Another possibility is to set your default boot flags to 1 - "conversational boot". That means your systems will never boot by themselves, they will always stop at SYSBOOT>. This isn't as silly as it sounds. If your OpenVMS systems are rebooting, something very unexpected has happened, and you will probably want to check it out before letting them boot. With remote console access you don't necessarily have to have someone physically present to authorize the reboot.

If DGA100 really has failed, or has old data on it, then your site RPM (Recovery Procedures Manual - you've all got one of those... haven't you?) would say to pop the filed drive from the shelf and manually change the boot paths to the surviving shadow member.

The RPM is THE most important component on a high availablility site. It needs to be on PAPER and stored in a metal box with a candle and a box of matches (seriously!). It should list all anticipated failures and a precise set of steps detailing recovery. You really don't want to be making this stuff up on the fly, especially at 3am with elevated adreneline.

(Note - Although I agree with what Jan has suggested, it only takes into account the case where SERVER1 is still up. What if there's a total power failure and DGA100 is the foundation member of the shadow set?)
A crucible of informative mistakes
VMS Support
Frequent Advisor

Re: Booting from System Disk Shadow Set

Thanks for the help.

I now have adequate information to be going with for the moment. Will test the boot theory in reply 1 this weekend.

Thanks