Re: Turning a standalone node into a two-node VMScluster with MSA1000

Jeremy Begg · ‎11-23-2009

Hi,

For the past couple of years I have been the system manager for a site running OpenVMS V8.2 on a single AlphaServer DS25. They recently became concerned about the business-critical nature of the OpenVMS application and have decided to implement a two-node VMScluster.

For reasons I won't go into here, they have two identical AlphaServer DS25s. Each machine has a SmartArray 5300A RAID controller and there is a total of 12 physical drives. Currently only one of these machines is in use (the other one has been shut down almost since the day they arrived).

The site has received the necessary hardware to form a two-node VMScluster with shared MSA1000 storage and I will be going there next weekend to set it up.

I'm intending to run the cluster with each DS25 having its own system disk, i.e. booting from its SA5300A controller. The application software and critical shared system files (SYSUAF, NETPROXY, etc) will go onto the MSA1000.

My only problem is that I'm a little unsure on a couple of the cluster configuration details. (I have configured and managed clusters before but always from scratch.)

One thing I'm not sure about is how to set the ALLOCLASS parameter. In the "Guidelines for OpenVMS Cluster Configurations" it says, "A Fibre Channel storage disk device name is formed by the operating system from the constant $1$DGA and a device identifier, nnnn. Note that Fibre Channel disk device names use an allocation class value of 1 whereas Fibre Channel tape device names use a value of 2".

In other words, FC devices ignore the system's ALLOCLASS value?

Currently the running DS25 has an ALLOCLASS of "1". Given that each system will have at least one "local" SCSI disk (the system disk) would I be better off changing the ALLOCLASS to a unique value on each DS25 (e.g. 2 and 3)?

When it comes to preparing the system disk for the second DS25, I see two possibilities (assuming I'm not going install VMS from scratch):

1. Use option 4 in CLUSTER_CONFIG.COM to clone the existing system disk to a scratch disk, then use option 5 to create a new system root (e.g. [SYS1]) on the cloned disk. Once this has been done the cloned disk will be moved to the second DS25.

2. Alternatively, could I just restore an image backup of the original DS25 to the second DS25 and then change the node name (in MODPARAMS.DAT, DECnet, etc)?

Thanks,
Jeremy Begg

Jon Pinkley · ‎11-23-2009

Jeremy,

Yes, FC devices ignore ALLOCLASS.

What is the compelling reason for multiple system disks? If that is what you are familiar with, that may be a sufficient reason. However, unless you cannot schedule downtime for system upgrades, etc., I see very little reason to go to the extra complexity, duplicated upgrades, etc. that come with multiple system disks.

In a two-node cluster, you are going to have to have a quorum disk on the MSA1000, and a place for the shared cluster files. It is much less complex to have a single system disk, a common SYS$COMMON, etc. That's my opinion.

You can still have your page/swap files on local devices (and even system dump files, although it can be nice to have those on a disk that can be seen by the other system, so unless you are really tight on MSA space I would also recommend that the dump files were on the MSA as well.

There are good reasons for multiple system disks, but do consider the reasons before heading down that road, especially if you haven't had experience with multiple system disks already.

Jon

it depends

Jeremy Begg · ‎11-23-2009

Hi Jon,

Thanks for confirming the FC ALLOCLASS issue.

I am familiar with both common-system-disk and multiple-system-disk clusters. I agree a common system disk would be simpler to manage but going with one system disk per node lets me perform rolling O/S updates and allows a node to be booted without the MSA1000 being on-line. (OK those might not be strong reasons, but they work for me for now.)

Thanks,
Jeremy Begg

Martin Vorlaender · ‎11-23-2009

Jeremy,

>>>
In other words, FC devices ignore the system's ALLOCLASS value?
<<<

Yes.

>>>
Currently the running DS25 has an ALLOCLASS of "1". Given that each system will have at least one "local" SCSI disk (the system disk) would I be better off changing the ALLOCLASS to a unique value on each DS25 (e.g. 2 and 3)?
<<<

Why not choose something that won't collide with an FC tape, e.g. 3 and 4?

>>>
When it comes to preparing the system disk for the second DS25, I see two possibilities (assuming I'm not going install VMS from scratch):
<<<

I'd go with option 1. Much cleaner, as the SCSNODE gets used in lots of places (see http://labs.hoffmanlabs.com/node/589 ).

HTH,
Martin

Jeremy Begg · ‎11-23-2009

Hi Martin,

Good point about host ALLOCLASS, I think I'll go with 10 and 20.

Thanks for the suggestion for disk prepartion.

Regards,
Jeremy Begg

Jan van den Ende · ‎11-24-2009

Jeremy,

>>>
They recently became concerned about the business-critical nature
<<<

to me, should imply (at the very least) the deployment of (host based!) Volume Shadowing.

combine with

>>>
lets me perform rolling O/S updates
<<<

... and you have practicly described a high desire FOR a single, common, system disk!
-split off a member, mount privately, change volume label, set VAXCLUSTER=0 & STARTUP_P1 = MIN.
Boot one system from this disk, and upgrade.
Reset VAXCLUSTER & STARTUP_P1 and reboot
(now a mixed-version, dusl-sysdisk cluster)
Shutdown other node, reboot from new disk, and the rolling upgrade is done.
At any point in time, if necessary roll-back is as simple as booting from (a member of) the original sysdisk. Keep this intact till satisfied with the upgrade.

(at current disk prices, I stronly suggest 3-member shadow sets. A.o., this lets your production stay shadowed while upgrading. Way easier in getting single-point-in time backups as well)

Btw, I have always felt comfortible with NOT using SYS0 as a system root in a cluster. It prevents accidental "wrong root" booting, especially by people who are not really familiar with the site, (such as maintenance, but I have also been surprised by "outside" software installers. Don't say it will not happen to your site; upper management tends to make decisions they will not discuss with you in advance. Better safe than sorry!)

hth

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

The Brit · ‎11-24-2009

Jeremy,
I would seriously consider Jon's comments regarding a shared system disk, however it this is not for you, here is a method which is pretty safe, and works for me.

1. Copy the current system disk to a spare internal disk.

2. Remove the disk and install in your second DS25.

3. Disconnect the second DS25 completely from the network and storage.

4. boot 2nd DS25 from root sys0, standalone.

5. make all necessary changes; name, parameters, network, etc.

6. When you are comfortable, reconnect to the network and storage, and run cluster_config on the original node to add the new node to the cluster.

Since you are using separate system disks, you can retain SYS0 as the boot root.

Dave

Hoff · ‎11-24-2009

Consider a common system disk with shadowing. And a two-node cluster requires specific steps for up-time. Details:

http://labs.hoffmanlabs.com/node/349
http://labs.hoffmanlabs.com/node/153
http://labs.hoffmanlabs.com/node/569

The implementation of the node name storage and related handling within OpenVMS and layered products is bad, simply put. Having gone through a node name change recently, it's a toss-up whether a name change or the wholesale reinstallation of OpenVMS is easier. As part of the name change, I ended up reinstalling some layered product components; there was just no way to untangle it. (And if you combine a system disk device name change - which can be the case here - you can end up fixing layered product startup procedures all over the place, too.)

Jan van den Ende · ‎11-24-2009

Re Hoff:

>>>
(And if you combine a system disk device name change - which can be the case here - you can end up fixing layered product startup procedures all over the place, too.)
<<<

AAUUWWW!!!

When/whereever THAT is the case, the responsible System Manager should take a solid course of Logical Name training (and probably repeat that same course one month later!!)

Apart from Oracle V3.x on VMS 3.x (which WAS installed by an SM that NEVER got to understand LNMs, up to the point where ALL user home DIRs were subdirs of SYS$SYSDEVICE:[SYS0] ) I have NEVER EVER in over 25 years
encountered THAT issue!.

fwiw

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Robert Gezelter · ‎11-24-2009

Jeremy,

I would concur with many of the comments about shared system disks.

Some other thoughts on minimizing downtime:

- Consider whether creating a cluster of one is a good first step

- Consider a similar comment with regards to host based volume shadowing to migrate data to the MSA. In essence, switch data on the current disks to access via DS devices, then when the MSA is available, add MSA volumes to the shadow set.

- As a guard against accidents with a shared system disk, it may be sound to create a cloned copy of that system disk (e.g., port and starboard).

I have incorporated the SYS0 exclusion mentioned earlier by Jan. I do not like using the same root number for different nodes. There are more than enough root numbers to not need this, and it prevents accidents.

- Bob Gezelter, http://www.rlgsc.com

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Turning a standalone node into a two-node VMScluster with MSA1000

Turning a standalone node into a two-node VMScluster with MSA1000