Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Booting VMS from a different root

Linda Chumbley
Occasional Visitor

Booting VMS from a different root

I want to implement a "poor man's" highly available configuration for an application that is not clusterable.

I have a 2-node VMS cluster. The application runs on node A. If node A has a serious h/w failure, I want to sacrifice the application running on node B, shutdown node B and boot node B into node A's root.

Other than locally installed page & swap files, what other things should I look out for? Nothing else is local.

Thanks for dusting off the VMS brain cells!
12 REPLIES
Ian Miller.
Honored Contributor

Re: Booting VMS from a different root

Its possible - a couple of thoughts however.

are these similar systems ? (same CPU, MEMORY etc_ - I'm just wondering about VMS system paramter differences on the two systems or different licence classes.

Is there any hardware only connected to one of the systems i.e local disk, tape ?
____________________
Purely Personal Opinion
Linda Chumbley
Occasional Visitor

Re: Booting VMS from a different root

Ooops...my bad! I should have said the 2 nodes are almost identical AS2100's. Both servers have locally attached disks - which is where page & swap reside.

Linda
Richard W Hunt
Valued Contributor

Re: Booting VMS from a different root

Question is, what licenses are bound to the CPU ID? That isn't rooted - it comes from a hardware register. Not saying you DO have any - but you CAN have an ID-bound license. So you need to review your licenses before doing this.

The other items that come to mind that identify the CPU are rooted in the params.dat file, like the old decnet node ID, scsnodename, etc.

Sr. Systems Janitor
Volker Halle
Honored Contributor

Re: Booting VMS from a different root

Linda,

if the systems are nearly identical (hardware-wise, especially look for LAN adapter names), this should work.

The local page/swap disks will have different disk labels, so you might need to adapt your mount commands.

It might even be possible to fully automate this 'failover' (e.g. MOUNT DKAx: node1_PAGSWP and if that fails, MOUNT DKAx: node2_PAGSWP - that way you known on which A2100 you are booting. Or even better: obtain the MAC address from the LAN interface and use this as a key to determine, whether the system is booting on node A or node B)

Volker.
Doug Phillips
Trusted Contributor

Re: Booting VMS from a different root

Linda,

I'm curious. What is it about the application that prevents you from just running it on the other node?

Have you considered writing a .com file to load licenses and set logicals and do whatever else needs to be done to start the application without having to reboot node B?

If you *must* boot to run it, you might find that you only need to change some logicals in node b startup to make the application runnable.

I'd look for a way to switch the application node without a reboot so when the problem node is repaired you could boot it back up without having to shut down the other.

With VMS, this should be possible.

Doug P
Martin P.J. Zinser
Honored Contributor

Re: Booting VMS from a different root

Hello Linda,

what exactly does the "not clusterable" encompass here? While many applications are
not "cluster aware", i.e. are not distributed in a cluster with automatic load distribution and failover etc., very few are actually not clusterable in the sense that they can not run on an arbitrary node in a cluster (provided the required SW is installed, I/O systems are availabel etc.). Since this seems to be the case in your situation, can you identify the bits and pieces that actually do prevent you from just bringing up the app on node B without taking over As identity? Maybe the group has some creative ideas how to overcome the situation without the need for various reboots.

Greetings, Martin
Bojan Nemec
Honored Contributor

Re: Booting VMS from a different root

Hi,

I agree with Doug and Martin, if it is possible, make node B to be able to run node A applications.

If that is not possible, I think booting node B as node A is posible without many problems. About the page and swap file I have an idea, but now I have no hardware to test it, so can anyone verify my idea?

The idea is:
If you have two identical systems, you have two local disks with the same name (say DKA0:) on which the page and swap file resides. In the common SYPAGSWPFILES.COM you can install them with this procedure:

$ MOUNT/SYSTEM/NOASSIST DKA0: label PAGE_SWAP
$ SYSGEN := $SYSGEN
$ SYSGEN INSTALL PAGE_SWAP:[SYSTEM]PAGEFILE.SYS/PAGEFILE
$ SYSGEN INSTALL PAGE_SWAP:[SYSTEM]SWAPFILE.SYS/SWAPFILE


This commands will mount the local disk on each cluster node. Note mount is /SYSTEM and the disk name has no allocation class in it!
Of course they must have the same label and nothing else can be put on this disks, because they cant be accessed from the other node. This will work also when you boot node B as node A. The startup will be the node A startup, but swap and page file will reside on the node B local disk.

On this local disk you can put some data, from which you can identify which phisyical node is running.

Bojan
Volker Halle
Honored Contributor

Re: Booting VMS from a different root

Bojan,

you can't use the same mount commands for both nodes in the common SYPAGSWPFILES.COM, as you can't mount 2 disks (say local DKA0: on both node A and node B) with the same label system-wide in a cluster.
You could work around this problem by first mounting the disk temporarily with /OVER=ID, obtain the label with F$GETDVI("DKAx","VOLNAM") and then issue the correct MOUNT/SYSTEM/NOASSI command.

And yes, I agree, finding out, if something really prevents the application from just being started on node B and trying to solve that, would be much better.

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: Booting VMS from a different root

Linda,

Since your system disk will not be shared, make sure that all possible system files are placed om a common disk.

E.g. a put @(file on the common disk) in sys$startup files. Otherwise you will soon have 2 systems that are identical.

Also : how are you going to boot the 2nd one without coming the 1st one ?

Special attention to :
1) the queue database (or you will lose all scheduled jobs)
2) system log, audit & accounting (or you will not be able to consult them)
3) sysuaf & co (big mess if you don't)
4) ncl files & decnet db (rarely changes but ...)
5) ucx/tcp databases (hosts, routing, etc)

(all this is documented somewhere in the manuals)

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Booting VMS from a different root

I'm a bad reader. You said nothing else is local. Ignore my remarks.

Wim
Wim
Linda Chumbley
Occasional Visitor

Re: Booting VMS from a different root

WOW! Thanks for all the replies! I didn't think there was that much VMS knowledge still out there...or that anyone would admit to! ;-)

The non-clusterable application is a radiology app. that runs on a DSM (Mumps) database. Yes, I know that mumps is clusterable, but for whatever reason the vendor, IDX, did not write it to make use of DSM's clustering capability, so I'm kinda stuck.

Licensing won't be a problem...besides, when I boot into the other node's root, that node would "become" that node, including inheriting the nodename, IP & DECnet address, etc.

I addressed the page/swap file issue by installing additional page/swap files on a non-local disk and included some code in SYPAGSWPFILES.COM so the system won't hang trying to mount a local disk that doesn't exist.

SYSUAF, queue files, and all other files I can think of are on shared disks. This used to a 4-node cluster...it's slowly being retired.

This will all be tested, hopefully soon...before I really need it.

Linda
Wim Van den Wyngaert
Honored Contributor

Re: Booting VMS from a different root

Linda,

We also have mumps. You can indeed run it on both nodes simultaniously but the application must be able to handle it.

You can run it on 1 node of the cluster too. And switch it to the other in case of a disaster. But even then, your application might have problems if they are not prepared.

So, your solution is the best one (without modif of the application).

Btw : did you know that you need after image journal (and before ij) in Mumps if you don't want to lose data during a crash ? We discovered that the hard way.

Wim
Wim