Dynamic Choice of VMS System to Boot As...

Rick Dyson · ‎09-16-2005

Strange subject line, but could not come up with anything better.

I a interested if it is possible to build a common cluster system disk (on a SAN) between a pair of AlphaServers with NODEA and NODEB setup as the two node names.

IF NODEA has a hardware failure, I want to be able to reboot NODEB and have it come up as if it was NODEA.

I am building a new pair of ES40s to replace the previous nodes and have the chance to use a proper common system disk so I only have to manage one setup, etc. But I don't know how to tell it that NODEB should be treated as "SYS0" instead of "SYS1" dynamically at the boot prompt...

NOTE: The original NODEA would not be allowed to return unless it is manually made to become NODEB.

Believe me, I know this is not the way OpenVMS Disaster-tolerant clustering was designed to work! This is to model a failover method in use today where each node has a discrete internal boot disk(s) that in the event of a H/W failure, the system disks of NODEA are physically swapped with the disks of NODEB and rebooted. The application only runs on NODEA.

It's sad, but NODEB is really just a hot spare for NODEA. I use it for maint, backups, admin, (SETI), etc. all the time while the application runs on NODEA. It has a lot of spare CPU cycles.

Rick

Uwe Zessin · ‎09-16-2005

Normally you do on NODEB:

>>> set boot_osflags 1,0
>>> boot

for a temporary boot as NODEA:

>>>> boot -flags 0,0

unless I have overlooked something, of course...

.

Rick Dyson · ‎09-16-2005

Duh!

Ooooh! This is looking up. I will be able to test this soon when the boxes get here. I am looking forward to it!

Now I just have to learn how to boot from a SAN disk. :) I have some notes on WWIDMgr that I see talk about it. Need to read it now. :) Have had one here for years, but the boxes always booted from their internal SCSI bus and I wanted to go with SAN if I could but I had drawn a blank on how to keep the ability to fake the same system booting no matter what the hardware. I was afraid I was going to have to use up 4 slots on my SAN to have two shadow pairs for separate system disks...

Thanks Uwe!

I wish I could buy you a cold one!!! But, do have a couple for me! Some day I will get to attend a meeting with some of you guys!

Rick

Andy Bustamante · ‎09-16-2005

The root to boot from is controlled by the console variable boot_osflags. For example to boot from root 1

>>> SET BOOT_OSFLAGS 1,0

to booting from root 0

>>> SET BOOT_OSFLAGS 0,0

If you do this be very careful to define proceedures to avoid starting both systems from the same root at the same time. Switching nodes is a very simple way to drop in backup hardware, especially with the same model Alpha.

You can use f$getenv to see how
$ this_root = f$getenv("boot_osflags")
$ show sym this_root

How does the application communicate with your users? You can use TCPIP aliasing to have a service address that migrates from node to node.

DECnet and LAT also have options for presenting a cluster address. This allows both nodes to be running and speeds the application transition from node to node.

Having both nodes configured to support your application makes remote management easier and reduces application transition time.

Andy

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net

Uwe Zessin · ‎09-16-2005

Well, dealing with WWIDMGR is not that complicated.

You can start all over from scratch:
>>> wwidmgr -clear all

Look for the devices:
>>> wwidmgr -show wwid
>>> wwidmgr -show wwid -full
...

Configure the device paths
>>> wwidmgr -quickset -udid 599

where 599 is the unit identifier. The trick is to INITIALIZE the system at this stage, so that you later can boot from it.
>>> initialize
...

You should now see the configured paths:
>>> show device
...
dga599.1001.0.2.1 $1$DGA599 HP HSV100 3014
dgb599.1002.0.2.3 $1$DGA599 HP HSV100 3014
...

>>> set bootdef_dev dga599.1001.0.2.1,dgb599.1002.0.2.3

.

Rick Dyson · ‎09-16-2005

Andy: Thanks, too.

This old app is a VT one. It does not really support cluster roll over. It has stuff that is tied to nodenames in the queue manager and the startup scripts. It is old and end-of-life from vendor, etc.

This trick used here looks like it will work for me just great. Sort of remote control disk swapping from my kitchen table at the console. :)

Uwe: Thanks for the quick wwidmgr stuff! I just wish I HAD an HSV you use for your example! Those look sweet. Still got (but it is still good too!) two pairs of HSG80s and SAN switch pairs.

Uwe Zessin · ‎09-16-2005

It doesn't matter, Rick, because the commands are the same. You will just see a different SCSI inquiry string (well, still no HSG, but you get the idea):

>>>show device dg
dga101.1001.0.9.2 $1$DGA101 COMPAQ MSA1000 VOLUME 4.32
dga102.1001.0.9.2 $1$DGA102 COMPAQ MSA1000 VOLUME 4.32
dgb101.1002.0.10.2 $1$DGA101 COMPAQ MSA1000 VOLUME 4.32
dgb102.1002.0.10.2 $1$DGA102 COMPAQ MSA1000 VOLUME 4.32
>>>set bootdef_dev dga101.1001.0.9.2,dgb101.1002.0.10.2

.

Rick Dyson · ‎09-16-2005

Understood!

I just meant it would be nice to have one of the newer HSV arrays instead of the older, slower HSG boxes. :) Not that there is anything wrong with them! They beat a HSZ20 and RA230+ I have elsewhere or at home...

Jan van den Ende · ‎09-16-2005

Rick,

It really _IS_ that simple, especially if your node hardware is identical.
The node identity is NOT the hardware, but the software that it is booted from.
(and yes, the hardware and the software must be totally interchangable to do it without adjustments)

But if:
the systems
- are the same hardware type
- have the same number & type of CPU
- have the same amount of memory
- have the same type of Network cards
- have NO direct-attached terminal nor other periferals
the it just is swapping the boot roots, and the systems ARE swapped!
Including (DECnet & IP) network adresses, and EVERYthing (except the serial numbers in SHOW CPU).

Been there, done that.

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Rick Dyson · ‎09-16-2005

Sweet!

I do this now with the independent disks and my feet. (true sneakernet!) But being able to connect to the console from home (when does unexpected downtime happen 9-5 M-F?!?), change the boot command and get it going will allow up to an hour faster response as compared to getting in by car and so forth.

Thanks All!

Robert Gezelter · ‎09-16-2005

Rick,

I have done this on several occasions at different sites. It is one of the underappreciated possibilities of OpenVMS clustering, and a very powerful tool.

Cluster nodes are more than boxes, they are virtualizations of roles in an overall configuration.

In many installations, this is presumed to be exactly equal to "which box is which node". Say (Boot Root on Common System Disk in parenthesis):

Production Primary Node ALPHA ES40 (0)
Production Secondary Node BETA ES45 (1)
Development Node CHARLE DS25 (2)

And that works for many sites. However, it is more powerful to use cluster names by function, not hardware box. Using the same notation above:

Production Primary Node ALPHA ES40 (0)
ES45 (1)
DS25 (2)
Production Secondary Node BETA ES40 (10)
ES45 (11)
DS25 (12)
Development Node CHARL ES40 (20)
ES45 (21)
DS25 (22)

I alluded to such a configuration in my article in Volume 3 of the OpenVMS Technical Journal (a reprint of which can be found at my www site at http://www.rlgsc.com/publications/vmstechjournal/inheritance.html or on the HP OpenVMS www site).

The strongest advantage of this approach is that is allows you to treat your hardware as a pool of systems, which run, in effect, virtual nodes. If a situation requires you to alter which physical box is used for each of the roles in your configuration, you are a pre-configured node reboot away for being finished.

Each of the boot roots can have its own set of parameters, and settings, so manual changes to configuration in an emergency are eliminated.

It is, indeed, a spectacularly powerful capability.

- Bob Gezelter, http://www.rlgsc.com

So, the set of cluster Nodes

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Dynamic Choice of VMS System to Boot As...

Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...

Re: Dynamic Choice of VMS System to Boot As...