Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

CLUSTER_CONFIG.COM and removing nodes from cluster

 
SOLVED
Go to solution

CLUSTER_CONFIG.COM and removing nodes from cluster

We run a 12-node Alpha VMS cluster. Each
machine has its own system disk. We wish
to remove two of the machines from the cluster
permanently. I have followed the method
given in the OpenVMS Cluster Systems manual,
section 8.3. I did an orderly shutdown of the
first machine and powered it off. I then
ran CLUSTER_CONFIG.COM on one of the active
nodes and selected the REMOVE option. Prompt
asked for SCS node name, no problem. Then
it asked "What is the device name for
's system root?" The default value is
the active system's system disk. I entered
$202$DKA0:, which is the system disk of the
machine that was shutdown. The procedure
complained that "$202$DKA0: is not mounted."
What should I have entered?

This is the first time I've ever had to
remove a machine from a cluster and the
documentation is a little lacking in
examples.

Gareth
10 REPLIES 10
Jan van den Ende
Honored Contributor
Solution

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

Gareth,

simplest option, IF THAT NODE STILL IS AVAILABLE!!!, then boot it into the cluster and run cluster_config again.

If thet is no longer possible, state so, and we we will take it from there.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Andy Bustamante
Honored Contributor

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

The REMOVE option in cluster_config.com deletes the node specific root option and purges the network node information.

The important bit here are to ensure you maintain quorum. Revise EXPECTED_VOTES using $SYSGEN if appropriate (where these voting nodes?). Use $SET CLUSTER/EXPECTED_VOTES or Availability Manager to revise the current value in your running cluster. A controlled shutdown with the REMOVE_NODE option will also update the running expected votes.

Since this node has it's own disk, assuming your don't have shared scsi, removing the root isn't an issue.

Removing the node information from your DECnet database and hosts table may be nice for clean up, you can also argue that leaving those entries and the root may make it easier to bring a node in for testing or upgrades.





If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Jan van den Ende
Honored Contributor

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

Gareth,

if you CANNOT bring back that node, just heed Andy's advise.
And remember, that you can NOT use that nodes NAME, NOR its SCSSYSTEMID (hence nor, DECnet address) as long as not EVERY cluster node that knew it, has been rebooted.

And do not forget to adjust MODPARAMS.DAT on _EVERY_ remaining node to the newly calculated EXPECTED_VOTES.

Success!

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
John Gillings
Honored Contributor

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

Gareth,

As you say, each node has its own system disk, presumably local. Once you've shut the node down, the local system disk is no longer available to other nodes (even assuming it was shared cluster wide in the first place).

The reason CLUSTER_CONFIG wants to know the device and root is so it can delete them. The actions of REMOVE are:

"
o It deletes the node's root directory tree.

o It removes the node's network information from the network database.

o If the node has entries in SYS$DEVICES.DAT, any port allocation class for shared SCSI bus access on the node must be re-assigned.
"

I'd guess that in your case none of these are necessary. The roots are gone, so effectively deleted, dead network data base entries don't matter much, and with 12 nodes I seriously doubt you're using shared SCSI buses.

The important part of removing a node permanently is:

"
If the node being removed is a voting member, EXPECTED_VOTES in each remaining cluster member's MODPARAMS.DAT must be adjusted. The cluster must then be rebooted.

For instructions, see the "OpenVMS Cluster Systems" manual.
"

which you have to do manually anyway!

Unfortunately, OpenVMS has never properly supported multiple system disk clusters (which is very strange because that's really the strength of the system vs competing platforms). Every system manager has to figure it out for themselves, (and inevitably get bits wrong). Utilities like CLUSTER_CONFIG make many assumptions, and don't cover cases like yours.

There's no technical reason OpenVMS engineering couldn't create a decent suite of utilities to manage multiple system disk clusters, but I guess it's just another case of "accountants win", despite the efforts of some of us to get the functionality hole filled.

All you really need do is make sure voting is correctly reconfigured for the final cluster state. It's probably also worth checking your site specific command procedures for references to the missing nodes.

(if you find any, try to make your procedures independent of node names, see lexical functions like F$GETSYI and F$CSID)
A crucible of informative mistakes

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

Thanks for the responses. I forgot to
mention in my original message that
neither of the two machines being removed
are voting members of the cluster.

I think I can bring the system back up.
I'm doing this remotely, but there should
be someone in the office.

So the general consensus appears to be:
1) bring the machine back up
2) rerun CLUSTER_CONFIG
3) enter $202$DKA0: as the system disk name
4) after CLUSTER_CONFIG completes, use
SYSMAN to shut the machine down.

Gareth
John Gillings
Honored Contributor

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

Gareth,

>1) bring the machine back up

I disagree. Why bother? What's it going to achieve?

If the nodes don't vote, there's really nothing to configure, other than to remove the nodes from the network data bases. You can do that yourself, or just leave the dead entries.

Yes CLUSTER_CONFIG may be able to see the disk and root, but it can't delete it while the system is up, so that part will fail.

Eventually you may want to do a cluster reboot (rolling?) to eliminate the node from the memories of other nodes, but again, why not let that happen by natural attrition. It's mostly cosmetic.
A crucible of informative mistakes
Andy Bustamante
Honored Contributor

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

As John says above, your work is done here. The cosmetic value of removing nodes from the networking database can quickly be executed only if desired.

Since these nodes have no votes there is no potential issue with quorum.

Leave early and have one at home.

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
RBrown_1
Trusted Contributor

Re: CLUSTER_CONFIG.COM and removing nodes from cluster


I know nothing about this, but I wonder about the wisdom of bringing the machine back into the cluster so that you can @CLUSTER_CONFIG to remove it from the cluster again.

I think John said that CLUSTER_CONFIG will remove the node's system root. That would be [SYS0] on the soon-to-be standalone node's system disk (unless CLUSTER_CONFIG is smart enough to not delete the last root on a disk). Will having no SYS0 on that disk make it harder to configure the box as standalone?

For now, I'll go with John with this and say "don't bother".

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

OK, the consensus seems to be just leave
it as it. I will accept the advice of
those more knowledgeable than me.

Thanks to all,
Gareth

Re: CLUSTER_CONFIG.COM and removing nodes from cluster

Question answered.