Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Cluster passowrd sync error

 
SOLVED
Go to solution
Mallkeet
Occasional Contributor

Cluster passowrd sync error

Alpha GS1280 OVMS8.3 clusters. Found the following entries in errlog.sys of one of the nodes( which is huge)


Host name HTV015

System Model hp AlphaServer GS1280 7/1150

Entry Type 98. Asynchronous Device Attention


---- Device Profile ----
Unit HTV015$PEA0
Product Name NI-SCA Port

---- NISCA Port Data ----
Error Type and SubType x0600 Channel Error, Invalid Cluster Password
Received
Status x0000000000000000
Datalink Device Name EWA2:
Remote Node Name HTV014
Remote Address x00003416000400AA
Local Address x00003415000400AA

HTV015 is in a cluster with HTV003 and node HTV005 in cluster HTV014 - 2 independent cluster. The cluster interconnects - NIC are supposed to be separate VPN on private switch. The log entry points to HTV005 & HTV014 nodes is separate. could it be the cluster interconnect of all 4 nodes in same VPN ?? or some cluster configuration issue. Will this cause cluster/node instability ??

Thanks in advance

5 REPLIES 5
Robert Gezelter
Honored Contributor
Solution

Re: Cluster passowrd sync error

Malkeet,

Welcome to the HP ITRC OpenVMS Forum.

Putting different OpenVMS clusters on different LANs is not uncommon. However, one should not generally rely upon LAN connectivity as a way to separate different clusters.

OpenVMS clusters are distinguished by different cluster group numbers each with a shared password (this prevents an accident when two groups stumble upon the same cluster group number, as may very well have happened in this case).

The cluster group number and password are stored in SYS$COMMON:[SYSEXE]CLUSTER_AUTHORIZE.DAT, the contents of which are manipulated with SYSMAN using the CONFIGURATION SET CLUSTER_AUTHORIZATION command (and displayed with the corresponding SHOW command).

The first step I recommend is doing a SYSMAN CONFIGURATION SHOW CLUSTER_AUTHORIZATION on all of the nodes involved, and report the results back to this thread. If the two clusters are using the same group number, this should be corrected (probably on the less production critical cluster). It is early here in New York, so in an abundance of caution, I will recommend that all but one (presuming a shared system disk) of the nodes in the cluster having its group number being altered are shut down.

In SYSMAN, doing a SET ENVIRONMENT/CLUSTER first will cause SYSMAN to perform the operation on all currently running members in turn. Please heed the note in the appropriate SYSMAN HELP text that a full cluster reboot is required following this change. If any nodes (or system disks) are not active, SET ENVIRONMENT/CLUSTER will not be effective; it will be necessary to propagate the new file contents (or redo the SYSMAN CONFIGURATION SET CLUSTER_AUTHORIZATION commands individually).

- Bob Gezelter, http://www.rlgsc.com

Mallkeet
Occasional Contributor

Re: Cluster passowrd sync error

Robert

As you mentioned the nodes in the cluster had the same group number. Changed the group number on one of the cluster and rebooted it make the chnage effective. It stopped the message flooding errlog.sys.

Couple of quick query

1. Having the same group number across differnt cluster can it be potentially be a cause for a cluster reboot

2. setting priority on cluster channel on NIC between node ( in a VPN on a private swtich) a recommened option to divert cluster traffic across to set of NIC

Thanks
Robert Gezelter
Honored Contributor

Re: Cluster passowrd sync error

Mallkeet,

A node may not join an OpenVMS cluster unless its group/password match that of the other node(s) already in the cluster.

I am not sure that I understand your second question.

- Bob Gezelter, http://www.rlgsc.com
John Gillings
Honored Contributor

Re: Cluster passowrd sync error

Mallkeet,

>1. Having the same group number across
>differnt cluster can it be potentially be a
>cause for a cluster reboot

Having the same group number defined on members of different clusters is a VERY BAD IDEA! That's regardless of the network topology.

The whole point of cluster group numbers is to distinguish different clusters. Therefore, if you have more than one cluster you should have more than one group number. There are plenty of numbers to choose from, and there is no sensible argument to have duplicates.

I recommend setting your group number according to the network address of the founder node, that way you can't have duplicates.

You can use SCACP to direct cluster traffic to prefer some paths over others, however, it is NOT a substitute, fix or workaround for group number misconfiguration.
A crucible of informative mistakes
Mallkeet
Occasional Contributor

Re: Cluster passowrd sync error

John

Thanks for the feedback. Group number issue been resolved.