Not able to join the cluster

Bhushi · ‎09-08-2010

2 node cluster

This is the status

nwdse_bm:bb056x cmviewcl
cmviewcl : Cannot talk to all the nodes.
Cluster does not appear to be up

CLUSTER STATUS
nwdse unknown

NODE STATUS STATE
nwdse_am unknown unknown
nwdse_bm down unknown

UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE
dse unknown unknown

nwdse_bm:root cmruncl -v
Local node is not currently configured in a cluster
cmruncl : Unable to determine the nodes on the current cluster
cmruncl : Either no cluster configuration file exists, or the file is corrupted

nwdse_bm:root cmrunnode -v nwdse_bm
cmrunnode : Unable to communicate with a running cluster or with all nodes in the cluster.
cmrunnode : In order to use cmrunnode, the cluster must already be running on a subset of reachable nodes or else all cluster nodes must be reachable.
cmrunnode : Issuing cmrunnode again may succeed.

Now restoring /etc/cmcluster directory from last good backup on both servers BWDSE-AM and BWDSE-BM

Matti_Kurkela · ‎09-08-2010

> nwdse_bm:bb056x cmviewcl
> cmviewcl : Cannot talk to all the nodes.
> Cluster does not appear to be up

You seem to have lost all heartbeat communication lines between the nodes, or nwdse_am really is down. You should check (and if necessary, fix) this communication problem first.

> nwdse_bm:root cmruncl -v
> Local node is not currently configured in a cluster

Has someone run "cmdeleteconf" on this node, or something?

In this situation, you cannot use basic cmrunnode to start nwdse_bm, because nwdse_bm cannot communicate with nwdse_am. Serviceguard has no way to verify that nwdse_am is not currently running the cluster services: if it is, nwdse_bm would cause disk corruption and data loss if it tried to start the packages now.

There is a special syntax you must use if your entire cluster is down and you can only start a single node (e.g. because the other node has burned to ashes). In this situation, *you* are responsible for verifying that the other nodes really are down because Serviceguard cannot do it automatically. If nwdse_am is running and you'll start nwdse_bm in single-node mode, you'd deliberately cause a split-brain situation, which would be Very Bad for the cluster.

See the Serviceguard documentation (the "Managing Serviceguard" book) for details.

MK

MK

Stephen Doud · ‎09-10-2010

Check syslog.log for any additional messages generated by the cmviewcl - such as "permission denied" or some other security message.

Assuming the cluster configuration file (/etc/cmcluster/cmclconfig) defines a valid cluster of this and other nodes, the problem may lie be caused by any of the following:
- missing hacl lines in /etc/services or /etc/inetd.conf
- nsswitch.conf does not point to files (/etc/hosts) before DNS
- /etc/hosts does not contain all fixed IPs configured on each node
- the IPs are not aliased to the simple hostname of the sponsoring node.
- one or more IPs are aliased to multiple domains
- other issues (requires further investigation)

Bhushi · ‎09-14-2010

Below is the only way I was able to get package up and runningâ ¦

nwdse_bm:root cmruncl -v -n nwdse_bm

WARNING:
Performing this task overrides the data integrity protection normally
provided by ServiceGuard. You must be certain that no package applications
or resources are running on the other nodes in the cluster:
nwdse_am

To ensure this, these nodes should be rebooted (i.e. /usr/sbin/shutdown -r)
before proceeding.

Are you sure you want to continue (y/[n])? y

Successfully started $SGLBIN/cmcld on nwdse_bm.
cmruncl : Waiting for cluster to form........
cmruncl : Cluster successfully formed.
cmruncl : Check the syslog files on all nodes in the cluster
cmruncl : to verify that no warnings occurred during startup.

nwdse_bm:root cmviewcl

CLUSTER STATUS
nwdse up

NODE STATUS STATE
nwdse_am down unknown
nwdse_bm up running

PACKAGE STATUS STATE AUTO_RUN NODE
dse up running enabled nwdse_bm

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Not able to join the cluster

Not able to join the cluster

Re: Not able to join the cluster

Re: Not able to join the cluster

Re: Not able to join the cluster