1838882 Members
3794 Online
110130 Solutions
New Discussion

MC/ServiceGuard Cluster

 
SOLVED
Go to solution
ViS_2
Frequent Advisor

MC/ServiceGuard Cluster

Hi!
There is two node cluster whith one lock disk.
When this cluster is runnig and any node fails the remaining one reforms the new cluster.
The question is:
what will happen if one node will fail to boot after rebooting the entire cluster? Will be able the remaining node to form the new cluster in this case?

Thanks.
9 REPLIES 9
Pedro Cosmen
Valued Contributor

Re: MC/ServiceGuard Cluster

Hello Victor,

No, you need all the nodes up to bring up the cluster, if one node fails to boot the entire cluster fails and cannot be formed. So don't shutdown the cluster until you can boot all the nodes!

Regards.
John Poff
Honored Contributor

Re: MC/ServiceGuard Cluster

Hi,

Pedro is right. You need both nodes to bring up the cluster automatically, but you can force the cluster to start manually on just one node by using the cmruncl command:

cmruncl -n

JP
Prashant Zanwar_4
Respected Contributor

Re: MC/ServiceGuard Cluster

If you boot entire cluster and only one node comes up..the quorum will missmatch and obviously the node which has booted will not be form cluster as quorum is not present. For two node cluster it will be okay to run the cluster without the lock disk if one of the node is supposed to be running the services and both nodes are independent of each other by means..In this case it might be the situation that only TOC is important..
Hope it helps
Thanks
Prashant
"Intellect distinguishes between the possible and the impossible; reason distinguishes between the sensible and the senseless. Even the possible can be senseless."
ViS_2
Frequent Advisor

Re: MC/ServiceGuard Cluster

2Prashant
Have i understood correctly your answer?
In case of three node cluster if one node fails to boot two remaining nodes can form the new cluster after total reboot?

2All
Maybe there is some means to force exactly 50% of alive nodes to form a cluster provided that they have got the lock?
Prashant Zanwar_4
Respected Contributor

Re: MC/ServiceGuard Cluster

Yes that is what I meant.

The Cluster Lock (either a Cluster Lock Disk or a Quorum Server) function is used to
remediate a cluster breakdown (arbitration). Cluster vitality is determined by all nodes passing heartbeat packets to the cluster manager by way of the heartbeat (HB) LAN. If a HB network failure occurs which isolates some nodes from the rest, cluster communication (and status synchronization) is lost. Serviceguard endeavors to form a reduced cluster of known functioning nodes to adopt packages from nodes whose status is no longer known. To determine the members of the new cluster, Serviceguard uses the Quorum Rule:
â ¢ Greater than 50% of nodes can interact:
These nodes reform a new cluster.
â ¢ Exactly 50% of nodes can interact:
The Cluster Lock Disk or Quorum Server is approached to negotiate the new cluster.
â ¢ Less than 50% of nodes can interact:
These node are forced to TOC to preserve data integrity.

Hope it helps
Thanks
"Intellect distinguishes between the possible and the impossible; reason distinguishes between the sensible and the senseless. Even the possible can be senseless."
melvyn burnard
Honored Contributor

Re: MC/ServiceGuard Cluster

>Have i understood correctly your answer?
In case of three node cluster if one node fails to boot two remaining nodes can form the new cluster after total reboot?


No, this is not the case.
On initial clsuter startup, after all nodes have been rebooted, 100% of nodes MUST be available to form the cluster.
A
If one is not available, then the remaining nodes will wait for the AUTO_START_TIMEOUT period (10 minutes is the default) and then they will cease to try to join a cluster.
As already pointed out, you would then need to manually intervene and issue the cmruncl -n and get that node running as a single node cluster, and then issue cmrunnode on the other nodes.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Prashant Zanwar_4
Respected Contributor
Solution

Re: MC/ServiceGuard Cluster

A small but good document on this. I agree with melvyn. All nodes must be present if cluster should reform on its own or manually cmruncl is to be issued.
Hope this helps
Thanks
"Intellect distinguishes between the possible and the impossible; reason distinguishes between the sensible and the senseless. Even the possible can be senseless."
Sridhar Bhaskarla
Honored Contributor

Re: MC/ServiceGuard Cluster

Hi Victor,

If you don't have all the nodes at the time of cluster startup, you can still bring up the cluster using -f option of cmruncl.

cmruncl -f -n

will bring up the cluster with only one node.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Prashant Zanwar_4
Respected Contributor

Re: MC/ServiceGuard Cluster

I am just sorry for first one:

Here is extract from the above document. This is quite helpful. And also in two node cluster the cluster lock is a must.

The cluster lock is used as a tie-breaker only for situations in which a
running cluster fails and, as MC/ServiceGuard attempts to form a new
cluster, the cluster is split into two sub-clusters of equal size. Each
sub-cluster will attempt to acquire the cluster lock. The sub-cluster
which gets the cluster lock will form the new cluster, preventing the
possibility of two sub-clusters running at the same time. If the two
sub-clusters are of unequal size, the sub-cluster with greater than 50% of
the nodes will form the new cluster, and the cluster lock is not used.
If you have a two-node cluster, you are required to configure the cluster
lock. If communications are lost between these two nodes, the node that
obtains the cluster lock will take over the cluster and the other node will
undergo a forced halt. Without a cluster lock, a failure of either node in
the cluster will result in a forced immediate system halt of the other
node, and therefore the cluster will halt.
If the cluster lock fails or is unavailable during an attempt to acquire it,
the cluster will halt. You can avoid this problem by configuring the
clusterâ s hardware so that the cluster lock is not lost due to an event that
causes a failure in another cluster component.

Hope it clarifies the need of cluster lock.

Thanks
Prash
"Intellect distinguishes between the possible and the impossible; reason distinguishes between the sensible and the senseless. Even the possible can be senseless."