1748136 Members
3784 Online
108758 Solutions
New Discussion юеВ

ServiceGuard - Redhat

 
SOLVED
Go to solution
Nabil_11
Frequent Advisor

ServiceGuard - Redhat

Dear All,
I have the following scenario .
cluser formed from 7 nodes
4 nodes exists at Main Site connected to EVA
node1 ( main site )
node2 ( main site )
node3 ( main site )
node4( main site )
other 3 nodes exists at DR Site
node5 ( DR Site )
node6 ( DR Site )
node7 ( DR Site )
all nodes at same subnet.
The first 3 nodes at main site active
the forth node at main site passive
the other three nodes at DR site passive
if one node fail it should failover the forth node at main site
if another node fail it should fail over to fifth node at DR site

IF THE WHOLE SITE FAIL
all packeges should fail over to the DR Site
also the EVA ( without manual Intervention )
asuming I have EVA with CA license

Also I have another two clusters
second Cluster consists of 5 nodes
3 at main site 2 at DR site
Third Cluster consists of four nodes
2 at main site 2 at DR Site

My Questions is:
1- is it a must to have Quorum Server
1.1 - if its a must to have Q.S can I
use redundant Q.S

2- whats the benefit if I use Clsuster
extention for EVA ( CLX EVA )
also it it a must ??
3- is it a must to have equal number of nodes
availabel at main and DR site

I appreciate if I have clear technical answer

Regards

Nabil

3 REPLIES 3
Matti_Kurkela
Honored Contributor
Solution

Re: ServiceGuard - Redhat

1.) If any fault can split your cluster into two parts of equal size, a Quorum Server should be used as a tie-breaker.

An example for your scenario:
You're going to do some hardware maintenance on node3. To do that, you switch the package(s) off node3 to node4, then shutdown node3. While node3 is shut down, a backhoe cuts the cables between your main site and your DR site.

As node3 has been shut down, your main site and your DR site are now of equal size. In this situation, some sort of a tie-breaker is needed, or the entire cluster (both halves) will execute a hard reboot to prevent a possible split-brain situation and to preserve data integrity.

1.1) Read the instruction manuals of the Quorum Server at the http://docs.hp.com site.
A Quorum Server can be installed into a ServiceGuard package, but it MUST NOT be used to provide quorum services to the same cluster it is running on.

2.) The extension allows ServiceGuard to send commands to the EVA to trigger the storage failover automatically. You could do this with your own scripts if you want, but the extension is supported by HP.

3.) It depends on your applications and other requirements. In your first cluster, you have one passive node at the main site: it allows you to perform maintenance on one node at a time without failing over to the DR site. You must decide: do you need (or want) this capability at the DR site too?

If your 5-node cluster has active workloads at all 3 nodes at the main site, you might get some performance reduction when running all the workloads simultaneously using only 2 nodes at the DR site. You must decide if this is OK or not. (Maybe you've tested it and found that the performance is satisfactory for DR purposes? Or maybe the machines at the DR site are newer and can handle the extra load?)

MK
MK
Stephen Doud
Honored Contributor

Re: ServiceGuard - Redhat

When a site failure occurs, the remaining nodes enter the cluster reformation protocol.

The protocol has 2 rules:
When 50% of the nodes of the previously running cluster are still active, they must consult the quorum server (or lock LUN) to gain permission to form a reduced cluster.

When greater-than or less-than 50% of the original cluster nodes remain active...
a. if the site failure results in a minority of active nodes, they MUST TOC (reboot)
b. if the site failure results in a majority of active nodes, they automatically reform a reduced cluster.

Notice that the quorum server is ONLY consulted in a 50% loss of nodes.

Therefore, a quorum server is not consulted if a majority of nodes are lost simultaneously due to a site failure (and a minority of active nodes remain).

Hence, for safety sake, put an equal number of nodes at each site. This would force the DR site to consult the quorum server.
Nabil_11
Frequent Advisor

Re: ServiceGuard - Redhat

Hi,

So now assuming the failure at main site will
CUZ the whole cluster to fail
that's clear means its a must to have equal number of nodes at each site

What about installing second Quorum Server
is that possible

Regards

Nabil