StoreVirtual Storage
1752576 Members
3870 Online
108788 Solutions
New Discussion

Re: how to maintain quorum in a single site cluster with 4 nodes?

 
KFM_1
Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

I know that HP recommends the FOM as a best practice. If you have an even number of manager, then the FOM is needed to get the odd number. I have also read the SAN installation guide, the problem is that there is no detailed explanation how this exactly work. It says that it is recommended to have 3 or 5. Even if I would have an FOM and 4 manager running, at  the moment when one of the nodes goes offline, I would be left with 3 managers +FOM=Even number.

 


I too am curious about how the management group determines quorum, and I don't mean by a simple odd/even number!

 

If the FOM is a best practice (not that I've read that explicitly) then why don't they just say to use it in all scenarios rather than just the two that are mentioned in the use guide?  Given the calculation of quorum I can't think of a scenario where you wouldn't use a FOM.  That is my main gripe with the documentation.

 

What would happen if I would build a logical “Multi Site cluster” but physically on the same location with 4 managers+ FOM outside of the “site” would there be any difference?

 

 

I'm guessing no difference.  I have built something similar (yet opposite) by stretching a single-site cluster

across two physical sites.  I had eight nodes, four at each site.  Of these four, two were running managers and I had a FOM at a third logical site so I had five managers for quorum.

 

The only difference I've found with multi-site clusters is the requirement for different subnets for each site, thus two or more VIPs.  That in itself shouldn't affect quorum calculations.

 

Although I'm always happy to be proved wrong! :)

 

lex_11
Occasional Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

which network reaid level did you configured for the volumes? Have you ever tested the failover function in your configuration?
Bart_Heungens
Honored Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Well some cases where you don't need really a FOM:

 

Single server room 4 nodes or more... At that moment 3 managers will give enough managers since a split brain situation is not really possible since I assume all nodes are connected to the same switches... Majority of nodes is only necessary at that moment when U will restart nodes when updating firmware...

 

The only thing a multi site cluster will do is spread the blocks of data accros the nodes so that every block of data is located in every site. This to avoid that, when 1 site goes down, all volumes would go down...

You can obtain this also by creating a single site cluster and arrange the nodes on cluster level that the odd nodes are in 1 site and the even nodes are in the second site...

A picture explains better and can be found in the training material (I am a P4000 certified instructor)... But there is a logic behind...

 

Know that from version 9.0 on, multiple subnets are not necessary anymore for a multi site cluster... I have them set up all the time in a single subnet, works great...

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
lex_11
Occasional Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

 



Well some cases where you don't need really a FOM:

Single server room 4 nodes or more... At that moment 3 managers will give enough managers since a split brain situation is not really possible since I assume all nodes are connected to the same switches... Majority of nodes is only necessary at that moment when U will restart nodes when updating firmware...


with 3 managers I would have the quorum in the cluster. But, I still don't get it how the quorum would be maintained in a situation after one of the 3 running managers would go offline? During the time the one manager is offline I would have a split brain in the cluster... I'm thinking about the worst case scenario and assuming that another manager could go offline shortly after the first one went offline.

 

I guess I will have to try the solution with 4 managers+FOM, though there is still some unclearness about the quorum maintenance..

 

 

 

Bart_Heungens
Honored Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi,

 

U mention it yourself, with 3 managers U can have quorum with 2 surviving managers but at that moment nothing else should go wrong... That is why HP (and also myself) always go for 5 managers if U have the possibility... All my customers with 4 nodes have the FOM installed, even if all nodes are in 1 datacenter...

 

But I discuss this always with the customer and discuss with him all ppssible scenarios... It's up to him to decide...

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
lex_11
Occasional Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi  Bart,

 

the other option I'm also thinking about is to go for 2 cluster in a single site, with all mangers running + FOM. Though I would have the half of the performance than in a single cluster,  I would get the benefit from the storage redundancy and could tolerate the failure of 2 nodes (1 from each cluster).

 

One thing I'm not quite sure is how many many FOM do I need to have in this scenario? Is one FOM enough for 2 Clusters, or do I need 2?

 

Cheers!

 

 

 

 

KFM_1
Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?


@lex_11 wrote:
which network reaid level did you configured for the volumes? Have you ever tested the failover function in your configuration?

Sorry for late reply!

 

I used network-raid10+2, so 4 copies of the data, 2 in each site.  Yes we did test failover functionality before going live - this was a customer requirement.  Data integrity and access passed with flying colours.  In this case, we did have a FOM in a different server room that had network connectivity to both physical sites.

Bart_Heungens
Honored Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi,

 

You add a FOM per management group and not per cluster...

So for that U don't need to create 2 clusters... A good reason why to create multiple clusters is to split up types of disks for instance SATA, SAS and SSD... Another reason can be for remote copy groups to renote sites with asynchronous replication...

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
David_Tocker
Regular Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

My understanding is that a FOM is there to serve the purpose of maintaining quorum in the case of lost nodes, or serving the purpose of a 'tie-breaker' in the case of having an even number of nodes.

 

So if you had a two-room scenario with two nodes in each room you would want a FOM on a seperate network that is accessable from any node in the case of a failure. Meaning that realistically you want to have L3 switches talking to a seperate switch on a seperate network with the FOM attached. The normal rules of TCP/IP networks apply, so routes to the seperate network need to be maintained on each rooms switch/router for reliable operation.

 

This can be as simple as (room1 (192.168.1.x) ---- FOM (192.168.2.s) ---- (room2 192.168.3.x))

This way you are covered from a room failure and a switch failure, but you still cannot loose the FOM if (only) one of the rooms is down. In the case of total failure of all rooms, you want the FOM to be the last to fail, at least that way it can perform the duty of a tie-breaker. In the case of the FOM going off before both rooms, i assume that it will still be able to perform this task when coming back online, but I am not 100% on that.

 

NEVER run the FOM from one of the nodes - if the FOM cannot come up independantly of the cluster then you could be in trouble.

Regards.

David Tocker