StoreVirtual Storage
1748019 Members
4535 Online
108757 Solutions
New Discussion

Re: how to maintain quorum in a single site cluster with 4 nodes?

 
lex_11
Occasional Advisor

how to maintain quorum in a single site cluster with 4 nodes?

Hi, 

 

I want to have a single site cluster of 4 nodes with storage redundancy build up to tolerate the loss of 2 nodes without impact to data access and the data consistency itself. For this purpose all volumes in the cluster are going to be fully provisioned and protected with network RAID10. In the cluster there will be 3 mangers running.

According to this, I would have the quorum in the cluster and my access to data would be granted.

 

P4500 Cluster:

Node1 Manager

Node2 Manager

Node3 Manger

Node4 –

 

..So far so good

What bothers me, is that if I lose the Node1 with the manager running on it, I will have no quorum and would be unprotected in case that the Node3 goes offline. I would have a split brain in the Cluster and would have no quorum for this time unless I start the manger on the Node4. Is it possible to trigger the start of the stopped manager on the Node4 automatically, as soon as one of the running managers stops?

 

Thanks all!!

18 REPLIES 18
Bryan McMullan
Trusted Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

You can either install a FOM to help with the quorum (and have all nodes running managers so you have the suggested 5 managers in the cluster), or you could get another node and run managers on them. 

 

As you seem to be set, I think running managers on all 4 nodes and adding a FOM  Even though you're not running multi-site, I think it should work fine.

oikjn
Honored Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

I haven't tried, but I would imagine you could get creative with the CLI and batch commands to get what you are looking for.

 

In the alternative, do you have a problem with adding a FOM to the mix?  Then you use all four nodes with the manager and a FOM to keep quorum and can maintain a two-node failure.  Hopefully you are using raid10+1 because it really doesn't matter if you keep quorum w/ a 2nd node failure if you only are using raid10 since the loss of the 2nd node will stop LUN availability anyway.

KFM_1
Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

You know what, I'm not entirely convinced that in your situation, with the loss of one node running a manager, that you will be in a split-brain situation.

 

I've attached a screenshot from the HP StorageWorks P4000 SAN Solution User Guide, on page 149, table 35 Managers and quorum.

 

Managers.png

 

You will see that when you run three managers, and one fails, it says "If one manager fails, 2 remain, so there is still a quorum."  Therefore this means we are NOT in a split-brain scenario.  What I assume from this statement is that the determination of whether or not a cluster is in a split-brain scenario is calculated using the pre-failure number of managers, that is, three.  Thus two out of three managers is still a majority (though not recommended as it's not fault tolerant).

 

This to me is misleading as the paragraph directly above the table says "An even number of managers can get into a state where no majority exists—one-half of the managers do not agree with the other one-half. This state, known as a 'split-brain,' may cause the management group to become unavailable."

 

To me, the guide does not go into enough detail with regards to managers, quorums and special managers.  The guide mentions that the FOM should be used for specific scenarios yet I've heard from authoritative sources within HP that it should be used pretty much in every scenario.  IMO HP use it as a silver bullet for all possible split-brain/manager scenarios - "oh just deploy a FOM and all will be good".  If that's the case then I cannot imagine a scenario where you wouldn't want a FOM!

 

Bart_Heungens
Honored Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi,

 

HP is quite clear about the usage of the FOM...

If you have 2 nodes in 1 server room or 2 times 2 being 4 nodes in a cluster spread accros 2 server rooms you should have a FOM to avoid split brain situations...

Split brain happens also with only 2 nodes in 1 server room where they loose communications between the 2 of them... Which node should stay active...

 

Best practices say that you need 3 or 5 managers to avoid downtime...

 

If you have 4 nofdes and you start 3 managers, as such you have a good situation since 1 can go down and U keep quorum... But it is not ideal, and that is also what the BPA says inside the CMC... You should go for 5...

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
KFM_1
Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi Bart,

So HP are essentially saying to use a FOM in all scenarios to make up one of the managers ;)
Bart_Heungens
Honored Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi,

 

That is the 1 and only reason of the FOM being there, being a manager... And it is doing it quite well...

 

It helped me already in several cases...

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
KFM_1
Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi Bart,

I don't doubt it's doing a good job being a manager! I was wondering why HP don't just say to use it in all scenarios....single site (single rack/multi-rack/multi-room/etc), multi-site and in situations where there are only two nodes in a cluster.
Bart_Heungens
Honored Contributor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi,

 

That is what HP says: with only 2 nodes it is always better to have a FOM...

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
lex_11
Occasional Advisor

Re: how to maintain quorum in a single site cluster with 4 nodes?

Hi, thanks for all the replays!

 

I know that HP recommends the FOM as a best practice. If you have an even number of manager, then the FOM is needed to get the odd number. I have also read the SAN installation guide, the problem is that there is no detailed explanation how this exactly work. It says that it is recommended to have 3 or 5. Even if I would have an FOM and 4 manager running, at  the moment when one of the nodes goes offline, I would be left with 3 managers +FOM=Even number.

So what is the guaranty that there still would be a quorum in a case that one of the other nodes goes offline as well?

At the end I guess I could test this before going online, but I'm still qurios about how the quorum is maintaiend..

 

What would happen if I would build a logical “Multi Site cluster” but physically on the same location with 4 managers+ FOM outside of the “site” would there be any difference?

 

Cheers