Operating System - HP-UX
1833758 Members
2599 Online
110063 Solutions
New Discussion

lock disk vs. quorum server

 

lock disk vs. quorum server

Hi,

just wondering, in a 2 node cluster, what are the advantages of a lock disk over a quorum server and vice versa?
I've been reading up on both and don't see a clear advantage of one above the other.
Below is a brief description of the hardware that I can use for this cluster.

Please focus only on this comparison as i can see clearly how a quorum server can be helpfull is you have multiple clusters.
Also, this is only regarding a 2 node cluster (campus cluster) over 2 sites, all disks on EMC symmetrics and the systems are 2 superdome n-pars.

Thanks for all in advance.
Emiel
10 REPLIES 10
Stephen Doud
Honored Contributor

Re: lock disk vs. quorum server

In discussing the cluster arbitration methods, it is important to understand
that arbitration to form a new cluster only comes into play when exactly half
of the nodes in the cluster cannot reach the other half over the heartbeat
network(s). This is sometimes called the 50% rule. Arbitration is not
involved when a heartbeat outage occurs between 25%/75% or 33%/66% of the
servers. In such a case, the minority-side of nodes will reboot themselves.

When heartbeat is lost to one or more servers, the cluster must reform in
order to insure that potentially failed packages are adopted by legitimately
operating servers. The arbitration process insures that one and only one
cluster will assume responsibility for orphaned packages. The nodes not
achieving cluster reformation will reboot themselves to preserve data
integrity.

Regarding configuring a 4-node cluster with cluster arbitration. ServiceGuard
supports any of these methods of course, but here are the pros/cons to the
various cluster arbitration methods:

Quorum Server
-------------
PROS: Can support up to 50 2-node clusters
Failover time shortened because quorum server access is faster than
lock disk access

CONS: Requires a server outside of the cluster be loaded with the quorum
server (QS) software (maintenance required).

Cluster node communication with the quorum server is subject to network
failure

QS's .rhosts must be updated to allow access to cluster nodes (possible
security issue).

Single cluster lock disk
------------------------
PROS: Simple to configure - allow cmquerycl to select one

CONS: Must have a shared VG between all 4 nodes in the cluster

The single cluster lock disk is a single-point of failure - but it is
only a vulnerability if it is unavailable when the 50% rule comes into
play.

Lock disk access (disk-I/O) is slower than quorum server(network-based)

Dual cluster lock disk
----------------------
PROS: Recommended in scenario where 50% of the nodes are in a different
location than the other 50% of the nodes - providing for site failure.

CONS: Could cause split brain clusters if only the heartbeat network fails
and not one of the 2 sites. (DATA CORRUPTION POSSIBLE)

No arbitration
--------------
PROS: easy to configure (?)

CONS: All nodes reboot when a 50/50 split occurs in the heartbeat net.



A single lock disk is still the preferred choice because it doesn't involve the split-brain possibility of a dual-lock and requires no network access (unless it's in a SAN), and it is configured by ServiceGuard without additional software loads such as Quorum Server. However, the other two methods serve their purpose when their method makes more sense.

NOTES:
PLEASE refer to the Managing MC/ServiceGuard manual for details about each arbitration method.
David de Beer
Valued Contributor

Re: lock disk vs. quorum server


Emiel,

You can put your quorum server anywhere on the network. You can put it on the clients site, in their network if you like.

This way, the package will always run on the right node - the one that your clients can access.

Regards,
David de Beer.
RAC_1
Honored Contributor

Re: lock disk vs. quorum server

I would say, a lock disk. Why?
Ease of management.
No extra work for quorum server.

As you understood, if there are more than 1 clusters, I would go for quorum server.
There is no substitute to HARDWORK

Re: lock disk vs. quorum server

Sometimes the speed of the itrc amazes me ;-)
Thank you all for your fast answers!

First I have to point out that if a lock disk will be chosen, it will be a dual lock disk for obvious reasons.

Ok, yes i agree.
A quorum server does have an advantage that focusses on the future if more clusters would come (and they will).

Network simply is not an issue.

I'm leaning towards using lock disk because of the ease of management. I'm familiar with the statistically worse uptime of cluster systems and i see a quorum server as 1 more system that can become a SPOF at the worse possible time.

Are there basic differences the effect?
let me rephrase that.. can we think of scenarios that 1 would be a huge advantage above the other?

let me point out again that all the HW is redundant. Everything is behind UPS. lock disks are on san. Multiple HB lan's will be used.
melvyn burnard
Honored Contributor

Re: lock disk vs. quorum server

well my comment here is that if you go the dual cluster lock scenario, you could end up in a split brain scenario. If you have a QS somewhere else on your lan, then if half of the nodes lost contact with the other half, i.e. a 50% split, one side may not be able to see any netaorking, and hence would not be able to get the QS, and hence you would not get split-brain.
Also, if you decide to change your disc configuration, you could be forced to halt the cluster to re-apply a new cluster lock disc.

The QS can be made HA, by having a spearate little 2 node cluster running QS as a package.
There are a large number of factors to be considered here. As mentioned, reading the manual may help you to make an informed choice.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Rita C Workman
Honored Contributor

Re: lock disk vs. quorum server

Frankly, Stephen has explained all the pro/con points.

So, on small clusters, it's six of one and a half dozen of the other. If you looking at going above 4 servers, then QS.

I'll be honest I have kept my lock disk all this time, cause I didn't want to bother putting up QS. Well now I'm in a situation where I will have more than 4 servers (for awhile anyway) in one cluster and I have to put in QS.
Dah...this is so simple to set up, why did I bother to wait.

For bad scenarios, you've already read on the split-brain issue, so there you go.

Just my 2cents,
rcw
Florian Heigl (new acc)
Honored Contributor

Re: lock disk vs. quorum server

I'd let every cluster have a cluster lock disk and let all clusters share a quorum server, so that the cluster nodes will be able to come up during either network or san connectivity loss.
(You might not get a running cluster package, but the cluster is formed and ready to go as soon as the other issues are resolved, saving uptime)
any old D- or A-class will do fine as a quorum server, and I think there should be one in every household ;)
yesterday I stood at the edge. Today I'm one step ahead.

Re: lock disk vs. quorum server

Thanks to all for the answers and thoughts.

this time the application is not that critical and we only use an HA environment because when a site really has a disaster you want to have as much as possible automated so that you can concentrate on something else than starting software.

I still didn't decide yet (on my advise that is, I don't pretend to make any decisions ;-) ) but considering that a split-brain is absolutely not an option I would say that
a quorum server on a 3rd location is the way to go. Still with SG configured in a way that cluster activity is not automatically started after reboot. There's always somebody standby so that should cover it in case something happens when at the same time the QS is unavailable.

I would like to thank everybody for their thoughts and experiences. Sometimes just talking about issues makes them much more clear.
Marlou Everson
Trusted Contributor

Re: lock disk vs. quorum server

The quorum server software is really very easy to install. I have actually had my quorum server running on 4 different systems. I have moved it when I knew that the system the quorum server was on was going to have maintenance. I also moved it to a newer, more reliable system that we installed. I did all this without taking the cluster down. When I first installed the quorum server, I added another IP address and host name to the server. So I "float" the IP address and host name to the "new" system when I move it.

Also, my quorum server is running on HP-UX 11.11 and the cluster is 11.0.

Marlou
Roberto Martinez_6
Frequent Advisor

Re: lock disk vs. quorum server

Hi, here my two cents...

We have a 2 node cluster with 2 Hitachi cabinets, on 2 different buildings. We have chosen the "single disk into the 1st Hitachi bay" option. I personally do not like very much this option, because if the 1st site gets burnt, the second server would be technically able to do all the job, but it has no access to the lock disk, and I'd have to manually restart the cluster and forcing the remaining node to form the cluster without lock disk. And all that, a sunday 3 a.m. which is, typically, the time when such these things happen.
But, all of this is much better than getting data inconsistency...I prefer to spend a while on a Sunday, than a "bigger while" recovering data from...hey! we started backing up the new cluster, didn't we?

It wasn't me