- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Question on Quorum Disk & Fence device in Redhat C...
Operating System - Linux
1752074
Members
4793
Online
108784
Solutions
Forums
Categories
Company
Local Language
юдл
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
юдл
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Go to solution
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-19-2010 05:58 AM
тАО09-19-2010 05:58 AM
Dear ALL,
As far i know we need in Redhat Cluster fence device is needed to cut off access to a resource (hard disk, etc.) from a node in your cluster if it loses contact with the rest of the nodes in the cluster.
And quorum disk is needed in the split-brain condition.
My query is:
a) Is the only purpose of fence device to cut-off resource & power-cycle when a node become unhealthy ??
b) Quorum disk is only need to run the cluster operation without any disruption when (say 3 nodes cluster, 2 nodes become fail) majority nodes losses his fitness
Can any one explain me in detail what is quorm disk & why we need it
What is the difference between fence device & quorum disk
In two or three nodes cluster did we need to configure quorum disk ? If yes then why ??
Thanks
Minhaz
As far i know we need in Redhat Cluster fence device is needed to cut off access to a resource (hard disk, etc.) from a node in your cluster if it loses contact with the rest of the nodes in the cluster.
And quorum disk is needed in the split-brain condition.
My query is:
a) Is the only purpose of fence device to cut-off resource & power-cycle when a node become unhealthy ??
b) Quorum disk is only need to run the cluster operation without any disruption when (say 3 nodes cluster, 2 nodes become fail) majority nodes losses his fitness
Can any one explain me in detail what is quorm disk & why we need it
What is the difference between fence device & quorum disk
In two or three nodes cluster did we need to configure quorum disk ? If yes then why ??
Thanks
Minhaz
Solved! Go to Solution.
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-19-2010 12:30 PM
тАО09-19-2010 12:30 PM
Solution
Fencing is RedHat Cluster's primary protection against the split-brain condition.
The problem is, if the cluster has only 2 nodes and no quorum disk, what would be a split-brain condition turns into a "fencing war" instead.
The quorum disk is an optional extra tool for deciding which nodes are healthy and which are not. It can affect the main cluster daemons' decisions on when to fence and which nodes to fence.
In 2-node clusters the quorum disk eliminates the possibility of fencing wars if the network connections between the nodes are lost. This is usually the primary reason to use a quorum disk in a 2-node cluster.
A 3-node cluster will become a 2-node cluster while any one of the nodes is down for any reason, e.g. on a planned maintenance downtime. To make the cluster safe from fencing wars even while one node is down, it would be a good thing to set up a quorum disk for 3-node clusters too. However, the quorum disk is much more important for 2-node clusters than for 3-node clusters.
An example of a fencing war in a 2-node cluster with no quorum disk would be:
1.) All network connections lost between nodes A and B.
2.) Node A decides B has failed and fences it. (At the same time, node B was trying to fence node A using the same rules, but by random chance, was not quite fast enough.)
3.) Since node B was succesfully fenced, node A now knows B is down for sure. Node A takes over all the cluster services; node B reboots.
4.) Since the cluster is running in the special 2-node mode, there is no quorum check and the node B can restart cluster daemons with no connection to node A. But because the state of node A is unknown to node B, there is a problem... so node B fences node A.
5.) Since node A was succesfully fenced, node B now knows A is down for sure. Node B takes over the cluster services; node A reboots.
6.) Since the cluster is running in the special 2-node mode, there is no quorum check and the node A can restart cluster daemons with no connection to node B. But because the state of node B is unknown to node A, there is a problem... so node A fences node B.
7.) (Go back to step 3.)
This cycle will go on forever until the network connections are restored or the sysadmin stops it manually. Because of the reboot cycle, the nodes can do very little useful work, and the users won't be happy.
The quorum disk can prevent this cycle from happening. It provides (at least) one extra vote to the cluster quorum voting process (which is done by the main RedHat Cluster daemons), and makes the special 2-node mode unnecessary. The quorum voting process will prevent the fenced node from starting the cluster operations until the network connections are fixed... so the step 4) won't happen.
The quorum disk daemon can also be used to set up extra conditions for node fitness. For example, if all your cluster services need a connection to an external database, you can make qdiskd run a script to check if the database is reachable; if it isn't, the node will be considered unhealthy.
In short:
- The fence device is an absolute requirement in all production RedHat clusters. Without a fence device, your RedHat Cluster configuration will not be supported by RedHat, will not be protected from split-brain situations, and will not recover automatically from some other types of hardware failures.
- The cluster will use the fence device to cut off nodes whose state is unknown. As a result, the cluster will know that a fenced node is down for sure. This will allow the automatic failover procedures to continue.
- The sysadmin can use the fencing mechanism to manually halt and/or reboot the nodes for any reason, e.g. to remotely shut down a node for hardware maintenance.
- The quorum disk is optional:
* it's a very very good thing to have on a 2-node cluster, to prevent fencing wars
* it's good to have on 3-node clusters too, but slightly less important
* it allows customizable extra health checks: if you need them, you may want to use it on bigger clusters too
* it can be used to allow a single node to keep running the cluster, even if the majority of nodes have failed.
* it has a limit of maximum 16 nodes.
Please read:
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/s1-qdisk-considerations-CA.html
MK
The problem is, if the cluster has only 2 nodes and no quorum disk, what would be a split-brain condition turns into a "fencing war" instead.
The quorum disk is an optional extra tool for deciding which nodes are healthy and which are not. It can affect the main cluster daemons' decisions on when to fence and which nodes to fence.
In 2-node clusters the quorum disk eliminates the possibility of fencing wars if the network connections between the nodes are lost. This is usually the primary reason to use a quorum disk in a 2-node cluster.
A 3-node cluster will become a 2-node cluster while any one of the nodes is down for any reason, e.g. on a planned maintenance downtime. To make the cluster safe from fencing wars even while one node is down, it would be a good thing to set up a quorum disk for 3-node clusters too. However, the quorum disk is much more important for 2-node clusters than for 3-node clusters.
An example of a fencing war in a 2-node cluster with no quorum disk would be:
1.) All network connections lost between nodes A and B.
2.) Node A decides B has failed and fences it. (At the same time, node B was trying to fence node A using the same rules, but by random chance, was not quite fast enough.)
3.) Since node B was succesfully fenced, node A now knows B is down for sure. Node A takes over all the cluster services; node B reboots.
4.) Since the cluster is running in the special 2-node mode, there is no quorum check and the node B can restart cluster daemons with no connection to node A. But because the state of node A is unknown to node B, there is a problem... so node B fences node A.
5.) Since node A was succesfully fenced, node B now knows A is down for sure. Node B takes over the cluster services; node A reboots.
6.) Since the cluster is running in the special 2-node mode, there is no quorum check and the node A can restart cluster daemons with no connection to node B. But because the state of node B is unknown to node A, there is a problem... so node A fences node B.
7.) (Go back to step 3.)
This cycle will go on forever until the network connections are restored or the sysadmin stops it manually. Because of the reboot cycle, the nodes can do very little useful work, and the users won't be happy.
The quorum disk can prevent this cycle from happening. It provides (at least) one extra vote to the cluster quorum voting process (which is done by the main RedHat Cluster daemons), and makes the special 2-node mode unnecessary. The quorum voting process will prevent the fenced node from starting the cluster operations until the network connections are fixed... so the step 4) won't happen.
The quorum disk daemon can also be used to set up extra conditions for node fitness. For example, if all your cluster services need a connection to an external database, you can make qdiskd run a script to check if the database is reachable; if it isn't, the node will be considered unhealthy.
In short:
- The fence device is an absolute requirement in all production RedHat clusters. Without a fence device, your RedHat Cluster configuration will not be supported by RedHat, will not be protected from split-brain situations, and will not recover automatically from some other types of hardware failures.
- The cluster will use the fence device to cut off nodes whose state is unknown. As a result, the cluster will know that a fenced node is down for sure. This will allow the automatic failover procedures to continue.
- The sysadmin can use the fencing mechanism to manually halt and/or reboot the nodes for any reason, e.g. to remotely shut down a node for hardware maintenance.
- The quorum disk is optional:
* it's a very very good thing to have on a 2-node cluster, to prevent fencing wars
* it's good to have on 3-node clusters too, but slightly less important
* it allows customizable extra health checks: if you need them, you may want to use it on bigger clusters too
* it can be used to allow a single node to keep running the cluster, even if the majority of nodes have failed.
* it has a limit of maximum 16 nodes.
Please read:
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/s1-qdisk-considerations-CA.html
MK
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-20-2010 02:47 AM
тАО09-20-2010 02:47 AM
Re: Question on Quorum Disk & Fence device in Redhat Cluster
WOW, very informative answer MK, Thanks
Not to unfamiliar to anyone that has worked with OpenVms cluster!
PS not points for this just a comment.
Jean-Pierre Huc
Not to unfamiliar to anyone that has worked with OpenVms cluster!
PS not points for this just a comment.
Jean-Pierre Huc
Smile I will feel the difference
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-20-2010 09:24 PM
тАО09-20-2010 09:24 PM
Re: Question on Quorum Disk & Fence device in Redhat Cluster
Thanks MK. Your input is so helpful.
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP