System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

What fencing mechanism happens in Linux Clustering ?

sksonkar
Contributor

What fencing mechanism happens in Linux Clustering ?

Hello,

 

What fencing mechanism happens in RedHat Linux Clustering ?

Why it happens ? Why both the node goes down in most of the cases ?

 

Thanks,

Shiv

3 REPLIES
Chhaya_Z
Valued Contributor

Re: What fencing mechanism happens in Linux Clustering ?

Hi,

 

Red Hat supports a variety of fencing methods, including network-power fencing, network-switch fencing,

virtual-machine fencing, and storage fencing. You can configure a node with one fencing method or multiple

fencing methods. When you configure a node for one fencing method, that is the only fencing method

available for fencing that node. When you configure a node for multiple fencing methods, the fencing

methods are cascaded from one fencing method to another, according to the order of the fencing methods

specified in the cluster configuration file, until a node is successfully fenced.

 

Please check the below official document on fencing for more details:

 

 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/ch-fencing.html

 

 

Regards,
Chhaya

I am an HP employee.
Was this post useful? - You may click the KUDOS! star to say thank you.
Nighwish
Frequent Advisor

Re: What fencing mechanism happens in Linux Clustering ?

Hello

 

Fencing is to avoid dual slip brain in a two node cluster.

 

If both nodes go down you have a problem with your fencing method , and both nodes go down.

 

 

When a cluster lost communication with the other nodes it gives a fencing to the other node to avoid lost of data.

 

Review your configurations and fencing method, in HP servers you can use ilo as a fencing device.

 

Regards.

Matti_Kurkela
Honored Contributor

Re: What fencing mechanism happens in Linux Clustering ?

As you say "both the nodes", I assume you're talking about two-node clusters?

 

Two-node clusters are the worst case for RedHat Cluster, since the quorum system must effectively be disabled in two-node operation. When connection between the two nodes is broken, both nodes will follow the same procedure: "Because I'm still alive, the other node must have failed, either partially or completely. I must fence it to make sure it cannot later spontaneously recover and corrupt the disks I'm writing to." Both nodes will attempt to fence each other: if the fencing is by an external power switch, the switch should accept only one connection at a time, and therefore only one node can succeed in fencing the other. (This is called "fencing race".)

 

But if you're using integrated power control systems (like HP iLO), it's possible for both nodes to simultaneously send the power-off command to each other, causing both nodes to go down.

 

If the node that has been fenced out of a two-node cluster (let's say, node B) is allowed to reboot and restart cluster operations automatically, it will wait for a while for the other node, and then form a one-node cluster of its own. To protect itself, it will attempt to fence the other node (node A) at this point. If the now-fenced node A  follows the same procedure node B did (and by default, it does exactly that), it will cause a "reboot-and-fence loop" where both nodes keep rebooting and fencing each other until the network problem that prevents communication between the nodes is fixed.

 

Therefore, if it is not essential that a failed node can recover automatically, you may want to set the fencing action to "power off" instead of the default "hard reboot" in two-node clusters. You must then restart the failed node manually... and you can choose to not start up the cluster services until you have verified that the communication problem that caused the need for fencing is fixed.

 

With three or more nodes, when the cluster becomes split, only a portion that has quorum (i.e. more than 50% of cluster votes) is allowed to fence other nodes. This avoids most of the problematic situations of the 2-node cluster. You can simulate this with two-node clusters if you add a quorum disk to your cluster configuration.

 

If you have a valid RedHat subscription, you should read this RedHat Knowledge Base article:

https://access.redhat.com/kb/docs/DOC-55111

It has a more detailed description of the problem and a list of possible solutions.

MK