HPE Morpheus VM Essentials
1844286 Members
2514 Online
110230 Solutions
New Discussion

Re: Created a 2-node cluster for testing porposes, but one host is 'Fenced'?

 
Voortman123
Visitor

Created a 2-node cluster for testing porposes, but one host is 'Fenced'?

Hi, I'm new to HPE VME and, like a lot of others I think, we want to test HPE VME to see if it's a good replacement for our current VMware environment. For that we do have 2 HP DL360 Gen9 servers available and an old 3Par with SSD's in it. We were able to install HPE VME 8.0.11 on both hosts and present 2 iSCSI volumes to the cluster, but now one of the hosts get a status 'Cluster member status is fenced'.

Since this is a warning, could I proceed and start testing? I have now added the 3Par as a storage server to HPE VME and will test from there.

5 REPLIES 5
CalvinZito
Neighborhood Moderator

Re: Created a 2-node cluster for testing porposes, but one host is 'Fenced'?

Hi @Voortman123 . Hopefully somone with deeper technical expertise also responds but I think the nature of the message you are seeing is that external iSCSI storage requires 3 hosts to be "healthy". I had thought it wouldn't work at all but I'm now not certain of that. 

External iSCSI storage with VME is based on GFR2 and that needs three hosts for quorum. A "fenced" status meants the cluster isn't healthy. With only two hosts, if one host loses communication even briefly, the customer can't decide which is the right host. I'm not sure but I think the host that is fenced wont' work with your iSCSI connected 3PAR. I think you would want to add another host before you proceed with any testing.  

From the documentation:

An HVM cluster using the hyperconverged infrastructure (HCI) Layout consists of at least three hosts. Physical hosts are recommended to experience full performance of the solution. In smaller environments, it is possible to create an HVM cluster with three nested virtual machines, a single physical host (non-HCI only), or a single nested virtual machine (non-HCI only) though performance may be reduced. With just one host it won’t be possible to migrate workloads between hosts or take advantage of automatic failover.
HPE Morpheus VM Essentials Software Documentation v8.0.11, under "Base Cluster Details" on page 75  Note, you can change the version of the document on the top left side. 


I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
DiegoDelgado
HPE Pro

Re: Created a 2-node cluster for testing porposes, but one host is 'Fenced'?

Hello @Voortman123,
As Calvin states, GFS2 datastores require quorum because the locking method used there requires quorum in the cluster. Starting with 8.0.11, we allow the creation of such datastores with only 2 nodes because we are in the process of qualifying this kind of setup using an external quorum witness; in my own testing, sometimes the datastores get unfenced after a while and you can work with the setup, but you risk HA operations not working.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
SergeMe
Occasional Advisor

Re: Created a 2-node cluster for testing porposes, but one host is 'Fenced'?

Almost certainly a Quorum issue. It could be caused by Corosync packets getting delayed if iSCSI traffic and Cluster heartbeats use the same physical interface but more likely that there is a "death match" where both nodes try to fence each other simultaneously. 

The best way to fix this is to introduce a QDevice (Quorum Device) that acts as a 3rd party witness. This is a small service (corosync-qnetd) running on a separate machine (even a raspberry pi or a VM outside the cluster).

Another potential workaround could be to configure a Fencing Delay to prevent a "death match". You can do this by running the following commands on 1 of the 2 nodes:

# Command to find your <fence_device_name> for the following command
sudo pcs stonith show

# Output example:
# Full List of Resources:
#   * fence_idrac_node1     (stonith:fence_idrac):   Started node-01
#   * fence_idrac_node2     (stonith:fence_idrac):   Started node-02
#
# your fence device names are fence_idrac_node1 and fence_idrac_node2
#
# NOTE: You might only have one shared device (like fence_ipmilan or fence_scsi), or separate devices for each node

# Thie following tells the cluster: "If you are trying to fence Node1, wait 30 seconds first. If you are fencing Node2, shoot immediately."
sudo pcs stonith update <fence_device_name> pcmk_delay_base="node1:30s"


This will ensure that in a true split-brain scenario, one node "Shoots The Other Node In The Head" (STONITH) immediately and the other waits which guarantees one survivor rather than 2 dead nodes.

Hope that helps!

Voortman123
Visitor

Re: Created a 2-node cluster for testing porposes, but one host is 'Fenced'?

Hi, I found already that the accepted cluster config using iSCSI is at least 3 for quorum, but since this is just a test cluster 2 should work with accepting the downsides. However, I also don't like open issues, even on a test environment, because I want to simulate a production environment (fail-over, migration, backup, restore and so on).

@DiegoDelgado An external qurom witness would be very nice for us to use, so how to configure this? If this means installing another version of HPE VME, no problem!

 @SergeMe where do I configure the 'Fencing' stuff?

as said in my first post, I'm new to HPE VME

SergeMe
Occasional Advisor

Re: Created a 2-node cluster for testing porposes, but one host is 'Fenced'?

To configure the fencing parameters, you'll need to SSH into one of your HVM nodes. 

I have offered the above as a suggestion for a workaround, as the Cluster services in the HVM cluster use OpenSource services such as pacemaker, etc. 

Given that this is your test cluster, I would view it as a good learning opportunity to learn more about the underlying OpenSource components. For production you of course need to stay within the boundaries of the scenarios that HPE supports for HVM, not in the least to ensure you're not changing settings that would conflict with what Morpheus takes control over (although in this case I doubt that would be the case).