StoreVirtual Storage

Re: Loss of quorum with a two-node VSA cluster and NFS witness

Valued Contributor

Loss of quorum with a two-node VSA cluster and NFS witness

Hi all,

Just a quick question for the community out there. We have had a number of incidents with our two-node VSA clusters with an NFS witness losing quorum during cluster upgrades.

Has anyone else seen this behaviour?

We've logged a support job but that hasn't been very helpful so far.

The odd thing is that we can lose a node no problem and the cluster maintains quorum, yet when we initiate a cluster upgrade, this causes the quorum to drop.

In the meantime we have started deploying a full FOM which doesn't have the same issue. Bit of a pain as this requires an additional machine/hypervisor but at least is reliable.

This is affecting 12.6 and 12.7 on vSphere using an NFS witness on the local LAN via a windows server running NFS services. Also the MG reports the NFS quorum witness as healthy and data is being written to the share fine.




Re: Loss of quorum with a two-node VSA cluster and NFS witness

Dear @BenLoveday

Thank you for question with elaborated details.

I believe this is matching with the customer advisory, which says it may happen due to high network latencies.

CUSTOMER ADVISORY: HP StoreVirtual 4000 and HP P4000 G2 Storage, HP StoreVirtual VSA and HP LeftHand P4000 Virtual SAN Appliance Software - Operations May Fail Following Quorum Witness Configuration in Two-System Management Groups with High Network Latency:

The Failover Manager is a specialized version of the LeftHand OS software designed to operate as a manager and provide automated failover capability. The Quorum Witness (applicable only for 2-node cluster) is an alternative to the Virtual Manager (manual failover), and is a low-bandwidth alternative to the Failover Manager (automatic failover). 

Hope this clarifies your query.


I work for HPE.

Accept or Kudo

Valued Contributor

Re: Loss of quorum with a two-node VSA cluster and NFS witness

Hi Vikas,

Appreciate your reply, however I am aware of this advisory and we have no issues on our local LAN. We have been able to replicate this many times across many different sites and is only with the NFS witness. 

The NFS witness works fine normally, and in the event of a node failing, the quorum is maintaned as expected. However, when we attempt to perform patching/upgrades on the cluster, the quorum is lost. Also, the CMC does not report that the NFS witness is lost and still shows the cluster as being healthy.

Our engineers have monitored the network and witness during the upgrade process and not once did the NFS share go offline or have increased latency. It seems to be more of a bug in the LHOS.