HPE SimpliVity

Unexpected Downtime

Frequent Advisor

Unexpected Downtime

We have a client with a 2-node SVT cluster running 3.7.8 with ESXi 6.5.

We needed to shutdown the nodes one at a time to increase the RAM on these nodes.

Here is what we did and what occured:

  1. vMotioned all the VMs on node #2 to node #1.
  2. Shutdown the OVC on node #2.
  3. Then shutdown node #2 via vCenter.
  4. Added the new RAM to node #2.
  5. Powered up node #2.
  6. Waited for all the VMs (currently, on node #1) to go green in vCenter. So, data was synched.
  7. Now the node #2 was in full operation, we then vMotioned the VMs from node #1 to node #2.
  8. Shutdown the OVC on node #1. This completed successully. 
  9. Then shutdown node #1 via vCenter.

At this point, we started experiencing issues with the VMs running on node #2. These VMs randomly became non-responsive to our pings and the IP addresses displayed in vCenter had disappeared.

The host never did finish the shutdown so, we cold-booted it after having waited long enough.

While node #1 was rebooting, we noticed the VMs on node #2 beginning to respond again.

After reviewing the VMware event logs, we believe the issues began right after we shut down the OVC on node #1. 

We are greatly concerned about this unexpected behavior. We feel this is a routine operation and should have not resulted in any interruptions. 

Can anyone explain what we did wrong?





Re: Unexpected Downtime

How did you shutdown the OVC was it via the safeshutdown option ?

Where is the arbiter located.

If the Vms were in HA sync and the arbiter was functioning then something went wrong and should be investigated by support.



I am an HPE employee
Accept or Kudo
Frequent Advisor

Re: Unexpected Downtime

It was a normal shutdown using the vCenter plug-in.

The Arbiter is located on a Windows box on the management VLAN.

Someone from HPE mentioned that we didn't wait for HA sync before shutting down the OVC. However, in another post you stated that the OVC will not shutdown if HA sync has not completed.

We recently restarted both hosts in a 2-node cluster, sequentially in the exact same manner and did not suffer any outage. The only difference is that was a 4.0.1 cluster and this one is 3.7.8.

Respected Contributor

Re: Unexpected Downtime

Hi @tonymcmillan 

Thanks for the additional info. If the arbiter which arbitrates for this cluster is not located on this cluster, and if all VM's were in HA sync, then in theory as DaveOB mentioned, there should not have been an issue.

If you would like for us to investigate what may have occured, please create a support ticket, as in depth log analysis will be required.



Accept or Kudo