Solved: Re: VM Failover in Ceph Cluster

Manuel Boosch · ‎09-10-2025

Hello,

in my 3-node ceph cluster (HVM 1.2 HCI Ceph Cluster on HVM/Ubuntu 24.04, created with ver 8.0.8, patched to 8.0.9) i've created some hvm instances with windows virtual machines (W2022 & W2019). The virtual machines are placed on mvm-volumes and the placement strategy is set to "Auto". The vms have the VirtIO drivers installed and all got an DHCP IP.

For testing i've removed the power of one of the cluster hosts (host3) where one of the vms was running on via ilo "cold boot". I guess there should be a failover for this vm to one of the other hosts (host1 or host2) & automatic restart but it did not work. After host3 is rebooted the vm is remaining in an "power off" state on the same host3.

Am i missing something ?

Arnout_Verbeken · ‎09-10-2025

Did you enable "Automatic power on VMS" in the Cloud and Cluster settings?
-->This is needed for HA (what you are testing)

Did you configure heartbeating on one of the cluster datastores?
--> This is needed for fast HA.
Without heartbeating, HA will only be triggered on next cluster sync (default every 5 min).
With heartbeating, HA will be triggered instantly (within seconds)

I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

Manuel Boosch · ‎09-11-2025

- automatic power on VMS was already activated

- Datastore Heartbeating was not enabled

I've enabled heartbeating on both of the cluster default "directory pool" datastore's "morpheus-cloud-init" and "morpheus-images". But the outcome is still the same. If i "cold boot" host3 all vm's on this host stay on host3 with status powered off even when host3 finished rebooting.

If i do a manual placement to another host1 or host2, migration works without problems. If i put host3 in maintenance mode all vm's are moved to other hosts.

Is there a log where i can troubleshoot HA issues ?

Thank you !

Manuel Boosch · ‎09-12-2025

hello,

i noticed that when i enable "datastore heartbeating" it will create the folder "mvm-hb" on this datasore. I checked all hosts for this folder in /var/morpheus/kvm/images and this path does not exist on host2. But it exists on host1 + host3. If i put host2 in maintenance mode and retry the failover (cold boot of host3) it works. All vm's are restarted on host1. So i guess the disk is not correctly mounted to host2.

Diego17 · ‎11-13-2025

I’m experiencing the same issue as Bosch. When I power off one of the hosts in the cluster, my VM doesn’t automatically migrate to another host — it just stays powered off in the same host.

All my VMs are stored on shared NFS storage accessible by all three nodes, but failover doesn’t occur.

I’ve already enabled everything mentioned above. Is there anything I might be missing?