u/UniiMiinD

[Help] How to achieve Instance HA (Masakari) on a 3-Node Hyperconverged cluster? (Kolla-Ansible Pacemaker conflict)

Hi everyone,

I’m looking for some architectural advice. I have 3 powerful bare-metal servers and I want to deploy a highly available OpenStack cloud on them. Because I only have 3 nodes, they need to be hyperconverged (running both Control and Compute services on all 3 nodes).

My primary requirement is Instance HA—if one of the physical nodes suddenly dies, I need the VMs to automatically evacuate and restart on the surviving nodes. Naturally, I looked into Masakari.

I am currently using Kolla-Ansible, but I've hit an architectural roadblock:

  • Masakari's host-monitor relies on Pacemaker/Corosync to detect host failures.
  • In Kolla, Controller nodes run the full pacemaker service, while Compute nodes run pacemaker_remote.
  • Because my nodes are both Control and Compute, Kolla-Ansible conflicts trying to deploy both pacemaker roles on the same host, breaking the deployment/monitoring.

I am open to any changes necessary to get this working. My questions for the community are:

  1. Is there a clean workaround in Kolla-Ansible for this? Has anyone successfully deployed Masakari on hyperconverged nodes using Kolla?
  2. Alternative Masakari Drivers: I’ve read that Masakari can technically use Consul or direct libvirt polling instead of Pacemaker. Is it worth trying to hack Kolla to use Consul + external IPMI fencing scripts, or is that a maintenance nightmare?
  3. Different Deployment Tools: Do other deployment tools (like OpenStack-Ansible, Kolla-K8s, or Canonical/Sunbeam) handle Instance HA on hyperconverged nodes better than Kolla-Ansible?
  4. The Proxmox Route: Would it be better to just install Proxmox on the bare-metal for node-level HA, and run OpenStack Control and Compute as VMs on top? (I'm worried about the nested virtualization performance penalty here).

Any advice, documentation, or reality-checks would be hugely appreciated. Thanks in advance!

reddit.com
u/UniiMiinD — 6 days ago