Techniques are disclosed for maintaining
high availability (HA) for virtual machines (VMs) running on host systems of a host cluster, where each host
system executes a HA module in a plurality of HA modules and a storage module in a plurality of storage modules, where the host cluster aggregates, via the plurality of storage modules, locally-attached storage resources of the host systems to provide an
object store, where persistent data for the VMs is stored as per-VM storage objects across the locally-attached storage resources comprising the
object store, and where a
failure causes the plurality of storage modules to observe a
network partition in the host cluster that the plurality of HA modules do not. In one embodiment, a host
system in the host cluster executing a first HA module invokes an API exposed by the plurality of storage modules for persisting
metadata for a VM to the
object store. If the API is not processed successfully, the host
system: (1) identifies a subset of second HA modules in the plurality of HA modules; (2) issues an
accessibility query for the VM to the subset of second HA modules in parallel, the
accessibility query being configured to determine whether the VM is accessible to the respective host systems of the subset of second HA modules; and (3) if at least one second HA module in the subset indicates that the VM is accessible to its respective host system, transmits a command to the at least one second HA module to invoke the API on its respective host system.