Proxmox VE has introduced one of those updates that may not generate much noise outside the virtualization world, but can significantly change day-to-day cluster operations. The pve-manager 9.1.8 update, together with pve-ha-manager 5.2.0, introduces a new scheduling mode called Dynamic Load and an automatic rebalancing option for HA-managed resources. It is not a clone of VMware DRS, but it does cover a very specific gap: preventing a cluster from remaining unbalanced after a failure or after adding new capacity.

Until now, Proxmox high availability behavior was effective but fairly direct. If a node failed, HA restarted the affected virtual machines or containers on the surviving cluster members. The problem came later. Once the failed node returned to the cluster, workloads did not automatically move back or redistribute themselves. The administrator had to migrate them manually, rely on scripts, or accept that one node remained underused while the others continued carrying more work than necessary.

What Dynamic Load Adds to Proxmox HA

The new feature sits in the cluster resource scheduler, under Datacenter options. Proxmox now adds a smarter layer for deciding where to place HA workloads and, if automatic rebalancing is enabled, when to move them to improve the overall cluster distribution.

This does not turn Proxmox into a platform with full VMware-style DRS. It is more accurate to describe it as load-balanced HA. The system observes CPU and memory usage, compares the state of the nodes and decides whether a migration would improve the overall balance. It does not try to predict future load or model complex long-term scenarios. It works with the current state and with the thresholds configured by the administrator.

That distinction matters. VMware DRS has spent years acting as a continuous optimizer, with reservations, limits, affinity policies, automation and recommendations in environments that can scale to thousands of virtual machines. Proxmox takes a simpler and more transparent approach. For many three-, five- or seven-node clusters, that simplicity is not a serious limitation. It is exactly what was needed: the system no longer depends on a person to redistribute workloads sensibly.

Scheduling modeWhat it does
DefaultKeeps the historical behavior, without considering node load
BasicDistributes resources based on the number of HA workloads per node
Static-LoadUses configured maximum CPU and memory quotas to decide placement
Dynamic-LoadUses real observed CPU and memory metrics from the nodes
Automatic RebalanceAllows existing HA workloads to be moved if it improves cluster balance

In practice, the clearest use case is a node failure. Previously, the cluster recovered, but the distribution remained uneven. Now, with the right scheduler mode and automatic rebalancing enabled, Proxmox can move workloads back or redistribute them across nodes when the calculation shows that migration improves the overall state.

It is also useful when expanding a cluster. If a new node is added, administrators no longer need to wait for future workloads to use it or move virtual machines manually. The system can start using it for HA-managed resources and migrate existing workloads when doing so improves balance.

How It Avoids Becoming a Migration Storm

The main concern with any automatic rebalancing system is that it may start moving virtual machines endlessly. A small load variation, one migration, another node becomes slightly busier, then another migration follows. In virtualization, bad automation can be worse than no automation.

Proxmox tries to avoid this through several controls. The imbalance threshold defines how much imbalance must exist before the scheduler even considers acting. The minimum imbalance improvement defines how much better the cluster must look after a proposed migration before it is allowed. The hold duration introduces a minimum interval so that the same workload is not moved repeatedly in a short period.

This design makes sense because it lets administrators tune how aggressive the system should be. In a small cluster with a few critical workloads, a slightly more reactive behavior may be desirable. In an environment with many virtual machines, sensitive shared storage or Ceph, a more conservative profile may be better to avoid operational noise, unnecessary network traffic or migrations at inconvenient times.

The feature also respects affinity rules. If two resources have positive affinity and should run together, the scheduler treats them as one unit for load calculations and migration decisions. If negative affinity rules exist to keep workloads apart, the scheduler respects them when deciding placement. This matters because many HA architectures depend on separating redundant components, controllers, databases or services that should not run on the same node.

There is also an upcoming improvement that points to finer control: the auto-rebalance property per HA resource. The idea, already being worked on in Proxmox patches, is to let administrators define which resources may or may not be candidates for automatic migration during rebalancing. By default, the value is expected to be enabled, but disabling it for a specific VM would prevent that VM from being moved by the automatic rebalancer.

Why This Matters for VMware Migrations

The absence of DRS has been one of the recurring arguments against Proxmox in VMware migration discussions. It was not always decisive, but it came up often. For large environments with thousands of virtual machines and advanced capacity policies, DRS still offers more depth. For medium-sized, departmental, managed services or private cloud clusters, what was needed was more basic: for Proxmox to understand real load and rebalance without constant manual intervention.

This update changes that conversation. It is no longer accurate to say that Proxmox has no automatic balancing mechanism for HA workloads. It does, although with a more limited scope than DRS. The right question becomes different: does the organization really need a complex predictive optimizer, or is load-aware placement and automatic HA rebalancing enough?

For many scenarios, the second option will be sufficient. A Proxmox cluster with HA, Ceph or shared storage, integrated backups, Veeam support and now workload rebalancing covers more and more use cases that previously felt more comfortable on vSphere because of operational maturity. It does not remove the need for good design, proper sizing and failure testing, but it reduces manual work in daily operations.

It also fits the Proxmox philosophy: adding useful capabilities without building an excessively opaque layer. An administrator can understand why a workload moved: there was an imbalance, the improvement exceeded the configured minimum, and the resource was not blocked by rules or conditions. That operational traceability has value, especially for smaller teams that do not want to argue with a hard-to-explain scheduling “brain”.

What This Feature Is Not

The new rebalancing feature does not replace good architecture. If a cluster is poorly sized, if virtual machines are oversized, if storage is the bottleneck or if the migration network is not ready, the scheduler will not perform miracles. It will move workloads between nodes, but it will not fix a bad procurement decision or a design with no growth margin.

It is also not a capacity planning tool. It does not predict whether the cluster will run out of RAM, CPU or IOPS in six months. That still requires historical metrics, observability, planning and regular review. The rebalancer works on the present, not on a future demand model.

Another limitation is that it applies within each cluster. Proxmox Datacenter Manager can provide visibility and management across clusters, but this feature does not turn multiple clusters into a single global pool capable of rebalancing workloads across sites. For most medium-sized organizations, that will not be a problem, although it will matter for more complex distributed environments.

Administrators should also review versions before enabling anything. In some repositories, the pve-manager interface exposed options before all required components were available in the same branch. The feature depends on compatible versions of pve-ha-manager and other cluster packages, so the sensible approach is to check with pveversion -v, read the update notes and test in a controlled environment.

For production, the reasonable recommendation is not to enable it on a Friday afternoon. First, test it with representative workloads, simulate node failure and recovery, observe migration behavior, adjust thresholds and validate the impact on network, storage and services. The migrations are real and can affect sensitive applications, even if the system is designed to behave conservatively.

The update confirms a broader trend: Proxmox is maturing quickly as an enterprise virtualization platform. Not by copying VMware feature by feature, but by closing practical gaps that made production adoption harder. Automatic HA rebalancing is not the final word in resource scheduling, but it is an important improvement for organizations that want to operate Proxmox clusters with less manual intervention and more confidence after failures, expansions or workload changes.

Frequently Asked Questions

Does Proxmox now have DRS like VMware?

Not exactly. Proxmox now includes automatic HA rebalancing and load-aware scheduling, but not a full predictive DRS equivalent like VMware’s. For many medium-sized clusters, however, it may solve the most common operational problem.

What is Dynamic Load in Proxmox?

Dynamic Load is a scheduling mode that uses real CPU and memory metrics to decide placement and rebalancing of HA resources inside the cluster.

Does rebalancing move all virtual machines?

No. It applies to HA-managed resources and respects affinity rules. Proxmox is also working on an auto-rebalance option to control which resources may be moved automatically.

Should it be enabled directly in production?

The cautious approach is to test it first in a representative environment. Thresholds, migration impact, storage behavior and affinity rules should be validated before allowing it to act on critical workloads.

Scroll to Top