Workload Optimization
How DevZero generates and applies workload optimization recommendations -- rightsizing, replica scaling, and live migration.
Workload Optimization
Tuning zxporter settings to reduce sampling rates will affect the efficiency
and effectiveness of recommendations.
Recommendation Modes
Different workloads have different risk tolerances, traffic profiles, and scaling behavior. Recommendation modes allow you to choose how aggressive or conservative the system should be when reducing resources.
In automated mode, when the write operator is applying recommendations, it
currently doesn't reset limits, only requests. This is done as a
reliability measure.
| Mode | Requests | Limits | Notes |
|---|---|---|---|
| Balanced | Use max observed usage, capped to avoid more than 50% drop. | Adjusted to 75% of current limit, never below new request. | Default. Recommended for most workloads. |
| Aggressive | Use P90 of max (current and historical utilization). | Set to max of 1.5x current max or 75% of current limits. | Backed by reinforcement learning. |
| Conservative | Set to 1.2x max of current utilization. | Left unchanged from current values. | For critical or stateful workloads. |
Example
For a workload with peak observed usage of 4 cores and 12 Gi over the past 12 hours -- currently requesting 9 cores and 32 Gi, with limits set to 14 cores and 48 Gi:
- CPU Requests: 4 cores (max observed usage)
- Memory Requests: 12 Gi (max observed usage)
- CPU Limits: 10.5 cores (75% of current 14-core limit)
- Memory Limits: 36 Gi (75% of current 48 Gi limit)
Requests are set to the maximum observed usage (P100). If this would reduce requests by more than 50%, the cut is capped at 50% of current requests. Limits are set to 75% of current values, but always >= the recommended requests.
- CPU Requests: 4.8 cores (1.2x max usage)
- Memory Requests: 14.4 Gi (1.2x max usage)
- CPU Limits: 14 cores (unchanged)
- Memory Limits: 48 Gi (unchanged)
Requests are set to 1.2x the max observed usage, providing extra headroom. Limits are left unchanged.
- CPU Requests: 3.2 cores (P90 of usage)
- Memory Requests: 9 Gi (P90 of usage)
- CPU Limits: 6 cores
- Memory Limits: 18 Gi
Requests are based on the P90 percentile of observed usage. Limits are set to the greater of 1.5x max usage or 75% of current limits. Optimized for cost, assuming workloads can tolerate throttling.
Replica Count Adjustments (HPA-Aware)
In some cases, DevZero will recommend adjusting the replica count if a workload is significantly over-provisioned.
Applies only when:
- The workload has more than one replica
- CPU, GPU, or network bandwidth data is available
Mode multipliers:
- Aggressive:
1.0(assumes optimal resource use) - Balanced:
1.5(allows a buffer) - Conservative:
2.0(assumes higher future demand)
Node Recommendations
It is not recommended to have multiple node autoscalers running at the same time in a cluster.
Node recommendations consider:
- Instance availability in the region and availability zone
- Current shape of node groups (CPU/Memory/GPU/network bandwidth)
- Taints, tolerations, and affinity/anti-affinity rules
- Cloud provider pricing
- Number of pods running on each node and their utilization
Live Migration
Live migration allows you to checkpoint running workloads and restore them on different nodes without losing application state.
Install the agent and scheduler alongside the DevZero Workload Operator. Navigate to your cluster's Overview page in the dashboard, click Operators, and select Workload Operator to get the pre-configured Helm install command.
Label nodes that support live migration:
kubectl label node <node-name> dakr.devzero.io/checkpoint-node=trueValidate the label and ensure the containerd shim is present:
kubectl get nodes -l "dakr.devzero.io/checkpoint-node=true"
kubectl logs daemonset/dakr-dakr-operator-agent -n dakr-operator -c installerRun workloads on labeled nodes (set nodeSelector to target them).
Apply workload recommendations that have live-migration enabled via the dashboard.