Terraform
Workload Rule Manage per-workload vertical and horizontal scaling rules with Terraform.
devzero_workload_rule configures vertical and horizontal scaling for a specific Kubernetes workload — CPU, memory, GPU, and HPA rules — along with emergency response and per-container overrides.
resource "devzero_workload_rule" "auto" {
cluster_id = "<YOUR_CLUSTER_ID>"
namespace = "production"
kind = "Deployment"
name = "my-api"
auto_generate = true
}
resource "devzero_workload_rule" "manual" {
cluster_id = "<YOUR_CLUSTER_ID>"
namespace = "production"
kind = "Deployment"
name = "my-api"
action_triggers = [ "on_schedule" , "on_detection" ]
cron_schedule = "0 2 * * *"
detection_triggers = [ "pod_creation" , "pod_update" , "pod_evict" ]
cpu_rule = {
enabled = true
min_request = 10
max_request = 32000
target_percentile = 0.95
limits_adjustment_enabled = true
limit_multiplier = 1.0
}
memory_rule = {
enabled = true
min_request = 67108864 # 64Mi in bytes
max_request = 68719476736 # 64Gi in bytes
target_percentile = 0.95
limits_adjustment_enabled = true
}
hpa_rule = {
enabled = true
min_replicas = 1
max_replicas = 10
target_utilization = 0.70
primary_metric = "cpu"
}
emergency_response = {
oom_enabled = true
oom_memory_multiplier = 1.5
cpu_throttling_enabled = true
cpu_throttling_threshold = 0.20
cpu_throttling_multiplier = 1.25
}
live_migration_enabled = false
use_in_place_vertical_scaling = false
}
resource "devzero_workload_rule" "per_container" {
cluster_id = "<YOUR_CLUSTER_ID>"
namespace = "production"
kind = "Deployment"
name = "my-multi-container-app"
action_triggers = [ "on_detection" ]
detection_triggers = [ "pod_creation" ]
containers = [
{
container_name = "app"
cpu_rule = {
enabled = true
min_request = 10
max_request = 32000
}
memory_rule = {
enabled = true
min_request = 67108864 # 64Mi
max_request = 68719476736 # 64Gi
}
},
{
container_name = "sidecar"
cpu_rule = {
enabled = true
max_request = 500
}
}
]
}
Parameter Type Description cluster_idstring ID of the cluster this rule targets kindstring Kubernetes workload kind namestring Name of the Kubernetes workload namespacestring Kubernetes namespace of the workload
Parameter Type Description action_triggerslist(string) When to apply recommendations: "on_detection", "on_schedule" auto_generatebool When true, the engine generates all rule fields automatically; manual overrides are ignored containerslist(object) Per-container rule configurations (see Containers ) cpu_ruleobject CPU vertical scaling rule (see Resource Rule ) cron_schedulestring 5-field UTC cron expression for scheduled application defragmentation_schedulestring Cron expression for node defragmentation detection_triggerslist(string) Events that trigger a recommendation: "pod_creation", "pod_update", "pod_evict" emergency_responseobject Emergency response for OOM and CPU throttle events (see Emergency Response ) gpu_ruleobject GPU vertical scaling rule (see Resource Rule ) hpa_ruleobject Horizontal scaling rule (see HPA Rule ) live_migration_enabledbool Allow live pod migration when applying recommendations memory_ruleobject Memory vertical scaling rule (see Resource Rule ) scheduler_pluginslist(string) Kubernetes scheduler plugins to activate use_in_place_vertical_scalingbool Use in-place pod vertical scaling instead of pod restarts
Attribute Type Description idstring Unique identifier of the workload rule
Used by cpu_rule, memory_rule, and gpu_rule at both workload and per-container level.
Parameter Type Description enabledbool Enable this resource axis rule min_requestnumber Minimum resource request (millicores for CPU, bytes for memory/GPU) max_requestnumber Maximum resource request (millicores for CPU, bytes for memory/GPU) target_percentilenumber Percentile of usage data used as the recommendation target (0–1) limits_adjustment_enabledbool Whether to also adjust resource limits alongside requests limits_removal_enabledbool Actively remove limits from workloads limit_multipliernumber Multiplier applied to the request to derive the resource limit max_scale_up_percentnumber Maximum percentage increase allowed in a single cycle max_scale_down_percentnumber Maximum percentage decrease allowed in a single cycle
max_scale_up_percent and max_scale_down_percent are available on workload-level cpu_rule, memory_rule, and gpu_rule but not on per-container rules.
Parameter Type Description enabledbool Enable horizontal (replica) scaling min_replicasnumber Minimum number of replicas max_replicasnumber Maximum number of replicas primary_metricstring Primary metric: "cpu", "memory", "gpu", "network_ingress", "network_egress" target_utilizationnumber Target CPU utilization ratio (0–1) target_memory_utilizationnumber Target memory utilization ratio (0–1), tuned independently of CPU max_replica_change_percentnumber Maximum percentage change in replica count per cycle composite_formulastring Formula combining multiple metric weights. Example: "0.6*cpu + 0.4*memory" metricslist(object) Additional metric triggers (see HPA Metrics ) behaviorobject Fine-grained scale-up/scale-down policies (see HPA Behavior ) fallbackobject Replica fallback when metrics are unavailable (see HPA Fallback )
CPU/Memory/Network triggers are auto-generated from primary_metric. Use metrics for additional sources such as Prometheus.
Parameter Type Description typestring Metric source type. Example: "CPU", "Memory", "prometheus" target_utilizationstring Target utilization as a decimal string. Example: "0.70" target_valuestring Absolute target value as a string. Example: "50000000" weightstring Weight for composite formula scaling (0–1 decimal string). Example: "0.5" server_addressstring Prometheus server URL. Example: "http://prometheus:9090" querystring PromQL query string. Example: "sum(rate(http_requests_total[2m]))" metadatamap(string) Free-form key-value metadata for external scalers
Parameter Type Description stabilization_window_secondsnumber Seconds to wait before acting on a scaling signal to avoid flapping select_policystring Which policy wins when multiple match: "Max", "Min", "Disabled" policieslist(object) List of step policies
Each policy object requires:
Parameter Type Description typestring Policy type: "Pods" or "Percent" valuenumber Policy value (pod count or percent) period_secondsnumber Period over which the policy applies
Parameter Type Required Description replicasnumber Yes Number of replicas to fall back to when metrics are unavailable behaviorstring No Fallback strategy: "static", "currentReplicas", "currentReplicasIfHigher", "currentReplicasIfLower" failure_thresholdnumber No Consecutive metric failures before activating fallback
Parameter Type Description oom_enabledbool React to OOM kills by increasing memory oom_memory_multipliernumber Multiplier applied to memory on OOM reaction cpu_throttling_enabledbool React to CPU throttling by increasing CPU request cpu_throttling_thresholdnumber Throttle ratio threshold that triggers a reaction (0–1) cpu_throttling_multipliernumber Multiplier applied to CPU request on throttle reaction
When containers is set, the workload-level cpu_rule, memory_rule, and gpu_rule are ignored. Each container entry accepts:
Parameter Type Required Description container_namestring Yes Name of the container this config applies to cpu_ruleobject No CPU resource rule (see Resource Rule ) memory_ruleobject No Memory resource rule (see Resource Rule ) gpu_ruleobject No GPU resource rule (see Resource Rule )