Kubernetes infrastructure operations
that actually scale

DevZero continuously analyzes your Kubernetes clusters and cuts infrastructure costs without touching application performance. Unlike Karpenter, it adjusts CPU and memory in place using MPA. No node replacement. No restarts.

Initial State
Applying Policy
Optimized
Before
After
c5.4xlargeOn-demand
$0.680/hr
c5.2xlargeOn-demand
$0.340/hr
c5.xlargeOn-demand
$0.170/hr
+3more nodes
6 nodes all
on-demand
c5.4xlargeOn-demand
$0.680/hr
c5.2xlargeOn-demand
$0.340/hr
c5.xlargeOn-demand
$0.170/hr
+3more nodes
On-demand — all 6 nodes
avg utilization: 28% · 72% wasted
28% utilized / 72% idle waste

All 6 nodes running on expensive on-demand instances

Companies who slashed their Kubernetes
spend
using DevZero

DATABAHN
Starburst
Fi
Outerbounds
Codilas
personality pool
Onnitech
OpenObserve
Parsimo
Dentira
DATABAHN
Starburst
Fi
Outerbounds
Codilas
personality pool
Onnitech
OpenObserve
Parsimo
Dentira

Every resize stays inside your policy

DevZero adjusts CPU and memory based on observed workload behavior. Choose Conservative, Balanced, or Aggressive per namespace. Hard min/max boundaries define what recommendations can never cross. autoApply is off by default. Changes queue for approval until you enable it. Models are per workload, not cluster-wide averages.

Cost Overview
Current CostUsage Cost
Avg. Monthly Projection
$136.84K/ mo▼2%
Average estimate based on request patterns during the period.
Right-Sized Projection
$44.36K/ mo▼68%
Estimated monthly cost if requests were optimized to usage.
Period Cost Delta
▼2.0%
Comparison of total cost against the previous period.
$210.00$180.00$150.00$120.00$90.00$60.00
Apr 1 at 00:00Apr 1 at 04:00Apr 1 at 08:00Apr 1 at 12:00
Cost attribution
Spend mapped to the workload consuming it
cluster / month
$14k
total infra spend
saved / month
$6.3k
after rightsizing
cluster efficiency
83 /100
up from 41/100
Cost by team
ml-team
$5,060
↓ 31%
data
$3,840
↑ 4%
platform
$1,820
↓ 59%
web
$0
↓ 64%
$2,500$5,000+
$6,340 attributed savings this monthTrend data: 30-day lookback per team

Spend mapped to the workload consuming it

Cost attribution follows the same namespace filters and label selectors in your policies. The data matches exactly how your team scoped their rules. Drill from cluster to individual pod. Trend data surfaces cost drift before it becomes a problem.

One policy or fifty the fleet adapts

Target by namespace, label, kind, or name pattern. Precedence is clear when rules overlap. autoApply is set per policy; production waits for approval while dev runs automatically. Node group policies are separate. Instance families, spot vs on-demand, and consolidation aggressiveness are independent from workload rules.

The fleet adapts
Most specific policy wins · precedence is always clear
Priority1By nameapi-server
WINS
Priority2By labelapi-server
2nd match
Priority3By namespaceproduction
3rd match
Priority4Defaultfallback
last resort
autoApply behavior per namespace
autoApply: false
production
waits for manual approval
autoApply: truemode: aggressive
dev / staging
applies changes automatically
WorkloadRequests Based
Cost
OptimizationsHealthNamespace
ml--nar-earth-performan…
CPU: $5.3172 · Mem.: $0.7060 · GPU: $21.80
$27.8314Not OptimizedUpdatesearch-aws-ml…Optimize
ml--nar-earth-staging-u…
CPU: $5.5440 · Mem.: $0.3409 · GPU: $15.8C
$21.6873Not OptimizedUpdatesearch-aws-ml…Optimize
ml--nar-mars-performan…
CPU: $4.7951 · Mem.: $0.6284 · GPU: $16.32
$21.7511Not OptimizedUpdatesearch-aws-ml…Optimize
ml--mars-staging-use1
CPU: $5.4751 · Mem.: $0.3218 · GPU: $13.65
$19.4502Not OptimizedUpdatesearch-aws-ml…Optimize
ml--mercury-staging-use1
CPU: $8.4314 · Mem.: $0.6993 · GPU: $0.941
$10.0718Not OptimizedUpdatesearch-aws-ml…Optimize
nuwa-embedding-servic…
CPU: $2.4192 · Mem.: $0.9325 · GPU: $0.00
$3.3517Not OptimizedUpdatesearch-aws-ml…Optimize
ml--apac-mars-perform…
CPU: $4.0049 · Mem.: $0.5248 · GPU: $20.15
$24.6867Not OptimizedUpdatesearch-aws-ml…Optimize
ml--earth-performance-…
CPU: $4.0644 · Mem.: $0.5397 · GPU: $20.17
$24.7751Not OptimizedUpdatesearch-aws-ml…Optimize

Different workloads per workload, not per cluster

Write scoped policies per workload type. Tight boundaries and manual approval for production. Aggressive P90 rightsizing for dev and staging. A dedicated GPU policy targets nvidia accelerator labels directly. Node group policies handle spot preference and consolidation aggressiveness per pool, fully independent from workload rules.

Built for platform engineers

The controls your team actually needs, not a dashboard that requires a PhD to interpret.

cgroup v2 · live
CPU throttle
78%
Mem PSI stall
43ms
OOM events
0
sub-second
prometheus · 15s avg
CPU usage
38%
Mem usage
51%
Throttle visible
averaged out

P99 latency spike · api-server-7f6d9

throttle_pct=0.78 · caught in 180ms window

Detected

DevZero reads CPU throttle rates, memory pressure (PSI), and OOM events directly from cgroup v2, not 15-second Prometheus averages. Sub-second anomalies that cause P99 latency spikes are visible and actionable.

EKS
841
pods managed
GKE
519
pods managed
AKS
302
pods managed
On-prem
178
pods managed
Single tenant1,840
pods · 4 clusters
$41kMonthly Savings
3Cloud Providers
4Clusters
1Tenant

One DevZero tenant manages rightsizing across EKS, GKE, AKS, and on-prem clusters. Monthly savings and optimization metrics are aggregated in a single view across all cloud providers.

DEVZERO

cpu.request: 2000m → 680m · conf=0.94
Triggered

SLACK · #PLATFORM-RIGHTSIZING

Advisory mode · saves $2.84/hr
Posted

GITHUB · HELM-VALUES PATCH

resources.requests.cpu: "680m"
PR open

Advisory mode surfaces recommendations where your team already works. GitHub PR integration proposes changes before anything touches production. Slack integration posts to the on-call channel, keeping the right people informed.

What our customers say

Databahn logo

We were essentially able to reduce the cost of that cluster by about 75%. On AWS, DevZero demonstrated they could achieve significantly higher savings than we initially thought possible.

Mihir Nair

Mihir Nair

Head of Architecture, Databahn

Frequently asked questions

Stop paying for CPU and memory you're not usingStart optimizing your infrastructure today

Connect your cluster in under 30 minutes. No code changes. No pod restarts. First savings visible within 24 hours.