If you’re running AI workloads on Kubernetes, chances are your average GPU/CPU utilization is below 25%, leading to thousands of dollars being wasted per cluster.In this hands-on workshop with NVIDIA and DevZero, we’ll show you how to measure what’s actually being used, uncover why GPUs go underutilized, and implement fixes that improve performance and unlock real efficiency gains.
You’ll discover:
- Why most GPU/CPU clusters run at just 15–25% utilization and how increasing that by even 10–20% can save hundreds of thousands in wasted compute
- How to go beyond nvidia‑smi, leveraging DCGM and Kubernetes integrations for granular GPU/CPU visibility
- Workload-specific optimization strategies like checkpoint/restore for training, right-sizing memory for inference, and cost‑effective node selection
- How NVIDIA MIG and container-level isolation let teams safely share GPUs and CPUs without stepping on each other
You’ll walk away with understanding the resources required by workload type, concrete tools to measure GPU/CPU utilization and a clear roadmap for right-sizing your infrastructure.

