KubeCon EUBooth 1151, Amsterdam. March 23-26

Key Concepts

Core concepts in DevZero -- operators, recommendations, policies, and live migration.

Key Concepts

Operators

DevZero uses the Kubernetes operator pattern to extend cluster functionality. Each operator has a specific responsibility:

OperatorRoleCluster Access
Read (zxporter)Collect metricsRead-only
Write (dakr-operator)Apply recommendations, update CRDsRead-write (scoped)
Node (dzkarp)Manage node lifecycleNode management
Network (zxporter-netmon)Monitor trafficRead-only
Security (dakr-security)Scan vulnerabilities (KSPM)Read-only
Scheduler (dz-scheduler)Optimize pod placementScheduler bindings

Recommendations

A recommendation is a suggested change to a Kubernetes resource. DevZero generates recommendations by comparing actual resource usage against current requests and limits.

If enabled in the policy, DevZero will apply the recommendation reactively as well as proactively (by analyzing historical behavior).

Each recommendation includes:

  • Target resource -- the deployment, statefulset, or daemonset to modify
  • Current values -- existing CPU/memory/GPU requests and limits, and replica counts
  • Recommended values -- suggested values based on observed utilization
  • Confidence level -- based on the amount of historical data available
  • Projected savings -- estimated cost reduction

Recommendations can be reviewed in the dashboard and applied manually, or auto-applied based on policies.

Policies

A policy defines the rules for how DevZero optimizes your cluster. Policies control:

  • Scope -- which namespaces, labels, or workload types are targeted
  • Aggressiveness -- how tightly to fit resources to usage (conservative, moderate, aggressive)
  • Guardrails -- minimum and maximum resource values that can't be exceeded
  • Approval mode -- automatic application vs. audit-only
  • Schedule -- time windows when changes are allowed (e.g., business hours only)
  • Optimization application mode -- spec patching vs. checkpoint/restore
  • In-cluster MPA -- using a cluster-local MPA (multi-dimensional pod autoscaler) for latency sensitive workflows

Live Migration

Live migration (aka checkpoint-restore) allows DevZero to resize workloads without restarting pods. This is critical for stateful applications (databases, caches, queues) that can't tolerate downtime.

Currently, live migration is only supported for workloads that don't use init containers.

CRDs

DevZero operators use Custom Resource Definitions (CRDs) to represent their state in Kubernetes:

  • WorkloadRecommendation -- a pending or applied recommendation for a workload
  • NodeGroupRecommendation -- a recommendation for node pool changes
  • NodeClass, NodePool, NodeClaim -- cluster autoscaling state tracking
  • VulnerabilityReport -- security scan results for a container image
  • ConfigAuditReport -- configuration compliance results
  • ClusterComplianceReport -- cluster-wide compliance status

Supported Resource Types

DevZero can optimize the following Kubernetes resource types:

Unless explicitly noted, optimizations can be applied using traditional Kubernetes workload spec patching as well as checkpoint/restore (live migration).

Core Workloads

ResourceKindNotes
DeploymentDeploymentMost common workload type
StatefulSetStatefulSetDatabases, caches, queues
DaemonSetDaemonSetPer-node workloads
JobJobBatch workloads (only checkpoint/restore)
CronJobCronJobScheduled batch workloads
ReplicaSetReplicaSetUsually managed by a Deployment
ReplicationControllerReplicationControllerLegacy controller
PodPodStandalone pods (only checkpoint/restore)

Extended Workloads

DevZero also supports third-party resource types commonly found in data and ML platforms:

ResourceKindProject
Argo RolloutRolloutArgo Rollouts
Kubeflow NotebookNotebookKubeflow
Volcano JobVolcanoJobVolcano
Spark ApplicationSparkApplicationSpark Operator
Scheduled Spark ApplicationScheduledSparkApplicationSpark Operator
CNPG ClusterClusterCloudNativePG

On this page