0%·7 min left
Cloud Cost Optimization

Container Checkpoint/Restore with CRIU

Debo Ray

Debo Ray

Co-Founder, CEO

July 8, 20257 min read
Container Checkpoint/Restore with CRIU

Container restarts are a common occurrence in production systems. Whether it's a node failure, scheduled maintenance, or resource rebalancing, traditional container restart means losing all in-memory state and forcing applications to rebuild their working set from scratch. For stateful applications, this translates to service interruption, degraded performance, and potentially impacts end-users.

CRIU (Checkpoint/Restore In Userspace) changes this equation entirely. Instead of killing and restarting containers, CRIU enables live migration of running containers with full state preservation, including memory contents, open file descriptors, and network connections.

Note*: It is essential to recognize that the “liveness” of live migration is implementation-specific; the following factors influence it: (a) Is the snapshotted or checkpointed process still running? (b) Latency to restore the checkpointed process? (c) Is there downtime between the checkpointed application serving live traffic and when the restored process is ready to serve live traffic?*

The Problem with Traditional Container Restarts#

When Kubernetes reschedules a pod or Docker restarts a container, the process is destructive:

  1. SIGTERM sent to main process
  2. SIGKILL after a grace period
  3. All memory state is discarded
  4. New container starts from scratch
  5. Application rebuilds caches, reconnects to databases, and reloads configuration

For a web application with a 2GB in-memory cache, this may result in 30-60 seconds of degraded performance while the cache rebuilds. For a machine learning inference service with loaded models, the restart time could be several minutes.

How CRIU Works: Process State Serialization#

CRIU operates at the Linux kernel level, leveraging several kernel features to capture and restore complete process state:

Memory Dumping#

CRIU uses /proc/PID/pagemap and /proc/PID/maps to identify all memory regions belonging to a process tree. It then:

  • Freezes the process tree using ptrace(PTRACE_SEIZE)
  • Dumps all memory pages to disk
  • Captures memory mapping information (heap, stack, shared libraries)
  • Records memory protection flags and special mappings

File Descriptor Preservation#

Every open file descriptor is catalogued and preserved:

  • Regular files: path and offset position
  • Sockets: protocol state, connection endpoints, buffer contents
  • Pipes: buffer data and connection topology
  • Device files: state-dependent handling

Process Tree Topology#

CRIU reconstructs the exact process hierarchy:

  • Parent-child relationships
  • Process groups and sessions
  • Signal handlers and pending signals
  • CPU registers and execution state

Practical Implementation with Docker#

Let's walk through a real checkpoint/restore scenario. First, ensure CRIU is installed and your kernel supports the necessary features:

Integration with containerd#

Production Considerations#

Performance Impact#

Checkpoint operations aren't free:

  • Memory dump time: ~100MB/sec for typical workloads
  • Network freeze duration: 10-500ms depending on connection count
  • Restore time: Usually 2-5x faster than cold start

Kernel Requirements#

CRIU requires specific kernel features:

CONFIG_CHECKPOINT_RESTORE=y
CONFIG_NAMESPACES=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y

Security Implications#

Checkpoint images contain complete process memory:

  • Encrypt checkpoint storage
  • Implement access controls
  • Consider secrets in memory dumps
  • Validate checkpoint integrity

Limitations and Gotchas#

Network Connections#

  • TCP connections can be restored, but may need re-establishment
  • UDP sockets restore more reliably
  • External services may timeout during migration

File System Dependencies#

  • Absolute paths must exist on restore host
  • Mounted volumes need identical configuration
  • Device files may not be portable

Container Runtime Integration#

  • Docker checkpoint support is experimental
  • Kubernetes native support is limited at the time of writing
  • Custom orchestration often required

Advanced Use Cases#

Database Migration#

For databases with large buffer pools:

Stateful Service Scaling#

CRIU enables novel scaling patterns:

  • Checkpoint running instance
  • Restore multiple copies for instant horizontal scaling
  • Preserve expensive initialization state

Future: Kubernetes Integration#

Several projects are working on Kubernetes integration:

  • Kubernetes Enhancement Proposal (KEP) for native checkpoint/restore
  • Podman checkpoint integration with CRI-O
  • Third-party operators for automated live migration, one of which is DevZero

Conclusion#

CRIU transforms container restart from a disruptive operation into seamless live migration. While not suitable for every workload, it's particularly valuable for:

  • Stateful applications with expensive initialization
  • Services with large in-memory caches
  • Long-running computations that need migration
  • Zero-downtime maintenance scenarios

The technology is production-ready for specific use cases, though broader ecosystem integration is still evolving. For organizations running stateful workloads at scale, CRIU provides a powerful tool for achieving true zero-downtime operations.

Ready to implement live migration in your infrastructure? Start with non-critical workloads, measure the performance characteristics, and gradually expand to more critical services as you build operational confidence.

Frequently Asked Questions#

What are the top zero-downtime migration solutions for cloud-native applications in 2026?#

Zero-downtime solutions for cloud-native Kubernetes workloads each address a different problem. DevZero uses CRIU to migrate and resize running containers in place across CPU, memory, and GPU workloads. Blue/green deployments are the standard approach for application upgrades but require 2x infrastructure during transition. Flagger handles canary and blue/green deployments with automated traffic shifting, best for application updates. OpenKruise provides open-source in-place pod updates for resource changes. Argo Rollouts handles progressive delivery but not node-to-node workload migration. CRIU-based tools cover the gap when the goal is resizing or migrating an already-running workload rather than rolling out a new version.

What are the best alternatives to traditional Kubernetes migration methods like rolling updates and cordon/drain?#

Traditional Kubernetes migration methods — rolling updates, blue/green deployments, and VPA-based resource changes — all involve some form of pod restart or resource duplication. CRIU (Checkpoint/Restore in Userspace) is the foundational technology for true zero-downtime alternatives. CRIU creates a complete snapshot of a running process — memory, file descriptors, pipes, network connections, timers — and restores it on the target node. The workload continues running with no client-visible interruption. Platforms built on CRIU (DevZero among them) apply this to live rightsizing and bin-packing migration: adjusting CPU, memory, and GPU without restarting containers, and consolidating workloads onto fewer instances. For GPU workloads, CRIU checkpoint/restore also enables training jobs to be moved to spot instances mid-run, eliminating the need for costly on-demand GPU reservations.

How does CRIU differ from Kubernetes rolling updates for stateful workload migration?#

Kubernetes rolling updates work by starting new pods with updated configurations and terminating old pods once the new ones are ready. For stateless services this is seamless, but stateful workloads lose their in-memory state, open connections, and local cache during the transition. CRIU works differently: instead of starting a new process, it serializes the complete runtime state of the existing process — every memory page, open file descriptor, TCP connection state, and timer — checkpoints it to disk, and restores it on the target host. The process resumes exactly where it left off, with no state loss and no connection resets. This makes CRIU the only technology that enables true zero-downtime migration for stateful Kubernetes workloads.

What types of workloads are best suited for CRIU-based checkpoint and restore?#

CRIU works best for workloads where in-memory state, open connections, or accumulated cache have significant value. Ideal candidates include LLM inference servers (where restarting means rebuilding GPU memory state and reconnecting clients), database sidecars and caches (where restart means cache warmup time), long-running ML training jobs (where restart means re-running potentially hours of computation), and any stateful service with persistent TCP connections. CRIU is less suited for workloads with external hardware dependencies (beyond standard GPU support), workloads that rely on kernel namespaces that CRIU cannot serialize, or services with very short lifetimes where checkpoint overhead exceeds the benefit.

Is CRIU production-ready for Kubernetes workloads in 2026?#

CRIU has been production-ready for specific Kubernetes use cases since 2023 and is increasingly used in production at scale. CRIU-based live migration is deployed today across EKS, GKE, and AKS workloads, including by DevZero. The technology is stable for CPU and memory workload migration, and GPU checkpoint/restore for NVIDIA A100 and H100 is production-validated for training jobs. The key operational consideration is testing each workload class before enabling CRIU-based migration in production — most platforms (DevZero included) offer per-workload compatibility scoring and a phased rollout mode so teams can enable CRIU migration incrementally, starting with low-risk workloads.

Share:
Debo Ray

Debo Ray

Co-Founder, CEO

Cut Kubernetes Cost Before You Pay a Cent.

Every feature unlocked. No hidden fees.

Start for free

Start Free

$0/ month
Unlimited clusters
K8s resource & cost monitoring
Network monitoring
Cost attribution for departments
Multi-cloud support & governance
Audit logging