Container restarts are a common occurrence in production systems. Whether it's a node failure, scheduled maintenance, or resource rebalancing, traditional container restart means losing all in-memory state and forcing applications to rebuild their working set from scratch. For stateful applications, this translates to service interruption, degraded performance, and potentially impacts end-users.
CRIU (Checkpoint/Restore In Userspace) changes this equation entirely. Instead of killing and restarting containers, CRIU enables live migration of running containers with full state preservation, including memory contents, open file descriptors, and network connections.
Note: It is essential to recognize that the “liveness” of live migration is implementation-specific; the following factors influence it: (a) Is the snapshotted or checkpointed process still running? (b) Latency to restore the checkpointed process? (c) Is there downtime between the checkpointed application serving live traffic and when the restored process is ready to serve live traffic?
The Problem with Traditional Container Restarts
When Kubernetes reschedules a pod or Docker restarts a container, the process is destructive:
- SIGTERM sent to main process
- SIGKILL after a grace period
- All memory state is discarded
- New container starts from scratch
- Application rebuilds caches, reconnects to databases, and reloads configuration
For a web application with a 2GB in-memory cache, this may result in 30-60 seconds of degraded performance while the cache rebuilds. For a machine learning inference service with loaded models, the restart time could be several minutes.
How CRIU Works: Process State Serialization
CRIU operates at the Linux kernel level, leveraging several kernel features to capture and restore complete process state:
Memory Dumping
CRIU uses /proc/PID/pagemap
and /proc/PID/maps
to identify all memory regions belonging to a process tree. It then:
- Freezes the process tree using ptrace(PTRACE_SEIZE)
- Dumps all memory pages to disk
- Captures memory mapping information (heap, stack, shared libraries)
- Records memory protection flags and special mappings
File Descriptor Preservation
Every open file descriptor is catalogued and preserved:
- Regular files: path and offset position
- Sockets: protocol state, connection endpoints, buffer contents
- Pipes: buffer data and connection topology
- Device files: state-dependent handling
Process Tree Topology
CRIU reconstructs the exact process hierarchy:
- Parent-child relationships
- Process groups and sessions
- Signal handlers and pending signals
- CPU registers and execution state
Practical Implementation with Docker
Let's walk through a real checkpoint/restore scenario. First, ensure CRIU is installed and your kernel supports the necessary features:
# Install CRIU
sudo apt install criu
# Check kernel compatibility
criu check --ms
# Verify Docker experimental features
docker version --format '{{.Server.Experimental}}'
Start a container with checkpoint support enabled:
# Run container with checkpoint support
docker run -d --name webapp \
--security-opt seccomp:unconfined \
--cap-add SYS_PTRACE \
--cap-add SYS_ADMIN \
nginx:latest
# Generate some state
docker exec webapp bash -c "echo 'test data' > /tmp/state.txt"
Create a checkpoint:
# Checkpoint the container
docker checkpoint create webapp checkpoint1
# Verify container is stopped
docker ps -a
Restore from checkpoint:
# Restore container from checkpoint
docker start --checkpoint checkpoint1 webapp
# Verify state preservation
docker exec webapp cat /tmp/state.txt
Integration with containerd
For more advanced use cases, containerd provides native CRIU integration:
# Create checkpoint with containerd
ctr task checkpoint --exit mycontainer checkpoint1
# Restore on same or different node
ctr task restore --live checkpoint1 mycontainer
Production Considerations
Performance Impact
Checkpoint operations aren't free:
- Memory dump time: ~100MB/sec for typical workloads
- Network freeze duration: 10-500ms depending on connection count
- Restore time: Usually 2-5x faster than cold start
Kernel Requirements
CRIU requires specific kernel features:
- CONFIG_CHECKPOINT_RESTORE=y
- CONFIG_NAMESPACES=y
- CONFIG_PID_NS=y
- CONFIG_NET_NS=y
Security Implications
Checkpoint images contain complete process memory:
- Encrypt checkpoint storage
- Implement access controls
- Consider secrets in memory dumps
- Validate checkpoint integrity
Limitations and Gotchas
Network Connections
- TCP connections can be restored, but may need re-establishment
- UDP sockets restore more reliably
- External services may timeout during migration
File System Dependencies
- Absolute paths must exist on restore host
- Mounted volumes need identical configuration
- Device files may not be portable
Container Runtime Integration
- Docker checkpoint support is experimental
- Kubernetes native support is limited at the time of writing
- Custom orchestration often required
Advanced Use Cases
Database Migration
For databases with large buffer pools:
# Checkpoint MySQL container with 8GB buffer pool
docker checkpoint create mysql-prod checkpoint-$(date +%s)
# Restore on new node in <30 seconds vs 5+ minutes cold start
docker start --checkpoint checkpoint-1634567890 mysql-prod
Stateful Service Scaling
CRIU enables novel scaling patterns:
- Checkpoint running instance
- Restore multiple copies for instant horizontal scaling
- Preserve expensive initialization state
Future: Kubernetes Integration
Several projects are working on Kubernetes integration:
- Kubernetes Enhancement Proposal (KEP) for native checkpoint/restore
- Podman checkpoint integration with CRI-O
- Third-party operators for automated live migration, one of which is DevZero
Conclusion
CRIU transforms container restart from a disruptive operation into seamless live migration. While not suitable for every workload, it's particularly valuable for:
- Stateful applications with expensive initialization
- Services with large in-memory caches
- Long-running computations that need migration
- Zero-downtime maintenance scenarios
The technology is production-ready for specific use cases, though broader ecosystem integration is still evolving. For organizations running stateful workloads at scale, CRIU provides a powerful tool for achieving true zero-downtime operations.
Ready to implement live migration in your infrastructure? Start with non-critical workloads, measure the performance characteristics, and gradually expand to more critical services as you build operational confidence.