Using NVIDIA GPUs with Kubernetes After Dockershim Removal and Switching to Containerd
Kubernetes relies on the Container Runtime Interface (CRI) to communicate with container runitmes. Docker never implemented CRI natively, so Kubernetes historically included a built-in dockershim component to bridge this gap. With dockershim maintenance ending in Kubernetes 1.24+, users are migrating to CRI-compliant runtimes like containerd. While plenty of documentation covers GPU setup with Docker in Kubernetes, resources for containerd-based clusters are less common. This walkthrough focuses on that transition while ensuring Pods can access NVIDIA hardware.
All steps assume a working Kubernetes cluster with containerd already installed, and skip OS-specific prerequisites beyond minimal examples. The core goal remains the same regardless of runtime: make host GPUs visible and usible inside containers.
Step 1: Install NVIDIA Host Drivers
NVIDIA kernel drivers are required on every GPU-enabled worker node. The official .run installer works across most Linux distributions and simplifies cleanup, though it requires compilation tools and kernel headers matching the running kernel.
For Debian/Ubuntu-based systems:
# Install required build dependencies
apt update && apt install -y gcc make linux-headers-$(uname -r)
# Download driver script (adjust version based on GPU model)
wget https://us.download.nvidia.com/tesla/470.239.06/NVIDIA-Linux-x86_64-470.239.06.run
# Make executable and run in silent mode
chmod +x NVIDIA-Linux-x86_64-470.239.06.run
./NVIDIA-Linux-x86_64-470.239.06.run --silent
# Verify installation
nvidia-smi
A successful verification shows GPU details, driver version, and no errors.
Step 2: Set Up NVIDIA Container Toolkit
nvidia-container-runtime (now part of NVIDIA Container Toolkit) modifies OCI runtimes to inject GPU devices, CUDA libraries, and environment variables into containers when requested.
Configure Package Repository
# Import GPG key
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
# Add repository for Debian/Ubuntu
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
Install Toolkit
apt update && apt install -y nvidia-container-toolkit
Step 3: Integrate Toolkit with containerd
containerd uses a TOML configuration file. First, generate a default config if none exists, then update runtime settings to use NVIDIA’s modified OCI runtime as the default for CRI workloads.
# Create config directory
mkdir -p /etc/containerd
# Generate default config
containerd config default | tee /etc/containerd/config.toml
Edit /etc/containerd/config.toml to modify these sections:
- Set
default_runtime_nametonvidiain the CRI containerd plugin - Add or update the
nvidiaruntime entry - Use
io.containerd.runc.v2for runtime type
...
[plugins."io.containerd.grpc.v1.cri"]
...
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
default_runtime_name = "nvidia"
no_pivot = false
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
runtime_type = "io.containerd.runc.v2"
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
BinaryName = "nvidia-container-runtime"
...
Restart containerd to apply changes:
systemctl restart containerd && systemctl status containerd
Step 4: Deploy NVIDIA Device Plugin
Kubernetes uses Device Plugins to advertise speecialized hardware to the scheduler. The official NVIDIA plugin registers GPUs as nvidia.com/gpu resources.
# Deploy a recent stable version
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.5/nvidia-device-plugin.yml
Check plugin DaemonSet status:
kubectl get pod -n kube-system -l name=nvidia-device-plugin-ds
Successful logs from a plugin pod show NVML initialization, GRPC server startup, and registration with the local kubelet:
kubectl logs -n kube-system -l name=nvidia-device-plugin-ds
Step 5: Validate GPU Functionality
First test with containerd’s native CLI (ctr) to confirm runtime integration works outside Kubernetes:
# Pull CUDA base image
ctr image pull docker.io/nvidia/cuda:11.8.0-base-ubuntu22.04
# Run test with GPU 0
ctr run --rm -t --gpus 0 docker.io/nvidia/cuda:11.8.0-base-ubuntu22.04 gpu-test nvidia-smi
Next, test within a Kubernetes Pod using a simple CUDA vector addition workload:
apiVersion: v1
kind: Pod
metadata:
name: gpu-vector-demo
namespace: default
spec:
restartPolicy: Never
containers:
- name: cuda-vector-calc
image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.8.0-ubuntu22.04
resources:
limits:
nvidia.com/gpu: 1
command: ["/bin/sh"]
args: ["-c", "/usr/local/bin/vectorAdd"]
Apply the manifest and monitor status:
kubectl apply -f gpu-vector-demo.yaml
kubectl get pod gpu-vector-demo
Once the pod completes, check logs for a successful test message:
kubectl logs gpu-vector-demo
Kubernetes currently supports only whole-GPU scheduling; sharing a single GPU across multiple containers requires additional third-party tools.