GPU Accelerated Video Processing with FFmpeg and OpenCV 4.8 in CUDA-Enabled Docker Containers
Building a high-performance video processing environment involves integrating CUDA 12.0, cuDNN 8, and the NVIDIA Video Codec SDK with FFmpeg and OpenCV. This configuration enables hardware-accelerated decoding and encoding directly within a containerized environment.
Docker Container Configuration
To access GPU hardware for video tasks, initialize a container using the official NVIDIA CUDA development image. Ansure the environment variables for driver capabilities include video-specific features.
docker run -dit --gpus all \
-h gpu-processor \
-e NVIDIA_DRIVER_CAPABILITIES=compute,utility,video \
--name media-gpu-env \
nvidia/cuda:12.0.1-cudnn8-devel-ubuntu20.04 bash
Building FFmpeg with NVENC and CUVID Support
1. System Dependencies
Install the necessary build tools and media libraries required for a full-featured FFmpeg installation.
apt-get update && apt-get upgrade -y
apt-get install -y build-essential cmake git pkg-config unzip yasm \
libx264-dev libx265-dev libvpx-dev libfdk-aac-dev libmp3lame-dev \
libopus-dev libass-dev libfreetype6-dev libsdl2-dev libtool \
libva-dev libvdpau-dev libvorbis-dev libxcb1-dev libxcb-shm0-dev \
libxcb-xfixes0-dev zlib1g-dev
2. Install NVIDIA Codec Headers
FFmpeg requires specific headers to interface with the NVIDIA hardware codecs.
git clone https://github.com/FFmpeg/nv-codec-headers.git
cd nv-codec-headers
git checkout n12.0.16.0
make && make install
3. Configure Video Codec SDK
Download the Video Codec SDK (e.g., version 12.2.72) and map the headers and library stubs to the CUDA directory.
# Assuming the SDK is extracted in /tmp/video_sdk
cp /tmp/video_sdk/Interface/* /usr/local/cuda/include/
cp /tmp/video_sdk/Lib/linux/stubs/x86_64/* /usr/local/cuda/lib64/stubs/
4. Compile FFmpeg
Configure the build to enable CUDA, CUVID, and NVENC. Set the installation prefix to a custom directory to avoid conflicts.
cd /tmp/ffmpeg-5.1
./configure --prefix=/usr/local/ffmpeg_cuda \
--enable-cuda-nvcc --enable-cuvid --enable-nvenc \
--enable-nonfree --enable-libnpp \
--extra-cflags=-I/usr/local/cuda/include \
--extra-ldflags=-L/usr/local/cuda/lib64 \
--enable-gpl --enable-libx264 --enable-libx265 \
--enable-shared --enable-libass --enable-libfdk-aac \
--enable-libfreetype --enable-libmp3lame --enable-libopus
make -j$(nproc)
make install
Add the new binaries to the system path:
export PATH=/usr/local/ffmpeg_cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/ffmpeg_cuda/lib:$LD_LIBRARY_PATH
Compiling OpenCV 4.8.0 with CUDA Support
1. Source Preparation
Download both the main OpenCV reepository and the contribution modules.
git clone https://github.com/opencv/opencv.git -b 4.8.0
git clone https://github.com/opencv/opencv_contrib.git -b 4.8.0
2. Systtem Integration Fixes
Create a symbolic link for the CUVID library and update the search paths in the CMake configuration files to ensure the compiler locates the NVIDIA headers.
ln -s /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1 /usr/lib/x86_64-linux-gnu/libnvcuvid.so
# Patch precomp.hpp in opencv_contrib to include the codec header
echo "#include <nvcuvid.h>" >> opencv_contrib/modules/cudacodec/src/precomp.hpp
3. Build with CMake
Use CMake to generate the build files, ensuring WITH_CUDA, WITH_NVCUVID, and BUILD_opencv_cudacodec are enabled.
mkdir -p opencv/build && cd opencv/build
cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_NVCUVID=ON \
-D WITH_FFMPEG=ON \
-D BUILD_opencv_cudacodec=ON \
-D CUDA_ARCH_BIN=8.6 \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D BUILD_opencv_python3=ON \
-D PYTHON3_EXECUTABLE=$(which python3) ..
make -j$(nproc)
make install
Verifying Accelerated Hardware Access
After installation, verify that the cudacodec module is functional in Python. This confirms that OpenCV is successfully offloading video decoding to the GPU.
import cv2
# Check OpenCV version
print(f"OpenCV Version: {cv2.__version__}")
# Initialize GPU-based video reader
stream_path = "sample_video.mp4"
gpu_reader = cv2.cudacodec.createVideoReader(stream_path)
# Retrieve a frame from the GPU memory
success, gpu_frame = gpu_reader.nextFrame()
if success:
# Transfer frame from GPU to CPU memory for analysis
cpu_frame = gpu_frame.download()
print(f"Captured frame resolution: {cpu_frame.shape}")