🧠 Accelerating CUDA Development with Docker: A Practical Guide to Building GPU Code with nvcc in Containers

Published 04/14/2025

X (Twitter) Facebook Pinterest LinkedIn Email WhatsApp

As GPU-accelerated computing becomes essential in AI, HPC, and scientific computing, developers increasingly turn to containers for reproducible, scalable, and efficient development environments. Docker, when paired with NVIDIA’s CUDA, provides a clean, consistent, and portable platform for building and deploying GPU-powered applications.

This guide walks you through how to compile and run CUDA code using nvcc inside a Docker container, leveraging NVIDIA’s official CUDA images.

✅ Prerequisites

Before you start, ensure you have the following installed:

Docker Engine
Download here
NVIDIA Container Toolkit (nvidia-docker)
Required to provide GPU access to Docker containers:

# For Debian/Ubuntu
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo systemctl restart docker

Verify GPU access:

docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi

🧾 Example: CUDA Vector Addition Program

Create a simple CUDA program vector_add.cu:

// vector_add.cu
#include <iostream>

__global__ void vector_add(float *a, float *b, float *c, int n) {
    int idx = blockIdx.x * blockDim.x + threadIdx.x;
    if (idx < n) c[idx] = a[idx] + b[idx];
}

int main() {
    const int N = 512;
    size_t size = N * sizeof(float);

    float *h_a = new float[N];
    float *h_b = new float[N];
    float *h_c = new float[N];

    for (int i = 0; i < N; ++i) {
        h_a[i] = i;
        h_b[i] = i * 2;
    }

    float *d_a, *d_b, *d_c;
    cudaMalloc(&d_a, size);
    cudaMalloc(&d_b, size);
    cudaMalloc(&d_c, size);

    cudaMemcpy(d_a, h_a, size, cudaMemcpyHostToDevice);
    cudaMemcpy(d_b, h_b, size, cudaMemcpyHostToDevice);

    vector_add<<<(N + 255) / 256, 256>>>(d_a, d_b, d_c, N);

    cudaMemcpy(h_c, d_c, size, cudaMemcpyDeviceToHost);

    for (int i = 0; i < 5; ++i)
        std::cout << h_a[i] << " + " << h_b[i] << " = " << h_c[i] << std::endl;

    cudaFree(d_a); cudaFree(d_b); cudaFree(d_c);
    delete[] h_a; delete[] h_b; delete[] h_c;

    return 0;
}

🐳 Create a Dockerfile

Here’s a Dockerfile to build the CUDA code inside a container:

# Dockerfile
FROM nvidia/cuda:12.3.2-devel-ubuntu22.04

RUN apt-get update && apt-get install -y build-essential

COPY vector_add.cu /workspace/vector_add.cu
WORKDIR /workspace

RUN nvcc -o vector_add vector_add.cu
CMD ["./vector_add"]

🏗️ Build and Run the Container

Step 1: Build the image

docker build -t cuda-vector-add .

Step 2: Run with GPU access

docker run --rm --gpus all cuda-vector-add

Expected output:

🧰 Interactive Development Using Volumes

You can mount your source code dynamically for faster iteration:

docker run --rm -it --gpus all -v $PWD:/workspace -w /workspace nvidia/cuda:12.3.2-devel-ubuntu22.04 bash

Inside the container:

nvcc -o vector_add vector_add.cu
./vector_add

🔧 Feature Comparison: Local vs. Docker-Based CUDA Development

Feature	Native CUDA Development	Docker + CUDA Container
Portability	Tied to local setup	Cross-platform and replicable
Isolation	Shared environment	Fully isolated and reproducible
Environment Setup Time	Manual (may vary by system)	One-time Dockerfile
Ease of Scaling to Cloud	Needs reconfiguration	Plug-and-play with container images
GPU Access	Direct	Requires `nvidia-container-toolkit`
Version Control of Toolchain	Manual version tracking	Fixed by Docker image tag

🚀 Pro Tips

Use nvidia/cuda:<version>-devel-ubuntu<version> for full development with nvcc.
For runtime-only containers, use nvidia/cuda:<version>-runtime.
Use .dockerignore to avoid copying unnecessary files.
Consider using multi-stage builds to separate compilation and runtime for leaner images.

💡 Conclusion

Using Docker with nvcc is a powerful way to simplify your CUDA development workflow. It eliminates environment inconsistencies and provides a reproducible, scalable path from local development to deployment — whether on bare-metal servers, Kubernetes clusters, or the cloud.

X (Twitter) Facebook Pinterest LinkedIn Email WhatsApp

🧠 Accelerating CUDA Development with Docker: A Practical Guide to Building GPU Code with nvcc in Containers

Table of Contents

✅ Prerequisites

🧾 Example: CUDA Vector Addition Program

🐳 Create a Dockerfile

🏗️ Build and Run the Container

Step 1: Build the image

Step 2: Run with GPU access

🧰 Interactive Development Using Volumes

🔧 Feature Comparison: Local vs. Docker-Based CUDA Development

🚀 Pro Tips

💡 Conclusion

Related articles

The 8 Most Popular Linux Desktop Environments in 2025: Innovation and Customization for Everyone

Movistar and O2 Block Cloudflare Again: Football Takes Priority Over the Rest of the Internet

Mastering Bash History: Advanced Techniques for Efficient Linux Server Management

Harlequin: A new way to manage databases from the terminal

Microsoft Announces Hyperlight Wasm: A Fast, Secure, and OS-Free Solution for WebAssembly Workloads

Cleavr: The Ultimate Server Management and Deployment Solution

How to Fix FTP Connection Issues in FileZilla: A Step-by-Step Guide

Linux 6.15 to Improve DRM Panic, the Kernel’s “Screen of Death”

Demystifying Pratt Parsers: A Powerful Approach to Expression Parsing

FrankenPHP: The modern PHP server revolutionizing web development