current position:Home>How to use NVIDIA GPU in docker container

How to use NVIDIA GPU in docker container

2022-01-27 01:21:45 mikes zhang

 Insert picture description here
Docker The container will not automatically see your system GPU. This leads to dependency GPU Workload for ( For example, machine learning framework ) Performance degradation . The following is the of the host NVIDIA GPU Methods disclosed to containers .

Give Way GPU stay Docker Work in China

Docker Containers share your host's kernel , But with its own operating system and software package . This means that they lack the ability to communicate with GPU The interaction of NVIDIA The driver . By default ,Docker You won't even add... To the container GPU, So ordinary people docker run I can't see your hardware at all .

In a nutshell , Give Way GPU Work is a two-step process : Install the driver in the image , Then instruct Docker At run time, the GPU Add device to container .

This guide focuses on CUDA and Docker The modern version of . Latest version NVIDIA Container Toolkit Specially designed for CUDA 10 and Docker Engine 19.03 And later versions .CUDA、Docker and NVIDIA Older versions of the driver may require additional steps .

add to NVIDIA The driver

Continuing Docker Before configuration , Please make sure that on your host NVIDIA The driver works properly . You should be able to successfully run nvidia-smi And see your GPU name 、 Driver version and CUDA edition .
 Insert picture description here
To put GPU And Docker Use a combination of , Please first NVIDIA Container Toolkit Add to your host . This is integrated into Docker The engine automatically configures your container to support GPU.

Use the sample command to add the package repository of the toolkit to your system :

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

Next nvidia-docker2 Install the package on your host :

apt-get update
apt-get install -y nvidia-docker2

restart Docker Daemon to complete the installation :

sudo systemctl restart docker

Container Toolkit It should be running now . You are ready to start the test container .

Use GPU Access the startup container

Because by default Docker Do not provide the of your system GPU, You need to create a with –gpus The container of the hardware flag to show . You can specify a specific device to enable or use all keyword .

The nvidia/cuda The image is pre configured CUDA Binaries and GPU Tools for . Start a container and run nvidia-smi Command to check your GPU Can I visit . The output should be with you nvidia-smi Match what you see when using on the host .CUDA The version may be different , Depending on the toolkit version on the host and in the selected container image .

docker run -it --gpus all nvidia/cuda:11.4.0-base-ubuntu20.04 nvidia-smi

 Insert picture description here

Select the base image

Use one of these nvidia/cuda The label is for your GPU The workload is in Docker The fastest and easiest way to run . There are many different variants to choose from ; They provide an operating system 、CUDA Version and NVIDIA Matrix of software options . These images are built for a variety of architectures .

Each label has the following format :

11.4.0-base-ubuntu20.04

11.4.0 – CUDA edition .
base – Image flavor .
ubuntu20.04 – Operating system version .

Provide three different image styles . The base Images are basic CUDA Minimum options for runtime binaries .runtime Is a more comprehensive option , Including for cross GPU communication CUDA Math library and NCCL . The third variant devel For you runtime Provides for creating custom CUDA All the contents of the image, as well as header files and development tools .

If one of the pictures suits you , Please use it as Dockerfile. then , You can use the general Dockerfile Instructions to install your programming language 、 Copy the source code and configure your application . It eliminates manual GPU Set the complexity of the steps .

FROM nvidia/cuda:11.4.0-base-ubuntu20.04
RUN apt update
RUN apt-get install -y python3 python3-pip
RUN pip install tensorflow-gpu

COPY tensor-code.py .
ENTRYPONT ["python3", "tensor-code.py"]

Use –gpus Flag to build and run this image through GPU Speed up your startup Tensor The workload .

Manually configure the mirror

If you need to choose a different foundation , You can manually add... To your image CUDA Support . The best way to achieve this is to refer to official NVIDIA Dockerfiles.

Copy is used to add CUDA Package Repository 、 Instructions for installing the library and linking it to your path . We will not copy all the steps in this guide , Because they are because of CUDA Versions and operating systems vary .

Be careful Dockerfile Environment variable at the end —— They define how containers that use your image interact with NVIDIA Container Runtime Integrate :

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility

Once installed CUDA And set the environment variable , Your image should detect your GPU. This gives you better control over the content of the image , But with the new CUDA Release of version , You may need to adjust the instructions .

How it works ?

NVIDIA Container Toolkit Is a collection of packages , It will the container runtime ( Such as Docker) With the host computer NVIDIA The interface of the driver is wrapped together . The libnvidia-container The library is responsible for providing API and CLI, Through the runtime wrapper, the system's GPU Provide to container .

The nvidia-container-toolkit Component implements a container runtime prestart hook . This means that it will be notified when the new container is about to start . It looks at what you want to attach and calls libnvidia-container To handle the... Created by the container GPU .

Hook by Enable nvidia-container-runtime. This will wrap your “ real ” Container runtime , for example containerd or runc, In order to ensure that prestart function NVIDIA hook . After the hook is executed , Your existing runtime will continue the container startup process . After installing the container kit , You will see in Docker Daemon configuration file selected NVIDIA Runtime .

Generalization

stay Docker Use in a container NVIDIA GPU Need you to NVIDIA Container Toolkit Add to host . This will NVIDIA The driver is integrated with your container runtime .

docker run Use –gpu Flag calls make your hardware visible to the container . In the installation Container Toolkit after , This must be set on every container you start .
NVIDIA Provide preconfigured CUDA Docker image , You can use it as a quick start to your application . If you need more specific content , Please refer to official Dockerfiles To assemble your own still with Container Toolkit Compatible files .

copyright notice
author[mikes zhang],Please bring the original link to reprint, thank you.
https://en.cdmana.com/2022/01/202201270121404865.html

Random recommended