This article shows how to access GPUs from Docker Swarm services. In essence, we need to do two things:
- Set the nodes in the cluster to advertise their GPUs as Docker generic resources;
- Have the service specify the constraint that it needs GPU resources.
Once these are both in place, the swarm orchestrator can automatically allocate services that need GPUs to nodes that have GPUs, without us needing to manually place tasks on specific nodes. Yay!
However, please note that only one Docker service replica can be assigned to a given GPU; there is no time-sharing between services on a single node. Practically this means you need at least as many nodes with GPUs as tasks that require them. If you have 5 nodes with GPUs and start 6 replicas of your service, one replica will stay pending due to lack of resources.
This article assumes you are already familiar with a number of concepts. Here are some resources for more background information:
- A nice introduction to Docker images and containers;
- A tutorial on Docker Swarm and service creation;
- An introduction to specifying constraints on Docker services.
Why GPUs and Docker Swarm?
Why might you want to access GPUs from Docker Swarm services? For this article I’ll assume that you want to rapidly train a lot of neural networks using Apache Spark. We can use Docker Swarm to manage our Spark cluster, deploying the Spark master on one node and replicating the Spark workers across the remaining nodes. With this architecture, we can direct each worker to train a single network, and use the GPU on a given worker node to speed up the training time.
Accessing the GPU from your own software
Before we get to Spark workers and Docker services, we need to ensure that our neural network training code can access the GPU in the first place. The nodes in the cluster should have an nVidia GPU (e.g. AWS EC2 instances starting with p), and the nVidia CUDA toolkit installed.
You also need a framework for designing and training neural networks such as Tensorflow or Theano, or the higher-level wrapper Keras. If installing these Python packages yourself make sure to install the GPU-enabled versions, e.g.:
If running on EC2, Amazon provides an AMI for their GPU-enabled nodes that comes with CUDA, Tensorflow, and Python already installed.
Now your Keras or Tensorflow neural network program should run on the GPU!
Accessing the GPU from a Docker container
Containers are great for abstracting away the details of the native system that we’re running on, but the GPU is one of the details that gets abstracted away! In order for a Docker container to access the GPU, we need to use
nvidia-docker instead of
docker to run containers.
On Linux we install
nvidia-docker through the package manager in the usual way (e.g.
apt-get). Then launching a container becomes:
If we launch our Keras program in this container, it will run on the GPU!
nvidia-docker didn’t support Docker Swarm. This meant that Spark workers couldn’t be replicated across nodes in a cluster. The work-around was to manually allocate a Spark worker to a specific node by issuing an
nvidia-docker run command on that node, instead of issuing a
service create --replicas request to the swarm manager. It gets the job done, but it misses all the nice benefits of orchestration.
In December 2017
nvidia-docker2 was released which supports Docker Swarm. Yay! The rest of this article draws from a GitHub comment explaining how to use
nvidia-docker with Docker Swarm from January this year. If you previously had
nvidia-docker installed, you need to uninstall it and change to
nvidia-docker2 for swarm support. For example:
Accessing the GPU from a Docker service
So how do we get Docker services to use the GPU? Well, in addition to the requirements above (CUDA,
nvidia-docker2) we need to do three more things:
- Configure the Docker daemon on each node to advertises its GPU
- Make the Docker daemon on each node default to using
- Add a constraint to our Docker service specifying that it needs a GPU
Once we take these steps, the orchestrator will be able to see which nodes have GPUs and which services require them, and deploy our services accordingly!
Configuring the Docker daemon
The first step is to find the identifier of the GPU on a specific node, so we can pass it to the daemon later. We find it and store it in an environment variable with this command:
What this is doing is running
nvidia-smi -a, finding the line containing ‘UUID’, then extracting the first 12 characters of the 4th column of this line. You can see an example of the output of
nvidia-smi -a in the comment here. Line 19 contains the UUID; columns 1, 2, and 3 are ‘GPU’, ‘UUID’, and ‘:’ respectively. The first 12 characters of column 4 should be enough to uniquely identify this GPU.
echo $GPU_ID, we can see it looks something like
Docker is launched and managed as a service through systemd. We can change its default behaviour by adding an override file, called
This file should contain the following lines:
Note: the second line is essential, because it clears any previously set
ExecStart commands. You’ll get an error if this is missing.
What is the third line doing? Three things: it’s saying that when we start the Docker daemon we want the default runtime to be
nvidia-docker (instead of
docker), and that this node provides a generic resource of type
gpu. (The name
gpu could be anything, but it should be the same thing across all the nodes in our cluster so that the orchestrator sees which nodes offer the same resource type.) Finally, it’s saying that on this specific node, the generic
gpu resource has the identifier we previously stored in
Next, we modify the file
/etc/nvidia-container-runtime/config.toml to allow the GPU to be advertised as a swarm resource. Uncomment or add the following line to this file:
After taking these three steps, we need to reload the Docker daemon (to pick up the new configuration override file), and start it:
Scripting these steps
It’s a bit tedious to manually take these steps on every node in our cluster. They can be scripted as follows:
Adding a service constraint
Now our cluster nodes are advertising to the swarm that they offer access to a GPU. The final step is to ensure that the service requests a GPU. We do this by adding to the Docker service create command
--generic-resource "gpu=1". The full command looks something like this:
The name of the generic resource being requested (
gpu here) should match the name of the resource being advertised by the nodes.
Congratulations! The Docker swarm orchestrator will now distribute your Spark workers onto nodes with GPU capability.