Recently, I had a chance to work on an AI project that most of the AI components are coded in Python and used Conda as the package manager.

These components are required to be built as a Docker image to ensure portability. So I had researched how to use Conda with Docker and succeeded after many tries.

In my project, I have built Docker images for various AI components. The components do the following work (briefly):

  • Get image (binary form) from queue service (Nats Streaming)
  • Convert image and detect objects in the image (using TensorFlow Serving)
  • Process output and store data to Database (PostgresDB)

The system can be running on CPU or GPU (if applicable).

In this article, I will show you how to use Docker with Conda, demonstrated by a simple Python script applied OpenCV. It’s a step-by-step guide that walks you through on how to create Dockerfile for a Conda project.

Docker and Conda Overview

Nowadays, Docker is not an unfamiliar keyword for developers. Docker is becoming more and more popular as a containerization platform used to develop, ship, and deploy containers. Many cloud platforms support Docker such as Azure, AWS, GCP… We may discuss Docker on Cloud in another topic.

Docker is an open platform, popular tool that allows us to quickly and easily build, deploy, and run applications using containers. Using Docker brings a number of benefits to portability, agility, isolation, and scalability. If you need to review Docker, please take a look at the following link:

Conda, a platform-independent package manager, is particularly well-known for data analysis and scientific computing. Or if you have just started to be a Python developer, having trouble setting up your environment while developing various projects, Conda would be a nice choice for you.

Conda is an open-source package and environment manager that help us to create and manage separate environments for applications. Conda provides “virtual environment” capabilities, we can avoid library version conflicts (which is typical for Python project). If you need to review Conda, please take a look at the following link:

Using Conda with Docker (sample project)

Step 1: Preparation

For this sample project, we prepare the following files:

A simple Python script “run.py” that grayscale an input image and display the result. We will use OpenCV in this script. We will also have an image “sample.jpg” for testing the script.

A conda environment file “environment.yml” that used to create Conda environment with dependencies.

A Docker file “Dockerfile” prepared to build a Docker image for Conda project.

Let’s take a closer look at the Docker file:

  • First 3 lines, we declare base image and workdir and copy the content of target folder to image working directory (as other docker files)
  • Next, we will create conda environment from “environment.yml’ file (just like what we do on the local machine)
  • Next, we will activate the conda environment by using RUN instruction. Using RUN will execute the activate command on top of the current image and commit the results. That makes following commands can be executed on the target (activated) Conda environment.
  • Finally, execute the command using CMD instruction

Step 2: Execution

Build docker image using command:

That may take a few minutes based on the dependencies we configured.

Run Docker image using command:

Since we need to display image from Docker container, we need to configure docker run command.

xhost + : grant access to X server (to display image)

Name, shorthand Description
–rm Automatically remove the container when it exits
–net Connect a container to a network
–net=host allows container to view ports on local machine
–ipc IPC mode to use
–ipc=host allow local machine to access shared memory of container
-e Set environment variables
In this command, we set DISPLAY env of container to DISPLAY env of local machine

In short, the command allows the container to access the X server on local machine to display result. If you are interested in configuration for “run” command, please read the article: https://docs.docker.com/engine/reference/commandline/run/

We can view the grayscale output:

Conda with Docker in AI project: Problem and Fix

With the previous section, you may have enough materials to build your own Dockerfile to enable Conda in the Docker image. The following points are my experience, I hope these points can be useful for you:

Take a long time to build docker image

Since creating a fresh Conda environment required to install lots of libraries, especially when you need to use heavy libraries, rebuild the docker image over and over while developing may cost lots of time. To prevent this, we can separate Dockerfile into 2 parts.

  • In the first part, we will build from the base image, copy the environment.yml file, and create the Conda environment only. We call this “base Conda image”
  • In the second part, we will build from the base Conda image and do the rest of the work (copy source code, run the target file…)

For example, in the previous section sample, we can separate Dockerfile into 2 parts as below:

Part A to build base Conda image:

Part B to build application image:

After that, each time you need to update source code, you just have to rebuild your image using the second Docker file. You do not have to go through the whole process to create a new Conda environment.

Logging

Using the following command line to enable python file to run on targeted conda environment will work, too. But no log is output to screen.

Instead, active conda using RUN and execute python script after that (as in the sample file) will produce log when docker container run.

That’s it!

Through the article, I hope you get what you need to use Conda with Docker. By using Docker, your development and deployment works can be faster, more efficient, and improve portability. And if you are currently working on an AI project, I hope the above content can be helpful for your adventure, as it did for mine.

Tran Gia Quoc Hung – Solution & Technology Unit, FPT Software

Related posts: