Building Production-Ready Python Application Docker Images with Poetry and FastAPI

Written by Michael Fox | May 21, 2024 6:12:39 PM

At Amplify, Python is one of the languages we use to build AI-driven data pipelines. Python is the natural choice for data processing and interfacing with LLMs to make decisions utilizing frameworks like LangChain. However, the tooling around Python for things like dependency management and deployment frequently leaves developers in frustration. If Python is the natural choice for large parts of our data pipelines, how can we make working with it, well, better? Let’s dive into it.

Goals and Requirements

We’ve recently set out to redesign and improve our main data pipeline and AI agent architecture, so we have a chance to implement a more modern, standard Python stack for this endeavor. A typical Python application usually consists of a requirements file to use with pip and the large, official Python Docker image (a whopping 1.02GB!), but we’ve found that this often causes many developer and deployment headaches—but an exhaustive list of these issues is outside of the scope of this blog. However, before we discuss how we attempted to improve upon the usual Python stack, we should at least discuss the goals that we had at Amplify for our latest projects.

Python dependency resolution should be deterministic across builds via a lock file, like most modern dependency management systems.
Docker images should have a minimal footprint and support running as a non-root user.
The same Dockerfile should be utilized in local development and production.
Local development must support debugging.

Managing Python Dependencies with Poetry

The first goal of our new Python stack was to find a solution to avoid the dependency resolution issues that can arise during builds with pip. Poetry, a packaging tool for Python, is a project that has been on my radar for some time and around for even longer. We finally got a chance to integrate Poetry into our Python stack at Amplify and we were not disappointed! Poetry provides a better way to manage Python dependencies with lock files for synchronized dependencies and optional dependency groups. As you’ll see later, optional dependency groups help keep Docker image sizes down for production while allowing us to enable the python debugger during local development.

The easiest way to install Poetry is by using pipx, so go ahead and do that. Once Poetry is installed, we can initialize our example project by running poetry init:

Let's add uvicorn and fastapi to our demo project. You can do this by running poetry add uvicorn and poetry add fastapi.

Lastly, we set package-mode = false in Poetry's config file as we will only be using it for dependency management, and this will result in a smaller overall image size later. Your pyproject.toml and poetry.lock files should be similar to the following:


# pyproject.toml
[tool.poetry]
name = "demo"
version = "0.1.0"
description = ""
authors = ["Michael Fox <mfox@amplify.security>"]
readme = "README.md"
package-mode = false

[tool.poetry.dependencies]
python = "^3.12"
uvicorn = "^0.29.0"
fastapi = "^0.111.0"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

Expose a FastAPI Endpoint

At Amplify, we use FastAPI ASGI apps with Uvicorn to build event workers utilizing our open source messaging adapter, Carrier. For the purposes of this demo, we’re going to expose a simple ping endpoint to verify our python environment and, later, Docker image. Create a file, server.py, with a FastAPI endpoint like below. We will also add functionality to run uvicorn programmatically, which will be necessary later.


# server.py
import os

import uvicorn
from fastapi import FastAPI

UVICORN_RELOAD = os.getenv("UVICORN_RELOAD", "False").lower() in ("true", "1")
UVICORN_HOST = os.getenv("UVICORN_HOST", "0.0.0.0")
UVICORN_PORT = int(os.getenv("UVICORN_PORT", "8000"))

app = FastAPI()


@app.get("/ping")
def ping():
    return ({"ping": "pong"})


if __name__ == "__main__":
    uvicorn.run("server:app", host=UVICORN_HOST, port=UVICORN_PORT, reload=UVICORN_RELOAD)

Running server.py with python server.py should start Uvicorn, load the FastAPI ASGI app, and expose a /ping endpoint on port 8000. One thing to note here is that we specify 0.0.0.0 as the default for UVICORN_HOST because we will be running in a Docker container for local testing later, but you can configure this however works best in your environment. For example, if we were deploying this application as a Kubernetes pod and only pod-local networking was necessary, we could configure UVICORN_HOST as 127.0.0.1 for security concerns. The UVICORN_RELOAD environment variable, used here to enable dynamic reloads on code updates, will be passed to uvicorn later in our Docker Compose stack.

Python Debugging with Optional Dependency Groups

A debugger should be a non-negotiable feature of any functional development environment. When it comes to debuggers, don’t take my word for it: John Carmack himself is one of the most outspoken proponents of debuggers. The debugpy package offers excellent python debugger support for VS Code (and hopefully PyCharm soon) so we will be installing it in our project within a Poetry dependency group using poetry add debugpy --group debug:

Once we’ve added debugpy, we want to make the debug dependency group optional so that it will not be installed in our production Docker image. Create a [tool.poetry.group.debug] table in pyproject.toml and include optional = true. Once done, your pyproject.toml should look similar to the following:


# pyproject.toml
[tool.poetry]
name = "demo"
version = "0.1.0"
description = ""
authors = ["Michael Fox <mfox@amplify.security>"]
readme = "README.md"
package-mode = false

[tool.poetry.dependencies]
python = "^3.12"
uvicorn = "^0.29.0"
fastapi = "^0.111.0"

[tool.poetry.group.debug]
optional = true

[tool.poetry.group.debug.dependencies]
debugpy = "^1.8.1"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

Build a Docker Image

As mentioned previously, one of our goals was to use the same Dockerfile for both production and local development. However, we also want to be able to run the Python debugger when developing locally, but given that debugpy is over 20MB, this unfortunately conflicts with our goal of keeping production images small! We will use Docker build arguments in conjunction with Poetry's optional dependency groups to satisfy both of these goals with the same Dockerfile. Let’s take a look:


# Dockerfile
FROM python:3.12-alpine as builder

ARG INSTALL_DEBUGPY

# Set environment variables
ENV POETRY_NO_INTERACTION=1 \
    POETRY_VIRTUALENVS_IN_PROJECT=1 \
    POETRY_VIRTUALENVS_CREATE=1 \
    POETRY_CACHE_DIR=/tmp/.poetry

# Install poetry
RUN pip install poetry==1.8.3

# Add demo user
RUN adduser -D demo && \
    mkdir -p /home/demo/app && \
    chown demo:demo /home/demo/app
WORKDIR /home/demo/app
USER demo

COPY pyproject.toml poetry.lock ./

# Install dependencies
RUN if [[ -z "${INSTALL_DEBUGPY}" ]]; then \
        poetry install --no-root; \
    else \
        poetry install --no-root --with debug; \
    fi

FROM python:3.12-alpine as runtime

# Expose fastapi port
EXPOSE 8000

# Add demo user
RUN adduser -D demo && \
    mkdir -p /home/demo/app && \
    chown demo:demo /home/demo/app
WORKDIR /home/demo/app
USER demo

# Set environment variables
ENV VIRTUAL_ENV=.venv \
    PATH=/home/demo/app/.venv/bin:$PATH

# Copy virtual environment
COPY --from=builder /home/demo/app/${VIRTUAL_ENV} ${VIRTUAL_ENV}

# Copy server.py
COPY server.py server.py

# Set entrypoint
ENTRYPOINT ["python", "server.py"]

In this Dockerfile, we use the builder pattern to install Poetry and all dependencies into a virtual environment. We install the optional dependency group debug only if the INSTALL_DEBUGPY Docker build argument is set. We then copy only the virtual environment from the build image into our runtime image to ensure the smallest possible final image. Finally, we set the Docker image to run our server.py ASGI app. As an added bonus, we configure the Docker image to drop privileges and run as a non-root user, demo. Without debugpy installed, the final image size is only 111MB, which is over 9X smaller than the full Python Docker image and still smaller than the base slim Docker image!

A Quick Note: `alpine` vs `slim`

For the purposes of this blog (and our actual pipeline at Amplify) we use the alpine Python Docker image tag as our base image. The alpine Python image is only 57.1MB compared to the full-fat 1.02GB latest image and the still slimmed down 130MB slim image. However, Alpine Linux is built against the musl C library (as opposed to the default in most other Linux distributions, glibc) which does not always play well with certain Python packages. For greenfield development, it doesn’t hurt to use alpine as a base until an irreplaceable dependency just won’t play nice. However, if you are migrating a legacy Python application with existing dependencies, you may have unintended consequences and using slim might be a safer bet.

Running with Docker Compose

The final task in getting this Python stack ready for active development is running it locally with Docker Compose. Two things that will be useful for local testing are Uvicorn's hot reloading and exposing the debugger. For hot reloading, we will set the UVICORN_RELOAD environment variable, which server.py will pass to uvicorn. For exposing the debugger, we will set the Docker Compose build context to include the INSTALL_DEBUGPY build argument, expose port 5678 for the debugger, and run server.py with the debugpy module. This is what the docker-compose.yml file will look like:


# docker-compose.yml
---
version: "3.3"

services:
  demo:
    build:
      context: .
      args:
        INSTALL_DEBUGPY: "True"
    ports:
      - "8000:8000"
      - "5678:5678"
    volumes:
      - ./server.py:/home/demo/app/server.py # this will allow hot reload when file changes
    environment:
      UVICORN_RELOAD: "True"
    entrypoint:
      - "python"
      - "-m"
      - "debugpy"
      - "--listen"
      - "0.0.0.0:5678"
      - "server.py"

Bring up the Docker Compose stack with docker-compose up. This command will build the image with debugging enabled and run the Docker Compose stack. You can then access the FastAPI endpoint at http://localhost:8000/ping and can fiddle with changing the response in server.py without restarting the Docker Compose stack. Uvicorn will automatically reload the ASGI application and you can continue to develop without restarts.

Attaching the Debugger

To attach the debugger to our container running with Docker Compose, create a VS Code launch configuration by creating a .vscode directory in the root of your project and adding a launch.json file to it. The file should contain a configuration for a remote attach Python debugger:


// .vscode/launch.json
{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [ 
        {
            "name": "Python Debugger: Remote Attach",
            "type": "debugpy",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}",
                    "remoteRoot": "."
                }
            ],
        }
    ]
}

That’s all there is to it! You can now add breakpoints to your code within VS Code and attach to the running container using the Run and Debug menu.

Closing

If you made it this far, we created a Python stack which:

Allows for deterministic dependency resolution at build time and synchronized development environments using Poetry.
Has a base image size of only 111MB with dependencies.
Uses the same Docker image in production and local development.
Supports debugging via VSCode and debugpy.

This is the opinionated way we build modern Python applications, including our AI-driven data pipelines, at Amplify. Hopefully you found something interesting or useful to takeaway and incorporate into your own projects! As usual, feel free to drop by and follow us on LinkedIn or GitHub to keep in touch and hear about the latest developments at Amplify.

View full post