Dockerfile generation#

This advanced guide describes the Dockerfile generation process. BentoML make uses of Jinja2, which enable users to have more control on the generated Dockerfile for a Bento.

Warning

This section is not meant to be a complete reference on Jinja2. It is meant to give a quick overview of how Jinja2 is used in conjunction with BentoML. For any reference on Jinja2 please refers to their Templates Design Documentation.

Tip

One huge advantage of this feature is that it allows developers and library, such as πŸš€ bentoctl to extend a 🍱.

The Dockerfile generation process is as follows:
  • Retrieve options provided via bentofile.yaml docker section.

  • Determine a given DistroSpec for the coresponding distros. This includes supported python version, cuda version, and architecture type.

  • From the given spec, alongside with options from BentoBuildConfig, generate a Dockerfile from a set of Jinja templates.

One of the powerful features Jinja offers is its template inheritance. This allows BentoML to enable users to fully customize how to structure a Bento’s Dockerfile.

Note

To use a custom Dockerfile template, users have to provide a file with a format that follows the Jinja2 template syntax. The template file should have extensions of .j2, .template, .jinja.

Tip

Warning: If you pass in a generic Dockerfile file, and then run bentoml build to build a Bento and it doesn’t throw any errors. However, when you try to run bentoml containerize, this won’t work.

This is an expected behaviour from Jinja2, where Jinja2 accepts any file as a template.

We decided not to put any restrictions to validate the template file, simply because we want to enable users to customize to their own needs.

So, make sure you know what you are doing if you are trying out something like above.

In addition to Jinja, BentoML also uses Docker Buildx, which enables users to build Bentos that support multiple architectures. Under the hood the generated Dockefile leverage buildkit, which enables some advanced dockerfile features.

Note

Currently the only buildkit frontend that we are supporting is docker/dockerfile:1.4-labs. Please contact us if you need support for different build frontend.

Writing custom Dockerfile template#

To construct a custom Dockerfile template, users have to provide an extends block at the beginning of the Dockerfile template Dockerfile.template, followed by the given base template name:

{% extends "python_debian.j2" %}

The following templates per distro are available for extending:

alpine
{% extends "base.j2" %}
{% block SETUP_BENTO_BASE_IMAGE %}
{{ super() }}

# Install helpers
RUN --mount=type=cache,from=cached,target=/var/cache/apk \
    apk add --update bash gcc libc-dev shadow musl-dev build-base \
		linux-headers g++

ENV ENV /root/.bashrc

SHELL [ "/bin/bash", "-eo", "pipefail", "-c" ]

{% if __options__system_packages is not none %}
# Install user-defined system package
RUN --mount=type=cache,from=cached,target=/var/cache/apk \
    apk add --update {{ __options__system_packages | join(' ') }}
{% endif -%}
{% endblock %}
{% extends "base_alpine.j2" %}
{% set __environment_yml__=expands_bento_path("env", "conda", "environment.yml", bento_path=bento__path) %}
{% block SETUP_BENTO_ENVARS %}
{{ super() }}

RUN --mount=type=cache,mode=0777,target=/opt/conda/pkgs bash <<EOF
set -euxo pipefail

SAVED_PYTHON_VERSION={{ __python_version_full__ }}
PYTHON_VERSION=${SAVED_PYTHON_VERSION%.*}

echo "Installing Python $PYTHON_VERSION with conda..."
conda install -y -n base pkgs/main::python=$PYTHON_VERSION pip

if [ -f {{ __environment_yml__ }} ]; then
  # set pip_interop_enabled to improve conda-pip interoperability. Conda can use
  # pip-installed packages to satisfy dependencies.
  echo "Updating conda base environment with environment.yml"
  conda config --set pip_interop_enabled True
  conda env update -n base -f {{ __environment_yml__ }}
  conda clean --all
fi
EOF
{% endblock %}
{% extends "base_alpine.j2" %}
{% block SETUP_BENTO_ENTRYPOINT %}
{{ super() }}

RUN rm -rf /var/cache/apk/*
{% endblock %}
debian
{% extends "base.j2" %}
{% block SETUP_BENTO_BASE_IMAGE %}
{{ super() }}
USER root

ENV DEBIAN_FRONTEND=noninteractive
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,from=cached,sharing=shared,target=/var/cache/apt \
    --mount=type=cache,from=cached,sharing=shared,target=/var/lib/apt \
	apt-get update -y \
    && apt-get install -q -y --no-install-recommends --allow-remove-essential \
		ca-certificates gnupg2 bash build-essential \
    && apt-get clean

{% if __options__system_packages is not none %}
# Install user-defined system package
RUN --mount=type=cache,from=cached,sharing=shared,target=/var/cache/apt \
    --mount=type=cache,from=cached,sharing=shared,target=/var/lib/apt \
    apt-get install -q -y --no-install-recommends --allow-remove-essential \
    {{ __options__system_packages | join(' ') }} \
    && apt-get clean

{% endif -%}
RUN rm -rf /var/lib/{apt,cache,log}

{% endblock %}
{% extends "base_debian.j2" %}
{% set __environment_yml__=expands_bento_path("env", "conda", "environment.yml", bento_path=bento__path) %}
{% block SETUP_BENTO_ENVARS %}
{{ super() }}

RUN --mount=type=cache,mode=0777,target=/opt/conda/pkgs bash <<EOF
set -euxo pipefail

SAVED_PYTHON_VERSION={{ __python_version_full__ }}
PYTHON_VERSION=${SAVED_PYTHON_VERSION%.*}

echo "Installing Python $PYTHON_VERSION with conda..."
conda install -y -n base pkgs/main::python=$PYTHON_VERSION pip

if [ -f {{ __environment_yml__ }} ]; then
  # set pip_interop_enabled to improve conda-pip interoperability. Conda can use
  # pip-installed packages to satisfy dependencies.
  echo "Updating conda base environment with environment.yml"
  conda config --set pip_interop_enabled True
  conda env update -n base -f {{ __environment_yml__ }}
  conda clean --all
fi
EOF
{% endblock %}
{% extends "base_debian.j2" %}
{% block SETUP_BENTO_BASE_IMAGE %}
{{ super() }}

RUN --mount=type=cache,from=cached,sharing=shared,target=/var/cache/apt \
    --mount=type=cache,from=cached,sharing=shared,target=/var/lib/apt bash <<EOF
set -euxo pipefail

apt-get update -y
apt-get install -y --no-install-recommends --allow-remove-essential \
		software-properties-common curl

# add deadsnakes ppa to install python
add-apt-repository ppa:deadsnakes/ppa
apt-get update -y

apt-get install -y --no-install-recommends --allow-remove-essential \
		{% if "ubuntu22.04" in __base_image__ and __options__python_version == "3.10" %}python{{ __options__python_version }}-jammy{% else %}python{{ __options__python_version }}{% endif %} \
		python{{ __options__python_version }}-dev \
		python{{ __options__python_version }}-distutils

apt-get clean
EOF

RUN ln -sf /usr/bin/python{{ __options__python_version }} /usr/bin/python3 && \
	ln -sf /usr/bin/pip{{ __options__python_version }} /usr/bin/pip3

RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
    python3 get-pip.py && \
    rm -rf get-pip.py

{% endblock %}
{% block SETUP_BENTO_ENVARS %}

SHELL [ "/bin/bash", "-eo", "pipefail", "-c" ]

{{ super() }}
{% endblock %}

Note

For cuda_debian.j2, we are using NVIDIA’s nvidia/cuda image with ubuntu variants. This is because NVIDIA does not maintain a debian image. Ubuntu is a good substitute for Debian as Ubuntu is debian-based.

{% extends "base_debian.j2" %}
{% block SETUP_BENTO_ENVARS %}

SHELL [ "/bin/bash", "-eo", "pipefail", "-c" ]

{{ super() }}
{% endblock %}
Adding conda to CUDA-based template

Tip

Warning: miniconda install scripts provided by ContinuumIO (the parent company of Anaconda) supports Python 3.7 to 3.9. Make sure that you are using the correct python version under docker.python_version.

If you need to use conda for CUDA images, use the following template (partially extracted from ContinuumIO/docker-images):

Expands me
{% extends "base_debian.j2" %}
{# Make sure to change the correct python_version and conda version accordingly. #}
{# example: py38_4.10.3 #}
{# refers to https://repo.anaconda.com/miniconda/ for miniconda3 base #}
{% set conda_version="py39_4.11.0" %}
{% set conda_path="/opt/conda" %}
{% set conda_exec= [conda_path, "bin", "conda"] | join("/") %}
{% block SETUP_BENTO_BASE_IMAGE %}
FROM debian:bullseye-slim as conda-build

RUN --mount=type=cache,from=cached,sharing=shared,target=/var/cache/apt \
    --mount=type=cache,from=cached,sharing=shared,target=/var/lib/apt \
    apt-get update -y && \
    apt-get install -y --no-install-recommends --allow-remove-essential \
                software-properties-common \
        bzip2 \
        ca-certificates \
        git \
        libglib2.0-0 \
        libsm6 \
        libxext6 \
        libxrender1 \
        mercurial \
        openssh-client \
        procps \
        subversion \
        wget && \
    apt-get clean

ENV PATH {{ conda_path }}/bin:$PATH

SHELL [ "/bin/bash", "-eo", "pipefail", "-c" ]

ARG CONDA_VERSION={{ conda_version }}

RUN bash <<EOF
set -ex

UNAME_M=$(uname -m)

if [ "${UNAME_M}" = "x86_64" ]; then
    MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-x86_64.sh";
    SHA256SUM="4ee9c3aa53329cd7a63b49877c0babb49b19b7e5af29807b793a76bdb1d362b4";
elif [ "${UNAME_M}" = "s390x" ]; then
    MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-s390x.sh";
    SHA256SUM="e5e5e89cdcef9332fe632cd25d318cf71f681eef029a24495c713b18e66a8018";
elif [ "${UNAME_M}" = "aarch64" ]; then
    MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-aarch64.sh";
    SHA256SUM="00c7127a8a8d3f4b9c2ab3391c661239d5b9a88eafe895fd0f3f2a8d9c0f4556";
elif [ "${UNAME_M}" = "ppc64le" ]; then
    MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-ppc64le.sh";
    SHA256SUM="8ee1f8d17ef7c8cb08a85f7d858b1cb55866c06fcf7545b98c3b82e4d0277e66";
fi

wget "${MINICONDA_URL}" -O miniconda.sh -q && echo "${SHA256SUM} miniconda.sh" > shasum

if [ "${CONDA_VERSION}" != "latest" ]; then
    sha256sum --check --status shasum;
fi

mkdir -p /opt
sh miniconda.sh -b -p {{ conda_path }} && rm miniconda.sh shasum

find {{ conda_path }}/ -follow -type f -name '*.a' -delete
find {{ conda_path }}/ -follow -type f -name '*.js.map' -delete
{{ conda_exec }} clean -afy
EOF

{{ super() }}

ENV PATH {{ conda_path }}/bin:$PATH

COPY --from=conda-build {{ conda_path }} {{ conda_path }}

RUN bash <<EOF
ln -s {{ conda_path }}/etc/profile.d/conda.sh /etc/profile.d/conda.sh
echo ". {{ conda_path }}/etc/profile.d/conda.sh" >> ~/.bashrc
echo "{{ conda_exec }} activate base" >> ~/.bashrc
EOF

{% endblock %}
{% block SETUP_BENTO_ENVARS %}

SHELL [ "/bin/bash", "-eo", "pipefail", "-c" ]

{{ super() }}

RUN --mount=type=cache,mode=0777,target=/opt/conda/pkgs bash <<EOF
SAVED_PYTHON_VERSION={{ __python_version_full__ }}
PYTHON_VERSION=${SAVED_PYTHON_VERSION%.*}

echo "Installing Python $PYTHON_VERSION with conda..."
{{ conda_exec }} install -y -n base pkgs/main::python=$PYTHON_VERSION pip

if [ -f {{ __environment_yml__ }} ]; then
# set pip_interop_enabled to improve conda-pip interoperability. Conda can use
# pip-installed packages to satisfy dependencies.
echo "Updating conda base environment with environment.yml"
{{ conda_exec }} config --set pip_interop_enabled True || true
{{ conda_exec }} env update -n base -f {{ __environment_yml__ }}
{{ conda_exec }} clean --all
fi
EOF
{% endblock %}
About BentoML’s templates inheritance:

All internal templates are located here

As you can see, the BentoML internal Dockerfile templates are organized with the format <release_type>_<distro>.j2 with:

Release type

Description

base

A base setup for all supported distros.

cuda

CUDA-supported templates.

miniconda

Conda-supported templates.

python

Python releases.

where base_<distro>.j2 is extended from base.j2

The templates hierarchy is as follows:

.
└── base.j2
    β”œβ”€β”€ base_alpine.j2
    β”‚Β Β  β”œβ”€β”€ miniconda_alpine.j2
    β”‚Β Β  └── python_alpine.j2
    β”œβ”€β”€ base_amazonlinux.j2
    β”‚Β Β  └── python_amazonlinux.j2
    β”œβ”€β”€ base_debian.j2
    β”‚Β Β  β”œβ”€β”€ cuda_debian.j2
    β”‚Β Β  β”œβ”€β”€ miniconda_debian.j2
    β”‚Β Β  └── python_debian.j2
    └── base_ubi8.j2
        β”œβ”€β”€ cuda_ubi8.j2
        └── python_ubi8.j2
{% extends bento_auto_template %}

By doing so, we ensure that the generated Dockerfile will be compatible with a Bento.

About writing Dockerfile.template

The Dockerfile template is a mix between Jinja2 syntax and Dockerfile syntax. BentoML set both trim_blocks and lstrip_blocks in Jinja templates environment to True.

Make sure that your Dockerfile instruction is unindented as if you are writting a normal Dockerfile.

Refers to Jinja Whitespace Control.

An example of a Dockerfile template takes advantage of multi-stage build to isolate the installation of a local library mypackage:

{% extends bento_autotemplate %}
{% block SETUP_BENTO_BASE_IMAGE %}
FROM --platform=$BUILDPLATFORM python:3.7-slim as buildstage
RUN mkdir /tmp/mypackage

WORKDIR /tmp/mypackage/
COPY mypackage .
RUN python setup.py sdist && mv dist/mypackage-0.0.1.tar.gz mypackage.tar.gz

{{ super() }}
{% endblock %}
{% block SETUP_BENTO_COMPONENTS %}
{{ super() }}
COPY --from=buildstage mypackage.tar.gz /tmp/wheels/
RUN --network=none pip install --find-links /tmp/wheels mypackage
{% endblock %}

Note

Notice how for all Dockerfile instruction, we consider as if the Jinja logics aren’t there πŸš€.

Blocks#

BentoML defines a sets of Blocks under the object bento_autotemplate.

All exported blocks that users can use to extend are as follow:

Blocks

Definition

SETUP_BENTO_BASE_IMAGE

Instructions to set up multi architecture supports, base images as well as installing system packages that is defined by users.

SETUP_BENTO_USER

Setup bento users with correct UID, GID and directory for a 🍱.

SETUP_BENTO_ENVARS

Add users environment variables (if specified) and other required variables from BentoML.

SETUP_BENTO_COMPONENTS

Setup components for a 🍱 , including installing pip packages, running setup scripts, installing bentoml, etc.

SETUP_BENTO_ENTRYPOINT

Finalize ports and set ENTRYPOINT and CMD for the 🍱.

Note

All the defined blocks are prefixed with SETUP_BENTO_*. This is to ensure that users can extend blocks defined by BentoML without sacrificing the flexibility of a Jinja template.

To extend any given block, users can do so by adding {{ super() }} at any point inside block. This will ensure that the block is inherited from the main SETUP_BENTO block defined by BentoML.

The following are examples of how to use custom blocks:

{% extends bento_autotemplate %}
{% block SETUP_BENTO_USER %}
{{ super() }}
ENV CUSTOM_USER_VAR=foobar
{% endblock %}
{% extends bento_autotemplate %}
{% block SETUP_BENTO_COMPONENTS %}
RUN --mount=type=ssh git clone git@github.com:myorg/myproject.git myproject
{{ super() }}
{% endblock %}
{% extends bento_autotemplate %}
{% block SETUP_BENTO_BASE_IMAGE %}
FROM --platform=$BUILDPLATFORM tensorflow/tensorflow:latest-devel as tf
{{ super() }}
COPY --from=tf /tf /tf
...
{% endblock %}

An example of a custom Dockerfile template:

{% extends bento_autotemplate %}
{% set bento__home = "/tmp" %}
{% block SETUP_BENTO_ENTRYPOINT %}
{{ super() }}
RUN --mount=type=cache,mode=0777,target=/root/.cache/pip \
    pip install awslambdaric==2.0.0 mangum==0.12.3

ENTRYPOINT [ "/usr/bin/python3", "-m", "awslambdaric" ]

{% endblock %}
About setting up 🍱 ENTRYPOINT and CMD

As you seen from the example above, the flexibility of a Jinja template also brings up the flexibility of setting up ENTRYPOINT and CMD.

From Dockerfile documentation:

Only the last ENTRYPOINT instruction in the Dockerfile will have an effect.

By default, a Bento sets:

ENTRYPOINT [ "{{ bento__entrypoint }}" ]

CMD ["bentoml", "serve", "{{ bento__path }}", "--production"]

This means that if you have multiple ENTRYPOINT instructions, you will have to make sure the last ENTRYPOINT will run bentoml when using docker run on the 🍱 container.

In cases where one needs to setup different ENTRYPOINT, you can use the ENTRYPOINT instruction under the SETUP_BENTO_ENTRYPOINT block as follow:

{% extends bento_autotemplate %}
{% block SETUP_BENTO_ENTRYPOINT %}
{{ super() }}

...
ENTRYPOINT [ "{{ bento__entrypoint }}", "python", "-m", "awslambdaric" ]
{% endblock %}

Tip

{{ bento__entrypoint }} is the path the BentoML entrypoint, nothinig special here 😏

Read more about CMD and ENTRYPOINT interaction here.

Customizing Bento variables#

From the example docker file above, we can see that we are also setting bento__home variable. BentoML does expose some variables that user can modify to fit their needs.

The following are the variables that users can set in their custom Dockerfile template:

Variables

Description

bento__home

Setup bento home, default to /home/{{ bento__user }}

bento__user

Setup bento user, default to bentoml

bento__uid_gid

Setup UID and GID for the user, default to 1034:1034

bento__path

Setup bento path, default to /home/{{ bento__user }}/bento

If any of the aforementioned fields are set with {% set ... %}, then we will use your value instead, otherwise a default value will be used.

Help us improve the project!

Found an issue or a TODO item? You’re always welcome to make contributions to the project and its documentation. Check out the BentoML development guide and documentation guide to get started.