Building Bentos#
What is a Bento?#
Bento π± is a file archive with all the source code, models, data files and dependency configurations required for running a user-defined bentoml.Service, packaged into a standardized format.
While bentoml.Service
standardizes the inference API definition, including the
serving logic, runners initialization and API input, output types.
Bento
standardizes how to reproduce the required environment for running a
bentoml.Service
in production.
Note
βBento Buildβ is essentially the build process in traditional software development, where source code files were converted into standalone artifacts that are ready to deploy. BentoML reimagined this process for Machine Learning model delivery, and optimized the workflow both for interactive model development and for working with automated training pipelines.
The Build Command#
A Bento can be created with the bentoml build CLI command
with a bentofile.yaml
build file. Hereβs an example from the
tutorial:
service: "service:svc" # Same as the argument passed to `bentoml serve`
labels:
owner: bentoml-team
stage: dev
include:
- "*.py" # A pattern for matching which files to include in the bento
python:
packages: # Additional pip packages required by the service
- scikit-learn
- pandas
Β» bentoml build
Building BentoML service "iris_classifier:dpijemevl6nlhlg6" from build context "/home/user/gallery/quickstart"
Packing model "iris_clf:zy3dfgxzqkjrlgxi"
Locking PyPI package versions..
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββ¦ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββ¦ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Successfully built Bento(tag="iris_classifier:dpijemevl6nlhlg6")
Similar to saving a model, a unique version tag will be automatically generated for the newly created Bento.
It is also possible to customize the Bento version string by specifying it in the
--version
CLI argument. However this is generally not recommended. Only use it
if your team has a very specific naming convention for deployable artifacts, e.g.:
Β» bentoml build --version 1.0.1
Note
The Bento build process requires importing the bentoml.Service
object
defined. This means, the build environment must have all its dependencies installed.
Support for building from a docker environment is on the roadmap, see #2495.
Advanced Project Structure#
For projects that are part of a larger codebase and interacts with other local python
modules; Or for projects containing multiple Bentos/Services, it may not be possible to
put all service definition code and bentofile.yaml
under the projectβs root
directory.
BentoML allows placing the service definition file and bentofile anywhere in the project
directory. In this case, the user needs to provide the build_ctx
and
bentofile
argument to the bentoml build
CLI command.
- build_ctx
Build context is your Python projectβs working directory. This is from where you start the Python interpreter during development so that your local python modules can be imported properly. Default to current directory where the
bentoml build
takes place.- bentofile
bentofile
is a.yaml
file that specifies the Bento Build Options. Default to thebentofile.yaml
file under the build context.
They can also be customized via the CLI command, e.g.:
Β» bentoml build -f ./src/my_project_a/bento_fraud_detect.yaml ./src/
Managing Bentos#
Bentos are the unit of deployment in BentoML, one of the most important artifact to keep track of for your model deployment workflow.
Local Bento Store#
Similar to Models, Bentos built locally can be managed via the bentoml CLI commands:
Β» bentoml list
Tag Size Creation Time Path
iris_classifier:nvjtj7wwfgsafuqj 16.99 KiB 2022-05-17 21:36:36 ~/bentoml/bentos/iris_classifier/nvjtj7wwfgsafuqj
iris_classifier:jxcnbhfv6w6kvuqj 19.68 KiB 2022-04-06 22:02:52 ~/bentoml/bentos/iris_classifier/jxcnbhfv6w6kvuqj
Β» bentoml get iris_classifier:latest
service: service:svc
name: iris_classifier
version: nvjtj7wwfgsafuqj
bentoml_version: 1.0.0
creation_time: '2022-05-17T21:36:36.436878+00:00'
labels:
owner: bentoml-team
project: gallery
models:
- tag: iris_clf:nb5vrfgwfgtjruqj
module: bentoml.sklearn
creation_time: '2022-05-17T21:36:27.656424+00:00'
runners:
- name: iris_clf
runnable_type: SklearnRunnable
models:
- iris_clf:nb5vrfgwfgtjruqj
resource_config:
cpu: 4.0
nvidia_gpu: 0.0
apis:
- name: classify
input_type: NumpyNdarray
output_type: NumpyNdarray
Β» bentoml delete iris_classifier:latest -y
Bento(tag="iris_classifier:nvjtj7wwfgsafuqj") deleted
Import and Export#
Bentos can be exported to a standalone archive file outside of the store, for sharing Bentos between teams or moving between different deployment stages. For example:
> bentoml export iris_classifier:latest .
INFO [cli] Bento(tag="iris_classifier:nvjtj7wwfgsafuqj") exported to ./iris_classifier-nvjtj7wwfgsafuqj.bento
> bentoml import ./iris_classifier-nvjtj7wwfgsafuqj.bento
INFO [cli] Bento(tag="iris_classifier:nvjtj7wwfgsafuqj") imported
Note
Bentos can be exported to or import from AWS S3, GCS, FTP, Dropbox, etc. For example with S3:
pip install fs-s3fs # Additional dependency required for working with s3
bentoml import s3://bentoml.com/quickstart/iris_classifier.bento
bentoml export iris_classifier:latest s3://my_bucket/my_prefix/
To see a list of plugins usable for upload, see the list provided by the pyfilesystem library.
Testing#
After you build a Bento, itβs essential to test it locally before containerizing it or pushing it to BentoCloud for production deployment. Local testing ensures that the Bento behaves as expected and helps identify any potential issues. Here are two methods to test a Bento locally.
Via BentoML CLI#
You can easily serve a Bento using the BentoML CLI. Replace BENTO_TAG
with your specific Bento tag (for example, iris_classifier:latest
) in the following command.
bentoml serve BENTO_TAG
Via bentoml.Server API#
For those working within scripting environments or running Python-based tests where using the CLI might be
difficult, the bentoml.Server
API offers a more programmatic way to serve and interact with your Bento.
It gives you detailed control over the server lifecycle, especially useful for debugging and iterative testing.
The following example uses the Bento iris_classifier:latest
created in the quickstart Deploy an Iris classification model with BentoML
to create an HTTP server. Note that GrpcServer
is also available.
from bentoml import HTTPServer
import numpy as np
# Initialize the server with the Bento
server = HTTPServer("iris_classifier:latest", production=True, port=3000, host='0.0.0.0')
# Start the server (non-blocking by default)
server.start(blocking=False)
# Get a client to make requests to the server
client = server.get_client()
# Send a request using the client
result = client.classify(np.array([[4.9, 3.0, 1.4, 0.2]]))
print(result)
# Stop the server to free up resources
server.stop()
Alternatively, you can manage the serverβs lifecycle using a context manager. This ensures that the server is automatically stopped once you exit the with
block.
from bentoml import HTTPServer
import numpy as np
server = HTTPServer("iris_classifier:latest", production=True, port=3000, host='0.0.0.0')
with server.start() as client:
result = client.classify(np.array([[4.9, 3.0, 1.4, 0.2]]))
print(result)
Push and Pull#
Yatai provides a centralized Bento repository that comes with flexible APIs and Web UI for managing all Bentos created by your team. It can be configured to store Bento files on cloud blob storage such as AWS S3, MinIO or GCS, and automatically build docker images when a new Bento was pushed.
Β» bentoml push iris_classifier:latest
Successfully pushed Bento "iris_classifier:nvjtj7wwfgsafuqj"
Β» bentoml pull iris_classifier:nvjtj7wwfgsafuqj
Successfully pulled Bento "iris_classifier:nvjtj7wwfgsafuqj"

Bento Management API#
Similar to concepts/model:Managing Models, equivalent Python APIs are also provided for managing Bentos:
import bentoml
bento = bentoml.get("iris_classifier:latest")
print(bento.tag)
print(bento.path)
print(bento.info.to_dict())
import bentoml
bentos = bentoml.list()
import bentoml
bentoml.export_bento('my_bento:latest', '/path/to/folder/my_bento.bento')
bentoml.import_bento('/path/to/folder/my_bento.bento')
Note
Bentos can be exported to or import from AWS S3, GCS, FTP, Dropbox, etc. For
example: bentoml.export_bento('my_bento:latest', 's3://my_bucket/folder')
If your team has Yatai setup, you can also push local Bentos to Yatai, it provides APIs and Web UI for managing all Bentos created by your team, stores Bento files on cloud blob storage such as AWS S3, MinIO or GCS, and automatically builds docker images when a new Bento was pushed.
import bentoml
bentoml.push("iris_classifier:nvjtj7wwfgsafuqj")
bentoml.pull("iris_classifier:nvjtj7wwfgsafuqj")
import bentoml
bentoml.delete("iris_classifier:nvjtj7wwfgsafuqj")
Whatβs inside a Bento#
It is possible to view the generated files in a specific Bento. Simply use the
-o/--output
option of the bentoml get
command to find the file path to
the Bento archive directory.
Β» cd $(bentoml get iris_classifier:latest -o path)
Β» tree
.
βββ README.md
βββ apis
β βββ openapi.yaml
βββ bento.yaml
βββ env
β βββ docker
β β βββ Dockerfile
β β βββ entrypoint.sh
β βββ python
β βββ requirements.lock.txt
β βββ requirements.txt
β βββ version.txt
βββ models
β βββ iris_clf
β βββ latest
β βββ nb5vrfgwfgtjruqj
β βββ model.yaml
β βββ saved_model.pkl
βββ src
βββ locustfile.py
βββ service.py
βββ train.py
src
directory contains files specified under the include field in thebentofile.yaml
. These files are relative to user Python codeβs CWD (current working directory), which makes importing relative modules and file path inside user code possible.models
directory contains all models required by the Service. This is automatically determined from thebentoml.Service
objectβs runners list.apis
directory contains all API definitions. This directory contains API specs that are generated from thebentoml.Service
objectβs API definitions.env
directory contains all environment-related files which will help bootstrap the Bento π±. This directory contains files that are generated from Bento Build Options that is specified underbentofile.yaml
.
Note
Warning: users should never change files in the generated Bento archive, unless itβs for debugging purpose.
Bento Build Options#
Build options are specified in a .yaml
file, which customizes the final Bento
produced.
By convention, this file is named bentofile.yaml
.
In this section, we will go over all the build options, including defining dependencies, configuring files to include, and customize docker image settings.
Service#
service
is a required field which specifies where the
bentoml.Service
object is defined.
In the tutorial, we defined service: "service:svc"
, which can be
interpreted as:
service
refers to the Python module (theservice.py
file)svc
refers to thebentoml.Service
object created inservice.py
, withsvc = bentoml.Service(...)
Tip
This is synonymous to how the bentoml serve command specifies a bentoml.Service
target.
ββββββββββββββββ
ββββββββββββββββββ€bentofile.yamlβ
β βββββββββββββ¬βββ
β β
β service: "service:svc" β
β ββ¬β β
β β β
βββββββββββββββββββΌβββββββββββ
β
β
β ββββββ
βββββββββββββββββββββββΌβββββ€bashβ
β β ββββ¬ββ
β βΌ β
β Β» bentoml serve service:svc β
β β
β β
βββββββββββββββββββββββββββββββ
Description#
description
field allows user to customize documentation for any given Bento.
The description contents must be plain text, optionally in Markdown format. Description
can be specified either inline in the bentofile.yaml
, or via a file path to an
existing text file:
service: "service.py:svc"
description: |
## Description For My Bento π±
Use **any markdown syntax** here!
> BentoML is awesome!
include:
...
service: "service.py:svc"
description: "file: ./README.md"
include:
...
Tip
When pointing to a description file, it can be either an absolute path or a relative
path. The file must exist on the given path upon bentoml build
command run,
and for relative file path, the current path is set to the build_ctx
, which
default to the directory where bentoml build
was executed from.
Labels#
labels
are key-value pairs that are attached to an object.
In BentoML, both Bento
and Model
can have labels attached to them. Labels are intended to
be used to specify identifying attributes of Bentos/Models that are meaningful and
relevant to users, but do not directly imply semantics to the rest of the system.
Labels can be used to organize models and Bentos in Yatai, which also allow users to add or modify labels at any time.
labels:
owner: bentoml-team
stage: not-ready
Files to include#
In the example above, the *.py
includes every Python files under build_ctx
.
You can also include other wildcard and directory pattern matching.
...
include:
- "data/"
- "**/*.py"
- "config/*.json"
- "path/to/a/file.csv"
If the include field is not specified, BentoML will include all files under the build_ctx
directory, besides the ones explicitly set to be excluded, as will be demonstrated in Files to exclude.
See also
Both include
and exclude
fields support gitignore style pattern
matching..
Files to exclude#
If there are a lot of files under the working directory, another approach is to only specify which files to be ignored.
exclude
field specifies the pathspecs (similar to .gitignore
files) of files to be excluded in the final Bento build. The pathspecs are relative to
the build_ctx
directory.
...
include:
- "data/"
- "**/*.py"
exclude:
- "tests/"
- "secrets.key"
Users can also opt to place a .bentoignore
file in the build_ctx
directory. This is what a .bentoignore
file would look like:
__pycache__/
*.py[cod]
*$py.class
.ipynb_checkpoints/
training_data/
Note
exclude
is always applied after include
.
Python Packages#
Required Python packages for a given Bento can be specified under the python.packages
field.
When a package name is left without a version, BentoML will lock the package to the
version available under the current environment when running bentoml build
. User can also specify the
desired version, install from a custom PyPI source, or install from a GitHub repo:
python:
packages:
- "numpy"
- "matplotlib==3.5.1"
- "package>=0.2,<0.3"
- "torchvision==0.9.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cpu"
- "git+https://github.com/username/mylib.git@main"
Note
Thereβs no need to specify bentoml
as a dependency here since BentoML will
add the current version of BentoML to the Bentoβs dependency list by default. User
can override this by specifying a different BentoML version.
To use a variant of BentoML with additional features such as gRPC, tracing exporters, pydantic
validation, specify the desired variant in the under python.packages
field:
python:
packages:
- "bentoml[grpc]"
python:
packages:
- "bentoml[aws]"
python:
packages:
- "bentoml[io-json]"
python:
packages:
- "bentoml[io-image]"
python:
packages:
- "bentoml[io-pandas]"
python:
packages:
- "bentoml[io-json]"
python:
packages:
- "bentoml[tracing-jaeger]"
python:
packages:
- "bentoml[tracing-zipkin]"
python:
packages:
- "bentoml[tracing-otlp]"
If you already have a
requirements.txt
file that defines python packages for your project, you may also supply a path to the
requirements.txt
file directly:
python:
requirements_txt: "./project-a/ml-requirements.txt"
Pip Install Options#
Additional pip install
arguments can also be provided.
Note that these arguments will be applied to all packages defined in python.packages
, as
well as the requirements_txt
file, if provided.
python:
requirements_txt: "./requirements.txt"
index_url: "https://my.mirror.com/simple"
no_index: False
trusted_host:
- "pypi.python.org"
- "my.mirror.com"
find_links:
- "https://download.pytorch.org/whl/cu80/stable.html"
extra_index_url:
- "https://<other api token>:@my.mirror.com/pypi/simple"
- "https://pypi.python.org/simple"
pip_args: "--pre -U --force-reinstall"
Note
BentoML by default will cache pip artifacts across all local image builds to speed up the build process.
If you want to force a re-download instead of using the cache, you can specify the pip_args: "--no-cache-dir"
option in your
bentofile.yaml
, or use the --no-cache
option in bentoml containerize
command, e.g.:
Β» bentoml containerize my_bento:latest --no-cache
PyPI Package Locking#
By default, BentoML automatically locks all package versions, as well as all packages in
their dependency graph, to the version found in the current build environment, and
generates a requirements.lock.txt
file. This process uses
pip-compile under the hood.
If you have already specified a version for all packages, you can optionally disable
this behavior by setting the lock_packages
field to False:
python:
requirements_txt: "requirements.txt"
lock_packages: false
Python Wheels#
Python .whl
files are also supported as a type of dependency to include in a
Bento. Simply provide a path to your .whl
files under the wheels`
field.
python:
wheels:
- ./lib/my_package.whl
If the wheel is hosted on a local network without TLS, you can indicate
that the domain is safe to pip with the trusted_host
field.
Python Options Table#
Field |
Description |
---|---|
requirements_txt |
The path to a custom requirements.txt file |
packages |
Packages to include in this bento |
lock_packages |
Whether to lock the packages or not |
index_url |
Inputs for the |
no_index |
Whether to include the |
trusted_host |
List of trusted hosts used as inputs using the |
find_links |
List of links to find as inputs using the |
extra_index_url |
List of extra index urls as inputs using the |
pip_args |
Any additional pip arguments that you would like to add when installing a package |
wheels |
List of paths to wheels to include in the bento |
Models#
You can specify the model to be used for building a bento using a string model tag or a dictionary, which will be written to the bento.yaml
file in the bento package.
When you start from an existing project, you can download models from Yatai to your local model store with these configurations by running bentoml models pull
.
Note that you need to log in to Yatai first by running bentoml yatai login
.
See the following example for details. If you donβt define models
in bentofile.yaml
, the model specified in the service is used to build the bento.
models:
- "iris_clf:latest" # A string model tag
- tag: "iris_clf:version1" # A dictionary
filter: "label:staging"
alias: "iris_clf_v1"
tag
: The name and version of the model, separated by a colon.filter
: This field uses the same filter syntax in Yatai. You use a filter to list specific models, such as the models with the same label. You can add multiple comma-separated filters to a model.alias
: An alias for the model. If this is specified, you can use it directly in code likebentoml.models.get(alias)
.
Conda Options#
Conda dependencies can be specified under conda
field. For example:
conda:
channels:
- default
dependencies:
- h2o
pip:
- "scikit-learn==1.2.0"
When channels
filed is left unspecified, BentoML will use the community
maintained conda-forge
channel as the default.
Optionally, you can export all dependencies from a preexisting conda environment to
an environment.yml
file, and provide this file in your bentofile.yaml
config:
Export conda environment:
Β» conda env export > environment.yml
In your bentofile.yaml
:
conda:
environment_yml: "./environment.yml"
Note
Unlike Python packages, BentoML does not support locking conda packages versions automatically. It is recommended for users to specify a version in the config file.
See also
When conda
options are provided, BentoML will select a docker base image
that comes with Miniconda pre-installed in the generated Dockerfile. Note that only
the debian
and alpine
distro support conda
. Learn more at
the Docker Options section below.
Conda Options Table#
Field |
Description |
---|---|
environment_yml |
Path to a conda environment file to copy into the bento. If specified, this file will overwrite any additional option specified |
channels |
Custom conda channels to use. If not specified will use |
dependencies |
Custom conda dependencies to include in the environment |
pip |
The specific |
Docker Options#
BentoML makes it easy to deploy a Bento to a Docker container. This section discuss the available options for customizing the docker image generated from a Bento.
Hereβs a basic Docker options configuration:
docker:
distro: debian
python_version: "3.8.12"
cuda_version: "11.6.2"
system_packages:
- libblas-dev
- liblapack-dev
- gfortran
env:
FOO: value1
BAR: value2
Note
BentoML leverage BuildKit, a cache-efficient builder toolkit, to containerize Bentos π±.
BuildKit comes with Docker 18.09. This means if you are using Docker via Docker Desktop, BuildKit will be available by default.
However, if you are using a standalone version of Docker, you can install BuildKit by following the instructions here.
OS Distros#
The following OS distros are currently supported in BentoML:
debian
: default, similar to Ubuntualpine
: A minimal Docker image based on Alpine Linuxubi8
: Red Hat Universal Base Imageamazonlinux
: Amazon Linux 2
Some of the distros may not support using conda or specifying CUDA for GPU. Here is the support matrix for all distros:
Distro |
Available Python Versions |
Conda Support |
CUDA Support (GPU) |
---|---|---|---|
debian |
3.7, 3.8, 3.9, 3.10 |
Yes |
Yes |
alpine |
3.7, 3.8, 3.9, 3.10 |
Yes |
No |
ubi8 |
3.8, 3.9 |
No |
Yes |
amazonlinux |
3.7, 3.8 |
No |
No |
Todo
Document image supported architectures
GPU support#
The cuda_version
field specifies the target CUDA version to install on the
the generated docker image. Currently, the following CUDA version are supported:
"11.6.2"
"11.4.3"
"11.2.2"
BentoML will also install additional packages required for given target CUDA version.
docker:
cuda_version: "11.6.2"
If you need a different cuda version that is not currently supported in BentoML, it is
possible to install it by specifying it in the system_packages
or via the
setup_script
.
We will demonstrate how you can install custom cuda version via conda.
Add the following to your bentofile.yaml
:
conda:
channels:
- conda-forge
- nvidia
- defaults
dependencies:
- cudatoolkit-dev=10.1
- cudnn=7.6.4
- cxx-compiler=1.0
- mpi4py=3.0 # installs cuda-aware openmpi
- matplotlib=3.2
- networkx=2.4
- numba=0.48
- pandas=1.0
Then proceed with bentoml build
and bentoml containerize
respectively:
Β» bentoml build
Β» bentoml containerize <bento>:<tag>
Setup Script#
For advanced Docker customization, you can also use a setup_script
to inject
arbitrary user provided script during the image build process. For example, with NLP
projects you can pre-download NLTK data in the image with:
In your bentofile.yaml
:
...
python:
packages:
- nltk
docker:
setup_script: "./setup.sh"
In the setup.sh
file:
#!/bin/bash
set -euxo pipefail
echo "Downloading NLTK data.."
python -m nltk.downloader all
Now build a new bento and then run bentoml containerize MY_BENTO --progress plain
to
view the docker image build progress. The newly built docker image will contain
pre-downloaded NLTK dataset.
Tip
When working with bash scripts, it is recommended to add set -euxo pipefail
to the beginning. Especially when set -e is missing, the script will fail silently
without raising an exception during bentoml containerize
. Learn more about
Bash Set builtin.
It is also possible to provide a Python script for initializing the docker image. Hereβs an example:
In bentofile.yaml
:
...
python:
packages:
- nltk
docker:
setup_script: "./setup.py"
In the setup.py
file:
#!/usr/bin/env python
import nltk
print("Downloading NLTK data..")
nltk.download('treebank')
Note
Pay attention to #!/bin/bash
and #!/usr/bin/env python
in the
first line of the example scripts above. They are known as Shebang
and they are required in a setup script provided to BentoML.
Setup script is always executed after the specified Python packages, conda dependencies, and system packages are installed. Thus user can import and utilize those libraries in their setup script for the initialization process.
Enable features for your Bento#
Users can optionally pass in the --enable-features
flag to bentoml containerize
to
enable additional features for the generated Bento container image.
|
Feature |
---|---|
|
adding AWS interop (currently file upload to S3) |
|
enable gRPC functionalities in BentoML |
|
enable gRPC Channelz for debugging purposes |
|
enable gRPC Reflection |
|
adding Pillow dependencies to Image IO descriptor |
|
adding Pydantic validation to JSON IO descriptor |
|
adding Pandas dependencies to PandasDataFrame descriptor |
|
enable Jaeger Exporter for distributed tracing |
|
enable OTLP Exporter for distributed tracing |
|
enable Zipkin Exporter for distributed tracing |
|
enable Monitoring feature |
Advanced Options#
For advanced customization for generating docker images, see Advanced Containerization:
Docker Options Table#
Field |
Description |
---|---|
distro |
The OS distribution on the Docker image, Default to |
python_version |
Specify which python to include on the Docker image [3.7, 3.8, 3.9, 3.10]. Default to the Python version in build environment. |
cuda_version |
Specify the cuda version to install on the Docker image [ |
system_packages |
Declare system packages to be installed in the container. |
env |
Declare environment variables in the generated Dockerfile. |
setup_script |
A python or shell script that executes during docker build time. |
base_image |
A user-provided docker base image. This will override all other custom attributes of the image. |
dockerfile_template |
Customize the generated dockerfile by providing a Jinja2 template that extends the default dockerfile. |