Getting Started

There are three parts to the BentoML workflow.

  1. Save Models

  2. Define and Debug Services

  3. Build and Deploy Bentos

Installation

BentoML is distributed as a Python package and can be installed from PyPI:

pip install bentoml --pre

** The --pre flag is required as BentoML 1.0 is still a preview release

Save Models

We begin by saving a trained model instance to BentoML’s local model store. If models you wish to use are already saved to disk, they can also be added to BentoML with the import APIs.

The ML framework specific API, bentoml.sklearn.save(), will save the Iris Classifier to a local model store managed by BentoML. You can then load the runner for this model with the load_runner() API:

iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")
iris_clf_runner.run(np.array([5.9, 3. , 5.1, 1.8]))

Models can also be managed via the bentoml models CLI command. For more information use bentoml models --help.

> bentoml models list iris_clf

Tag                        Module           Path                                                 Size      Creation Time
iris_clf:svcryrt5xgafweb5  bentoml.sklearn  /home/user/bentoml/models/iris_clf/svcryrt5xgafweb5  5.81 KiB  2022-01-25 08:34:16

Define and Debug Services

Services are the core components of BentoML, where the serving logic is defined. With the model saved in the model store, we can define the service by creating a Python file bento.py with the following contents:

# bento.py
import bentoml
import bentoml.sklearn
import numpy as np

from bentoml.io import NumpyNdarray

# Load the runner for the latest ScikitLearn model we just saved
iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")

# Create the iris_classifier service with the ScikitLearn runner
svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])

# Create API function with pre- and post- processing logic
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def predict(input_ndarray: np.ndarray) -> np.ndarray:
    # Define pre-processing logic
    result = iris_clf_runner.run(input_ndarray)
    # Define post-processing logic
    return result

In this example, we defined the input and output type to be numpy.ndarray. More options, such as pandas.DataFrame and PIL.image are also supported. To see all supported options, see API and IO Descriptors.

We now have everything we need to serve our first request. Launch the server in debug mode by running the bentoml serve command in the current working directory. Using the --reload option allows the server to reflect any changes made to the bento.py module without restarting:

> bentoml serve ./bento.py:svc --reload

[10:18:42 AM] INFO     Starting development BentoServer from "./bento.py:svc"
[10:18:42 AM] INFO     Service imported from source: bentoml.Service(name="iris_classifier", import_str="bento:svc", working_dir="/home/user/devel/bentoml-quickstart")
[10:18:42 AM] INFO     Will watch for changes in these directories: ['/home/user/devel/bentoml-quickstart']                                                              config.py:334
              INFO     Uvicorn running on http://127.0.0.1:5000 (Press CTRL+C to quit)                                                                                   config.py:554
              INFO     Started reloader process [97796] using statreload                                                                                              basereload.py:56
[10:18:43 AM] INFO     Started server process [97808]                                                                                                                     server.py:84
              INFO     Waiting for application startup.                                                                                                                       on.py:45
              INFO     Application startup complete.                                                                                                                          on.py:59

We can then send requests to the newly started service with any HTTP client:

import requests
requests.post(
    "http://127.0.0.1:5000/predict",
    headers={"content-type": "application/json"},
    data="[5,4,3,2]").text

Build and Deploy Bentos

Once we are happy with the service definition, we can build the model and service into a bento. Bentos are the distribution format for services, and contains all the information required to run or deploy those services, such as models and dependencies. For more information about building bentos, see Building Bentos.

To build a Bento, first create a bentofile.yaml in your project directory:

 # bentofile.yaml
service: "bento.py:svc"
include:
 - "*.py"
python:
  packages:
   - scikit-learn

Next, use the bentoml build CLI command in the same directory to build a bento.

> bentoml build

[10:25:51 AM] INFO     Building BentoML service "iris_classifier:foereut5zgw3ceb5" from build context "/home/user/devel/bentoml-quickstart"
              INFO     Packing model "iris_clf:svcryrt5xgafweb5" from "/home/user/bentoml/models/iris_clf/svcryrt5xgafweb5"
              INFO
                       ██████╗░███████╗███╗░░██╗████████╗░█████╗░███╗░░░███╗██╗░░░░░
                       ██╔══██╗██╔════╝████╗░██║╚══██╔══╝██╔══██╗████╗░████║██║░░░░░
                       ██████╦╝█████╗░░██╔██╗██║░░░██║░░░██║░░██║██╔████╔██║██║░░░░░
                       ██╔══██╗██╔══╝░░██║╚████║░░░██║░░░██║░░██║██║╚██╔╝██║██║░░░░░
                       ██████╦╝███████╗██║░╚███║░░░██║░░░╚█████╔╝██║░╚═╝░██║███████╗
                       ╚═════╝░╚══════╝╚═╝░░╚══╝░░░╚═╝░░░░╚════╝░╚═╝░░░░░╚═╝╚══════╝

              INFO     Successfully built Bento(tag="iris_classifier:foereut5zgw3ceb5") at "/home/user/bentoml/bentos/iris_classifier/foereut5zgw3ceb5/"

Bentos built will be saved in the local bento store, which you can view using the bentoml list CLI command.

> bentoml list

Tag                               Service    Path                                                          Size       Creation Time
iris_classifier:foereut5zgw3ceb5  bento:svc  /home/user/bentoml/bentos/iris_classifier/foereut5zgw3ceb5  13.97 KiB  2022-01-25 10:25:51

We can serve bentos from the bento store using the bentoml serve --production CLI command. Using the --production option will serve the bento in production mode.

> bentoml serve iris_classifier:latest --production

[09:04:18 PM] INFO     Starting production BentoServer from "iris_classifier:latest"
              INFO     Service loaded from Bento store: bentoml.Service(tag="iris_classifier:2qcg23t5zgzlseb5", path="/home/user/bentoml/bentos/iris_classifier/2qcg23t5zgzlseb5")
[09:04:19 PM] INFO     Service loaded from Bento store: bentoml.Service(tag="iris_classifier:2qcg23t5zgzlseb5", path="/home/user/bentoml/bentos/iris_classifier/2qcg23t5zgzlseb5")
[09:04:19 PM] INFO     Service loaded from Bento store: bentoml.Service(tag="iris_classifier:2qcg23t5zgzlseb5", path="/home/user/bentoml/bentos/iris_classifier/2qcg23t5zgzlseb5")
[09:04:19 PM] INFO     Started server process [28395]                                                                                                                     server.py:84
              INFO     Waiting for application startup.                                                                                                                       on.py:45
[09:04:19 PM] INFO     Started server process [28396]                                                                                                                     server.py:84
              INFO     Waiting for application startup.                                                                                                                       on.py:45
              INFO     Application startup complete.                                                                                                                          on.py:59
              INFO     Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)                                                                                     server.py:222
              INFO     Application startup complete.                                                                                                                          on.py:59
              INFO     Uvicorn running on socket /run/user/1000/tmpy16ao7fo/140574878932496.sock (Press CTRL+C to quit)                                                  server.py:191

Lastly, we can containerize bentos as Docker images using the bentoml container CLI command and manage bentos at scale using the model and bento management service.