Getting Started

In this guide we will show you how to create a local web service for your machine learning model(s). Then we will package that web service into a self contained package (Bento) which is ready for production deployment. The source code of this guide is avaialble in the gallery project.

There are three parts to the BentoML workflow.

  1. Save your Model

    Once model training is complete, use one of our tool specific frameworks to save your model in BentoML’s standard format.

  2. Define your Service

    Now that we’ve stored your model in our standard format, we will define the webservice which will host the model. In this definition, you can easily add Pre/Post processing code along with your model inference.

  3. Build and Deploy your Bento

    Finally, let BentoML build your deployable container (your bento) and assist you in deploying to your cloud service of choice

Installation

BentoML is distributed as a Python package and can be installed from PyPI:

pip install bentoml --pre

** The --pre flag is required as BentoML 1.0 is still a preview release

Save Models

We begin by saving a trained model instance to BentoML’s local model store. The local model store is used to version your models as well as control which models are packaged with your bento.

If the models you wish to use are already saved to disk or available in a cloud repository, they can also be added to BentoML with the import APIs.

import bentoml

from sklearn import svm
from sklearn import datasets

# Load predefined training set to build an example model
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)

# Call to bentoml.<FRAMEWORK>.save(<MODEL_NAME>, model)
# In order to save to BentoML's standard format in a local model store
bentoml.sklearn.save("iris_clf", clf)

# [08:34:16 AM] INFO     Successfully saved Model(tag="iris_clf:svcryrt5xgafweb5",
#                        path="/home/user/bentoml/models/iris_clf/svcryrt5xgafweb5/")
# Tag(name='iris_clf', version='svcryrt5xgafweb5')

bentoml.sklearn.save(), will save the Iris Classifier to a local model store managed by BentoML. See ML framework specific API for all supported modeling libraries.

You can then load the the model to be run inline using the bentoml.<FRAMEWORK>.load(<TAG>)

Or you can use our performance optimized runners using the bentoml.<FRAMEWORK>.load_runner(<TAG>) API:

iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")
iris_clf_runner.run(np.array([5.9, 3. , 5.1, 1.8]))

Models can also be managed via the bentoml models CLI command. For more information use bentoml models --help.

> bentoml models list iris_clf

Tag                        Module           Path                                                 Size      Creation Time
iris_clf:svcryrt5xgafweb5  bentoml.sklearn  /home/user/bentoml/models/iris_clf/svcryrt5xgafweb5  5.81 KiB  2022-01-25 08:34:16

Define and Debug Services

Services are the core components of BentoML, where the serving logic is defined. With the model saved in the model store, we can define the service by creating a Python file service.py with the following contents:

# service.py
import numpy as np
import bentoml
from bentoml.io import NumpyNdarray

# Load the runner for the latest ScikitLearn model we just saved
iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")

# Create the iris_classifier service with the ScikitLearn runner
# Multiple runners may be specified if needed in the runners array
# When packaged as a bento, the runners here will included
svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])

# Create API function with pre- and post- processing logic with your new "svc" annotation
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def classify(input_series: np.ndarray) -> np.ndarray:
    # Define pre-processing logic
    result = iris_clf_runner.run(input_series)
    # Define post-processing logic
    return result

In this example, we defined the input and output type to be numpy.ndarray. More options, such as pandas.DataFrame and PIL.image are also supported. To see all supported options, see API and IO Descriptors.

We now have everything we need to serve our first request. Launch the server in debug mode by running the bentoml serve command in the current working directory. Using the --reload option allows the server to reflect any changes made to the service.py module without restarting:

> bentoml serve ./service.py:svc --reload

02/24/2022 02:43:40 INFO     [cli] Starting development BentoServer from "./service.py:svc" running on http://127.0.0.1:3000 (Press CTRL+C to quit)
02/24/2022 02:43:41 INFO     [dev_api_server] Service imported from source: bentoml.Service(name="iris_classifier", import_str="service:svc", working_dir="/home/user/gallery/quickstart")
02/24/2022 02:43:41 INFO     [dev_api_server] Will watch for changes in these directories: ['/home/user/gallery/quickstart']
02/24/2022 02:43:41 INFO     [dev_api_server] Started server process [25915]
02/24/2022 02:43:41 INFO     [dev_api_server] Waiting for application startup.
02/24/2022 02:43:41 INFO     [dev_api_server] Application startup complete.                                                                                                                          on.py:59

We can then send requests to the newly started service with any HTTP client:

import requests
requests.post(
    "http://127.0.0.1:3000/classify",
    headers={"content-type": "application/json"},
    data="[5,4,3,2]").text

BentoML optimizes your service in a number of ways for example we use two of the fastest Python web framework Starlette and Uvicorn, in order to serve your model efficiently at scale.

For more information on our performance optimizations please see BentoServer.

Build and Deploy Bentos

Once we are happy with the service definition, we can build the model and service into a bento. Bentos are the distribution format for services, and contains all the information required to run or deploy those services, such as models and dependencies. For more information about building bentos, see Building Bentos.

To build a Bento, first create a file named bentofile.yaml in your project directory:

# bentofile.yaml
service: "service.py:svc"  # A convention for locating your service: <YOUR_SERVICE_PY>:<YOUR_SERVICE_ANNOTATION>
description: "file: ./README.md"
labels:
    owner: bentoml-team
    stage: demo
include:
 - "*.py"  # A pattern for matching which files to include in the bento
python:
  packages:
   - scikit-learn  # Additional libraries to be included in the bento
   - pandas

Next, use the bentoml build CLI command in the same directory to build a bento.

> bentoml build

02/24/2022 02:47:06 INFO     [cli] Building BentoML service "iris_classifier:dpijemevl6nlhlg6" from build context "/home/user/gallery/quickstart"
02/24/2022 02:47:06 INFO     [cli] Packing model "iris_clf:tf773jety6jznlg6" from "/home/user//bentoml/models/iris_clf/tf773jety6jznlg6"
02/24/2022 02:47:06 INFO     [cli] Locking PyPI package versions..
02/24/2022 02:47:08 INFO     [cli]
                            ██████╗░███████╗███╗░░██╗████████╗░█████╗░███╗░░░███╗██╗░░░░░
                            ██╔══██╗██╔════╝████╗░██║╚══██╔══╝██╔══██╗████╗░████║██║░░░░░
                            ██████╦╝█████╗░░██╔██╗██║░░░██║░░░██║░░██║██╔████╔██║██║░░░░░
                            ██╔══██╗██╔══╝░░██║╚████║░░░██║░░░██║░░██║██║╚██╔╝██║██║░░░░░
                            ██████╦╝███████╗██║░╚███║░░░██║░░░╚█████╔╝██║░╚═╝░██║███████╗
                            ╚═════╝░╚══════╝╚═╝░░╚══╝░░░╚═╝░░░░╚════╝░╚═╝░░░░░╚═╝╚══════╝

02/24/2022 02:47:08 INFO     [cli] Successfully built Bento(tag="iris_classifier:dpijemevl6nlhlg6") at "/home/user//bentoml/bentos/iris_classifier/dpijemevl6nlhlg6/"

Bentos built will be saved in the local bento store, which you can view using the bentoml list CLI command.

> bentoml list

Tag                               Service      Path                                                        Size       Creation Time
iris_classifier:dpijemevl6nlhlg6  service:svc  /home/user/bentoml/bentos/iris_classifier/dpijemevl6nlhlg6  19.46 KiB  2022-02-24 10:47:08

We can serve bentos from the bento store using the bentoml serve --production CLI command. Using the --production option will serve the bento in production mode.

> bentoml serve iris_classifier:latest --production

02/24/2022 03:01:19 INFO     [cli] Service loaded from Bento store: bentoml.Service(tag="iris_classifier:dpijemevl6nlhlg6", path="/Users/ssheng/bentoml/bentos/iris_classifier/dpijemevl6nlhlg6")
02/24/2022 03:01:19 INFO     [cli] Starting production BentoServer from "bento_identifier" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
02/24/2022 03:01:20 INFO     [iris_clf] Service loaded from Bento store: bentoml.Service(tag="iris_classifier:dpijemevl6nlhlg6", path="/Users/ssheng/bentoml/bentos/iris_classifier/dpijemevl6nlhlg6")
02/24/2022 03:01:20 INFO     [api_server] Service loaded from Bento store: bentoml.Service(tag="iris_classifier:dpijemevl6nlhlg6", path="/Users/ssheng/bentoml/bentos/iris_classifier/dpijemevl6nlhlg6")
02/24/2022 03:01:20 INFO     [iris_clf] Started server process [28761]
02/24/2022 03:01:20 INFO     [iris_clf] Waiting for application startup.
02/24/2022 03:01:20 INFO     [api_server] Started server process [28762]
02/24/2022 03:01:20 INFO     [api_server] Waiting for application startup.
02/24/2022 03:01:20 INFO     [api_server] Application startup complete.
02/24/2022 03:01:20 INFO     [iris_clf] Application startup complete.

Lastly, we can containerize bentos as Docker images using the bentoml container CLI command and manage bentos at scale using the model and bento management service.

Further Reading