1.0 Migration Guide#
BentoML version 1.0.0 APIs are backward incompatible with version 0.13.1. However, most of the common functionality can be achieved with the new version. We will guide and demonstrate the migration by transforming the quickstart gallery project from BentoML version 0.13.1 to 1.0.0. Complete every migration action denoted like the section below.
π‘ Migration Task
Install BentoML version 1.0.0 by running the following command.
> pip install bentoml
Train Models#
First, the quickstart project begins by training a classifier Scikit-Learn model from the iris datasets.
By running python train.py
, we obtain a trained classifier model.
from sklearn import svm
from sklearn import datasets
# Load training data
iris = datasets.load_iris()
X, y = iris.data, iris.target
# Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)
BentoML version 1.0.0 introduces the model store concept to help improve model management during development.
Once we are happy with the model trained, we can save the model instance with the save_model()
framework API to persist it in the model store. Optionally, you may attach custom labels, metadata, or custom
objects like tokenizers to be saved alongside the model. See
Save A Trained Model to learn more.
π‘ Migration Task
Append the model saving logic below to train.py and run python train.py.
bentoml.sklearn.save_model("iris_clf", clf)
print(f"Model saved: {saved_model}")
You can view and manage all saved models via the bentoml models
CLI command.
> bentoml models list
Tag Module Size Creation Time Path
iris_clf:zy3dfgxzqkjrlgxi bentoml.sklearn 5.81 KiB 2022-05-19 08:36:52 ~/bentoml/models/iris_clf/zy3dfgxzqkjrlgxi
Define Services#
Next, we will transform the service definition module and breakdown each section into details.
π‘ Migration Task
Update the service definition module service.py from the BentoML 0.13.1 specification to 1.0.0 specification.
import pandas as pd
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput
from bentoml.frameworks.sklearn import SklearnModelArtifact
@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('model')])
class IrisClassifier(BentoService):
@api(input=DataframeInput(), batch=True)
def predict(self, df: pd.DataFrame):
return self.artifacts.model.predict(df)
import numpy as np
import pandas as pd
import bentoml
from bentoml.io import NumpyNdarray, PandasDataFrame
iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner()
svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])
@svc.api(input=PandasDataFrame(), output=NumpyNdarray())
def predict(input_series: pd.DataFrame) -> np.ndarray:
result = iris_clf_runner.predict.run(input_series)
return result
Environment#
BentoML version 0.13.1 relies on the @env
decorator API for defining the
environment settings and dependencies of the service. Typical arguments of the environment decorator includes Python
dependencies (e.g. pip_packages
, pip_index_url
), Conda dependencies (e.g. conda_channels
,
conda_dependencies
), and Docker options (e.g. setup_sh
, docker_base_image
).
@env(pip_packages=["scikit-learn", "pandas"])
BentoML version 1.0.0 no longer relies on the environment decorator. Environment settings and service dependencies are
defined in the bentofile.yaml
file in the project directory. The contents are used to specify the
bentoml build
opations when building bentos.
π‘ Migration Task
Save the contents below to the bentofile.yaml file in the same directory as service.py.
service: "service.py:svc"
labels:
owner: bentoml-team
project: gallery
include:
- "*.py"
python:
packages:
- scikit-learn
- pandas
Artifacts#
BentoML version 0.13.1 provides the @artifacts
decorator API for users to specify
the trained models required by a BentoService.
The specified artifacts are automatically serialized and deserialized
when saving and loading a BentoService.
@artifacts([SklearnModelArtifact('model')])
BentoML 1.0.0 leverages a combination of model store and runners APIs for specifying the required models at runtime. Methods on the model can be invoked by calling the run function on the runner. Runner represents a unit of computation that can be executed on a remote Python worker and scales independently.
iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner()
API#
BentoML version 0.13.1 defines the inference API through the @api
decorator.
Input and output types can be specified through the adapters. The service will convert the inference request from
HTTP to the desired format specified by the input adaptor, in this case, a pandas.DataFrame
object.
@api(input=DataframeInput(), batch=True)
def predict(self, df: pd.DataFrame):
return self.artifacts.model.predict(df)
BentoML version 1.0.0 also provides a similar @svc.api
decorator.
The inference API is no longer defined within the service class. The association with the service is declared with the
@svc.api
decorator from the bentoml.Service
class. Input and output specifications are defined by IO
descriptor arguments passed to the @src.api
decorator. Similar to the adaptors, they help describe the expected
data types, validate that the input and output conform to the expected format and schema, and convert them from and to
the specified native types. In addition, multiple input and output can be defined using the tuple syntax,
e.g. input=(image=Image(), metadata=JSON())
.
@svc.api(input=PandasDataFrame(), output=NumpyNdarray())
def predict(input_series: pd.DataFrame) -> np.ndarray:
result = iris_clf_runner.predict.run(input_series)
return result
BentoML version 1.0.0 supports defining inference API as an asynchronous coroutine. Asynchronous APIs are preferred if the processing logic is IO-bound or invokes multiple runners simultaneously which is ideal for fetching features and calling remote APIs.
Test Services#
To improve development agility, BentoML version 1.0.0 adds the capability to test the service in development before
saving. Executing the bentoml serve --development
command will bring up an API server for rapid development
iterations. The --reload
option allows the development API server to reload upon every change of the service module.
> bentoml serve --development --reload
To bring up the API server and runners in a production like setting, run without the --development
option. In
production mode, API servers and runners will run in separate processes to maximize server utility and parallelism.
> bentoml serve
Building Bentos#
Next, we will build the service into a bento and save it to the bento store. Building a service to bento is to persist
the service for distribution. This operation is unique to BentoML version 1.0.0. The comparable operation in version
0.13.1 is to save a service to disk by calling the save()
function on the service instance.
π‘ Migration Task
Run bentoml build
command from the same directory as service.py and bentofile.yaml.
# import the IrisClassifier class defined above
from bento_service import IrisClassifier
# Create a iris classifier service instance
iris_classifier_service = IrisClassifier()
# Pack the newly trained model artifact
from sklearn import svm
from sklearn import datasets
# Load training data
iris = datasets.load_iris()
X, y = iris.data, iris.target
# Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)
iris_classifier_service.pack('model', clf)
# Save the prediction service to disk for model serving
saved_path = iris_classifier_service.save()
> bentoml build
Building BentoML service "iris_classifier:6otbsmxzq6lwbgxi" from build context "/home/user/gallery/quickstart"
Packing model "iris_clf:zy3dfgxzqkjrlgxi"
Locking PyPI package versions..
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββ¦ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββ¦ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Successfully built Bento(tag="iris_classifier:6otbsmxzq6lwbgxi")
You can view and manage all saved models via the bentoml
CLI command.
> bentoml list
Tag Size Creation Time Path
iris_classifier:6otbsmxzq6lwbgxi 16.48 KiB 2022-07-01 16:03:44 ~/bentoml/bentos/iris_classifier/6otbsmxzq6lwbgxi
Serve Bentos#
We can serve the saved bentos in production mode by running the bentoml serve
command.
The API servers and runners will run in separate processes to maximize server utility and parallelism.
> bentoml serve iris_classifier:latest
2022-07-06T02:02:30-0700 [INFO] [] Starting production BentoServer from "." running on http://0.0.0.0:3000 (Press CTRL+C to quit)
2022-07-06T02:02:31-0700 [INFO] [runner-iris_clf:1] Setting up worker: set CPU thread count to 10
Generate Docker Images#
Similar to version 0.13.1, we can generate docker images from bentos using the bentoml containerize
command in BentoML
version 1.0.0, see Containerize Bentos to learn more.
> bentoml containerize iris_classifier:latest
Building docker image for Bento(tag="iris_classifier:6otbsmxzq6lwbgxi")...
Successfully built docker image "iris_classifier:6otbsmxzq6lwbgxi"
You can run the docker image to start the service.
> docker run -p 3000:3000 iris_classifier:6otbsmxzq6lwbgxi
2022-07-01T21:57:47+0000 [INFO] [] Service loaded from Bento directory: bentoml.Service(tag="iris_classifier:6otbsmxzq6lwbgxi", path="/home/bentoml/bento/")
2022-07-01T21:57:47+0000 [INFO] [] Starting production BentoServer from "/home/bentoml/bento" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
2022-07-01T21:57:48+0000 [INFO] [api_server:1] Service loaded from Bento directory: bentoml.Service(tag="iris_classifier:6otbsmxzq6lwbgxi", path="/home/bentoml/bento/")
2022-07-01T21:57:48+0000 [INFO] [runner-iris_clf:1] Service loaded from Bento directory: bentoml.Service(tag="iris_classifier:6otbsmxzq6lwbgxi", path="/home/bentoml/bento/")
2022-07-01T21:57:48+0000 [INFO] [api_server:2] Service loaded from Bento directory: bentoml.Service(tag="iris_classifier:6otbsmxzq6lwbgxi", path="/home/bentoml/bento/")
2022-07-01T21:57:48+0000 [INFO] [runner-iris_clf:1] Setting up worker: set CPU thread count to 4
2022-07-01T21:57:48+0000 [INFO] [api_server:3] Service loaded from Bento directory: bentoml.Service(tag="iris_classifier:6otbsmxzq6lwbgxi", path="/home/bentoml/bento/")
2022-07-01T21:57:48+0000 [INFO] [api_server:4] Service loaded from Bento directory: bentoml.Service(tag="iris_classifier:6otbsmxzq6lwbgxi", path="/home/bentoml/bento/")
Deploy Bentos#
BentoML version 0.13.1 supports deployment of Bentos to various cloud providers, including Google Cloud Platform, Amazon Web Services,
and Microsoft Azure. To better support the devops workflows, cloud deployment of Bentos has been moved to a separate project,
π bentoctl, to better focus on the deployment tasks. bentoctl
is a CLI tool for
deploying your machine-learning models to any cloud platforms.
Manage Bentos#
BentoML version 0.13.1 relies on Yatai as a bento registry to help teams collaborate and manage bentos. In addition to bento management, π¦οΈ Yatai project has since been expanded into a platform for deploying large scale model serving workloads on Kubernetes. Yatai standardizes BentoML deployment and provides UI for managing all your ML models and deployments in one place, and enables advanced GitOps and CI/CD workflow.
πΒ Ta-da, you have migrated your project to BentoML 1.0.0. Have more questions? Join the BentoML Slack community.