PyCaret

Users can now use PyCaret with BentoML with the following API: load, save, and load_runner as follow:

import bentoml
import pandas as pd

from pycaret.datasets import get_data
from pycaret.classification import setup as pycaret_setup
from pycaret.classification import save_model
from pycaret.classification import tune_model
from pycaret.classification import create_model
from pycaret.classification import predict_model
from pycaret.classification import finalize_model

dataset = get_data("credit")
data = dataset.sample(frac=0.95, random_state=786)
data_unseen = dataset.drop(data.index)
data.reset_index(inplace=True, drop=True)
data_unseen.reset_index(inplace=True, drop=True)

pycaret_setup(data=data, target="default", session_id=123, silent=True)
dt = create_model("dt")
tuned_dt = tune_model(dt)
final_dt = finalize_model(tuned_dt)

# `save` a given classifier and retrieve coresponding tag:
tag = bentoml.pycaret.save("cls", final_dt)

# retrieve metadata with `bentoml.models.get`:
metadata = bentoml.models.get(tag)

# `load` the model back in memory:
model = bentoml.pycaret.load("cls:latest")

# Run a given model under `Runner` abstraction with `load_runner`
input_data = pd.from_csv("/path/to/csv")
runner = bentoml.pycaret.load_runner(tag)
runner.run(pd.DataFrame("/path/to/csv"))

Note

You can find more examples for PyCaret in our gallery repo.

bentoml.pycaret.save(name, model, *, labels=None, custom_objects=None, metadata=None)

Save a model instance to BentoML modelstore.

Parameters
  • name (str) – Name for given model instance. This should pass Python identifier check.

  • model (Union[sklearn.pipeline.Pipeline, xgboost.Booster, lightgbm.basic.Booster]) – Instance of model to be saved

  • labels (Dict[str, str], optional, default to None) – user-defined labels for managing models, e.g. team=nlp, stage=dev

  • custom_objects (Dict[str, Any]], optional, default to None) – user-defined additional python objects to be saved alongside the model, e.g. a tokenizer instance, preprocessor function, model configuration json

  • metadata (Dict[str, Any], optional, default to None) – Custom metadata for given model.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

A tag with a format name:version where name is the user-defined model’s name, and a generated version by BentoML.

Return type

Tag

Examples:

import bentoml

from pycaret.classification import *
from pycaret.datasets import get_data

# get data
dataset = get_data('credit')
data = dataset.sample(frac=0.95, random_state=786)

# setup is needed to save
exp_clf101 = setup(data = data, target = 'default', session_id=123)

best_model = compare_models() # get the best model
tuned_best = tune_model(best_model)
final_model = finalize_model(tuned_best)

# NOTE: pycaret setup config will be saved with the model
bentoml.pycaret.save("my_model", final_model)
bentoml.pycaret.load(tag, model_store=<simple_di.providers.SingletonFactory object>)

Load a model from BentoML local modelstore with given name.

Parameters
  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

a Pycaret model instance.

Return type

Any

Examples:

import bentoml
dt = bentoml.pycaret.load("my_model:latest")
bentoml.pycaret.load_runner(tag, *, name=None)

Runner represents a unit of serving logic that can be scaled horizontally to maximize throughput. bentoml.pycaret.load_runner implements a Runner class that wraps around a PyCaret model, which optimizes it for the BentoML runtime.

Parameters

tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

Returns

Runner instances for bentoml.pycaret model

Return type

Runner

Examples:

import bentoml
from pycaret.datasets import get_data
dataset = get_data('credit')

# pre-process data
data = dataset.sample(frac=0.95, random_state=786)
data_unseen = dataset.drop(data.index)
data.reset_index(inplace=True, drop=True)
data_unseen.reset_index(inplace=True, drop=True)

# initialize runner
# NOTE: loading the model will load the saved pycaret config too
runner = bentoml.pycaret.load_runner(tag="my_model:latest")

prediction = runner.run_batch(input_data=data_unseen)