BentoService API

BentoService

class bentoml.BentoService

BentoService is the base component for building prediction services using BentoML.

BentoService provide an abstraction for describing model artifacts and environment dependencies required for a prediction service. And allows users to create inference APIs that defines the inferencing logic and how the underlying model can be served. Each BentoService can contain multiple models and serve multiple inference APIs.

Usage example:

>>>  from bentoml import BentoService, env, api, artifacts
>>>  from bentoml.adapters import DataframeInput
>>>  from bentoml.artifact import SklearnModelArtifact
>>>
>>>  @artifacts([SklearnModelArtifact('clf')])
>>>  @env(pip_dependencies=["scikit-learn"])
>>>  class MyMLService(BentoService):
>>>
>>>     @api(input=DataframeInput())
>>>     def predict(self, df):
>>>         return self.artifacts.clf.predict(df)
>>>
>>>  if __name__ == "__main__":
>>>     bento_service = MyMLService()
>>>     bento_service.pack('clf', trained_classifier_model)
>>>     bento_service.save_to_dir('/bentoml_bundles')
name()
Returns

BentoService name

versioneer()

Function used to generate a new version string when saving a new BentoService bundle. User can also override this function to get a customized version format

set_version(version_str=None)

Set the version of this BentoService instance. Once the version is set explicitly via set_version, the self.versioneer method will no longer be invoked when saving this BentoService.

inference_apis

Return a list of user defined API functions

Returns

List of Inference API objects

Return type

list(InferenceAPI)

artifacts

Returns the ArtifactCollection instance specified with this BentoService class

Returns

A dictionary of packed artifacts from the artifact name to the BentoServiceArtifact instance

Return type

artifacts(ArtifactCollection)

pack(name, *args, *kwargs)

BentoService#pack method is used for packing trained model instances with a BentoService instance and make it ready for BentoService#save.

Parameters
  • name – name of the declared model artifact

  • args – args passing to the target model artifact to be packed

  • kwargs – kwargs passing to the target model artifact to be packed

Returns

this BentoService instance

pack(*args, **kwargs)

Deprecated: Legacy BentoService#pack class method, no longer supported

save(base_path=None, version=None)

Save and register this BentoService via BentoML’s built-in model management system. BentoML by default keeps track of all the SavedBundle’s files and metadata in local file system under the $BENTOML_HOME(~/bentoml) directory. Users can also configure BentoML to save their BentoService to a shared Database and cloud object storage such as AWS S3.

Parameters
  • base_path – optional - override repository base path

  • version – optional - save with version override

Returns

saved_path: file path to where the BentoService is saved

save_to_dir(path, version=None)

Save this BentoService along with all its artifacts, source code and dependencies to target file path, assuming path exist and empty. If target path is not empty, this call may override existing files in the given path.

Parameters
  • (str) (path) – Destination of where the bento service will be saved

  • version – optional - save with version override

api

bentoml.api(*args, input: bentoml.adapters.base_input.BaseInputAdapter = None, output: bentoml.adapters.base_output.BaseOutputAdapter = None, api_name=None, api_doc=None, mb_max_batch_size=2000, mb_max_latency=10000, **kwargs)

A decorator exposed as bentoml.api for defining Inference API in a BentoService class.

Parameters
  • input – InputAdapter instance of the inference API

  • output – OutputAdapter instance of the inference API

  • api_name – API name, default to the user-defined callback function’s function name

  • api_doc – user-facing documentation of the inference API. default to the user-defined callback function’s docstring

  • mb_max_batch_size – The maximum size of requests batch accepted by this inference API. This parameter governs the throughput/latency trade off, and avoids having large batches that exceed some resource constraint (e.g. GPU memory to hold the entire batch’s data). Default: 1000.

  • mb_max_latency – The latency goal of this inference API in milliseconds. Default: 10000.

Example usage:

>>> from bentoml import BentoService, api
>>> from bentoml.adapters import JsonInput, DataframeInput
>>>
>>> class FraudDetectionAndIdentityService(BentoService):
>>>
>>>     @api(input=JsonInput())
>>>     def fraud_detect(self, json_list):
>>>         # user-defined callback function that process inference requests
>>>
>>>     @api(input=DataframeInput(input_json_orient='records'))
>>>     def identity(self, df):
>>>         # user-defined callback function that process inference requests

Deprecated syntax before version 0.8.0:

>>> from bentoml import BentoService, api
>>> from bentoml.handlers import JsonHandler, DataframeHandler  # deprecated
>>>
>>> class FraudDetectionAndIdentityService(BentoService):
>>>
>>>     @api(JsonHandler)  # deprecated
>>>     def fraud_detect(self, parsed_json_list):
>>>         # do something
>>>
>>>     @api(DataframeHandler, input_json_orient='records')  # deprecated
>>>     def identity(self, df):
>>>         # do something

env

bentoml.env(pip_dependencies: List[str] = None, auto_pip_dependencies: bool = False, requirements_txt_file: str = None, conda_channels: List[str] = None, conda_dependencies: List[str] = None, setup_sh: str = None, docker_base_image: str = None)

Define environment and dependencies required for the BentoService being created

Parameters
  • pip_dependencies – list of pip_dependencies required, specified by package name or with specified version {package_name}=={package_version}

  • auto_pip_dependencies – (Beta) whether to automatically find all the required pip dependencies and pin their version

  • requirements_txt_file – pip dependencies in the form of a requirements.txt file, this can be a relative path to the requirements.txt file or the content of the file

  • conda_channels – list of extra conda channels to be used

  • conda_dependencies – list of conda dependencies required

  • setup_sh – user defined setup bash script, it is executed in docker build time

  • docker_base_image – used for customizing the docker container image built with BentoML saved bundle. Base image must either have both bash and conda installed; or have bash, pip, python installed, in which case the user is required to ensure the python version matches the BentoService bundle

artifacts

bentoml.artifacts(artifacts: List[bentoml.artifact.artifact.BentoServiceArtifact])

Define artifacts required to be bundled with a BentoService

Parameters

artifacts (list(bentoml.artifact.BentoServiceArtifact)) – A list of desired artifacts required by this BentoService

ver

bentoml.ver(major, minor)

Decorator for specifying the version of a custom BentoService.

Parameters
  • major (int) – Major version number for Bento Service

  • minor (int) – Minor version number for Bento Service

BentoML uses semantic versioning for BentoService distribution:

  • MAJOR is incremented when you make breaking API changes

  • MINOR is incremented when you add new functionality without breaking the existing API or functionality

  • PATCH is incremented when you make backwards-compatible bug fixes

‘Patch’ is provided(or auto generated) when calling BentoService#save, while ‘Major’ and ‘Minor’ can be defined with @ver’ decorator

>>>  @ver(major=1, minor=4)
>>>  @artifacts([PickleArtifact('model')])
>>>  class MyMLService(BentoService):
>>>     pass
>>>
>>>  svc = MyMLService()
>>>  svc.pack("model", trained_classifier)
>>>  svc.set_version("2019-08.iteration20")
>>>  svc.save()
>>>  # The final produced BentoService bundle will have version:
>>>  # "1.4.2019-08.iteration20"

save

bentoml.save(bento_service, base_path=None, version=None)

Save and register the given BentoService via BentoML’s built-in model management system. BentoML by default keeps track of all the SavedBundle’s files and metadata in local file system under the $BENTOML_HOME(~/bentoml) directory. Users can also configure BentoML to save their BentoService to a shared Database and cloud object storage such as AWS S3.

Parameters
  • bento_service – target BentoService instance to be saved

  • base_path – optional - override repository base path

  • version – optional - save with version override

Returns

saved_path: file path to where the BentoService is saved

save_to_dir

bentoml.save_to_dir(bento_service, path, version=None, silent=False)

Save given BentoService along with all its artifacts, source code and dependencies to target file path, assuming path exist and empty. If target path is not empty, this call may override existing files in the given path.

Parameters
  • (bentoml.service.BentoService) (bento_service) – a Bento Service instance

  • (str) (version) – Destination of where the bento service will be saved

  • (str) – Override the service version with given version string

  • (boolean) (silent) – whether to hide the log message showing target save path

load

bentoml.load(bundle_path)

Load bento service from local file path or s3 path

Parameters

bundle_path (str) – The path that contains saved BentoService bundle, supporting both local file path and s3 path

Returns

a loaded BentoService instance

Return type

bentoml.service.BentoService