ONNX#

About this page

This is an API reference for ONNX in BentoML. Please refer to </frameworks/onnx> for more information about how to use ONNX in BentoML.

bentoml.onnx.save_model(name: str, model: onnx.ModelProto, *, signatures: dict[str, ModelSignatureDict | ModelSignature] | None = None, labels: dict[str, str] | None = None, custom_objects: dict[str, t.Any] | None = None, external_modules: t.List[ModuleType] | None = None, metadata: dict[str, t.Any] | None = None) bentoml.Model[source]#

Save a onnx model instance to the BentoML model store.

Parameters
  • name (str) – The name to give to the model in the BentoML store. This must be a valid Tag name.

  • model (ModelProto) – The ONNX model to be saved.

  • signatures (dict[str, ModelSignatureDict], optional) – Signatures of predict methods to be used. If not provided, the signatures default to {"run": {"batchable": False}}. Because we are using :obj:onnxruntime.InferenceSession, the only allowed method name is “run” See ModelSignature for more details.

  • labels (dict[str, str], optional) – A default set of management labels to be associated with the model. An example is {"training-set": "data-1"}.

  • custom_objects (dict[str, Any], optional) –

    Custom objects to be saved with the model. An example is {"my-normalizer": normalizer}.

    Custom objects are currently serialized with cloudpickle, but this implementation is subject to change.

  • external_modules (List[ModuleType], optional, default to None) – user-defined additional python modules to be saved alongside the model or custom objects, e.g. a tokenizer module, preprocessor module, model configuration module

  • metadata (dict[str, Any], optional) –

    Metadata to be associated with the model. An example is {"bias": 4}.

    Metadata is intended for display in a model management UI and therefore must be a default Python type, such as str or int.

Returns

A BentoML model containing the saved ONNX model instance. store.

Return type

Model

Example:

import bentoml

import torch
import torch.nn as nn

class ExtendedModel(nn.Module):
    def __init__(self, D_in, H, D_out):
        # In the constructor we instantiate two nn.Linear modules and assign them as
        #  member variables.
        super(ExtendedModel, self).__init__()
        self.linear1 = nn.Linear(D_in, H)
        self.linear2 = nn.Linear(H, D_out)

    def forward(self, x, bias):
        # In the forward function we accept a Tensor of input data and an optional bias
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred + bias


N, D_in, H, D_out = 64, 1000, 100, 1
x = torch.randn(N, D_in)
model = ExtendedModel(D_in, H, D_out)

input_names = ["x", "bias"]
output_names = ["output1"]

tmpdir = "/tmp/model"
model_path = os.path.join(tmpdir, "test_torch.onnx")
torch.onnx.export(
    model,
    (x, torch.Tensor([1.0])),
    model_path,
    input_names=input_names,
    output_names=output_names,
)

bento_model = bentoml.onnx.save_model("onnx_model", model_path, signatures={"run": "batchable": True})
bentoml.onnx.load_model(bento_model: str | Tag | bentoml.Model, *, providers: ProvidersType | None = None, session_options: 'ort.SessionOptions' | None = None) ort.InferenceSession[source]#

Load the onnx model with the given tag from the local BentoML model store.

Parameters
  • bento_model (str | Tag | Model) – Either the tag of the model to get from the store, or a BentoML ~bentoml.Model instance to load the model from.

  • providers (List[Union[str, Tuple[str, Dict[str, Any]], optional, default to None) – Different providers provided by users. By default BentoML will use ["CPUExecutionProvider"] when loading a model.

  • session_options (onnxruntime.SessionOptions, optional, default to None) – SessionOptions per use case. If not specified, then default to None.

Returns

An instance of ONNX Runtime inference session created using ONNX model loaded from the model store.

Return type

onnxruntime.InferenceSession

Example:

import bentoml
sess = bentoml.onnx.load_model("my_onnx_model")
bentoml.onnx.get(tag_like: str | bentoml._internal.tag.Tag) Model[source]#

Get the BentoML model with the given tag.

Parameters

tag_like – The tag of the model to retrieve from the model store.

Returns

A BentoML Model with the matching tag.

Return type

Model

Example:

import bentoml
# target model must be from the BentoML model store
model = bentoml.onnx.get("onnx_resnet50")