MXNet Gluon

Users can now use MXNet Gluon with BentoML with the following API: load, save, and load_runner as follow:

import mxnet
import mxnet.gluon as gluon
import bentoml


def train_gluon_classifier() -> gluon.nn.HybridSequential:
   net = mxnet.gluon.nn.HybridSequential()
   net.hybridize()
   net.forward(mxnet.nd.array(0))
   return net

model = train_gluon_classifier()

# `save` a model to BentoML modelstore:
tag = bentoml.gluon.save("gluon_block", model)

# retrieve metadata with `bentoml.models.get`:
metadata = bentoml.models.get(tag)

# `load` the model back in memory:
loaded: gluon.Block = bentoml.gluon.load(tag)

Note

You can find more examples for MXNet Gluon in our gallery repo.

bentoml.gluon.save(name, model, *, labels=None, custom_objects=None, metadata=None)

Save a model instance to BentoML modelstore.

Parameters
  • name (str) – Name for given model instance. This should pass Python identifier check.

  • model (mxnet.gluon.HybridBlock) – Instance of gluon.HybridBlock model to be saved.

  • labels (Dict[str, str], optional, default to None) – user-defined labels for managing models, e.g. team=nlp, stage=dev

  • custom_objects (Dict[str, Any]], optional, default to None) – user-defined additional python objects to be saved alongside the model, e.g. a tokenizer instance, preprocessor function, model configuration json

  • metadata (Dict[str, Any], optional, default to None) – Custom metadata for given model.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

A tag with a format name:version where name is the user-defined model’s name, and a generated version by BentoML.

Return type

Tag

Examples:

import mxnet
import mxnet.gluon as gluon
import bentoml


def train_gluon_classifier() -> gluon.nn.HybridSequential:
    net = mxnet.gluon.nn.HybridSequential()
    net.hybridize()
    net.forward(mxnet.nd.array(0))
    return net

model = train_gluon_classifier()

tag = bentoml.gluon.save("gluon_block", model)
bentoml.gluon.load(tag, mxnet_ctx=None, model_store=<simple_di.providers.SingletonFactory object>)

Load a model from BentoML local modelstore with given tag.

Parameters
  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • mxnet_ctx (mxnet.context.Context, optional, default to :code:cpu) – Device type to cast model. Default behaviour similar to torch.device("cuda") Options: “cuda” or “cpu”. If None is specified then return default config.MODEL.DEVICE

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

an instance of mxnet.gluon.Block from BentoML modelstore.

Return type

mxnet.gluon.Block

Examples:

import bentoml

model = bentoml.gluon.load(tag)
bentoml.gluon.load_runner(tag, predict_fn_name='__call__', *, name=None)

Runner represents a unit of serving logic that can be scaled horizontally to maximize throughput. bentoml.detectron.load_runner implements a Runner class that wrap around a torch.nn.Module model, which optimize it for the BentoML runtime.

Parameters
  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • predict_fn_name (str, default to __call__) – Options for inference functions. Default to __call__

Returns

Runner instances for bentoml.detectron model

Return type

Runner

Examples:

import bentoml

runner = bentoml.gluon.load_runner("gluon_block")
runner.run_batch(data)