About this page

This is an API reference for PyTorch in BentoML. Please refer to PyTorch guide for more information about how to use PyTorch in BentoML.


You can find more examples for PyTorch in our BentoML/examples directory.

bentoml.pytorch.save_model(name: str, model: torch.nn.Module, *, signatures: ModelSignaturesType | None = None, labels: t.Dict[str, str] | None = None, custom_objects: t.Dict[str, t.Any] | None = None, external_modules: t.List[ModuleType] | None = None, metadata: t.Dict[str, t.Any] | None = None) bentoml.Model[source]#

Save a model instance to BentoML modelstore.

  • name (str) – Name for given model instance. This should pass Python identifier check.

  • model (torch.nn.Module) – Instance of model to be saved

  • signatures (ModelSignaturesType, optional, default to None) – A dictionary of method names and their corresponding signatures.

  • labels (Dict[str, str], optional, default to None) – user-defined labels for managing models, e.g. team=nlp, stage=dev

  • custom_objects (Dict[str, Any]], optional, default to None) – user-defined additional python objects to be saved alongside the model, e.g. a tokenizer instance, preprocessor function, model configuration json

  • external_modules (List[ModuleType], optional, default to None) – user-defined additional python modules to be saved alongside the model or custom objects, e.g. a tokenizer module, preprocessor module, model configuration module

  • metadata (Dict[str, Any], optional, default to None) – Custom metadata for given model.


A tag with a format name:version where name is the user-defined model’s name, and a generated version by BentoML.

Return type:



import torch
import bentoml

class NGramLanguageModeler(nn.Module):

    def __init__(self, vocab_size, embedding_dim, context_size):
        super(NGramLanguageModeler, self).__init__()
        self.embeddings = nn.Embedding(vocab_size, embedding_dim)
        self.linear1 = nn.Linear(context_size * embedding_dim, 128)
        self.linear2 = nn.Linear(128, vocab_size)

    def forward(self, inputs):
        embeds = self.embeddings(inputs).view((1, -1))
        out = F.relu(self.linear1(embeds))
        out = self.linear2(out)
        log_probs = F.log_softmax(out, dim=1)
        return log_probs

tag ="ngrams", NGramLanguageModeler(len(vocab), EMBEDDING_DIM, CONTEXT_SIZE))
# example tag: ngrams:20201012_DE43A2

Integration with Torch Hub and BentoML:

import torch
import bentoml

resnet50 = torch.hub.load("pytorch/vision", "resnet50", pretrained=True)
# trained a custom resnet50

tag ="resnet50", resnet50)
bentoml.pytorch.load_model(bentoml_model: str | bentoml._internal.tag.Tag | bentoml._internal.models.model.Model, device_id: Optional[str] = 'cpu') torch.nn.Module[source]#

Load a model from a BentoML Model with given name.

  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • device_id (str, optional, default to cpu) – Optional devices to put the given model on. Refer to device attributes.


an instance of torch.nn.Module from BentoML modelstore.

Return type:



import bentoml
model = bentoml.pytorch.load_model('lit_classifier:latest', device_id="cuda:0")
bentoml.pytorch.get(tag_like: str | bentoml._internal.tag.Tag) Model[source]#