PaddlePaddle

Users can now use PaddlePaddle with BentoML with the following API: load, save, and load_runner as follow:

import random
import numpy as np

import bentoml
import paddle
import paddle.nn as nn
from paddle.static import InputSpec

IN_FEATURES = 13
OUT_FEATURES = 1


def set_random_seed(seed):
   random.seed(seed)
   np.random.seed(seed)
   paddle.seed(seed)
   paddle.framework.random._manual_program_seed(seed)


class LinearModel(nn.Layer):
   def __init__(self):
      super(LinearModel, self).__init__()
      self.fc = nn.Linear(IN_FEATURES, OUT_FEATURES)

   @paddle.jit.to_static(input_spec=[InputSpec(shape=[IN_FEATURES], dtype="float32")])
   def forward(self, x):
      return self.fc(x)


def train_paddle_model() -> "LinearModel":
   set_random_seed(SEED)
   model = LinearModel()
   loss = nn.MSELoss()
   adam = paddle.optimizer.Adam(parameters=model.parameters())

   train_data = paddle.text.datasets.UCIHousing(mode="train")

   loader = paddle.io.DataLoader(
      train_data, batch_size=BATCH_SIZE, shuffle=True, drop_last=True, num_workers=2
   )

   model.train()
   for _ in range(EPOCH_NUM):
      for _, (feature, label) in enumerate(loader()):
            out = model(feature)
            loss_fn = loss(out, label)
            loss_fn.backward()
            adam.step()
            adam.clear_grad()
   return model

model = train_paddle_model()
# `save` a pretrained model to BentoML modelstore:
tag = bentoml.paddle.save("linear_model", model, input_spec=InputSpec(shape=[IN_FEATURES], dtype="float32"))

# retrieve metadata with `bentoml.models.get`:
metadata = bentoml.models.get(tag)

# `load` the model back in memory:
loaded = bentoml.paddle.load(tag)

# `load` with custom config:
conf = paddle.inference.Config(
    metadata.path + "/saved_model.pdmodel", info.path + "/saved_model.pdiparams"
)
conf.enable_memory_optim()
conf.set_cpu_math_library_num_threads(1)
paddle.set_device("cpu")
loaded_with_customs: nn.Layer = bentoml.paddle.load(tag, config=conf)

# Load a given model under `Runner` abstraction with `load_runner`
runner = bentoml.paddle.load_runner(tag)

runner.run_batch(pd_dataframe.to_numpy().astype(np.float32))

We also offer import_from_paddlehub which enables users to import model from PaddleHub and use it with BentoML:

import os

import bentoml

test_text = ["这家餐厅很好吃", "这部电影真的很差劲"]
tag = bentoml.paddle.import_from_paddlehub("senta_bilstm")
runner = bentoml.paddle.load_runner(tag, infer_api_callback="sentiment_classify")
results = runner.run_batch(None, texts=test_text, use_gpu=False, batch_size=1)

assert results[0]["positive_probs"] == 0.9407
assert results[1]["positive_probs"] == 0.02

# import from local paddle hub module, refers to
# https://paddlehub.readthedocs.io/en/release-v2.1/index.html
senta_path = os.path.join(current_dir, "senta_test")

# save given paddlehub module to BentoML modelstore
tag = bentoml.paddle.import_from_paddlehub(senta_path)

# load back into memory:
module = bentoml.paddle.load(tag)
assert module.sentiment_classify(texts=text)[0]["sentiment"] == "negative"

Note

You can find more examples for PaddlePaddle in our gallery repo.

bentoml.paddle.save(name, model, *, input_spec=None, labels=None, custom_objects=None, metadata=None, model_store=<simple_di.providers.SingletonFactory object>)

Save a model instance to BentoML modelstore.

Parameters
  • name (str) – Name for given model instance. This should pass Python identifier check.

  • model (Union[paddle.nn.Layer, paddle.inference.Predictor, StaticFunction]) – Instance of paddle.nn.Layer, decorated functions, or paddle.inference.Predictor to be saved.

  • input_spec (Union[List[InputSpec], Tuple[InputSpec, ...]], optional, default to None) – Describes the input of the saved model’s forward method, which can be described by InputSpec or example Tensor. Moreover, we support to specify non-tensor type argument, such as int, float, string, or list/dict of them. If None, all input variables of the original Layer’s forward method would be the inputs of the saved model. Generally this is NOT RECOMMENDED to use unless you know what you are doing.

  • labels (Dict[str, str], optional, default to None) – user-defined labels for managing models, e.g. team=nlp, stage=dev

  • custom_objects (Dict[str, Any]], optional, default to None) – user-defined additional python objects to be saved alongside the model, e.g. a tokenizer instance, preprocessor function, model configuration json

  • metadata (Dict[str, Any], optional, default to None) – Custom metadata for given model.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

A tag with a format name:version where name is the user-defined model’s name, and a generated version by BentoML.

Return type

Tag

Examples:

import random
import numpy as np

import bentoml
import paddle
import paddle.nn as nn
from paddle.static import InputSpec

IN_FEATURES = 13
OUT_FEATURES = 1


def set_random_seed(seed):
    random.seed(seed)
    np.random.seed(seed)
    paddle.seed(seed)
    paddle.framework.random._manual_program_seed(seed)


class LinearModel(nn.Layer):
    def __init__(self):
        super(LinearModel, self).__init__()
        self.fc = nn.Linear(IN_FEATURES, OUT_FEATURES)

    @paddle.jit.to_static(input_spec=[InputSpec(shape=[IN_FEATURES], dtype="float32")])
    def forward(self, x):
        return self.fc(x)


def train_paddle_model() -> "LinearModel":
    set_random_seed(SEED)
    model = LinearModel()
    loss = nn.MSELoss()
    adam = paddle.optimizer.Adam(parameters=model.parameters())

    train_data = paddle.text.datasets.UCIHousing(mode="train")

    loader = paddle.io.DataLoader(
        train_data, batch_size=BATCH_SIZE, shuffle=True, drop_last=True, num_workers=2
    )

    model.train()
    for _ in range(EPOCH_NUM):
        for _, (feature, label) in enumerate(loader()):
            out = model(feature)
            loss_fn = loss(out, label)
            loss_fn.backward()
            adam.step()
            adam.clear_grad()
    return model

model = train_paddle_model()
# `save` a pretrained model to BentoML modelstore:
tag = bentoml.paddle.save("linear_model", model, input_spec=InputSpec(shape=[IN_FEATURES], dtype="float32"))
bentoml.paddle.load(tag, config=None, model_store=<simple_di.providers.SingletonFactory object>, **kwargs)

Load a model from BentoML local modelstore with given name.

Parameters
  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • config (paddle.inference.Config, optional, default to None) – Model config to be used to create a Predictor instance.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

obj:Union[paddle.inference.Predictor, paddlehub.module.module.RunModule, paddlehub.module.module.ModuleV1]: an instance of one of paddle.inference.Predictor, paddlehub.module.module.RunModule, and paddlehub.module.module.ModuleV1 from BentoML modelstore.

Return type

Union[paddle.inference.Predictor, module.RunModule, module.ModuleV1]

Examples:

import bentoml

model = bentoml.paddle.load(tag)
bentoml.paddle.load_runner(tag, *, infer_api_callback='run', gpu_mem_pool_mb=0, name=None, config=None)

Runner represents a unit of serving logic that can be scaled horizontally to maximize throughput. bentoml.paddle.load_runner() implements a Runner class that either wraps around paddle.inference.Predictor or paddlehub.module.Module, which optimize it for the BentoML runtime.

Parameters
  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • infer_api_callback (str, optional, default to run) – Inference API function that will be used during run_batch. If tag is a tag of a hub.Module, then infer_api_callback should be changed to corresponding API endpoint offered by given module.

  • config (paddle.inference.Config, optional, default to None) – Config for inference. If None is specified, then use BentoML default.

  • gpu_mem_pool_mb (int, optional, default to 0) – Amount of memory one wants to allocate to GPUs. By default we will allocate None.

Returns

Runner instances for bentoml.paddle model

Return type

Runner

Examples:

import bentoml
import numpy as np

runner = bentoml.paddle.load_runner(
    tag,
    device="gpu:0",
)

_ = runner.run_batch(pd_dataframe.to_numpy().astype(np.float32))
bentoml.paddle.import_from_paddlehub(model_name, name=None, version=None, branch=None, source=None, update=False, ignore_env_mismatch=False, hub_module_home=None, keep_download_from_hub=False, *, labels=None, custom_objects=None, metadata=None, model_store=<simple_di.providers.SingletonFactory object>)

Import models from PaddleHub and save it under BentoML modelstore.

Parameters
  • model_name (str) – Name for a PaddleHub model. This can be either path to a Hub module or model name, for both v2 and v1.

  • name (str, optional, default to None) – Name for given model instance. This should pass Python identifier check. If not specified, then BentoML will save under model_name.

  • version (str, optional, default to None) – The version limit of the module, only takes effect when the name is specified. When the local Module does not meet the specified version conditions, PaddleHub will re-request the server to download the appropriate Module. Default to None, This means that the local Module will be used. If the Module does not exist, PaddleHub will download the latest version available from the server according to the usage environment.

  • source (str, optional, default to None) – URL of a git repository. If this parameter is specified, PaddleHub will no longer download the specified Module from the default server, but will look for it in the specified repository.

  • update (bool, optional, default to False) – Whether to update the locally cached git repository, only takes effect when the source is specified.

  • branch (str, optional, default to None) – The branch of the specified git repository.

  • ignore_env_mismatch (bool, optional, default to False) – Whether to ignore the environment mismatch when installing the Module.

  • hub_module_home (str, optional, default is None) – Location to save for your PaddleHub cache. If None, then default to use PaddleHub default cache, which is under $HOME/.paddlehub

  • keep_download_from_hub (bool, optional, default to False) – Whether to re-download pretrained model from hub.

  • labels (Dict[str, str], optional, default to None) – user-defined labels for managing models, e.g. team=nlp, stage=dev

  • custom_objects (Dict[str, Any]], optional, default to None) – user-defined additional python objects to be saved alongside the model, e.g. a tokenizer instance, preprocessor function, model configuration json

  • metadata (Dict[str, Any], optional, default to None) – Custom metadata for given model.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

A tag with a format name:version where name is the user-defined model’s name, and a generated version by BentoML.

Return type

Tag

Examples:

import bentoml

tag = bentoml.paddle.import_from_paddlehub("senta_bilstm")