Keras

Users can now use Keras (with Tensorflow v1 and v2 backend) with BentoML with the following API: load, save, and load_runner as follow:

import bentoml
import tensorflow as tf
import tensorflow.keras as keras

def KerasSequentialModel() -> keras.models.Model:
   net = keras.models.Sequential(
      (
            keras.layers.Dense(
               units=1,
               input_shape=(5,),
               use_bias=False,
               kernel_initializer=keras.initializers.Ones(),
            ),
      )
   )

   opt = keras.optimizers.Adam(0.002, 0.5)
   net.compile(optimizer=opt, loss="binary_crossentropy", metrics=["accuracy"])
   return net

model = KerasSequentialModel()

# `save` a given model and retrieve coresponding tag:
tag = bentoml.keras.save("keras_model", model, store_as_json=True)

# retrieve metadata with `bentoml.models.get`:
metadata = bentoml.models.get(tag)

# retrieve session that save keras model with `bentoml.keras.get_session()`:
session = bentoml.keras.get_session()
session.run(tf.global_variables_initializer())
with session.as_default():
   # `load` the model back in memory:
   loaded = bentoml.keras.load("keras_model:latest")

Note

You can find more examples for Keras in our gallery repo.

bentoml.keras.save(name, model, *, save_format='tf', store_as_json_and_weights=False, labels=None, custom_objects=None, metadata=None)

Save a model instance to BentoML modelstore.

Parameters
  • name (str) – Name for given model instance. This should pass Python identifier check.

  • model (tensorflow.keras.Model) – Instance of the Keras model to be saved to BentoML modelstore.

  • save_format (str, optional, default to tf) – Whether to store Keras model or weight in tf format or old h5 format.

  • custom_objects (Dict[str, Any], optional, default to None) – Dictionary of Keras custom objects, if specified.

  • store_as_json_and_weights (bool, optional, default to False) – Whether to store Keras model as JSON and weights.

  • metadata (Dict[str, Any], optional, default to None) – Custom metadata for given model.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

A tag with a format name:version where name is the user-defined model’s name, and a generated version by BentoML.

Return type

Tag

Examples:

import bentoml
import tensorflow as tf
import tensorflow.keras as keras


def custom_activation(x):
    return tf.nn.tanh(x) ** 2


class CustomLayer(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(CustomLayer, self).__init__(**kwargs)
        self.units = tf.Variable(units, name="units")

    def call(self, inputs, training=False):
        if training:
            return inputs * self.units
        else:
            return inputs

    def get_config(self):
        config = super(CustomLayer, self).get_config()
        config.update({"units": self.units.numpy()})
        return config


def KerasSequentialModel() -> keras.models.Model:
    net = keras.models.Sequential(
        (
            keras.layers.Dense(
                units=1,
                input_shape=(5,),
                use_bias=False,
                kernel_initializer=keras.initializers.Ones(),
            ),
        )
    )

    opt = keras.optimizers.Adam(0.002, 0.5)
    net.compile(optimizer=opt, loss="binary_crossentropy", metrics=["accuracy"])
    return net

model = KerasSequentialModel()

# `save` a given model and retrieve coresponding tag:
tag = bentoml.keras.save("keras_model", model)

# `save` a given model with custom objects definition:
custom_objects = {
    "CustomLayer": CustomLayer,
    "custom_activation": custom_activation,
},
custom_tag = bentoml.keras.save("custom_obj_keras", custom_objects=custom_objects)
bentoml.keras.load(tag, model_store=<simple_di.providers.SingletonFactory object>)

Load a model from BentoML local modelstore with given name.

Parameters
  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • model_store (ModelStore, default to BentoMLContainer.model_store) – BentoML modelstore, provided by DI Container.

Returns

an instance of users keras.Model from BentoML modelstore.

Return type

keras.Model

Examples:

import bentoml
import tensorflow as tf

# retrieve session that save keras model with `bentoml.keras.get_session()`:
session = bentoml.keras.get_session()
session.run(tf.global_variables_initializer())
with session.as_default():
    # `load` the model back in memory:
    loaded = bentoml.keras.load("keras_model")
bentoml.keras.load_runner(tag, *, predict_fn_name='predict', device_id='CPU:0', partial_kwargs=None, name=None)

Runner represents a unit of serving logic that can be scaled horizontally to maximize throughput. bentoml.tensorflow.load_runner implements a Runner class that wrap around a Keras model, with Tensorflow as backend, which optimize it for the BentoML runtime.

Parameters
  • tag (Union[str, Tag]) – Tag of a saved model in BentoML local modelstore.

  • predict_fn_name (str, optional, default to predict) – Inference function to be used.

  • partial_kwargs (Dict[str, Any], optional, default to None) – Dictionary of predict() kwargs that can be shared across different model.

  • device_id (str, optional, default to the first CPU) – Optional devices to put the given model on. Refers to Logical Devices from TF documentation.

Returns

Runner instances for bentoml.keras model

Return type

Runner

Examples:

import bentoml

# load tag to a BentoML runner:
runner = bentoml.keras.load_runner(tag)
with bentoml.keras.get_session().as_default():
    runner.run_batch([1,2,3])
bentoml.keras.get_session()

Return TF1 sessions for bentoml.keras.

Warning

This function is served for the purposes of using Tensorflow V1.

Example:

session = bentoml.keras.get_session()
# Initialize variables in the graph/model
session.run(tf.global_variables_initializer())
with session.as_default():
    loaded = bentoml.keras.load(tag)