API IO Descriptors#

IO Descriptors are used for describing the input and output spec of a Service API. Here’s a list of built-in IO Descriptors and APIs for extending custom IO Descriptor.

NumPy ndarray#

class bentoml.io.NumpyNdarray(*args, **kwargs)[source]#

NumpyNdarray defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from type numpy.ndarray as specified in your API function signature.

Sample implementation of a sklearn service:

# sklearn_svc.py
import bentoml
from bentoml.io import NumpyNdarray
import bentoml.sklearn

runner = bentoml.sklearn.load_runner("sklearn_model_clf")

svc = bentoml.Service("iris-classifier", runners=[runner])

@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def predict(input_arr):
    res = runner.run(input_arr)
    return res

Users then can then serve this service with bentoml serve:

% bentoml serve ./sklearn_svc.py:svc --auto-reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

% curl -X POST -H "Content-Type: application/json" --data '[[5,4,3,2]]' http://0.0.0.0:3000/predict

[1]%
Parameters
  • dtype (numpy.typings.DTypeLike, optional, default to None) – Data Type users wish to convert their inputs/outputs to. Refers to arrays dtypes for more information.

  • enforce_dtype (bool, optional, default to False) – Whether to enforce a certain data type. if enforce_dtype=True then dtype must be specified.

  • shape (Tuple[int, ...], optional, default to None) –

    Given shape that an array will be converted to. For example:

    from bentoml.io import NumpyNdarray
    
    @svc.api(input=NumpyNdarray(shape=(3,1), enforce_shape=True), output=NumpyNdarray())
    def predict(input_array: np.ndarray) -> np.ndarray:
        # input_array will have shape (3,1)
        result = await runner.run(input_array)
    

  • enforce_shape (bool, optional, default to False) – Whether to enforce a certain shape. If enforce_shape=True then shape must be specified

Returns

IO Descriptor that np.ndarray.

Return type

IODescriptor

classmethod NumpyNdarray.from_sample(sample_input, enforce_dtype=True, enforce_shape=True)[source]#

Create a NumpyNdarray IO Descriptor from given inputs.

Parameters
  • sample_input (np.ndarray) – Given sample np.ndarray data

  • (;code (enforce_dtype) – bool, optional, default to True): Enforce a certain data type. dtype must be specified at function signature. If you don’t want to enforce a specific dtype then change enforce_dtype=False.

  • enforce_shape (bool, optional, default to False) – Enforce a certain shape. shape must be specified at function signature. If you don’t want to enforce a specific shape then change enforce_shape=False.

Returns

NumpyNdarray IODescriptor from given users inputs.

Return type

NumpyNdarray

Example:

import numpy as np
from bentoml.io import NumpyNdarray
arr = [[1,2,3]]
inp = NumpyNdarray.from_sample(arr)

...
@svc.api(input=inp, output=NumpyNdarray())
def predict() -> np.ndarray:...

Tabular Data with Pandas#

Note

The pandas package is required to use the bentoml.io.PandasDataFrame or bentoml.io.PandasSeries. Install it with pip install pandas and add it to your bentofile.yaml’s Python packages list.

class bentoml.io.PandasDataFrame(*args, **kwargs)[source]#

PandasDataFrame defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from type numpy.ndarray as specified in your API function signature.

Sample implementation of a sklearn service:

# sklearn_svc.py
import bentoml
import pandas as pd
import numpy as np
from bentoml.io import PandasDataFrame
import bentoml.sklearn

input_spec = PandasDataFrame.from_sample(pd.DataFrame(np.array([[5,4,3,2]])))

runner = bentoml.sklearn.load_runner("sklearn_model_clf")

svc = bentoml.Service("iris-classifier", runners=[runner])

@svc.api(input=input_spec, output=PandasDataFrame())
def predict(input_arr):
    res = runner.run_batch(input_arr)  # type: np.ndarray
    return pd.DataFrame(res)

Users then can then serve this service with bentoml serve:

% bentoml serve ./sklearn_svc.py:svc --auto-reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

Parameters
  • orient (str, optional, default to records) –

    Indication of expected JSON string format. Compatible JSON strings can be produced by pandas.io.json.to_json() with a corresponding orient value. Possible orients are:

    • split - Dict[str, Any]: {idx -> [idx], columns -> [columns], data -> [values]}

    • records - List[Any]: [{column -> value}, …, {column -> value}]

    • index - Dict[str, Any]: {idx -> {column -> value}}

    • columns - Dict[str, Any]: {column -> {index -> value}}

    • values - Dict[str, Any]: Values arrays

  • columns (List[str], optional, default to None) – List of columns name that users wish to update.

  • apply_column_names (bool, optional, default to False) – Whether to update incoming DataFrame columns. If apply_column_names`=True, then `columns must be specified.

  • dtype (Union[bool, Dict[str, Any]], optional, default to None) – Data Type users wish to convert their inputs/outputs to. If it is a boolean, then pandas will infer dtypes. Else if it is a dictionary of column to dtype, then applies those to incoming dataframes. If False, then don’t infer dtypes at all (only applies to the data). This is not applicable when orient='table'.

  • enforce_dtype (bool, optional, default to False) – Whether to enforce a certain data type. if enforce_dtype=True then dtype must be specified.

  • shape (Tuple[int, ...], optional, default to None) –

    Optional shape check that users can specify for their incoming HTTP requests. We will only check the number of columns you specified for your given shape:

    import pandas as pd
    from bentoml.io import PandasDataFrame
    
    df = pd.DataFrame([[1,2,3]])  # shape (1,3)
    inp = PandasDataFrame.from_sample(arr)
    
    @svc.api(input=PandasDataFrame(
      shape=(51,10), enforce_shape=True
    ), output=PandasDataFrame())
    def infer(input_df: pd.DataFrame) -> pd.DataFrame:...
    # if input_df have shape (40,9), it will throw out errors
    

  • enforce_shape (bool, optional, default to False) – Whether to enforce a certain shape. If enforce_shape=True then shape must be specified

  • default_format (str, optional, default to json) –

    The default serialization format to use if the request does not specify a headers['content-type']. It is also the serialization format used for the response. Possible values are:

    • json - JSON text format (inferred from content-type “application/json”)

    • parquet - Parquet binary format (inferred from content-type “application/octet-stream”)

    • csv - CSV text format (inferred from content-type “text/csv”)

Returns

IO Descriptor that pd.DataFrame.

Return type

IODescriptor

classmethod PandasDataFrame.from_sample(sample_input, orient='records', apply_column_names=True, enforce_shape=True, enforce_dtype=False, default_format='json')[source]#

Create a PandasDataFrame IO Descriptor from given inputs.

Parameters
  • sample_input (pd.DataFrame) – Given sample pd.DataFrame data

  • orient (str, optional, default to records) –

    Indication of expected JSON string format. Compatible JSON strings can be produced by pandas.io.json.to_json() with a corresponding orient value. Possible orients are:

    • split - Dict[str, Any]: {idx -> [idx], columns -> [columns], data -> [values]}

    • records - List[Any]: [{column -> value}, …, {column -> value}]

    • index - Dict[str, Any]: {idx -> {column -> value}}

    • columns - Dict[str, Any]: {column -> {index -> value}}

    • values - Dict[str, Any]: Values arrays

  • apply_column_names (bool, optional, default to True) – Update incoming DataFrame columns. columns must be specified at function signature. If you don’t want to enforce a specific columns name then change apply_column_names=False.

  • enforce_dtype (bool, optional, default to True) – Enforce a certain data type. dtype must be specified at function signature. If you don’t want to enforce a specific dtype then change enforce_dtype=False.

  • enforce_shape (bool, optional, default to False) – Enforce a certain shape. shape must be specified at function signature. If you don’t want to enforce a specific shape then change enforce_shape=False.

  • default_format (str, optional, default to json) –

    The default serialization format to use if the request does not specify a headers['content-type']. It is also the serialization format used for the response. Possible values are:

    • json - JSON text format (inferred from content-type “application/json”)

    • parquet - Parquet binary format (inferred from content-type “application/octet-stream”)

    • csv - CSV text format (inferred from content-type “text/csv”)

Returns

PandasDataFrame IODescriptor from given users inputs.

Return type

bentoml._internal.io_descriptors.PandasDataFrame

Example:

import pandas as pd
from bentoml.io import PandasDataFrame
arr = [[1,2,3]]
inp = PandasDataFrame.from_sample(pd.DataFrame(arr))

...
@svc.api(input=inp, output=PandasDataFrame())
def predict(inputs: pd.DataFrame) -> pd.DataFrame:...
class bentoml.io.PandasSeries(*args, **kwargs)[source]#

PandasSeries defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from type numpy.ndarray as specified in your API function signature.

Sample implementation of a sklearn service:

# sklearn_svc.py
import bentoml
import pandas as pd
import numpy as np
from bentoml.io import PandasSeries
import bentoml.sklearn

input_spec = PandasSeries.from_sample(pd.Series(np.array([[5,4,3,2]])))

runner = bentoml.sklearn.load_runner("sklearn_model_clf")

svc = bentoml.Service("iris-classifier", runners=[runner])

@svc.api(input=input_spec, output=PandasSeries())
def predict(input_arr):
    res = runner.run_batch(input_arr)  # type: np.ndarray
    return pd.Series(res)

Users then can then serve this service with bentoml serve:

% bentoml serve ./sklearn_svc.py:svc --auto-reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

Parameters
  • orient (str, optional, default to records) –

    Indication of expected JSON string format. Compatible JSON strings can be produced by pandas.io.json.to_json() with a corresponding orient value. Possible orients are:

    • split - Dict[str, Any]: {idx -> [idx], columns -> [columns], data -> [values]}

    • records - List[Any]: [{column -> value}, …, {column -> value}]

    • index - Dict[str, Any]: {idx -> {column -> value}}

    • columns - Dict[str, Any]: {column -> {index -> value}}

    • values - Dict[str, Any]: Values arrays

  • columns (List[str], optional, default to None) – List of columns name that users wish to update

  • apply_column_names (bool, optional, default to False) – Whether to update incoming DataFrame columns. If apply_column_names=True, then columns must be specified.

  • dtype (Union[bool, Dict[str, Any]], optional, default to None) – Data Type users wish to convert their inputs/outputs to. If it is a boolean, then pandas will infer dtypes. Else if it is a dictionary of column to dtype, then applies those to incoming dataframes. If False, then don’t infer dtypes at all (only applies to the data). This is not applicable when orient='table'.

  • enforce_dtype (bool, optional, default to False) – Whether to enforce a certain data type. if enforce_dtype=True then dtype must be specified.

  • shape (Tuple[int, …], optional, default to None) –

    Optional shape check that users can specify for their incoming HTTP requests. We will only check the number of columns you specified for your given shape. Examples usage:

    import pandas as pd
    from bentoml.io import PandasSeries
    
    df = pd.DataFrame([[1,2,3]])  # shape (1,3)
    inp = PandasSeries.from_sample(arr)
    
    ...
    @svc.api(input=inp, output=PandasSeries())
    # the given shape above is valid
    def predict(input_df: pd.DataFrame) -> pd.DataFrame:
        result = await runner.run(input_df)
        return result
    
    @svc.api(input=PandasSeries(shape=(51,10), enforce_shape=True), output=PandasSeries())
    def infer(input_df: pd.DataFrame) -> pd.DataFrame: ...
    # if input_df have shape (40,9), it will throw out errors
    

  • enforce_shape (bool, optional, default to False) – Whether to enforce a certain shape. If enforce_shape=True then shape must be specified

Returns

IO Descriptor that pd.DataFrame.

Return type

IODescriptor

Structure Data with JSON#

Note

For common structure data, we recommend using the JSON descriptor, as it provides the most flexibility. Users can also define a schema of the JSON data via a Pydantic model, and use it to for data validation.

class bentoml.io.JSON(*args, **kwargs)[source]#

JSON defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from a JSON representation as specified in your API function signature.

Sample implementation of a sklearn service:

# sklearn_svc.py
import pandas as pd
import numpy as np
from bentoml.io import PandasDataFrame, JSON
import bentoml.sklearn

input_spec = PandasDataFrame.from_sample(pd.DataFrame(np.array([[5,4,3,2]])))

runner = bentoml.sklearn.load_runner("sklearn_model_clf")

svc = bentoml.Service("iris-classifier", runners=[runner])

@svc.api(input=input_spec, output=JSON())
def predict(input_arr: pd.DataFrame):
    res = runner.run_batch(input_arr)  # type: np.ndarray
    return {"res":pd.DataFrame(res).to_json(orient='record')}

Users then can then serve this service with bentoml serve:

% bentoml serve ./sklearn_svc.py:svc --auto-reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

Parameters
  • pydantic_model (pydantic.BaseModel, optional, default to None) – Pydantic model schema.

  • validate_json (bool, optional, default to True) – If True, then use Pydantic model specified above to validate given JSON.

  • json_encoder (Type[json.JSONEncoder], default to ~bentoml._internal.io_descriptor.json.DefaultJsonEncoder) – JSON encoder class.

Returns

IO Descriptor that in JSON format.

Return type

IODescriptor

Texts#

bentoml.io.Text is commonly used for NLP Applications:

class bentoml.io.Text(*args, **kwargs)[source]#

Text defines API specification for the inputs/outputs of a Service. Text represents strings for all incoming requests/outcoming responses as specified in your API function signature.

Sample implementation of a GPT2 service:

# gpt2_svc.py
import bentoml
from bentoml.io import Text
import bentoml.transformers

# If you don't have a gpt2 model previously saved under BentoML modelstore
# tag = bentoml.transformers.import_from_huggingface_hub('gpt2')
runner = bentoml.transformers.load_runner('gpt2',tasks='text-generation')

svc = bentoml.Service("gpt2-generation", runners=[runner])

@svc.api(input=Text(), output=Text())
def predict(input_arr):
    res = runner.run_batch(input_arr)
    return res[0]['generated_text']

Users then can then serve this service with bentoml serve:

% bentoml serve ./gpt2_svc.py:svc --auto-reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "gpt2-generation" defined in "gpt2_svc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

Note

Text is not designed to take any args or kwargs during initialization

Returns

IO Descriptor that strings type.

Return type

IODescriptor

Images#

Note

The Pillow package is required to use the bentoml.io.Image. Install it with pip install Pillow and add it to your bentofile.yaml’s Python packages list.

class bentoml.io.Image(*args, **kwargs)[source]#

Image defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from images as specified in your API function signature.

Sample implementation of a transformers service for object detection:

#obj_detc.py
import bentoml
from bentoml.io import Image, JSON
import bentoml.transformers

tag='google_vit_large_patch16_224:latest'
runner = bentoml.transformers.load_runner(tag, tasks='image-classification',
                                          device=-1,
                                          feature_extractor="google/vit-large-patch16-224")

svc = bentoml.Service("vit-object-detection", runners=[runner])

@svc.api(input=Image(), output=JSON())
def predict(input_img):
    res = runner.run_batch(input_img)
    return res

Users then can then serve this service with bentoml serve:

% bentoml serve ./obj_detc.py:svc --auto-reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "vit-object-detection" defined in "obj_detc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

Parameters
  • pilmode (str, optional, default to RGB) – Color mode for PIL.

  • mime_type (str, optional, default to image/jpeg) – Return MIME type of the starlette.response.Response, only available when used as output descriptor.

Returns

IO Descriptor that either a PIL.Image.Image or a np.ndarray representing an image.

Return type

IODescriptor

Files#

class bentoml.io.File(kind='binaryio', mime_type=None)[source]#

File defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from file-like objects as specified in your API function signature.

Sample implementation of a ViT service:

# vit_svc.py
import bentoml
from bentoml.io import File

svc = bentoml.Service("vit-object-detection")

@svc.api(input=File(), output=File())
def predict(input_pdf):
    return input_pdf

Users then can then serve this service with bentoml serve:

% bentoml serve ./vit_svc.py:svc --auto-reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "vit-object-detection" defined in "vit_svc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

Parameters

mime_type (str, optional, default to None) – Return MIME type of the starlette.response.Response, only available when used as output descriptor

Returns

IO Descriptor that file-like objects.

Return type

IODescriptor

Multipart Payloads#

Note

io.Multipart makes it possible to compose a multipart payload from multiple other IO Descriptor instances. For example, you may create a Multipart input that contains a image file and additional metadata in JSON.

class bentoml.io.Multipart(*args, **kwargs)[source]#

Multipart defines API specification for the inputs/outputs of a Service, where inputs/outputs of a Service can receive/send a multipart request/responses as specified in your API function signature.

Sample implementation of a sklearn service:

# sklearn_svc.py
import bentoml
from bentoml.io import NumpyNdarray, Multipart, JSON
import bentoml.sklearn

runner = bentoml.sklearn.load_runner("sklearn_model_clf")

svc = bentoml.Service("iris-classifier", runners=[runner])
input_spec = Multipart(arr=NumpyNdarray(), annotations=JSON())
output_spec = Multipart(output=NumpyNdarray(), result=JSON())

@svc.api(input=input_spec, output=output_spec)
def predict(arr, annotations):
    res = runner.run(arr)
    return {"output":res, "result":annotations}

Users then can then serve this service with bentoml serve:

% bentoml serve ./sklearn_svc.py:svc --reload

(Press CTRL+C to quit)
[INFO] Starting BentoML API server in development mode with auto-reload enabled
[INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py"
[INFO] API Server running on http://0.0.0.0:3000

Users can then send requests to the newly started services with any client:

Parameters

inputs (Dict[str, IODescriptor]) –

Dictionary consisting keys as inputs definition for a Multipart request/response, values as IODescriptor supported by BentoML. Currently, Multipart supports Image, NumpyNdarray, PandasDataFrame, PandasSeries, Text, and File.

Make sure to match the input params in an API function to the keys defined under Multipart:

+----------------------------------------------------------------+
|                                                                |
|   +--------------------------------------------------------+   |
|   |                                                        |   |
|   |    Multipart(arr=NumpyNdarray(), annotations=JSON()    |   |
|   |                                                        |   |
|   +----------------+-----------------------+---------------+   |
|                    |                       |                   |
|                    |                       |                   |
|                    |                       |                   |
|                    +----+        +---------+                   |
|                         |        |                             |
|         +---------------v--------v---------+                   |
|         |  def predict(arr, annotations):  |                   |
|         +----------------------------------+                   |
|                                                                |
+----------------------------------------------------------------+

Returns

IO Descriptor that Multipart request/response.

Return type

IODescriptor

Custom IODescriptor#

Note

The IODescriptor base class can be extended to support custom data format for your APIs, if the built-in descriptors does not fit your needs.

class bentoml.io.IODescriptor(*args, **kwargs)[source]#

IODescriptor describes the input/output data format of an InferenceAPI defined in a bentoml.Service. This is an abstract base class for extending new HTTP endpoint IO descriptor types in BentoServer.