API IO Descriptors#
IO Descriptors are used for describing the input and output spec of a Service API. Hereβs a list of built-in IO Descriptors and APIs for extending custom IO Descriptor.
NumPy ndarray
#
Note
The numpy
package is required to use the bentoml.io.NumpyNdarray
.
Install it with pip install numpy
and add it to your bentofile.yaml
βs under either Python or Conda packages list.
Refer to Build Options.
- class bentoml.io.NumpyNdarray(dtype: str | ext.NpDTypeLike | None = None, enforce_dtype: bool = False, shape: tuple[int, ...] | None = None, enforce_shape: bool = False)[source]#
NumpyNdarray
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from typenumpy.ndarray
as specified in your API function signature.A sample service implementation:
service.py#from __future__ import annotations from typing import TYPE_CHECKING, Any import bentoml from bentoml.io import NumpyNdarray if TYPE_CHECKING: from numpy.typing import NDArray runner = bentoml.sklearn.get("sklearn_model_clf").to_runner() svc = bentoml.Service("iris-classifier", runners=[runner]) @svc.api(input=NumpyNdarray(), output=NumpyNdarray()) def predict(input_arr: NDArray[Any]) -> NDArray[Any]: return runner.run(input_arr)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
% curl -X POST -H "Content-Type: application/json" \ --data '[[5,4,3,2]]' http://0.0.0.0:3000/predict # [1]%
request.py#import requests requests.post( "http://0.0.0.0:3000/predict", headers={"content-type": "application/json"}, data='[{"0":5,"1":4,"2":3,"3":2}]' ).text
- Parameters:
dtype β Data type users wish to convert their inputs/outputs to. Refer to arrays dtypes for more information.
enforce_dtype β Whether to enforce a certain data type. if
enforce_dtype=True
thendtype
must be specified.shape β
Given shape that an array will be converted to. For example:
service.py#from bentoml.io import NumpyNdarray @svc.api(input=NumpyNdarray(shape=(2,2), enforce_shape=False), output=NumpyNdarray()) async def predict(input_array: np.ndarray) -> np.ndarray: # input_array will be reshaped to (2,2) result = await runner.run(input_array)
When
enforce_shape=True
is provided, BentoML will raise an exception if the input array received does not match the shape provided.shape
If specified, then both
bentoml.io.NumpyNdarray.from_http_request()
andbentoml.io.NumpyNdarray.from_proto()
will reshape the input array before sending it to the API function.enforce_shape β Whether to enforce a certain shape. If
enforce_shape=True
thenshape
must be specified.
- Returns:
IO Descriptor that represents a
np.ndarray
.- Return type:
IODescriptor
- classmethod NumpyNdarray.from_sample(sample: ext.NpNDArray | t.Sequence[t.Any]) ext.NpNDArray #
Create a
NumpyNdarray
IO Descriptor from given inputs.- Parameters:
sample β Given sample
np.ndarray
data. It also accepts a sequence-like data type that can be converted tonp.ndarray
.enforce_dtype β Enforce a certain data type.
dtype
must be specified at function signature. If you donβt want to enforce a specific dtype then changeenforce_dtype=False
.enforce_shape β Enforce a certain shape.
shape
must be specified at function signature. If you donβt want to enforce a specific shape then changeenforce_shape=False
.
- Returns:
IODescriptor from given users inputs.
- Return type:
Example:
service.py#from __future__ import annotations from typing import TYPE_CHECKING, Any import bentoml from bentoml.io import NumpyNdarray import numpy as np if TYPE_CHECKING: from numpy.typing import NDArray input_spec = NumpyNdarray.from_sample(np.array([[1,2,3]])) @svc.api(input=input_spec, output=NumpyNdarray()) async def predict(input: NDArray[np.int16]) -> NDArray[Any]: return await runner.async_run(input)
- Raises:
BadInput β If given sample is a type
numpy.generic
. This exception will also be raised if we failed to create anp.ndarray
from given sample.
- async NumpyNdarray.from_proto(field: pb.NDArray | bytes) ext.NpNDArray [source]#
Process incoming protobuf request and convert it to
numpy.ndarray
- Parameters:
request β Incoming RPC request message.
context β grpc.ServicerContext
- Returns:
A
np.array
constructed from given protobuf message.- Return type:
numpy.ndarray
Note
Currently, we support
pb.NDArray
andserialized_bytes
as valid inputs.serialized_bytes
will be prioritised overpb.NDArray
if both are provided. Serialized bytes has a specialized bytes representation and should not be used by users directly.
- async NumpyNdarray.from_http_request(request: Request) ext.NpNDArray [source]#
Process incoming requests and convert incoming objects to
numpy.ndarray
.- Parameters:
request β Incoming Requests
- Returns:
- a
numpy.ndarray
object. This can then be used inside users defined logics.
- a
- async NumpyNdarray.to_http_response(obj: ext.NpNDArray, ctx: Context | None = None)[source]#
Process given objects and convert it to HTTP response.
- Parameters:
obj β
np.ndarray
that will be serialized to JSONctx β
Context
object that contains information about the request.
- Returns:
- HTTP Response of type
starlette.responses.Response
. This can be accessed via cURL or any external web traffic.
- HTTP Response of type
Tabular Data with Pandas#
To use the IO descriptor, install bentoml with extra io-pandas
dependency:
pip install "bentoml[io-pandas]"
Note
The pandas
package is required to use the bentoml.io.PandasDataFrame
or bentoml.io.PandasSeries
.
Install it with pip install pandas
and add it to your bentofile.yaml
βs under either Python or Conda packages list.
Refer to Build Options.
- class bentoml.io.PandasDataFrame(orient: ext.DataFrameOrient = 'records', columns: list[str] | None = None, apply_column_names: bool = False, dtype: bool | ext.PdDTypeArg | None = None, enforce_dtype: bool = False, shape: tuple[int, ...] | None = None, enforce_shape: bool = False, default_format: t.Literal['json', 'parquet', 'csv'] = 'json')[source]#
PandasDataFrame
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from typepd.DataFrame
as specified in your API function signature.A sample service implementation:
service.py#from __future__ import annotations import bentoml import pandas as pd import numpy as np from bentoml.io import PandasDataFrame input_spec = PandasDataFrame.from_sample(pd.DataFrame(np.array([[5,4,3,2]]))) runner = bentoml.sklearn.get("sklearn_model_clf").to_runner() svc = bentoml.Service("iris-classifier", runners=[runner]) @svc.api(input=input_spec, output=PandasDataFrame()) def predict(input_arr): res = runner.run(input_arr) return pd.DataFrame(res)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
% curl -X POST -H "Content-Type: application/json" \ --data '[{"0":5,"1":4,"2":3,"3":2}]' http://0.0.0.0:3000/predict # [{"0": 1}]%
request.py#import requests requests.post( "http://0.0.0.0:3000/predict", headers={"content-type": "application/json"}, data='[{"0":5,"1":4,"2":3,"3":2}]' ).text
- Parameters:
orient β
Indication of expected JSON string format. Compatible JSON strings can be produced by
pandas.io.json.to_json()
with a corresponding orient value. Possible orients are:split
-dict[str, Any]
β¦ {idx
β[idx]
,columns
β[columns]
,data
β[values]
}records
-list[Any]
β¦ [{column
βvalue
}, β¦, {column
βvalue
}]index
-dict[str, Any]
β¦ {idx
β {column
βvalue
}}columns
-dict[str, Any]
β¦ {column
β {index
βvalue
}}values
-dict[str, Any]
β¦ Values arrays
columns β List of columns name that users wish to update.
apply_column_names β Whether to update incoming DataFrame columns. If
apply_column_names=True
, thencolumns
must be specified.dtype β Data type users wish to convert their inputs/outputs to. If it is a boolean, then pandas will infer dtypes. Else if it is a dictionary of column to
dtype
, then applies those to incoming dataframes. IfFalse
, then donβt infer dtypes at all (only applies to the data). This is not applicable fororient='table'
.enforce_dtype β Whether to enforce a certain data type. if
enforce_dtype=True
thendtype
must be specified.shape β
Optional shape check that users can specify for their incoming HTTP requests. We will only check the number of columns you specified for your given shape:
service.py#import pandas as pd from bentoml.io import PandasDataFrame df = pd.DataFrame([[1, 2, 3]]) # shape (1,3) inp = PandasDataFrame.from_sample(df) @svc.api( input=PandasDataFrame(shape=(51, 10), enforce_shape=True), output=PandasDataFrame() ) def predict(input_df: pd.DataFrame) -> pd.DataFrame: # if input_df have shape (40,9), # it will throw out errors ...
enforce_shape β Whether to enforce a certain shape. If
enforce_shape=True
thenshape
must be specified.default_format β
The default serialization format to use if the request does not specify a
Content-Type
Headers. It is also the serialization format used for the response. Possible values are:
- Returns:
IO Descriptor that represents a
pd.DataFrame
.- Return type:
- classmethod PandasDataFrame.from_sample(sample: ext.PdDataFrame) ext.PdDataFrame #
Create a
PandasDataFrame
IO Descriptor from given inputs.- Parameters:
sample β Given sample
pd.DataFrame
dataorient β
Indication of expected JSON string format. Compatible JSON strings can be produced by
pandas.io.json.to_json()
with a corresponding orient value. Possible orients are:split
-dict[str, Any]
β¦ {idx
β[idx]
,columns
β[columns]
,data
β[values]
}records
-list[Any]
β¦ [{column
βvalue
}, β¦, {column
βvalue
}]index
-dict[str, Any]
β¦ {idx
β {column
βvalue
}}columns
-dict[str, Any]
β¦ {column
β {index
βvalue
}}values
-dict[str, Any]
β¦ Values arraystable
-dict[str, Any]
β¦ {schema
: { schema },data
: { data }}
apply_column_names β Update incoming DataFrame columns.
columns
must be specified at function signature. If you donβt want to enforce a specific columns name then changeapply_column_names=False
.enforce_dtype β Enforce a certain data type. dtype must be specified at function signature. If you donβt want to enforce a specific dtype then change
enforce_dtype=False
.enforce_shape β Enforce a certain shape.
shape
must be specified at function signature. If you donβt want to enforce a specific shape then changeenforce_shape=False
.default_format β
The default serialization format to use if the request does not specify a
Content-Type
Headers. It is also the serialization format used for the response. Possible values are:
- Returns:
IODescriptor from given users inputs.
- Return type:
Example:
service.py#import pandas as pd from bentoml.io import PandasDataFrame arr = [[1,2,3]] input_spec = PandasDataFrame.from_sample(pd.DataFrame(arr)) @svc.api(input=input_spec, output=PandasDataFrame()) def predict(inputs: pd.DataFrame) -> pd.DataFrame: ...
- async PandasDataFrame.from_proto(field: pb.DataFrame | bytes) ext.PdDataFrame [source]#
Process incoming protobuf request and convert it to
pandas.DataFrame
- Parameters:
request β Incoming RPC request message.
context β grpc.ServicerContext
- Returns:
- a
pandas.DataFrame
object. This can then be used inside users defined logics.
- a
- async PandasDataFrame.from_http_request(request: Request) ext.PdDataFrame [source]#
Process incoming requests and convert incoming objects to pd.DataFrame
- Parameters:
request (starlette.requests.Requests) β Incoming Requests
- Returns:
- a pd.DataFrame object. This can then be used
inside users defined logics.
- Raises:
BadInput β Raised when the incoming requests are bad formatted.
- async PandasDataFrame.to_http_response(obj: ext.PdDataFrame, ctx: Context | None = None) Response [source]#
Process given objects and convert it to HTTP response.
- Parameters:
obj (pd.DataFrame) β pd.DataFrame that will be serialized to JSON or parquet
- Returns:
- HTTP Response of type starlette.responses.Response. This can
be accessed via cURL or any external web traffic.
- class bentoml.io.PandasSeries(orient: ext.SeriesOrient = 'records', dtype: ext.PdDTypeArg | None = None, enforce_dtype: bool = False, shape: tuple[int, ...] | None = None, enforce_shape: bool = False)[source]#
PandasSeries
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from typepd.Series
as specified in your API function signature.A sample service implementation:
service.py#import bentoml import pandas as pd import numpy as np from bentoml.io import PandasSeries runner = bentoml.sklearn.get("sklearn_model_clf").to_runner() svc = bentoml.Service("iris-classifier", runners=[runner]) @svc.api(input=PandasSeries(), output=PandasSeries()) def predict(input_arr): res = runner.run(input_arr) # type: np.ndarray return pd.Series(res)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
% curl -X POST -H "Content-Type: application/json" \ --data '[{"0":5,"1":4,"2":3,"3":2}]' http://0.0.0.0:3000/predict # [{"0": 1}]%
request.py#import requests requests.post( "http://0.0.0.0:3000/predict", headers={"content-type": "application/json"}, data='[{"0":5,"1":4,"2":3,"3":2}]' ).text
- Parameters:
orient β
Indication of expected JSON string format. Compatible JSON strings can be produced by
pandas.io.json.to_json()
with a corresponding orient value. Possible orients are:split
-dict[str, Any]
β¦ {idx
β[idx]
,columns
β[columns]
,data
β[values]
}records
-list[Any]
β¦ [{column
βvalue
}, β¦, {column
βvalue
}]index
-dict[str, Any]
β¦ {idx
β {column
βvalue
}}columns
-dict[str, Any]
β¦ {column
β {index
βvalue
}}values
-dict[str, Any]
β¦ Values arrays
columns β List of columns name that users wish to update.
apply_column_names β Whether to update incoming DataFrame columns. If
apply_column_names=True
, thencolumns
must be specified.dtype β Data type users wish to convert their inputs/outputs to. If it is a boolean, then pandas will infer dtypes. Else if it is a dictionary of column to
dtype
, then applies those to incoming dataframes. IfFalse
, then donβt infer dtypes at all (only applies to the data). This is not applicable fororient='table'
.enforce_dtype β Whether to enforce a certain data type. if
enforce_dtype=True
thendtype
must be specified.shape β
Optional shape check that users can specify for their incoming HTTP requests. We will only check the number of columns you specified for your given shape:
service.py#import pandas as pd from bentoml.io import PandasSeries @svc.api(input=PandasSeries(shape=(51,), enforce_shape=True), output=PandasSeries()) def infer(input_series: pd.Series) -> pd.Series: # if input_series has shape (40,), it will error ...
enforce_shape β Whether to enforce a certain shape. If
enforce_shape=True
thenshape
must be specified.
- Returns:
IO Descriptor that represents a
pd.Series
.- Return type:
- classmethod PandasSeries.from_sample(sample: ext.PdSeries | t.Sequence[t.Any]) ext.PdSeries #
Create a
PandasSeries
IO Descriptor from given inputs.- Parameters:
sample_input β Given sample
pd.DataFrame
dataorient β
Indication of expected JSON string format. Compatible JSON strings can be produced by
pandas.io.json.to_json()
with a corresponding orient value. Possible orients are:split
-dict[str, Any]
β¦ {idx
β[idx]
,columns
β[columns]
,data
β[values]
}records
-list[Any]
β¦ [{column
βvalue
}, β¦, {column
βvalue
}]index
-dict[str, Any]
β¦ {idx
β {column
βvalue
}}table
-dict[str, Any]
β¦ {schema
: { schema },data
: { data }}
enforce_dtype β Enforce a certain data type. dtype must be specified at function signature. If you donβt want to enforce a specific dtype then change
enforce_dtype=False
.enforce_shape β Enforce a certain shape.
shape
must be specified at function signature. If you donβt want to enforce a specific shape then changeenforce_shape=False
.
- Returns:
IODescriptor from given users inputs.
- Return type:
Example:
service.py#import pandas as pd from bentoml.io import PandasSeries arr = [1,2,3] input_spec = PandasSeries.from_sample(pd.DataFrame(arr)) @svc.api(input=input_spec, output=PandasSeries()) def predict(inputs: pd.Series) -> pd.Series: ...
- async PandasSeries.from_proto(field: pb.Series | bytes) ext.PdSeries [source]#
Process incoming protobuf request and convert it to
pandas.Series
- Parameters:
request β Incoming RPC request message.
context β grpc.ServicerContext
- Returns:
- a
pandas.Series
object. This can then be used inside users defined logics.
- a
- async PandasSeries.from_http_request(request: Request) ext.PdSeries [source]#
Process incoming requests and convert incoming objects to
pd.Series
.- Parameters:
request β Incoming Requests
- Returns:
a
pd.Series
object. This can then be used inside users defined logics.
- async PandasSeries.to_http_response(obj: t.Any, ctx: Context | None = None) Response [source]#
Process given objects and convert it to HTTP response.
- Parameters:
obj β pd.Series that will be serialized to JSON
- Returns:
HTTP Response of type
starlette.responses.Response
. This can be accessed via cURL or any external web traffic.
Structured Data with JSON#
Note
For common structure data, we recommend using the JSON
descriptor, as it provides
the most flexibility. Users can also define a schema of the JSON data via a
Pydantic model, and use it to for data
validation.
To use the IO descriptor with pydantic, install bentoml with extra io-json
dependency:
pip install "bentoml[io-json]"
This will include BentoML with Pydantic alongside with BentoML
Then proceed to add it to your bentofile.yaml
βs under either Python or Conda packages list.
Refer to Build Options. We also provide an example project using Pydantic for request validation.
...
python:
packages:
- pydantic
...
conda:
channels:
- conda-forge
dependencies:
- pydantic
Refers to Build Options.
- class bentoml.io.JSON(*, pydantic_model: type[pydantic.main.BaseModel] | None = None, validate_json: bool | None = None, json_encoder: type[json.encoder.JSONEncoder] = <class 'bentoml._internal.io_descriptors.json.DefaultJsonEncoder'>)[source]#
JSON
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from a JSON representation as specified in your API function signature.A sample service implementation:
service.py#from __future__ import annotations import typing from typing import TYPE_CHECKING from typing import Any from typing import Optional import bentoml from bentoml.io import NumpyNdarray from bentoml.io import JSON import numpy as np import pandas as pd from pydantic import BaseModel iris_clf_runner = bentoml.sklearn.get("iris_clf_with_feature_names:latest").to_runner() svc = bentoml.Service("iris_classifier_pydantic", runners=[iris_clf_runner]) class IrisFeatures(BaseModel): sepal_len: float sepal_width: float petal_len: float petal_width: float # Optional field request_id: Optional[int] # Use custom Pydantic config for additional validation options class Config: extra = 'forbid' input_spec = JSON(pydantic_model=IrisFeatures) @svc.api(input=input_spec, output=NumpyNdarray()) def classify(input_data: IrisFeatures) -> NDArray[Any]: if input_data.request_id is not None: print("Received request ID: ", input_data.request_id) input_df = pd.DataFrame([input_data.dict(exclude={"request_id"})]) return iris_clf_runner.run(input_df)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
% curl -X POST -H "content-type: application/json" \ --data '{"sepal_len": 6.2, "sepal_width": 3.2, "petal_len": 5.2, "petal_width": 2.2}' \ http://127.0.0.1:3000/classify # [2]%
request.py#import requests requests.post( "http://0.0.0.0:3000/predict", headers={"content-type": "application/json"}, data='{"sepal_len": 6.2, "sepal_width": 3.2, "petal_len": 5.2, "petal_width": 2.2}' ).text
- Parameters:
pydantic_model β Pydantic model schema. When used, inference API callback will receive an instance of the specified
pydantic_model
class.json_encoder β JSON encoder class. By default BentoML implements a custom JSON encoder that provides additional serialization supports for numpy arrays, pandas dataframes, dataclass-like (attrs, dataclass, etc.). If you wish to use a custom encoder, make sure to support the aforementioned object.
- Returns:
IO Descriptor that represents JSON format.
- Return type:
- classmethod JSON.from_sample(sample: str | Dict[str, Any] | BaseModel | None) str | Dict[str, Any] | BaseModel | None #
Create a
JSON
IO Descriptor from given inputs.- Parameters:
sample β
A JSON-like datatype, which can be either dict, str, list.
sample
will also accepting a Pydantic model.from pydantic import BaseModel class IrisFeatures(BaseModel): sepal_len: float sepal_width: float petal_len: float petal_width: float input_spec = JSON.from_sample( IrisFeatures(sepal_len=1.0, sepal_width=2.0, petal_len=3.0, petal_width=4.0) ) @svc.api(input=input_spec, output=NumpyNdarray()) async def predict(input: NDArray[np.int16]) -> NDArray[Any]: return await runner.async_run(input)
json_encoder β Optional JSON encoder.
- Returns:
IODescriptor from given users inputs.
- Return type:
Example:
service.py#from __future__ import annotations import bentoml from typing import Any from bentoml.io import JSON input_spec = JSON.from_sample({"Hello": "World", "foo": "bar"}) @svc.api(input=input_spec, output=JSON()) async def predict(input: dict[str, Any]) -> dict[str, Any]: return await runner.async_run(input)
- Raises:
BadInput β Given sample is not a valid JSON string, bytes, or supported nest types.
Texts#
bentoml.io.Text
is commonly used for NLP Applications:
- class bentoml.io.Text(content_type: Literal['text/plain', 'text/event-stream'] = 'text/plain')[source]#
Text
defines API specification for the inputs/outputs of a Service.Text
represents strings for all incoming requests/outcoming responses as specified in your API function signature.A sample GPT2 service implementation:
service.py#from __future__ import annotations import bentoml from bentoml.io import Text runner = bentoml.tensorflow.get('gpt2:latest').to_runner() svc = bentoml.Service("gpt2-generation", runners=[runner]) @svc.api(input=Text(), output=Text()) def predict(text: str) -> str: res = runner.run(text) return res['generated_text']
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
% curl -X POST -H "Content-Type: text/plain" \ --data 'Not for nothing did Orin say that people outdoors.' \ http://0.0.0.0:3000/predict
request.py#import requests requests.post( "http://0.0.0.0:3000/predict", headers = {"content-type":"text/plain"}, data = 'Not for nothing did Orin say that people outdoors.' ).text
Note
Text
is not designed to take anyargs
orkwargs
during initialization.- Returns:
IO Descriptor that represents strings type.
- Return type:
- classmethod Text.from_sample(sample: str | bytes) str #
Create a
Text
IO Descriptor from given inputs.- Parameters:
sample β Given sample text.
- Returns:
IODescriptor from given users inputs.
- Return type:
Example:
service.py#@svc.api(input=bentoml.io.Text.from_sample('Bento box is'), output=bentoml.io.Text()) def predict(inputs: str) -> str: ...
Images#
To use the IO descriptor, install bentoml with extra io-image
dependency:
pip install "bentoml[io-image]"
Note
The Pillow
package is required to use the bentoml.io.Image
.
Install it with pip install Pillow
and add it to your bentofile.yaml
βs under either Python or Conda packages list.
Refer to Build Options.
- class bentoml.io.Image(pilmode: _Mode | None = 'RGB', mime_type: str = 'image/jpeg', *, allowed_mime_types: t.Iterable[str] | None = None)[source]#
Image
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from images as specified in your API function signature.A sample object detection service:
service.py#from __future__ import annotations from typing import TYPE_CHECKING from typing import Any import bentoml from bentoml.io import Image from bentoml.io import NumpyNdarray if TYPE_CHECKING: from PIL.Image import Image from numpy.typing import NDArray runner = bentoml.tensorflow.get('image-classification:latest').to_runner() svc = bentoml.Service("vit-object-detection", runners=[runner]) @svc.api(input=Image(), output=NumpyNdarray(dtype="float32")) async def predict_image(f: Image) -> NDArray[Any]: assert isinstance(f, Image) arr = np.array(f) / 255.0 assert arr.shape == (28, 28) # We are using greyscale image and our PyTorch model expect one # extra channel dimension arr = np.expand_dims(arr, (0, 3)).astype("float32") # reshape to [1, 28, 28, 1] return await runner.async_run(arr)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
# we will run on our input image test.png # image can get from http://images.cocodataset.org/val2017/000000039769.jpg % curl -H "Content-Type: multipart/form-data" \ -F 'fileobj=@test.jpg;type=image/jpeg' \ http://0.0.0.0:3000/predict_image # [{"score":0.8610631227493286,"label":"Egyptian cat"}, # {"score":0.08770329505205154,"label":"tabby, tabby cat"}, # {"score":0.03540956228971481,"label":"tiger cat"}, # {"score":0.004140055272728205,"label":"lynx, catamount"}, # {"score":0.0009498853469267488,"label":"Siamese cat, Siamese"}]%
request.py#import requests requests.post( "http://0.0.0.0:3000/predict_image", files = {"upload_file": open('test.jpg', 'rb')}, headers = {"content-type": "multipart/form-data"} ).text
- Parameters:
pilmode β Color mode for PIL. Default to
RGB
.mime_type β The MIME type of the file type that this descriptor should return. Only relevant when used as an output descriptor.
allowed_mime_types β A list of MIME types to restrict input to.
- Returns:
IO Descriptor that either a
PIL.Image.Image
or anp.ndarray
representing an image.- Return type:
- classmethod Image.from_sample(sample: ImageType | str) ImageType #
Create a
Image
IO Descriptor from given inputs.- Parameters:
sample β Given File-like object, or a path to a file.
pilmode β Optional color mode for PIL. Default to
RGB
.mime_type β The MIME type of the file type that this descriptor should return. If not specified, then
from_sample
will try to infer the MIME type from file extension.allowed_mime_types β An optional list of MIME types to restrict input to.
- Returns:
IODescriptor from given users inputs.
- Return type:
Example:
service.py#from __future__ import annotations import bentoml from typing import Any from bentoml.io import Image import numpy as np input_spec = Image.from_sample("/path/to/image.jpg") @svc.api(input=input_spec, output=Image()) async def predict(input: t.IO[t.Any]) -> t.IO[t.Any]: return await runner.async_run(input)
- Raises:
InvalidArgument β If the given sample is not a valid image type.
MissingDependencyException β If
filetype
is not installed.BadInput β Given sample from file canβt be parsed with Pillow.
Files#
- class bentoml.io.File(kind: FileKind = 'binaryio', mime_type: str | None = None, **kwargs: t.Any)[source]#
File
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from file-like objects as specified in your API function signature.A sample ViT service:
service.py#from __future__ import annotations import io from typing import TYPE_CHECKING from typing import Any import bentoml from bentoml.io import File if TYPE_CHECKING: from numpy.typing import NDArray runner = bentoml.tensorflow.get('image-classification:latest').to_runner() svc = bentoml.Service("vit-pdf-classifier", runners=[runner]) @svc.api(input=File(), output=NumpyNdarray(dtype="float32")) async def predict(input_pdf: io.BytesIO[Any]) -> NDArray[Any]: return await runner.async_run(input_pdf)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
% curl -H "Content-Type: multipart/form-data" \ -F 'fileobj=@test.pdf;type=application/pdf' \ http://0.0.0.0:3000/predict
request.py#import requests requests.post( "http://0.0.0.0:3000/predict", files = {"upload_file": open('test.pdf', 'rb')}, headers = {"content-type": "multipart/form-data"} ).text
- Parameters:
kind β The kind of file-like object to be used. Currently, the only accepted value is
binaryio
.mime_type β Return MIME type of the
starlette.response.Response
, only available when used as output descriptor
- Returns:
IO Descriptor that represents file-like objects.
- Return type:
- classmethod File.from_sample(sample: IOBase | IO[bytes] | FileLike[bytes] | str) IOBase | IO[bytes] | FileLike[bytes] #
Create a
File
IO Descriptor from given inputs.- Parameters:
sample β Given File-like object, or a path to a file.
kind β The kind of file-like object to be used. Currently, the only accepted value is
binaryio
.mime_type β Optional MIME type for the descriptor. If not provided,
from_sample
will try to infer the MIME type from the file extension.
- Returns:
IODescriptor from given users inputs.
- Return type:
Example:
service.py#from __future__ import annotations import bentoml from typing import Any from bentoml.io import File input_spec = File.from_sample("/path/to/file.pdf") @svc.api(input=input_spec, output=File()) async def predict(input: t.IO[t.Any]) -> t.IO[t.Any]: return await runner.async_run(input)
- Raises:
MissingDependencyException β If
filetype
is not installed.ValueError β If the MIME type cannot be inferred automatically.
BadInput β Any other errors that may occur during the process of figure out the MIME type.
Multipart Payloads#
Note
Multipart
makes it possible to compose a multipart payload from multiple
other IO Descriptor instances. For example, you may create a Multipart input that
contains a image file and additional metadata in JSON.
- class bentoml.io.Multipart(**inputs: IODescriptor[Any])[source]#
Multipart
defines API specification for the inputs/outputs of a Service, where inputs/outputs of a Service can receive/send a multipart request/responses as specified in your API function signature.A sample service implementation:
service.py#from __future__ import annotations from typing import TYPE_CHECKING from typing import Any import bentoml from bentoml.io import NumpyNdarray from bentoml.io import Multipart from bentoml.io import JSON if TYPE_CHECKING: from numpy.typing import NDArray runner = bentoml.sklearn.get("sklearn_model_clf").to_runner() svc = bentoml.Service("iris-classifier", runners=[runner]) input_spec = Multipart(arr=NumpyNdarray(), annotations=JSON()) output_spec = Multipart(output=NumpyNdarray(), result=JSON()) @svc.api(input=input_spec, output=output_spec) async def predict( arr: NDArray[Any], annotations: dict[str, Any] ) -> dict[str, NDArray[Any] | dict[str, Any]]: res = await runner.run(arr) return {"output": res, "result": annotations}
Users then can then serve this service with
bentoml serve
:% bentoml serve ./service.py:svc --reload
Users can then send requests to the newly started services with any client:
% curl -X POST -H "Content-Type: multipart/form-data" \ -F annotations=@test.json -F arr='[5,4,3,2]' \ http://0.0.0.0:3000/predict # --b1d72c201a064ecd92a17a412eb9208e # Content-Disposition: form-data; name="output" # content-length: 1 # content-type: application/json # 1 # --b1d72c201a064ecd92a17a412eb9208e # Content-Disposition: form-data; name="result" # content-length: 13 # content-type: application/json # {"foo":"bar"} # --b1d72c201a064ecd92a17a412eb9208e--
Note
The following code snippet uses requests_toolbelt. Install with
pip install requests-toolbelt
.request.py#import requests from requests_toolbelt.multipart.encoder import MultipartEncoder m = MultipartEncoder( fields={ "field0": "value", "field1": "value", "field2": ("filename", open("test.json", "rb"), "application/json"), } ) requests.post( "http://0.0.0.0:3000/predict", data=m, headers={"Content-Type": m.content_type} )
- Parameters:
inputs β
Dictionary consisting keys as inputs definition for a Multipart request/response, values as IODescriptor supported by BentoML. Currently,
Multipart
supportsImage
,NumpyNdarray
,PandasDataFrame
,PandasSeries
,Text
, andFile
.Make sure to match the input parameters in function signatures in an API function to the keys defined under
Multipart
:+----------------------------------------------------------------+ | | | +--------------------------------------------------------+ | | | | | | | Multipart(arr=NumpyNdarray(), annotations=JSON()) | | | | | | | | | +---------------+-----------------------+----------------+ | | | | | | | | | | | | | | +-----+ +--------+ | | | | | | +---------------v--------v---------+ | | | def predict(arr, annotations): | | | +----------------------------------+ | | | +----------------------------------------------------------------+
- Returns:
IO Descriptor that represents a Multipart request/response.
- Return type:
Custom IODescriptor#
Note
The IODescriptor base class can be extended to support custom data format for your APIs, if the built-in descriptors does not fit your needs.