API IO Descriptors¶
IODescriptor base¶
- class bentoml.io.IODescriptor(*args, **kwargs)¶
IODescriptor describes the input/output data format of an InferenceAPI defined in a
bentoml.Service
. This is an abstract base class for extending new HTTP endpoint IO descriptor types in BentoServer.
bentoml.io.JSON¶
- class bentoml.io.JSON(*args, **kwargs)¶
JSON
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from a JSON representation as specified in your API function signature.Sample implementation of a sklearn service:
# sklearn_svc.py import pandas as pd import numpy as np from bentoml.io import PandasDataFrame, JSON import bentoml.sklearn input_spec = PandasDataFrame.from_sample(pd.DataFrame(np.array([[5,4,3,2]]))) runner = bentoml.sklearn.load_runner("sklearn_model_clf") svc = bentoml.Service("iris-classifier", runners=[runner]) @svc.api(input=input_spec, output=JSON()) def predict(input_arr: pd.DataFrame): res = runner.run_batch(input_arr) # type: np.ndarray return {"res":pd.DataFrame(res).to_json(orient='record')}
Users then can then serve this service with
bentoml serve
:% bentoml serve ./sklearn_svc.py:svc --auto-reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
import requests requests.post( "http://0.0.0.0:3000/predict", headers={"content-type": "application/json"}, data='[{"0":5,"1":4,"2":3,"3":2}]' ).text
% curl -X POST -H "Content-Type: application/json" --data '[{"0":5,"1":4,"2":3,"3":2}]' http://0.0.0.0:3000/predict {"res":"[{"0":1}]"}%
- Parameters
pydantic_model (
pydantic.BaseModel
, optional, default toNone
) – Pydantic model schema.validate_json (
bool
, optional, default toTrue
) – If True, then use Pydantic model specified above to validate given JSON.json_encoder (
Type[json.JSONEncoder]
, default to~bentoml._internal.io_descriptor.json.DefaultJsonEncoder
) – JSON encoder class.
- Returns
IO Descriptor that in JSON format.
- Return type
IODescriptor
bentoml.io.NumpyNdarray¶
- class bentoml.io.NumpyNdarray(*args, **kwargs)¶
NumpyNdarray
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from typenumpy.ndarray
as specified in your API function signature.Sample implementation of a sklearn service:
# sklearn_svc.py import bentoml from bentoml.io import NumpyNdarray import bentoml.sklearn runner = bentoml.sklearn.load_runner("sklearn_model_clf") svc = bentoml.Service("iris-classifier", runners=[runner]) @svc.api(input=NumpyNdarray(), output=NumpyNdarray()) def predict(input_arr): res = runner.run(input_arr) return res
Users then can then serve this service with
bentoml serve
:% bentoml serve ./sklearn_svc.py:svc --auto-reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
% curl -X POST -H "Content-Type: application/json" --data '[[5,4,3,2]]' http://0.0.0.0:3000/predict [1]%
- Parameters
dtype (
numpy.typings.DTypeLike
, optional, default toNone
) – Data Type users wish to convert their inputs/outputs to. Refers to arrays dtypes for more information.enforce_dtype (
bool
, optional, default toFalse
) – Whether to enforce a certain data type. ifenforce_dtype=True
thendtype
must be specified.shape (
Tuple[int, ...]
, optional, default toNone
) –Given shape that an array will be converted to. For example:
from bentoml.io import NumpyNdarray arr = [[1,2,3]] # shape (1,3) inp = NumpyNdarray.from_sample(arr) ... @svc.api(input=inp(shape=(3,1), enforce_shape=True), output=NumpyNdarray()) def predict(input_array: np.ndarray) -> np.ndarray: # input_array will have shape (3,1) result = await runner.run(input_array)
enforce_shape (
bool
, optional, default toFalse
) – Whether to enforce a certain shape. If enforce_shape=True then shape must be specified
- Returns
IO Descriptor that
np.ndarray
.- Return type
IODescriptor
- classmethod NumpyNdarray.from_sample(sample_input, enforce_dtype=True, enforce_shape=True)¶
Create a NumpyNdarray IO Descriptor from given inputs.
- Parameters
sample_input (
np.ndarray
) – Given sample np.ndarray data(;code (enforce_dtype) – bool, optional, default to
True
): Enforce a certain data type.dtype
must be specified at function signature. If you don’t want to enforce a specific dtype then changeenforce_dtype=False
.enforce_shape (
bool
, optional, default toFalse
) – Enforce a certain shape.shape
must be specified at function signature. If you don’t want to enforce a specific shape then changeenforce_shape=False
.
- Returns
NumpyNdarray
IODescriptor from given users inputs.- Return type
NumpyNdarray
Example:
import numpy as np from bentoml.io import NumpyNdarray arr = [[1,2,3]] inp = NumpyNdarray.from_sample(arr) ... @svc.api(input=inp, output=NumpyNdarray()) def predict() -> np.ndarray:...
bentoml.io.File¶
- class bentoml.io.File(kind='binaryio', mime_type=None)¶
File
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from file-like objects as specified in your API function signature.Sample implementation of a ViT service:
# vit_svc.py import bentoml from bentoml.io import File svc = bentoml.Service("vit-object-detection") @svc.api(input=File(), output=File()) def predict(input_pdf): return input_pdf
Users then can then serve this service with
bentoml serve
:% bentoml serve ./vit_svc.py:svc --auto-reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "vit-object-detection" defined in "vit_svc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
import requests requests.post( "http://0.0.0.0:3000/predict", files = {"upload_file": open('test.pdf', 'rb')}, headers = {"content-type": "multipart/form-data"} ).text
% curl -H "Content-Type: multipart/form-data" -F 'fileobj=@test.pdf;type=application/pdf' http://0.0.0.0:3000/predict
- Parameters
mime_type (
str
, optional, default toNone
) – Return MIME type of thestarlette.response.Response
, only available when used as output descriptor- Returns
IO Descriptor that file-like objects.
- Return type
IODescriptor
bentoml.io.Image¶
- class bentoml.io.Image(*args, **kwargs)¶
Image
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from images as specified in your API function signature.Sample implementation of a transformers service for object detection:
#obj_detc.py import bentoml from bentoml.io import Image, JSON import bentoml.transformers tag='google_vit_large_patch16_224:latest' runner = bentoml.transformers.load_runner(tag, tasks='image-classification', device=-1, feature_extractor="google/vit-large-patch16-224") svc = bentoml.Service("vit-object-detection", runners=[runner]) @svc.api(input=Image(), output=JSON()) def predict(input_img): res = runner.run_batch(input_img) return res
Users then can then serve this service with
bentoml serve
:% bentoml serve ./obj_detc.py:svc --auto-reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "vit-object-detection" defined in "obj_detc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
import requests requests.post( "http://0.0.0.0:3000/predict", files = {"upload_file": open('test.jpg', 'rb')}, headers = {"content-type": "multipart/form-data"} ).text
# we will run on our input image test.png # image can get from http://images.cocodataset.org/val2017/000000039769.jpg % curl -H "Content-Type: multipart/form-data" -F 'fileobj=@test.jpg;type=image/jpeg' http://0.0.0.0:3000/predict [{"score":0.8610631227493286,"label":"Egyptian cat"}, {"score":0.08770329505205154,"label":"tabby, tabby cat"}, {"score":0.03540956228971481,"label":"tiger cat"}, {"score":0.004140055272728205,"label":"lynx, catamount"}, {"score":0.0009498853469267488,"label":"Siamese cat, Siamese"}]%
- Parameters
pilmode (
str
, optional, default toRGB
) – Color mode for PIL.mime_type (
str
, optional, default toimage/jpeg
) – Return MIME type of thestarlette.response.Response
, only available when used as output descriptor.
- Returns
IO Descriptor that either a
PIL.Image.Image
or anp.ndarray
representing an image.- Return type
IODescriptor
bentoml.io.Text¶
- class bentoml.io.Text(*args, **kwargs)¶
Text
defines API specification for the inputs/outputs of a Service.Text
represents strings for all incoming requests/outcoming responses as specified in your API function signature.Sample implementation of a GPT2 service:
# gpt2_svc.py import bentoml from bentoml.io import Text import bentoml.transformers # If you don't have a gpt2 model previously saved under BentoML modelstore # tag = bentoml.transformers.import_from_huggingface_hub('gpt2') runner = bentoml.transformers.load_runner('gpt2',tasks='text-generation') svc = bentoml.Service("gpt2-generation", runners=[runner]) @svc.api(input=Text(), output=Text()) def predict(input_arr): res = runner.run_batch(input_arr) return res[0]['generated_text']
Users then can then serve this service with
bentoml serve
:% bentoml serve ./gpt2_svc.py:svc --auto-reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "gpt2-generation" defined in "gpt2_svc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
import requests requests.post( "http://0.0.0.0:3000/predict", headers = {"content-type":"text/plain"}, data = 'Not for nothing did Orin say that people outdoors down here just scuttle in vectors from air conditioning to air conditioning.' ).text
% curl -X POST -H "Content-Type: text/plain" --data 'Not for nothing did Orin say that people outdoors down here just scuttle in vectors from air conditioning to air conditioning.' http://0.0.0.0:3000/predict
Note
Text is not designed to take any args or kwargs during initialization
- Returns
IO Descriptor that strings type.
- Return type
IODescriptor
bentoml.io.PandasDataFrame¶
- class bentoml.io.PandasDataFrame(*args, **kwargs)¶
PandasDataFrame
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from typenumpy.ndarray
as specified in your API function signature.Sample implementation of a sklearn service:
# sklearn_svc.py import bentoml import pandas as pd import numpy as np from bentoml.io import PandasDataFrame import bentoml.sklearn input_spec = PandasDataFrame.from_sample(pd.DataFrame(np.array([[5,4,3,2]]))) runner = bentoml.sklearn.load_runner("sklearn_model_clf") svc = bentoml.Service("iris-classifier", runners=[runner]) @svc.api(input=input_spec, output=PandasDataFrame()) def predict(input_arr): res = runner.run_batch(input_arr) # type: np.ndarray return pd.DataFrame(res)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./sklearn_svc.py:svc --auto-reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
import requests requests.post( "http://0.0.0.0:3000/predict", headers={"content-type": "application/json"}, data='[{"0":5,"1":4,"2":3,"3":2}]' ).text
% curl -X POST -H "Content-Type: application/json" --data '[{"0":5,"1":4,"2":3,"3":2}]' http://0.0.0.0:3000/predict [{"0": 1}]%
- Parameters
orient (
str
, optional, default torecords
) –Indication of expected JSON string format. Compatible JSON strings can be produced by
pandas.io.json.to_json()
with a corresponding orient value. Possible orients are:split
-Dict[str, Any]
: {idx -> [idx], columns -> [columns], data -> [values]}records
-List[Any]
: [{column -> value}, …, {column -> value}]index
-Dict[str, Any]
: {idx -> {column -> value}}columns
-Dict[str, Any]
: {column -> {index -> value}}values
-Dict[str, Any]
: Values arrays
columns (
List[str]
, optional, default toNone
) – List of columns name that users wish to update.apply_column_names (
bool
, optional, default toFalse
) – Whether to update incoming DataFrame columns. Ifapply_column_names`=True, then `columns
must be specified.dtype (
Union[bool, Dict[str, Any]]
, optional, default toNone
) – Data Type users wish to convert their inputs/outputs to. If it is a boolean, then pandas will infer dtypes. Else if it is a dictionary of column to dtype, then applies those to incoming dataframes. If False, then don’t infer dtypes at all (only applies to the data). This is not applicable whenorient='table'
.enforce_dtype (
bool
, optional, default toFalse
) – Whether to enforce a certain data type. ifenforce_dtype=True
thendtype
must be specified.shape (
Tuple[int, ...]
, optional, default toNone
) –Optional shape check that users can specify for their incoming HTTP requests. We will only check the number of columns you specified for your given shape:
import pandas as pd from bentoml.io import PandasDataFrame df = pd.DataFrame([[1,2,3]]) # shape (1,3) inp = PandasDataFrame.from_sample(arr) @svc.api(input=PandasDataFrame( shape=(51,10), enforce_shape=True ), output=PandasDataFrame()) def infer(input_df: pd.DataFrame) -> pd.DataFrame:... # if input_df have shape (40,9), it will throw out errors
enforce_shape (bool, optional, default to
False
) – Whether to enforce a certain shape. If enforce_shape=True then shape must be specifieddefault_format (
str
, optional, default tojson
) –The default serialization format to use if the request does not specify a
headers['content-type']
. It is also the serialization format used for the response. Possible values are:json
- JSON text format (inferred from content-type “application/json”)parquet
- Parquet binary format (inferred from content-type “application/octet-stream”)csv
- CSV text format (inferred from content-type “text/csv”)
- Returns
IO Descriptor that pd.DataFrame.
- Return type
IODescriptor
- classmethod PandasDataFrame.from_sample(sample_input, orient='records', apply_column_names=True, enforce_shape=True, enforce_dtype=False, default_format='json')¶
Create a PandasDataFrame IO Descriptor from given inputs.
- Parameters
sample_input (pd.DataFrame) – Given sample pd.DataFrame data
orient (
str
, optional, default torecords
) –Indication of expected JSON string format. Compatible JSON strings can be produced by
pandas.io.json.to_json()
with a corresponding orient value. Possible orients are:split
-Dict[str, Any]
: {idx -> [idx], columns -> [columns], data -> [values]}records
-List[Any]
: [{column -> value}, …, {column -> value}]index
-Dict[str, Any]
: {idx -> {column -> value}}columns
-Dict[str, Any]
: {column -> {index -> value}}values
-Dict[str, Any]
: Values arrays
apply_column_names (bool, optional, default to
True
) – Update incoming DataFrame columns. columns must be specified at function signature. If you don’t want to enforce a specific columns name then change apply_column_names=False.enforce_dtype (bool, optional, default to
True
) – Enforce a certain data type. dtype must be specified at function signature. If you don’t want to enforce a specific dtype then change enforce_dtype=False.enforce_shape (bool, optional, default to
False
) – Enforce a certain shape. shape must be specified at function signature. If you don’t want to enforce a specific shape then change enforce_shape=False.default_format (
str
, optional, default tojson
) –The default serialization format to use if the request does not specify a
headers['content-type']
. It is also the serialization format used for the response. Possible values are:json
- JSON text format (inferred from content-type “application/json”)parquet
- Parquet binary format (inferred from content-type “application/octet-stream”)csv
- CSV text format (inferred from content-type “text/csv”)
- Returns
PandasDataFrame
IODescriptor from given users inputs.- Return type
bentoml._internal.io_descriptors.PandasDataFrame
Example:
import pandas as pd from bentoml.io import PandasDataFrame arr = [[1,2,3]] inp = PandasDataFrame.from_sample(pd.DataFrame(arr)) ... @svc.api(input=inp, output=PandasDataFrame()) def predict(inputs: pd.DataFrame) -> pd.DataFrame:...
bentoml.io.PandasSeries¶
- class bentoml.io.PandasSeries(*args, **kwargs)¶
PandasSeries
defines API specification for the inputs/outputs of a Service, where either inputs will be converted to or outputs will be converted from typenumpy.ndarray
as specified in your API function signature.Sample implementation of a sklearn service:
# sklearn_svc.py import bentoml import pandas as pd import numpy as np from bentoml.io import PandasSeries import bentoml.sklearn input_spec = PandasSeries.from_sample(pd.Series(np.array([[5,4,3,2]]))) runner = bentoml.sklearn.load_runner("sklearn_model_clf") svc = bentoml.Service("iris-classifier", runners=[runner]) @svc.api(input=input_spec, output=PandasSeries()) def predict(input_arr): res = runner.run_batch(input_arr) # type: np.ndarray return pd.Series(res)
Users then can then serve this service with
bentoml serve
:% bentoml serve ./sklearn_svc.py:svc --auto-reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
import requests requests.post( "http://0.0.0.0:3000/predict", headers={"content-type": "application/json"}, data='[{"0":5,"1":4,"2":3,"3":2}]' ).text
% curl -X POST -H "Content-Type: application/json" --data '[{"0":5,"1":4,"2":3,"3":2}]' http://0.0.0.0:3000/predict [{"0": 1}]%
- Parameters
orient (
str
, optional, default torecords
) –Indication of expected JSON string format. Compatible JSON strings can be produced by
pandas.io.json.to_json()
with a corresponding orient value. Possible orients are:split
-Dict[str, Any]
: {idx -> [idx], columns -> [columns], data -> [values]}records
-List[Any]
: [{column -> value}, …, {column -> value}]index
-Dict[str, Any]
: {idx -> {column -> value}}columns
-Dict[str, Any]
: {column -> {index -> value}}values
-Dict[str, Any]
: Values arrays
columns (List[str], optional, default to
None
) – List of columns name that users wish to updateapply_column_names (bool, optional, default to
False
) – Whether to update incoming DataFrame columns. Ifapply_column_names=True
, then columns must be specified.dtype (
Union[bool, Dict[str, Any]]
, optional, default toNone
) – Data Type users wish to convert their inputs/outputs to. If it is a boolean, then pandas will infer dtypes. Else if it is a dictionary of column to dtype, then applies those to incoming dataframes. If False, then don’t infer dtypes at all (only applies to the data). This is not applicable whenorient='table'
.enforce_dtype (bool, optional, default to
False
) – Whether to enforce a certain data type. ifenforce_dtype=True
thendtype
must be specified.shape (Tuple[int, …], optional, default to
None
) –Optional shape check that users can specify for their incoming HTTP requests. We will only check the number of columns you specified for your given shape. Examples usage:
import pandas as pd from bentoml.io import PandasSeries df = pd.DataFrame([[1,2,3]]) # shape (1,3) inp = PandasSeries.from_sample(arr) ... @svc.api(input=inp, output=PandasSeries()) # the given shape above is valid def predict(input_df: pd.DataFrame) -> pd.DataFrame: result = await runner.run(input_df) return result @svc.api(input=PandasSeries(shape=(51,10), enforce_shape=True), output=PandasSeries()) def infer(input_df: pd.DataFrame) -> pd.DataFrame: ... # if input_df have shape (40,9), it will throw out errors
enforce_shape (bool, optional, default to
False
) – Whether to enforce a certain shape. If enforce_shape=True then shape must be specified
- Returns
IO Descriptor that pd.DataFrame.
- Return type
IODescriptor
bentoml.io.Multipart¶
- class bentoml.io.Multipart(*args, **kwargs)¶
Multipart
defines API specification for the inputs/outputs of a Service, where inputs/outputs of a Service can receive/send a multipart request/responses as specified in your API function signature.Sample implementation of a sklearn service:
# sklearn_svc.py import bentoml from bentoml.io import NumpyNdarray, Multipart, JSON import bentoml.sklearn runner = bentoml.sklearn.load_runner("sklearn_model_clf") svc = bentoml.Service("iris-classifier", runners=[runner]) input_spec = Multipart(arr=NumpyNdarray(), annotations=JSON()) output_spec = Multipart(output=NumpyNdarray(), result=JSON()) @svc.api(input=input_spec, output=output_spec) def predict(arr, annotations): res = runner.run(arr) return {"output":res, "result":annotations}
Users then can then serve this service with
bentoml serve
:% bentoml serve ./sklearn_svc.py:svc --reload (Press CTRL+C to quit) [INFO] Starting BentoML API server in development mode with auto-reload enabled [INFO] Serving BentoML Service "iris-classifier" defined in "sklearn_svc.py" [INFO] API Server running on http://0.0.0.0:3000
Users can then send requests to the newly started services with any client:
import requests from requests_toolbelt.multipart.encoder import MultipartEncoder m = MultipartEncoder( fields={'field0': 'value', 'field1': 'value', 'field2': ('filename', open('test.json', 'rb'), 'application/json')} ) requests.post('http://0.0.0.0:3000/predict', data=m, headers={'Content-Type': m.content_type})
% curl -X POST -H "Content-Type: multipart/form-data" -F annotations=@test.json -F arr='[5,4,3,2]' http://0.0.0.0:3000/predict --b1d72c201a064ecd92a17a412eb9208e Content-Disposition: form-data; name="output" content-length: 1 content-type: application/json 1 --b1d72c201a064ecd92a17a412eb9208e Content-Disposition: form-data; name="result" content-length: 13 content-type: application/json {"foo":"bar"} --b1d72c201a064ecd92a17a412eb9208e--
- Parameters
inputs (
Dict[str, IODescriptor]
) –Dictionary consisting keys as inputs definition for a Multipart request/response, values as IODescriptor supported by BentoML. Currently, Multipart supports Image, NumpyNdarray, PandasDataFrame, PandasSeries, Text, and File.
Make sure to match the input params in an API function to the keys defined under
Multipart
:+----------------------------------------------------------------+ | | | +--------------------------------------------------------+ | | | | | | | Multipart(arr=NumpyNdarray(), annotations=JSON() | | | | | | | +----------------+-----------------------+---------------+ | | | | | | | | | | | | | | +----+ +---------+ | | | | | | +---------------v--------v---------+ | | | def predict(arr, annotations): | | | +----------------------------------+ | | | +----------------------------------------------------------------+
- Returns
IO Descriptor that Multipart request/response.
- Return type
IODescriptor