Offline Batch Serving¶
BentoML CLI allows you to load and run packaged models straight from the CLI without needing to start up a server to serve requests (hence offline batch serving). There are three main classes of input adapters through which you can perform offline batch serving.
StringInput
¶
DataframeInput
, JSONInput
, and TFTensorInput
all inherit from StringInput
. This class of adapter should be used on any input that is string-like (e.g. JSON, CSV, regular string, raw bytes).
Query with CLI command
Example with DataframeInput
. Here, we give the data to the input adapter in the form of a flat JSON string using the --input
flag:
$ bentoml run IrisClassifier:latest predict --input '[{"sw": 1, "sl": 2, "pw": 1, "pl": 2}]'
You can also pass file data to any subclass of StringInput
using the --input-file
flag:
$ bentoml run IrisClassifier:latest predict --format csv --input-file test.csv
FileInput
¶
ImageInput
inherits from FileInput
. This class of adapter should be used mostly for image data (e.g. JPG, PNG).
Query with CLI command
Example with ImageInput
. We provide the image data to the input adapter by specifying the the image file we want to run inference on using the flag --input-file
:
$ bentoml run PyTorchFashionClassifier:latest predict --input-file test.jpg
Alternatively, we can also run inference on all images in a folder and specify the batch size using the flag --max-match-size
:
$ bentoml run PyTorchFashionClassifier:latest predict \\
--input-file folder/*.jpg --max-batch-size 10
MultiFileInput
¶
AnnotatedImageInput
and MultiImageInput
all inherit from MultiFileInput
. This class of adapter should be mostly used for models that require multiple images (e.g. models that require a depth-map along with a regular image).
Query with CLI command
Example with MultiImageInput
. We provide image data to the input adapter using CLI flags in the form --input-<name>
or --input-file-<name>
:
$ bentoml run PyTorchFashionClassifier:latest predict \\
--input-file-imageX testx.jpg \\
--input-file-imageY testy.jpg
Similarly to FileInput
, we can infer all file pairs under a folder with ten pairs each batch by specifying the --max-batch-size
flag:
$ bentoml run PyTorchFashionClassifier:latest predict --max-batch-size 10 \\
--input-file-imageX folderx/*.jpg \\
--input-file-imageY foldery/*.jpg
Note: when running inference using the MultiFileInput under a folder, ensure that the file names have the same prefix. For example:
folderx:
- 1.jpg
- 2.jpg
...
foldery:
- 1.jpg
- 2.jpg