fast.ai#
fastai is a popular deep learning library which provides high-level components for practioners to get state-of-the-art results in standard deep learning domains, as well as low-level components for researchers to build new approaches. To learn more about fastai, visit their documentation.
BentoML provides native support for fastai, and this guide provides an overview of how to use BentoML with fastai.
Compatibility#
BentoML requires fastai version 2 or higher to be installed.
BentoML does not support fastai version 1. If you are using fastai version 1, consider using Custom Runner.
Saving a trained fastai learner#
This example is based on Transfer Learning with text from fastai.
from fastai.basics import URLs
from fastai.metrics import accuracy
from fastai.text.data import DataBlock
from fastai.text.data import TextBlock
from fastai.text.data import untar_data
from fastai.text.data import CategoryBlock
from fastai.text.models import AWD_LSTM
from fastai.text.learner import text_classifier_learner
from fastai.data.transforms import parent_label
from fastai.data.transforms import get_text_files
from fastai.data.transforms import GrandparentSplitter
# Download IMDB dataset
path = untar_data(URLs.IMDB)
# Create IMDB DataBlock
imdb = DataBlock(
blocks=(TextBlock.from_folder(path), CategoryBlock),
get_items=get_text_files,
get_y=parent_label,
splitter=GrandparentSplitter(valid_name="test"),
)
dls = imdb.dataloaders(path)
# define a Learner object
learner = text_classifier_learner(
dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy
)
# quickly fine tune the model
learner.fine_tune(4, 1e-2)
# output:
# epoch train_loss valid_loss accuracy time
# 0 0.453252 0.395130 0.822080 36:45
learner.predict("I really liked that movie!")
# output:
# ('pos', TensorText(1), TensorText([0.1216, 0.8784]))
After training, use save_model
to save the Learner instance to BentoML model store.
bentoml.fastai.save_model("fastai_sentiment", learner)
To verify that the saved learner can be loaded properly:
learner = bentoml.fastai.load_model("fastai_sentiment:latest")
learner.predict("I really liked that movie!")
Building a Service using fastai#
See also
Building a Service: more information on creating a prediction service with BentoML.
import bentoml
import numpy as np
from bentoml.io import Text
from bentoml.io import NumpyNdarray
runner = bentoml.fastai.get("fastai_sentiment:latest").to_runner()
svc = bentoml.Service("fast_sentiment", runners=[runner])
@svc.api(input=Text(), output=NumpyNdarray())
async def classify_text(text: str) -> np.ndarray:
# returns sentiment score of a given text
res = await runner.predict.async_run(text)
return np.asarray(res[-1])
When constructing a bentofile.yaml,
there are two ways to include fastai as a dependency, via python
or
conda
:
python:
packages:
- fastai
conda:
channels:
- fastchan
dependencies:
- fastai
Using Runners#
See also
See Using Runners doc for a general introduction to the Runner concept and its usage.
runner.predict.run
is generally a drop-in replacement for learner.predict
regardless of the learner type
for executing the prediction in the model runner. A fastai runner will receive the same inputs type as
the given learner.
For example, Runner created from a Tabular learner model will
accept a pandas.DataFrame
as input, where as a Text learner based runner will accept a str
as input.
Using PyTorch layer#
Since fastai is built on top of PyTorch, it is also possible to use PyTorch
models from within a fastai learner directly for inference. Note that by using
the PyTorch layer, you will not be able to use the fastai Learner
’s
features such as .predict()
, .get_preds()
, etc.
To get the PyTorch model, access it via learner.model
:
import bentoml
bentoml.pytorch.save_model(
"my_pytorch_model", learner.model, signatures={"__call__": {"batchable": True}}
)
Learn more about using PyTorch with BentoML here.
Using GPU#
Since fastai doesn’t support using GPU for inference, BentoML can only support CPU inference with fastai models.
Additionally, if the model uses mixed_precision
, then the loaded model will also be converted to FP32.
See mixed precision to learn more about mixed precision.
If you need to use GPU for inference, you can use the PyTorch layer.
Adaptive batching#
fastai’s Learner#predict
does not support taking batch input for inference, hence
the adaptive batching feature in BentoML is not available for fastai models.
The default signature has batchable
set to False
.
If you need to use adaptive batching for inference, you can use the PyTorch layer.