fast.ai#

fastai is a popular deep learning library which provides high-level components for practioners to get state-of-the-art results in standard deep learning domains, as well as low-level components for researchers to build new approaches. To learn more about fastai, visit their documentation.

BentoML provides native support for fastai, and this guide provides an overview of how to use BentoML with fastai.

Compatibility#

BentoML requires fastai version 2 or higher to be installed.

BentoML does not support fastai version 1. If you are using fastai version 1, consider using Custom Runner.

Saving a trained fastai learner#

This example is based on Transfer Learning with text from fastai.

from fastai.basics import URLs
from fastai.metrics import accuracy
from fastai.text.data import DataBlock
from fastai.text.data import TextBlock
from fastai.text.data import untar_data
from fastai.text.data import CategoryBlock
from fastai.text.models import AWD_LSTM
from fastai.text.learner import text_classifier_learner
from fastai.data.transforms import parent_label
from fastai.data.transforms import get_text_files
from fastai.data.transforms import GrandparentSplitter

# Download IMDB dataset
path = untar_data(URLs.IMDB)

# Create IMDB DataBlock
imdb = DataBlock(
    blocks=(TextBlock.from_folder(path), CategoryBlock),
    get_items=get_text_files,
    get_y=parent_label,
    splitter=GrandparentSplitter(valid_name="test"),
)
dls = imdb.dataloaders(path)

# define a Learner object
learner = text_classifier_learner(
     dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy
 )

# quickly fine tune the model
learner.fine_tune(4, 1e-2)

# output:
# epoch     train_loss  valid_loss  accuracy  time
# 0         0.453252    0.395130    0.822080  36:45

learner.predict("I really liked that movie!")

# output:
# ('pos', TensorText(1), TensorText([0.1216, 0.8784]))

After training, use save_model to save the Learner instance to BentoML model store.

bentoml.fastai.save_model("fastai_sentiment", learner)

To verify that the saved learner can be loaded properly:

learner = bentoml.fastai.load_model("fastai_sentiment:latest")

learner.predict("I really liked that movie!")

Building a Service using fastai#

Using Runners#

Using PyTorch layer#

Since fastai is built on top of PyTorch, it is also possible to use PyTorch models from within a fastai learner directly for inference. Note that by using the PyTorch layer, you will not be able to use the fastai Learner’s features such as .predict(), .get_preds(), etc.

To get the PyTorch model, access it via learner.model:

import bentoml

bentoml.pytorch.save_model(
   "my_pytorch_model", learner.model, signatures={"__call__": {"batchable": True}}
)

Learn more about using PyTorch with BentoML here.

Using GPU#

Since fastai doesn’t support using GPU for inference, BentoML can only support CPU inference with fastai models.

Additionally, if the model uses mixed_precision, then the loaded model will also be converted to FP32. See mixed precision to learn more about mixed precision.

If you need to use GPU for inference, you can use the PyTorch layer.

Adaptive batching#

fastai’s Learner#predict does not support taking batch input for inference, hence the adaptive batching feature in BentoML is not available for fastai models.

The default signature has batchable set to False.

If you need to use adaptive batching for inference, you can use the PyTorch layer.