Deploying to Google Cloud Run¶
Google Cloud Run is a fully manged compute platform that automatically scales. It is great alternative to run BentoService that requires more computing power. Cloud Run is serverless. It abstracts away infrastructure management, so you can focus on building service.
This guide demonstrates how to deploy a scikit-learn based iris classifier model with BentoML to Google Cloud Run. The same deployment steps are also applicable for models trained with other machine learning frameworks, see more BentoML examples here.
Prerequisites¶
Google cloud CLI tool
Install instruction: https://cloud.google.com/sdk/docs/install
Docker is installed and running on the machine.
Install instruction: https://docs.docker.com/install
Python 3.6 or above and required packages bentoml and scikit-learn:
pip install bentoml scikit-learn
Create Google cloud project¶
$ gcloud components update
All components are up to date.
$ gcloud projects create irisclassifier-gcloud-run
# Sample output
Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/irisclassifier-gcloud-run].
Waiting for [operations/cp.6403723248945195918] to finish...done.
Enabling service [cloudapis.googleapis.com] on project [irisclassifier-gcloud-run]...
Operation "operations/acf.15917ed1-662a-484b-b66a-03259041bf43" finished successfully.
$ gcloud config set project irisclassifier-gcloud-run
Updated property [core/project]
Build and push BentoML model service image to GCP repository¶
Run the example project from the quick start guide to create the BentoML saved bundle for deployment:
git clone git@github.com:bentoml/BentoML.git
pip install -r ./bentoml/guides/quick-start/requirements.txt
python ./bentoml/guides/quick-start/main.py
Verify the saved bundle created:
$ bentoml get IrisClassifier:latest
# Sample output
{
"name": "IrisClassifier",
"version": "20200121141808_FE78B5",
"uri": {
"type": "LOCAL",
"uri": "/Users/bozhaoyu/bentoml/repository/IrisClassifier/20200121141808_FE78B5"
},
"bentoServiceMetadata": {
"name": "IrisClassifier",
"version": "20200121141808_FE78B5",
"createdAt": "2020-01-21T22:18:25.079723Z",
"env": {
"condaEnv": "name: bentoml-IrisClassifier\nchannels:\n- defaults\ndependencies:\n- python=3.7.3\n- pip\n",
"pipDependencies": "bentoml==0.5.8\nscikit-learn",
"pythonVersion": "3.7.3"
},
"artifacts": [
{
"name": "model",
"artifactType": "SklearnModelArtifact"
}
],
"apis": [
{
"name": "predict",
"InputType": "DataframeInput",
"docs": "BentoService API"
}
]
}
}
The BentoML saved bundle created can now be used to start a REST API Server hosting the BentoService and available for sending test request:
# Start BentoML API server:
bentoml serve IrisClassifier:latest
# Send test request:
curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '[[5.1, 3.5, 1.4, 0.2]]' \
http://localhost:5000/predict
Use gcloud CLI to build the docker image
# Find the local path of the latest version IrisClassifier saved bundle
$ saved_path=$(bentoml get IrisClassifier:latest --print-location --quiet)
$ cd $saved_path
$ gcloud builds submit --tag gcr.io/irisclassifier-gcloud-run/iris-classifier
# Sample output
Creating temporary tarball archive of 15 file(s) totalling 15.8 MiB before compression.
Uploading tarball of [.] to [gs://irisclassifier-gcloud-run_cloudbuild/source/1587430763.39-03422068242448efbcfc45f2aed218d3.tgz]
Created [https://cloudbuild.googleapis.com/v1/projects/irisclassifier-gcloud-run/builds/9c0f3ef4-11c0-4089-9406-1c7fb9c7e8e8].
Logs are available at [https://console.cloud.google.com/cloud-build/builds/9c0f3ef4-11c0-4089-9406-1c7fb9c7e8e8?project=349498001835]
----------------------------- REMOTE BUILD OUTPUT ------------------------------
...
...
...
DONE
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ID CREATE_TIME DURATION SOURCE IMAGES STATUS
9c0f3ef4-11c0-4089-9406-1c7fb9c7e8e8 2020-04-21T00:59:38+00:00 5M22S gs://irisclassifier-gcloud-run_cloudbuild/source/1587430763.39-03422068242448efbcfc45f2aed218d3.tgz gcr.io/irisclassifier-gcloud-run/iris-classifier (+1 more) SUCCESS
Deploy the image to Google Cloud Run¶
Use your browser, go into the Google Cloud Console, select project irisclassifier-gcloud-run and navigate to the CloudRun page
Click Create Service on the top of the navigation bar
In the Create Cloud Run service page:
Select container image URL from the selection menu, choose allow Unauthenticated invocations from the Authentication section

Expand Show Optional Revision Settings and change Container Port from `8080` to `5000`

After successful deployment, you can fin the service endpoint URL at top of the page.

Validate Google cloud run deployment with sample data¶
Copy the service URL from the screen
$ curl -i \
--header "Content-Type: application/json" \
--request POST \
-d '[[5.1, 3.5, 1.4, 0.2]]' \
https://iris-classifier-7v6yobzlcq-uw.a.run.app/predict
# Sample output
[0]
Clean up deployed service on Google Cloud Run¶
Navigate to the manage resources page in Google Cloud Console.
In the project list, select the project you want to delete and click the delete icon
In the dialog, type the projectID sentiment-gcloud-run and then click Shut down to delete the project.