Deploying to Kubernetes Cluster¶
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It is the de-facto solution for deploying applications today. Machine learning services also can take advantage of Kubernetes’ ability to quickly deploy and scale base on demand.
This guide demonstrates how to serve a scikit-learn based iris classifier model with BentoML on a Kubernetes cluster. The same deployment steps are also applicable for models trained with other machine learning frameworks, see more BentoML examples here.
Prerequisites¶
Before starting this guide, make sure you have the following:
A kubernetes cluster.
minikube is the recommended way to run Kubernetes locally:: https://kubernetes.io/docs/tasks/tools/install-minikube/
Kubernetes guide: https://kubernetes.io/docs/setup/
kubectl CLI tool
install instruction: https://kubernetes.io/docs/tasks/tools/install-kubectl/
Docker and Docker Hus is properly configured on your system
Install instruction: https://docs.docker.com/install
Python 3.6 or above and required packages: bentoml and scikit-learn
pip install bentoml scikit-learn
Kubernetes deployment with BentoML¶
Run the example project from the quick start guide to create the BentoML saved bundle for deployment:
git clone git@github.com:bentoml/BentoML.git
pip install -r ./bentoml/guides/quick-start/requirements.txt
python ./bentoml/guides/quick-start/main.py
Verify the saved bundle created:
$ bentoml get IrisClassifier:latest
# Sample output
{
"name": "IrisClassifier",
"version": "20200121141808_FE78B5",
"uri": {
"type": "LOCAL",
"uri": "/Users/bozhaoyu/bentoml/repository/IrisClassifier/20200121141808_FE78B5"
},
"bentoServiceMetadata": {
"name": "IrisClassifier",
"version": "20200121141808_FE78B5",
"createdAt": "2020-01-21T22:18:25.079723Z",
"env": {
"condaEnv": "name: bentoml-IrisClassifier\nchannels:\n- defaults\ndependencies:\n- python=3.7.3\n- pip\n",
"pipDependencies": "bentoml==0.5.8\nscikit-learn",
"pythonVersion": "3.7.3"
},
"artifacts": [
{
"name": "model",
"artifactType": "SklearnModelArtifact"
}
],
"apis": [
{
"name": "predict",
"InputType": "DataframeInput",
"docs": "BentoService API"
}
]
}
}
The BentoML saved bundle created can now be used to start a REST API Server hosting the BentoService and available for sending test request:
# Start BentoML API server:
bentoml serve IrisClassifier:latest
# Send test request:
curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '[[5.1, 3.5, 1.4, 0.2]]' \
http://localhost:5000/predict
Deploy BentoService to Kubernetes¶
BentoML provides a convenient way to containerize the model API server with Docker:
Find the SavedBundle directory with bentoml get command
Run docker build with the SavedBundle directory which contains a generated Dockerfile
Run the generated docker image to start a docker container serving the model
# Find the local path of the latest version IrisClassifier saved bundle
saved_path=$(bentoml get IrisClassifier:latest --print-location --quiet)
# Replace {docker_username} with your Docker Hub username
docker build -t {docker_username}/iris-classifier $saved_path
docker push {docker_username}/iris-classifier
The following is an example YAML file for specifying the resources required to run and expose a BentoML model server in a Kubernetes cluster. Replace {docker_username} with your Docker Hub username and save it to iris-classifier.yaml
#iris-classifier.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: iris-classifier
name: iris-classifier
spec:
ports:
- name: predict
port: 5000
targetPort: 5000
selector:
app: iris-classifier
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: iris-classifier
name: iris-classifier
spec:
selector:
matchLabels:
app: iris-classifier
template:
metadata:
labels:
app: iris-classifier
spec:
containers:
- image: {docker_username}/iris-classifier
imagePullPolicy: IfNotPresent
name: iris-classifier
ports:
- containerPort: 5000
Use kubectl CLI to deploy model server to Kubernetes cluster.
kubectl apply -f iris-classifier.yaml
Make prediction with curl:
curl -i \
--request POST \
--header "Content-Type: application/json" \
--data '[[5.1, 3.5, 1.4, 0.2]]' \
${minikube ip}:5000/predict
Monitor model server metrics with Prometheus¶
See also
Remove deployment¶
kubectl delete -f iris-classifier.yaml