Bento building and deployment#
Bentos are the distribution unit in the BentoML ecosystem, containing all the necessary files for building AI applications, such as models, source code, and dependencies. Once you build a Bento and fully test it, you can deploy it on BentoCloud for better management and scaling. To accelerate the application delivery lifecycle and facilitate the coordination between different team members, you can follow some development best practices.
This page contains a list of best practices for building and deploying Bentos.
Create an automated CI/CD workflow#
Automated Continuous Integration (CI) and Continuous Deployment (CD) workflows streamline the process of integrating code changes from multiple contributors and ensuring they are ready for production. Leveraging CI/CD offers benefits such as faster time-to-market, enhanced code quality, and reduced manual intervention, making it an effective tool for application development.
To set up an automated CI/CD workflow on GitHub for Bento Deployments, consider the following GitHub Actions.
setup-bentoml-action: Bootstraps BentoML and any required tools to run BentoML in a CI process.
build-bento-action: Builds Bentos either from a GitHub repository or from a specified context path.
deploy-bento-action: Builds and pushes a Bento to BentoCloud by default and optionally creates a new Deployment or updates an existing Deployment on BentoCloud.
containerize-push-action: Creates and pushes Bento images to any container registry (for example, Docker Hub, GHCR, ECR, and GCR).
Depending on your specific needs, you might choose a single action or combine multiple for a comprehensive workflow. Consider this BentoCloud CI/CD example of building a text summarization application built upon a Transformer model. This example uses some of the GitHub Actions workflows mentioned above and includes:
models.py: Downloads the model that powers the application from Hugging Face for the first time. It retrieves the model tag from the
bentofile.yamlfile and checks if the model with the tag exists locally; if not, it pulls it from BentoCloud.
requirements.txt: Dependencies required to run the application.
bentofile.yaml: Specifies the configurations used to build the Bento.
service.py: Specifies the BentoML Service to define the serving logic and expose the API endpoint.
deployment.json: The JSON configuration used to create or update the application’s Bento Deployment on BentoCloud.
Two automated workflows, Build Bento and Deploy to BentoCloud, are available to simplify the Bento building and deployment processes respectively. The following diagram presents the general workflow from model training to cloud deployment.
The Build Bento workflow is designed to automate the process of building a Bento
whenever there are changes pushed to the
main branch, excluding those made to the
deployment.json file. Specifically, this workflow does the following with the help of deploy-bento-action:
Pull the model specified in
bentofile.yamlfrom BentoCloud. This is done by running
bentoml models pull, which pulls the model specified in
bentofile.yamlif the model does not exist locally. If you have a new model version, make sure it has already been uploaded to BentoCloud using
bentoml models pushand you can then update
models.pyto trigger the workflow.
Build a new Bento based on the new model or the new configuration.
Push this new Bento to BentoCloud.
As the deployment option is set to
skip, this workflow does not create or update any Deployment on BentoCloud. To do so, create a separate pull request to update the Bento tag in
deployment.json, which triggers the “Deploy to BentoCloud” workflow.
Deploy to BentoCloud#
The Deploy to BentoCloud workflow is designed to automate the deployment process on BentoCloud whenever there’s a change to the
deployment.json file in the repository.
You can use it to update your Deployment’s configuration (for example, the maximum and the minimum number of replicas allowed for scaling).
Make sure the Bento Deployment specified in the
deployment.json file is already created on BentoCloud when you use this workflow.
It only updates the existing Deployment instead of creating a new one.
We recommend you carefully consider the conditions that will trigger deployments and updates to Bentos on BentoCloud when designing an automated CI/CD workflow. Factors such as file modifications or branch changes can serve as deployment triggers. See Plan your deployment strategy wisely to learn more.
Use orchestration tools#
Machine learning models often need to evolve over time, especially when they’re exposed to new or changing data. Regular retraining ensures your models remain accurate and relevant.
For a seamless integration, consider an orchestration job that updates the model tag within the
bentofile.yaml file. Once this modification is committed to the code repository,
it can trigger the pipeline described earlier, ensuring that your application always utilizes the most up-to-date model.
You can integrate unit and end-to-end tests into the CI process, which are triggered once new Bentos are built. See Test Bentos to learn more.
Plan your deployment strategy wisely#
Effective deployment is more than just pushing changes. You may also want to select the right strategy that aligns with your team’s workflow, your application’s requirements, and the overall project lifecycle. In terms of deployment automation, consider the following:
Automatic deployment with each commit to
main. Deploying Bentos automatically with each commit to the
mainbranch ensures that your application is always up-to-date with the latest changes. This strategy is beneficial for projects that have rigorous testing in place and require frequent updates. It guarantees that once changes pass all checks and reviews, they’re immediately deployed, ensuring your users always have access to the latest features and fixes.
Semi-automatic deployment. This option provides a balance between automation and manual oversight. They are triggered based on specific conditions, allowing teams to have more control over when and what gets deployed. Semi-automatic deployment triggers include:
deployment.json: Deployments can be triggered based on the
deployment.jsonfile, ensuring that only updates to the configuration lead to a new deployment.
New pushes to the
deploymentbranch: Instead of deploying from the
mainbranch, consider a dedicated
deploymentbranch. Changes are merged into this branch when they are ready for deployment, separating feature development from the deployment process.
Release tags: Deployments can be triggered based on release tags. This approach is especially useful for version-controlled applications. When a new version is ready, a tag is created, prompting the deployment process. This ensures that only stable, vetted versions of the application get deployed.
Manual CI job triggers: Sometimes, the best approach is human judgment. Allowing team members to manually trigger deployment jobs ensures that deployments only happen when the team is confident about the changes. This method is particularly useful during critical periods or when deploying significant or potentially disruptive updates.