Deployment details#
Deployment details refer to the properties of a Bento Deployment, such as its metadata, status, monitoring metrics, and revision records. You set some of these properties when you create the Deployment, and you also have the option to edit some of them as needed after creation.
For more information on how to view these properties, see View Deployment details.
Overview#
Overview#
Property |
Description |
---|---|
Name |
The name of the Deployment, serving as a unique identifier for the Bento Deployment on BentoCloud. |
URL |
The endpoint at which the Bento Deployment is accessible. |
Bento |
The specific Bento package being used for this Deployment, which includes the model, source code, and dependencies. |
Endpoint Access Type |
The level of access allowed to the endpoint, including Public and Protected. |
Description |
A brief explanation of the Bento Deployment. |
Creator |
The user who created this Bento Deployment. |
Created at |
The date and time when the Bento Deployment was initially created. |
Component Status#
Property |
Description |
---|---|
Name |
The name of the component, such as API Server or a specific Runner. |
Instance Type |
The classification of the cloud instance that the component is using. |
Resources |
The allocated computing resources for the component, such as CPU and memory. |
Status |
The current operational status of the component, indicated by one of the status lights:
|
Events#
This section displays a chronological record of system events related to the Deploymentâs operation on BentoCloud. Each entry logs the actions performed by the system, such as scaling, component status changes, and updates to the Bento configurations. These logs are instrumental for auditing, debugging, and understanding the behavior of the Deployment over time.
You can search for event keywords in the search bar.
Note
The Events section is also available on the Replicas and Revisions tabs.
Replicas#
Property |
Description |
---|---|
Group |
The group type the replica belongs to, including API Server and Runner. |
Name |
The name of the API Server or Runner replica. |
Status |
The status of the replica, indicating whether the replica is running or if it needs your attention. Possible statuses include |
Cluster |
The name of the cloud cluster where the replica is running. |
Node |
The specific node within the cluster where the replica is running. |
Start Time |
The time when the replica was deployed. |
Operation |
Operations that you can perform:
|
Logging#
Property |
Description |
---|---|
Advanced Search |
Search for logs using advanced filters. It supports regular expressions. |
Component |
The component whose logs need to be displayed. |
Max Lines |
The maximum number of log entries displayed. |
Logs volume |
A histogram that displays the volume of log entries over a selected period, providing a visual representation of the activity levels and potential anomalies related to the Deploymentâs operation. |
Time |
The timestamp of when each log entry is recorded. |
Unique labels |
Labels that provide context for the logs, such as the node, cluster, and Deployment name. |
Wrap lines |
Wrap long lines for easier readability. |
Prettify JSON |
Formats JSON log entries to be more human-readable. |
Deduplication |
Filters out duplicate log entries. Options including âNoneâ, âExactâ, âNumbersâ and âSignatureâ provide different methods to identify and remove repeated logs. |
Display results |
Sort log entries chronologically, either from the newest or the oldest. |
Download |
Download logs in text or JSON format. |
Note
The Logging tab integrates Grafana Loki. For more information, see the Loki documentation.
Monitoring#
Property |
Description |
---|---|
Number of Replicas |
Displays the count of active replicas in the Deployment. It provides a detailed count for both the total number of replicas and the individual number of API Servers and Runners. |
Request Percentile Response Times |
Displays the response time across different percentiles (for example, P95), giving insight into the range of response speeds that users may experience. It includes the total response time as well as the response time by API endpoints. |
Request Volume |
Tracks the number of requests made to the Deployment over time, which is helpful in understanding the demand and traffic patterns. It includes the total request volume as well as the volume by API endpoints. |
Request Success Rate (non-4|5xx responses) |
Indicates the proportion of requests that are successfully processed without any client-side (4xx) or server-side (5xx) errors. It includes the total success rate as well as the success rate by API endpoints. |
CPU Usage |
Displays the CPU utilization of the Deployment, which includes the total usage as well as the usage by individual components. |
Memory Usage |
Displays the amount of memory being used over time, which includes the total usage as well as the usage by individual components. |
GPU Usage |
Displays the GPU utilization of the Deployment, which includes the total usage as well as the usage by individual components. |
GPU Memory Usage |
Displays the memory usage on the GPU, offering insights into how memory-intensive the GPU tasks are. It includes the total usage as well as the usage by individual components. |
GPU Memory Bandwidth Usage |
Displays the bandwidth usage of the GPU memory, providing data on how intensively the memory is being accessed and at what rate data is being transferred. It includes the total usage as well as the usage by individual components. |
Revisions#
Property |
Description |
---|---|
ID |
A unique identifier assigned to each revision of the Deployment. |
Deployment Targets |
The Bento used to create the Deployment. |
Creator |
The user who created this Bento Deployment. |
Created at |
The date and time when the Bento Deployment was initially created. |
Operation |
Actions that can be performed on each revision. For example, you can roll back your Deployment to a specific version. |