Deploying to AWS ECS(Elastic Container Service)

AWS ECS (elastic container service) is a fully managed container orchestration service. With AWS Fargate, a serverless compute engine for containers, ECS provides the benefit of AWS Lambda without sacrificing computing performance. It is great for running more advanced ML prediction service that require more computing power compare to AWS Lambda, while still want to take advantage of the benefits that AWS Lambda brings.

Prerequisites

  1. An active AWS account configured on the machine with AWS CLI installed and configurated

  1. Docker is installed and running on the machine.

  1. AWS ECS CLI tool

AWS ECS deployment with BentoML

We will walk through from deploying BentoService with ECS, validate result with sample data and removing service and clean up AWS resources.

We will use the IrisClassifier BentoService from the getting started guide(https://docs.bentoml.org/en/latest/quickstart.html).

> bentoml get IrisClassifier:20200121141808_FE78B5

{
  "name": "IrisClassifier",
  "version": "20200121141808_FE78B5",
  "uri": {
    "type": "LOCAL",
    "uri": "/Users/bozhaoyu/bentoml/repository/IrisClassifier/20200121141808_FE78B5"
  },
  "bentoServiceMetadata": {
    "name": "IrisClassifier",
    "version": "20200121141808_FE78B5",
    "createdAt": "2020-01-21T22:18:25.079723Z",
    "env": {
      "condaEnv": "name: bentoml-IrisClassifier\nchannels:\n- defaults\ndependencies:\n- python=3.7.3\n- pip\n",
      "pipDependencies": "bentoml==0.5.8\nscikit-learn",
      "pythonVersion": "3.7.3"
    },
    "artifacts": [
      {
        "name": "model",
        "artifactType": "SklearnModelArtifact"
      }
    ],
    "apis": [
      {
        "name": "predict",
        "handlerType": "DataframeHandler",
        "docs": "BentoService API"
      }
    ]
  }
}

Build and push docker image to AWS ECR(elastic container registry)

Docker login with AWS ECR

> aws ecr get-login --region us-west-2 --no-include-email

docker login -u AWS -p eyJ.................OOH https://account_id.dkr.ecr.us-west-2.amazonaws.com

Copy the output from previous step and run it in the terminal

> docker login -u AWS -p eyJ.................OOH https://account_id.dkr.ecr.us-west-2.amazonaws.com

Login Succeeded

Create AWS ECR repository

> aws ecr create-repository --repository-name irisclassifier-ecs

{
    "repository": {
        "repositoryArn": "arn:aws:ecr:us-west-2:192023623294:repository/irisclassifier-ecs",
        "registryId": "192023623294",
        "repositoryName": "irisclassifier-ecs",
        "repositoryUri": "192023623294.dkr.ecr.us-west-2.amazonaws.com/irisclassifier-ecs",
        "createdAt": 1576542447.0,
        "imageTagMutability": "MUTABLE",
        "imageScanningConfiguration": {
            "scanOnPush": false
        }
    }
}
> cd /Users/bozhaoyu/bentoml/repository/IrisClassifier/20200121141808_FE78B5
> docker build . --tag=192023623294.dkr.ecr.us-west-2.amazonaws.com/irisclassifier-ecs

Step 1/12 : FROM continuumio/miniconda3:4.7.12
...
...
...
Successfully built 19d21c608b08
Successfully tagged 192023623294.dkr.ecr.us-west-2.amazonaws.com/irisclassifier-ecs:latest

Push the built docker image to AWS ECR

> docker push 192023623294.dkr.ecr.us-west-2.amazonaws.com/irisclassifier-ecs

The push refers to repository [192023623294.dkr.ecr.us-west-2.amazonaws.com/irisclassifier-ecs]
...
...
785a656a85507b3717c83e8a1d4c901605c4fa301364c7c18fc30346 size: 2225

Prepare AWS for ECR deployment

Setup IAM role

Create task-execution-assume-role.json

> cat task-execution-assume-role.json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs-tasks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create IAM role

> aws iam --region us-west-2 create-role --role-name ecsTaskExecutionRole \
  --assume-role-policy-document file://task-execution-assume-role.json

{
    "Role": {
        "Path": "/",
        "RoleName": "ecsTaskExecutionRole",
        "RoleId": "AROASZNL76Z7C7Q7SZJ4D",
        "Arn": "arn:aws:iam::192023623294:role/ecsTaskExecutionRole",
        "CreateDate": "2019-12-17T01:04:08Z",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Sid": "",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ecs-tasks.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        }
    }
}
> aws iam --region us-west-2 attach-role-policy --role-name ecsTaskExecutionRole \
  --policy-arn arn:aws:iam:aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

Configure ECR CLI

Create ECR CLI profile

ecs-cli configure profile --access-key AWS_ACCESS_KEY_ID --secret-key AWS_SECRET_ACCESS_KEY --profile-name tutorial-profile

Create ECR cluster profile configuration

ecs-cli configure --cluster tutorial --default-launch-type FARGATE --config-name tutorial --region us-west-2

Prepare ECR cluster for deployment

Start ECR cluster with the ecr profile we created in the earlier step

> ecs-cli up --cluster-config tutorial --ecs-profile tutorial-profile

INFO[0001] Created cluster                               cluster=tutorial region=us-west-2
INFO[0002] Waiting for your cluster resources to be created...
INFO[0002] Cloudformation stack status                   stackStatus=CREATE_IN_PROGRESS
INFO[0063] Cloudformation stack status                   stackStatus=CREATE_IN_PROGRESS
VPC created: vpc-0465d14ba04402f80
Subnet created: subnet-0d23851806f3db403
Subnet created: subnet-0dece5451f1a3b8b2
Cluster creation succeeded.

Use the VPC id from previous command to get secruity group ID

> aws ec2 describe-security-groups --filters Name=vpc-id,Values=vpc-0465d14ba04402f80 \
  --region us-west-2

{
    "SecurityGroups": [
        {
            "Description": "default VPC security group",
            "GroupName": "default",
            "IpPermissions": [
                {
                    "IpProtocol": "-1",
                    "IpRanges": [],
                    "Ipv6Ranges": [],
                    "PrefixListIds": [],
                    "UserIdGroupPairs": [
                        {
                            "GroupId": "sg-0258b891f053e077b",
                            "UserId": "192023623294"
                        }
                    ]
                }
            ],
            "OwnerId": "192023623294",
            "GroupId": "sg-0258b891f053e077b",
            "IpPermissionsEgress": [
                {
                    "IpProtocol": "-1",
                    "IpRanges": [
                        {
                            "CidrIp": "0.0.0.0/0"
                        }
                    ],
                    "Ipv6Ranges": [],
                    "PrefixListIds": [],
                    "UserIdGroupPairs": []
                }
            ],
            "VpcId": "vpc-0465d14ba04402f80"
        }
    ]
}

Use security group ID from previous command

> aws ec2 authorize-security-group-ingress --group-id sg-0258b891f053e077b --protocol tcp \
  --port 5000 --cidr 0.0.0.0/0 --region us-west-2

Deploying BentoService to ECR cluster

Create docker-compose.yaml file, use the image tag from previous steps

version: '3'
services:
  web:
    image: 192023623294.dkr.ecr.us-west-2.amazonaws.com/irisclassifier-ecs
    ports:
      - "5000:5000"
    logging:
      driver: awslogs
      options:
        awslogs-group: irisclassifier-aws-ecs
        awslogs-region: us-west-2
        awslogs-stream-prefix: web

Compose ecs-params.yaml with subnets information from starting up ECS cluster, and security group id from describe security group

version: 1
task_definition:
  task_execution_role: ecsTaskExecutionRole
  ecs_network_mode: awsvpc
  task_size:
    mem_limit: 0.5GB
    cpu_limit: 256
run_params:
  network_configuration:
    awsvpc_configuration:
      subnets:
        - subnet-0d23851806f3db403
        - subnet-0dece5451f1a3b8b2
      security_groups:
        - sg-0258b891f053e077b
      assign_public_ip: ENABLED

After create ecs-params.yaml, we can deploy our BentoService to the ECS cluster

> ecs-cli compose --project-name tutorial-bentoml-ecs service up --create-log-groups \
  --cluster-config tutorial --ecs-profile tutorial-profile


INFO[0000] Using ECS task definition                     TaskDefinition="tutorial-bentoml-ecs:1"
WARN[0001] Failed to create log group sentiment-aws-ecs in us-west-2: The specified log group already exists
INFO[0001] Updated ECS service successfully              desiredCount=1 force-deployment=false service=tutorial-bentoml-ecs
INFO[0017] (service tutorial-bentoml-ecs) has started 1 tasks: (task ecd119f0-b159-42e6-b86c-e6a62242ce7a).  timestamp="2019-12-17 01:05:23 +0000 UTC"
INFO[0094] Service status                                desiredCount=1 runningCount=1 serviceName=tutorial-bentoml-ecs
INFO[0094] (service tutorial-bentoml-ecs) has reached a steady state.  timestamp="2019-12-17 01:06:40 +0000 UTC"
INFO[0094] ECS Service has reached a stable state        desiredCount=1 runningCount=1 serviceName=tutorial-bentoml-ecs

Now, after creating the service, we can use ecs-cli service ps command to check the service’s status

> ecs-cli compose --project-name tutorial-bentoml-ecs service ps \
  --cluster-config tutorial --ecs-profile tutorial-profile

Name                                      State    Ports                        TaskDefinition          Health
ecd119f0-b159-42e6-b86c-e6a62242ce7a/web  RUNNING  34.212.49.46:5000->5000/tcp  tutorial-bentoml-ecs:1  UNKNOWN

Testing ECS service with sample data

> curl -i \
  --request POST \
  --header "Content-Type: application/json" \
  --data '[[5.1, 3.5, 1.4, 0.2]]' \
  http://34.212.49.46:5000/predict

[0]

Clean up AWS ECS Deployment

Delete the service on AWS ECS

> ecs-cli compose --project-name tutorial-bentoml-ecs service down --cluster-config tutorial \
  --ecs-profile tutorial-profile

INFO[0000] Updated ECS service successfully              desiredCount=0 force-deployment=false service=tutorial-bentoml-ecs
INFO[0000] Service status                                desiredCount=0 runningCount=1 serviceName=tutorial-bentoml-ecs
INFO[0016] Service status                                desiredCount=0 runningCount=0 serviceName=tutorial-bentoml-ecs
INFO[0016] (service tutorial-bentoml-ecs) has stopped 1 running tasks: (task ecd119f0-b159-42e6-b86c-e6a62242ce7a).  timestamp="2019-12-17 01:15:37 +0000 UTC"
INFO[0016] ECS Service has reached a stable state        desiredCount=0 runningCount=0 serviceName=tutorial-bentoml-ecs
INFO[0016] Deleted ECS service                           service=tutorial-bentoml-ecs
INFO[0016] ECS Service has reached a stable state        desiredCount=0 runningCount=0 serviceName=tutorial-bentoml-ecs

Shutting down the AWS ECS cluster

> ecs-cli down --force --cluster-config tutorial --ecs-profile tutorial-profile

INFO[0001] Waiting for your cluster resources to be deleted...
INFO[0001] Cloudformation stack status                   stackStatus=DELETE_IN_PROGRESS
INFO[0062] Deleted cluster                               cluster=tutorial