Create

Prerequisites

Install and run Docker on your machine.
Subscribe to the AMI with GPU support (for GPU clusters).
Create an IAM user with AdministratorAccess and programmatic access.
You may need to request limit increases for your desired instance types.

Create a cluster on your AWS account

# install the CLI
pip install cortex

# create a cluster
cortex cluster up cluster.yaml

`cluster.yaml`

# cluster name
cluster_name: cortex

# AWS region
region: us-east-1

# list of availability zones for your region
availability_zones:  # default: 3 random availability zones in your region, e.g. [us-east-1a, us-east-1b, us-east-1c]

# list of cluster node groups; the smaller index, the higher the priority of the node group
node_groups:
  - name: ng-cpu # name of the node group
    instance_type: m5.large # instance type
    min_instances: 1 # minimum number of instances
    max_instances: 5 # maximum number of instances
    instance_volume_size: 50 # disk storage size per instance (GB)
    instance_volume_type: gp2 # instance volume type [gp2 | io1 | st1 | sc1]
    # instance_volume_iops: 3000 # instance volume iops (only applicable to io1)
    spot: false # whether to use spot instances

  - name: ng-gpu
    instance_type: g4dn.xlarge
    min_instances: 1
    max_instances: 5
    instance_volume_size: 50
    instance_volume_type: gp2
    # instance_volume_iops: 3000
    spot: false
  ...

# subnet visibility [public (instances will have public IPs) | private (instances will not have public IPs)]
subnet_visibility: public

# NAT gateway (required when using private subnets) [none | single | highly_available (a NAT gateway per availability zone)]
nat_gateway: none

# API load balancer scheme [internet-facing | internal]
api_load_balancer_scheme: internet-facing

# operator load balancer scheme [internet-facing | internal]
# note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator
operator_load_balancer_scheme: internet-facing

# to install Cortex in an existing VPC, you can provide a list of subnets for your cluster to use
# subnet_visibility (specified above in this file) must match your subnets' visibility
# this is an advanced feature (not recommended for first-time users) and requires your VPC to be configured correctly; see https://eksctl.io/usage/vpc-networking/#use-existing-vpc-other-custom-configuration
# here is an example:
# subnets:
#   - availability_zone: us-west-2a
#     subnet_id: subnet-060f3961c876872ae
#   - availability_zone: us-west-2b
#     subnet_id: subnet-0faed05adf6042ab7

# restrict access to APIs by cidr blocks/ip address ranges
api_load_balancer_cidr_white_list: [0.0.0.0/0]

# restrict access to the Operator by cidr blocks/ip address ranges
operator_load_balancer_cidr_white_list: [0.0.0.0/0]

# additional tags to assign to AWS resources (all resources will automatically be tagged with cortex.dev/cluster-name: <cluster_name>)
tags:  # <string>: <string> map of key/value pairs

# SSL certificate ARN (only necessary when using a custom domain)
ssl_certificate_arn:

# List of IAM policies to attach to your Cortex APIs
iam_policy_arns: ["arn:aws:iam::aws:policy/AmazonS3FullAccess"]

# primary CIDR block for the cluster's VPC
vpc_cidr: 192.168.0.0/16

The docker images used by the cluster can also be overridden. They can be configured by adding any of these keys to your cluster configuration file (default values are shown):

image_operator: quay.io/cortexlabs/operator:0.34.0
image_manager: quay.io/cortexlabs/manager:0.34.0
image_downloader: quay.io/cortexlabs/downloader:0.34.0
image_request_monitor: quay.io/cortexlabs/request-monitor:0.34.0
image_image_async_gateway: quay.io/cortexlabs/async-gateway:0.34.0
image_cluster_autoscaler: quay.io/cortexlabs/cluster-autoscaler:0.34.0
image_metrics_server: quay.io/cortexlabs/metrics-server:0.34.0
image_inferentia: quay.io/cortexlabs/inferentia:0.34.0
image_neuron_rtd: quay.io/cortexlabs/neuron-rtd:0.34.0
image_nvidia: quay.io/cortexlabs/nvidia:0.34.0
image_fluent_bit: quay.io/cortexlabs/fluent-bit:0.34.0
image_istio_proxy: quay.io/cortexlabs/istio-proxy:0.34.0
image_istio_pilot: quay.io/cortexlabs/istio-pilot:0.34.0
image_prometheus: quay.io/cortexlabs/prometheus:0.34.0
image_prometheus_config_reloader: quay.io/cortexlabs/prometheus-config-reloader:0.34.0
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:0.34.0
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:0.34.0
image_prometheus_dcgm_exporter: quay.io/cortexlabs/prometheus-dcgm-exporter:0.34.0
image_prometheus_kube_state_metrics: quay.io/cortexlabs/prometheus-kube-state-metrics:0.34.0
image_prometheus_node_exporter: quay.io/cortexlabs/prometheus-node-exporter:0.34.0
image_kube_rbac_proxy: quay.io/cortexlabs/kube-rbac-proxy:0.34.0
image_grafana: quay.io/cortexlabs/grafana:0.34.0
image_event_exporter: quay.io/cortexlabs/event-exporter:0.34.0

PreviousAuth NextUpdate

Last updated 3 years ago

cluster.yaml

# cluster name
cluster_name: cortex

# AWS region
region: us-east-1

# list of availability zones for your region
availability_zones:  # default: 3 random availability zones in your region, e.g. [us-east-1a, us-east-1b, us-east-1c]

# list of cluster node groups; the smaller index, the higher the priority of the node group
node_groups:
  - name: ng-cpu # name of the node group
    instance_type: m5.large # instance type
    min_instances: 1 # minimum number of instances
    max_instances: 5 # maximum number of instances
    instance_volume_size: 50 # disk storage size per instance (GB)
    instance_volume_type: gp2 # instance volume type [gp2 | io1 | st1 | sc1]
    # instance_volume_iops: 3000 # instance volume iops (only applicable to io1)
    spot: false # whether to use spot instances

  - name: ng-gpu
    instance_type: g4dn.xlarge
    min_instances: 1
    max_instances: 5
    instance_volume_size: 50
    instance_volume_type: gp2
    # instance_volume_iops: 3000
    spot: false
  ...

# subnet visibility [public (instances will have public IPs) | private (instances will not have public IPs)]
subnet_visibility: public

# NAT gateway (required when using private subnets) [none | single | highly_available (a NAT gateway per availability zone)]
nat_gateway: none

# API load balancer scheme [internet-facing | internal]
api_load_balancer_scheme: internet-facing

# operator load balancer scheme [internet-facing | internal]
# note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator
operator_load_balancer_scheme: internet-facing

# to install Cortex in an existing VPC, you can provide a list of subnets for your cluster to use
# subnet_visibility (specified above in this file) must match your subnets' visibility
# this is an advanced feature (not recommended for first-time users) and requires your VPC to be configured correctly; see https://eksctl.io/usage/vpc-networking/#use-existing-vpc-other-custom-configuration
# here is an example:
# subnets:
#   - availability_zone: us-west-2a
#     subnet_id: subnet-060f3961c876872ae
#   - availability_zone: us-west-2b
#     subnet_id: subnet-0faed05adf6042ab7

# restrict access to APIs by cidr blocks/ip address ranges
api_load_balancer_cidr_white_list: [0.0.0.0/0]

# restrict access to the Operator by cidr blocks/ip address ranges
operator_load_balancer_cidr_white_list: [0.0.0.0/0]

# additional tags to assign to AWS resources (all resources will automatically be tagged with cortex.dev/cluster-name: <cluster_name>)
tags:  # <string>: <string> map of key/value pairs

# SSL certificate ARN (only necessary when using a custom domain)
ssl_certificate_arn:

# List of IAM policies to attach to your Cortex APIs
iam_policy_arns: ["arn:aws:iam::aws:policy/AmazonS3FullAccess"]

# primary CIDR block for the cluster's VPC
vpc_cidr: 192.168.0.0/16

The docker images used by the cluster can also be overridden. They can be configured by adding any of these keys to your cluster configuration file (default values are shown):

image_operator: quay.io/cortexlabs/operator:0.34.0
image_manager: quay.io/cortexlabs/manager:0.34.0
image_downloader: quay.io/cortexlabs/downloader:0.34.0
image_request_monitor: quay.io/cortexlabs/request-monitor:0.34.0
image_image_async_gateway: quay.io/cortexlabs/async-gateway:0.34.0
image_cluster_autoscaler: quay.io/cortexlabs/cluster-autoscaler:0.34.0
image_metrics_server: quay.io/cortexlabs/metrics-server:0.34.0
image_inferentia: quay.io/cortexlabs/inferentia:0.34.0
image_neuron_rtd: quay.io/cortexlabs/neuron-rtd:0.34.0
image_nvidia: quay.io/cortexlabs/nvidia:0.34.0
image_fluent_bit: quay.io/cortexlabs/fluent-bit:0.34.0
image_istio_proxy: quay.io/cortexlabs/istio-proxy:0.34.0
image_istio_pilot: quay.io/cortexlabs/istio-pilot:0.34.0
image_prometheus: quay.io/cortexlabs/prometheus:0.34.0
image_prometheus_config_reloader: quay.io/cortexlabs/prometheus-config-reloader:0.34.0
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:0.34.0
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:0.34.0
image_prometheus_dcgm_exporter: quay.io/cortexlabs/prometheus-dcgm-exporter:0.34.0
image_prometheus_kube_state_metrics: quay.io/cortexlabs/prometheus-kube-state-metrics:0.34.0
image_prometheus_node_exporter: quay.io/cortexlabs/prometheus-node-exporter:0.34.0
image_kube_rbac_proxy: quay.io/cortexlabs/kube-rbac-proxy:0.34.0
image_grafana: quay.io/cortexlabs/grafana:0.34.0
image_event_exporter: quay.io/cortexlabs/event-exporter:0.34.0