LogoLogo
WebsiteSlack
0.30
0.30
  • Get started
  • Clients
    • Install
    • CLI commands
    • Python API
    • Environments
    • Telemetry
    • Uninstall
  • Workloads
    • Realtime APIs
      • Example
      • Predictor
      • Configuration
      • Models
      • Parallelism
      • Server-side batching
      • Autoscaling
      • Statuses
      • Metrics
      • Multi-model
        • Example
        • Configuration
        • Caching
      • Traffic Splitter
        • Example
        • Configuration
      • Troubleshooting
    • Batch APIs
      • Example
      • Predictor
      • Configuration
      • Jobs
      • Statuses
    • Task APIs
      • Example
      • Definition
      • Configuration
      • Jobs
      • Statuses
    • Dependencies
      • Example
      • Python packages
      • System packages
      • Custom images
    • Observability
      • Logging
      • Metrics
  • Clusters
    • AWS
      • Install
      • Update
      • Auth
      • Security
      • Spot instances
      • Networking
        • Custom domain
        • HTTPS (via API Gateway)
        • VPC peering
      • Setting up kubectl
      • Uninstall
    • GCP
      • Install
      • Credentials
      • Setting up kubectl
      • Uninstall
    • Private Docker registry
Powered by GitBook
On this page
  • Prerequisites
  • Spin up Cortex on your AWS account
  • Configure Cortex
  1. Clusters
  2. AWS

Install

PreviousAWSNextUpdate

Last updated 4 years ago

Prerequisites

  1. must be installed and running on your machine (to verify, check that running docker ps does not return an error)

  2. Subscribe to the (for GPU clusters)

  3. An IAM user with AdministratorAccess and programmatic access (see if you'd like to use less privileged credentials after spinning up your cluster)

  4. You may need to for your desired instance type

Spin up Cortex on your AWS account

# install the CLI
pip install cortex

# spin up Cortex on your AWS account
cortex cluster up  # or: cortex cluster up --config cluster.yaml (see configuration options below)

Configure Cortex

# cluster.yaml

# EKS cluster name
cluster_name: cortex

# AWS region
region: us-east-1

# list of availability zones for your region
availability_zones:  # default: 3 random availability zones in your region, e.g. [us-east-1a, us-east-1b, us-east-1c]

# instance type
instance_type: m5.large

# minimum number of instances
min_instances: 1

# maximum number of instances
max_instances: 5

# disk storage size per instance (GB)
instance_volume_size: 50

# instance volume type [gp2 | io1 | st1 | sc1]
instance_volume_type: gp2

# instance volume iops (only applicable to io1)
# instance_volume_iops: 3000

# subnet visibility [public (instances will have public IPs) | private (instances will not have public IPs)]
subnet_visibility: public

# NAT gateway (required when using private subnets) [none | single | highly_available (a NAT gateway per availability zone)]
nat_gateway: none

# API load balancer scheme [internet-facing | internal]
api_load_balancer_scheme: internet-facing

# operator load balancer scheme [internet-facing | internal]
# note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator
operator_load_balancer_scheme: internet-facing

# to install Cortex in an existing VPC, you can provide a list of subnets for your cluster to use
# subnet_visibility (specified above in this file) must match your subnets' visibility
# this is an advanced feature (not recommended for first-time users) and requires your VPC to be configured correctly; see https://eksctl.io/usage/vpc-networking/#use-existing-vpc-other-custom-configuration
# here is an example:
# subnets:
#   - availability_zone: us-west-2a
#     subnet_id: subnet-060f3961c876872ae
#   - availability_zone: us-west-2b
#     subnet_id: subnet-0faed05adf6042ab7

# additional tags to assign to AWS resources (all resources will automatically be tagged with cortex.dev/cluster-name: <cluster_name>)
tags:  # <string>: <string> map of key/value pairs

# enable spot instances
spot: false

# SSL certificate ARN (only necessary when using a custom domain)
ssl_certificate_arn:

# List of IAM policies to attach to your Cortex APIs
iam_policy_arns: ["arn:aws:iam::aws:policy/AmazonS3FullAccess"]

# primary CIDR block for the cluster's VPC
vpc_cidr: 192.168.0.0/16

The docker images used by the Cortex cluster can also be overridden, although this is not common. They can be configured by adding any of these keys to your cluster configuration file (default values are shown):

image_operator: quay.io/cortexlabs/operator:0.30.0
image_manager: quay.io/cortexlabs/manager:0.30.0
image_downloader: quay.io/cortexlabs/downloader:0.30.0
image_request_monitor: quay.io/cortexlabs/request-monitor:0.30.0
image_cluster_autoscaler: quay.io/cortexlabs/cluster-autoscaler:0.30.0
image_metrics_server: quay.io/cortexlabs/metrics-server:0.30.0
image_inferentia: quay.io/cortexlabs/inferentia:0.30.0
image_neuron_rtd: quay.io/cortexlabs/neuron-rtd:0.30.0
image_nvidia: quay.io/cortexlabs/nvidia:0.30.0
image_fluent_bit: quay.io/cortexlabs/fluent-bit:0.30.0
image_istio_proxy: quay.io/cortexlabs/istio-proxy:0.30.0
image_istio_pilot: quay.io/cortexlabs/istio-pilot:0.30.0
image_prometheus: quay.io/cortexlabs/prometheus:0.30.0
image_prometheus_config_reloader: quay.io/cortexlabs/prometheus-config-reloader:0.30.0
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:0.30.0
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:0.30.0
image_prometheus_dcgm_exporter: quay.io/cortexlabs/prometheus-dcgm-exporter:0.30.0
image_prometheus_kube_state_metrics: quay.io/cortexlabs/prometheus-kube-state-metrics:0.30.0
image_prometheus_node_exporter: quay.io/cortexlabs/prometheus-node-exporter:0.30.0
image_kube_rbac_proxy: quay.io/cortexlabs/kube-rbac-proxy:0.30.0
image_grafana: quay.io/cortexlabs/grafana:0.30.0
image_event_exporter: quay.io/cortexlabs/event-exporter:0.30.0
Docker
EKS-optimized AMI with GPU Support
security
request a limit increase