LogoLogo
WebsiteSlack
0.32
0.32
  • Get started
  • Clusters
    • Management
      • Auth
      • Create
      • Update
      • Delete
      • Environments
    • Instances
      • Multi-instance
      • Spot instances
    • Observability
      • Logging
      • Metrics
    • Networking
      • Load balancers
      • VPC peering
      • HTTPS
      • Custom domain
    • Advanced
      • Setting up kubectl
      • Private Docker registry
  • Workloads
    • Realtime APIs
      • Example
      • Predictor
      • Configuration
      • Models
      • Parallelism
      • Server-side batching
      • Autoscaling
      • Statuses
      • Metrics
      • Multi-model
        • Example
        • Configuration
        • Caching
      • Traffic Splitter
        • Example
        • Configuration
      • Troubleshooting
    • Async APIs
      • Example
      • Predictor
      • Configuration
      • Statuses
      • Webhooks
      • Metrics
    • Batch APIs
      • Example
      • Predictor
      • Configuration
      • Jobs
      • Statuses
      • Metrics
    • Task APIs
      • Example
      • Definition
      • Configuration
      • Jobs
      • Statuses
      • Metrics
    • Dependencies
      • Example
      • Python packages
      • System packages
      • Custom images
  • Clients
    • Install
    • Uninstall
    • CLI commands
    • Python client
Powered by GitBook
On this page
  • Best practices
  • Examples
  • CPU spot, CPU on-demand, and GPU on-demand
  • CPU on-demand, GPU on-demand, and Inferentia on-demand
  • 3 CPU spot and 1 CPU on-demand
  1. Clusters
  2. Instances

Multi-instance

Cortex can be configured to provision different instance types to improve workload performance and reduce cloud infrastructure spend.

Best practices

Node groups with lower indices have higher priority.

  1. Small instance node groups should be listed before large instance node groups.

  2. CPU node groups should be listed before GPU/Inferentia node groups.

  3. Spot node groups should always be listed before on-demand node groups.

Examples

CPU spot, CPU on-demand, and GPU on-demand

# cluster.yaml

node_groups:
  - name: cpu-spot
    instance_type: m5.large
    spot: true
  - name: cpu-on-demand
    instance_type: m5.large
  - name: gpu-on-demand
    instance_type: g4dn.xlarge

CPU on-demand, GPU on-demand, and Inferentia on-demand

# cluster.yaml

node_groups:
  - name: cpu-on-demand
    instance_type: m5.large
  - name: gpu-on-demand
    instance_type: g4dn.xlarge
  - name: inferentia-on-demand
    instance_type: inf.xlarge

3 CPU spot and 1 CPU on-demand

# cluster.yaml

node_groups:
  - name: cpu-1
    instance_type: t3.medium
    spot: true
  - name: cpu-2
    instance_type: m5.2xlarge
    spot: true
  - name: cpu-3
    instance_type: m5.8xlarge
    spot: true
  - name: cpu-4
    instance_type: m5.24xlarge
PreviousInstancesNextSpot instances

Last updated 4 years ago