For the complete documentation index, see llms.txt. This page is also available as Markdown.

Install

Cortex currently relies on cloud provider specific functionality such as load balancers and storage. Kubernetes clusters in the following cloud providers are supported:

Cortex uses helm to install the Cortex operator and its dependencies on your Kubernetes cluster.

AWS

Prerequisites

  • kubectl

  • aws cli

  • helm 3

  • EKS cluster

    • at least 3 t3.medium (2 vCPU, 4 GB mem) instances

You may install Cortex in any namespace in your cluster. In the guide that follows, the "default" namespace is assumed; if you're using a different namespace, replace all occurrences of "default" with the name of your namespace.

Note that installing Cortex on your Kubernetes cluster will not provide some of the cluster-level features such as cluster autoscaling and spot instances with on-demand backup.

Download Cortex charts

wget https://s3-us-west-2.amazonaws.com/get-cortex/0.28.0/charts/cortex-0.28.0.tar.gz
tar -xzf cortex-0.28.0.tar.gz

Create a bucket in S3

The Cortex operator will use this bucket to store API state and dependencies.

Credentials

The credentials need to have at least these permissions.

Install Cortex

Define a values.yaml with the following information provided:

Configure Cortex client

Wait for the loadbalancers to be provisioned and connected to your cluster.

Get the Cortex operator endpoint:

You can use the curl command below to verify that your load balancer is ready. It can take 5-10 minutes for the setup to complete. You can expect to encounter Could not resolve host or timeouts when running the verification request before the load balancer is initialized.

A successful response looks like this:

Once you receive a successful response, configure your Cortex client:

Using GPU/Inf resources on your cluster

The following tolerations are added to Deployments and Jobs orchestrated by Cortex.

GCP

Prerequisites

  • kubectl

  • gsutil

  • helm 3

  • GKE cluster

    • at least 2 n1-standard-2 (2 vCPU, 8 GB mem) (with monitoring and logging disabled)

You may install Cortex in any namespace in your cluster. In the guide that follows, the "default" namespace is assumed; if you're using a different namespace, replace all occurrences of "default" with the name of your namespace.

Note that installing Cortex on your Kubernetes cluster will not provide some of the cluster-level features such as cluster autoscaling and preemptible instances.

Download Cortex charts

Create a bucket in GCS

The Cortex operator will use this bucket to store API state and dependencies.

Credentials

The credentials need to have at least these permissions.

Install Cortex

Define a values.yaml with the following information provided:

Configure Cortex client

Wait for the loadbalancers to be provisioned and connected to your cluster.

Get the Cortex operator endpoint:

You can use the curl command below to verify that your load balancer is ready. It can take 5-10 minutes for the setup to complete. You can expect to encounter Could not resolve host or timeouts when running the verification request before the load balancer is initialized.

A successful response looks like this:

Once you receive a successful response, configure your Cortex client:

Using GPU resources on your cluster

The following tolerations are added to Deployments and Jobs orchestrated by Cortex.

Last updated