LogoLogo
WebsiteSlack
0.32
0.32
  • Get started
  • Clusters
    • Management
      • Auth
      • Create
      • Update
      • Delete
      • Environments
    • Instances
      • Multi-instance
      • Spot instances
    • Observability
      • Logging
      • Metrics
    • Networking
      • Load balancers
      • VPC peering
      • HTTPS
      • Custom domain
    • Advanced
      • Setting up kubectl
      • Private Docker registry
  • Workloads
    • Realtime APIs
      • Example
      • Predictor
      • Configuration
      • Models
      • Parallelism
      • Server-side batching
      • Autoscaling
      • Statuses
      • Metrics
      • Multi-model
        • Example
        • Configuration
        • Caching
      • Traffic Splitter
        • Example
        • Configuration
      • Troubleshooting
    • Async APIs
      • Example
      • Predictor
      • Configuration
      • Statuses
      • Webhooks
      • Metrics
    • Batch APIs
      • Example
      • Predictor
      • Configuration
      • Jobs
      • Statuses
      • Metrics
    • Task APIs
      • Example
      • Definition
      • Configuration
      • Jobs
      • Statuses
      • Metrics
    • Dependencies
      • Example
      • Python packages
      • System packages
      • Custom images
  • Clients
    • Install
    • Uninstall
    • CLI commands
    • Python client
Powered by GitBook
On this page
  • Accessing the dashboard
  • Default credentials
  • Selecting an API
  • Selecting a time range
  • Available dashboards
  • Exposed metrics
  • Custom user metrics
  • Metric types
  • Pushing metrics
  • Metrics client class reference
  1. Clusters
  2. Observability

Metrics

PreviousLoggingNextNetworking

Last updated 4 years ago

Cortex includes Prometheus for metrics collection and Grafana for visualization. You can monitor your APIs with the default Grafana dashboards, or create custom metrics and dashboards.

Accessing the dashboard

The dashboard URL is displayed once you run a cortex get <api_name> command.

Alternatively, you can access it on http://<operator_url>/dashboard. Run the following command to get the operator URL:

cortex env list

If your operator load balancer is configured to be internal, there are a few options for accessing the dashboard:

  1. Access the dashboard from a machine that has VPC Peering configured to your cluster's VPC, or which is inside of your

    cluster's VPC.

  2. Run kubectl port-forward -n default grafana-0 3000:3000 to forward Grafana's port to your local machine, and access

    the dashboard on (see instructions for setting up kubectl ).

  3. Set up VPN access to your cluster's

    VPC ().

Default credentials

The dashboard is protected with username / password authentication, which by default are:

  • Username: admin

  • Password: admin

You will be prompted to change the admin user password in the first time you log in.

Selecting an API

You can select one or more APIs to visualize in the top left corner of the dashboard.

Selecting a time range

Grafana allows you to select a time range on which the metrics will be visualized. You can do so in the top right corner of the dashboard.

Note: Cortex only retains a maximum of 2 weeks worth of data at any moment in time

Available dashboards

There are more than one dashboard available by default. You can view the available dashboards by accessing the Grafana menu: Dashboards -> Manage -> Cortex folder.

The dashboards that Cortex ships with are the following:

  • RealtimeAPI

  • BatchAPI

  • Cluster resources

  • Node resources

Exposed metrics

Cortex exposes more metrics with Prometheus, that can be potentially useful. To check the available metrics, access the Explore menu in grafana and press the Metrics button.

You can use any of these metrics to set up your own dashboards.

Custom user metrics

It is possible to export your own custom metrics by using the MetricsClient class in your predictor code. This allows you to create a custom metrics from your deployed API that can be later be used on your own custom dashboards.

Code examples on how to use custom metrics for each API kind can be found here:

Metric types

Currently, we only support 3 different metric types that will be converted to its respective Prometheus type:

  • monotonically increasing counter whose value can only increase or be reset to zero on restart.

  • and down.

  • request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed

    values.

Pushing metrics

  • Counter

     metrics.increment('my_counter', value=1, tags={"tag": "tag_name"})
  • Gauge

     metrics.gauge('active_connections', value=1001, tags={"tag": "tag_name"})
  • Histogram

     metrics.histogram('inference_time_milliseconds', 120, tags={"tag": "tag_name"})

Metrics client class reference

class MetricsClient:

    def gauge(self, metric: str, value: float, tags: Dict[str, str] = None):
        """
        Record the value of a gauge.

        Example:
        >>> metrics.gauge('active_connections', 1001, tags={"protocol": "http"})
        """
        pass

    def increment(self, metric: str, value: float = 1, tags: Dict[str, str] = None):
        """
        Increment the value of a counter.

        Example:
        >>> metrics.increment('model_calls', 1, tags={"model_version": "v1"})
        """
        pass

    def histogram(self, metric: str, value: float, tags: Dict[str, str] = None):
        """
        Set the value in a histogram metric

        Example:
        >>> metrics.histogram('inference_time_milliseconds', 120, tags={"model_version": "v1"})
        """
        pass

Grafana allows managing the access of several users and managing teams. For more information on this topic check the .

- a cumulative metric that represents a single

- a single numerical value that can arbitrarily go up

- samples observations (usually things like

http://localhost:3000
here
docs
grafana documentation
Counter
Gauge
Histogram
BatchAPI
TaskAPI
AsyncAPI
RealtimeAPI