LogoLogo
WebsiteSlack
0.38
0.38
  • Get started
  • Overview
  • Clusters
    • Management
      • Auth
      • Create
      • Update
      • Delete
      • Environments
    • Instances
      • Multi-instance
      • Spot instances
    • Observability
      • Logging
      • Metrics
      • Alerting
    • Networking
      • Load balancers
      • VPC peering
      • HTTPS
      • Custom domain
    • Advanced
      • Setting up kubectl
      • Private Docker registry
      • Self hosted images
  • Workloads
    • Realtime
      • Example
      • Configuration
      • Containers
      • Autoscaling
      • Traffic Splitter
      • Metrics
      • Statuses
      • Troubleshooting
    • Async
      • Example
      • Configuration
      • Containers
      • Statuses
    • Batch
      • Example
      • Configuration
      • Containers
      • Jobs
      • Statuses
    • Task
      • Example
      • Configuration
      • Containers
      • Jobs
      • Statuses
  • Clients
    • Install
    • Uninstall
    • CLI commands
    • Python client
Powered by GitBook
On this page
  1. Workloads
  2. Realtime

Metrics

PreviousTraffic SplitterNextStatuses

Last updated 3 years ago

The cortex get and cortex get API_NAME commands display the request time (averaged over the past 2 weeks) and response code counts (summed over the past 2 weeks) for your APIs:

cortex get

env      api                         status   up-to-date   requested   last update   avg request   2XX
cortex   iris-classifier             live     1            1           17m           24ms          1223
cortex   text-generator              live     1            1           8m            180ms         433
cortex   image-classifier-resnet50   live     2            2           1h            32ms          1121126

The cortex get API_NAME command also provides a link to a Grafana dashboard:

dashboard

Metrics in the dashboard

Panel

Description

Note

Request Rate

Request rate, computed over every minute, of an API

In Flight Request

Active in-flight requests for an API.

In-flight requests are recorded every 10 seconds, which will correspond to the minimum resolution.

Active Replicas

Active replicas for an API

2XX Responses

Request rate, computed over a minute, for responses with status code 2XX of an API

4XX Responses

Request rate, computed over a minute, for responses with status code 4XX of an API

5XX Responses

Request rate, computed over a minute, for responses with status code 5XX of an API

p99 Latency

99th percentile latency, computed over a minute, for an API

Value might not be accurate because the histogram buckets are not dynamically set.

p90 Latency

90th percentile latency, computed over a minute, for an API

Value might not be accurate because the histogram buckets are not dynamically set.

p50 Latency

50th percentile latency, computed over a minute, for an API

Value might not be accurate because the histogram buckets are not dynamically set.

Average Latency

Average latency, computed over a minute, for an API