LogoLogo
WebsiteSlack
0.28
0.28
  • Get started
  • Clients
    • Install
    • CLI commands
    • Python API
    • Environments
    • Telemetry
    • Uninstall
  • Workloads
    • Realtime APIs
      • Example
      • Predictor
      • Configuration
      • Models
      • Parallelism
      • Server-side batching
      • Autoscaling
      • Statuses
      • Multi-model
        • Example
        • Configuration
        • Caching
      • Traffic Splitter
        • Example
        • Configuration
      • Troubleshooting
    • Batch APIs
      • Example
      • Predictor
      • Configuration
      • Jobs
      • Statuses
    • Task APIs
      • Example
      • Definition
      • Configuration
      • Jobs
      • Statuses
    • Dependencies
      • Example
      • Python packages
      • System packages
      • Custom images
  • Clusters
    • Cortex Cloud on AWS
      • Install
      • Update
      • Security
      • Logging
      • Spot instances
      • Networking
        • Custom domain
        • HTTPS (via API Gateway)
        • VPC peering
      • Setting up kubectl
      • Uninstall
    • Cortex Cloud on GCP
      • Install
      • Logging
      • Credentials
      • Setting up kubectl
      • Uninstall
    • Cortex Core on Kubernetes
      • Install
      • Uninstall
    • Private Docker registry
Powered by GitBook
On this page
  • Implement
  • Deploy
  • Monitor
  • Stream logs
  • Make a request
  • Delete
  1. Workloads
  2. Realtime APIs

Example

Create APIs that can respond to prediction requests in real-time.

Implement

$ mkdir text-generator && cd text-generator
$ touch predictor.py requirements.txt text_generator.yaml
# predictor.py

from transformers import pipeline

class PythonPredictor:
    def __init__(self, config):
        self.model = pipeline(task="text-generation")

    def predict(self, payload):
        return self.model(payload["text"])[0]
# requirements.txt

transformers
torch
# text_generator.yaml

- name: text-generator
  kind: RealtimeAPI
  predictor:
    type: python
    path: predictor.py
  compute:
    gpu: 1

Deploy

$ cortex deploy text_generator.yaml

Monitor

$ cortex get text-generator --watch

Stream logs

$ cortex logs text-generator

Make a request

$ curl http://***.elb.us-west-2.amazonaws.com/text-generator -X POST -H "Content-Type: application/json" -d '{"text": "hello world"}'

Delete

$ cortex delete text-generator
PreviousRealtime APIsNextPredictor

Last updated 4 years ago