LogoLogo
WebsiteSlack
0.32
0.32
  • Get started
  • Clusters
    • Management
      • Auth
      • Create
      • Update
      • Delete
      • Environments
    • Instances
      • Multi-instance
      • Spot instances
    • Observability
      • Logging
      • Metrics
    • Networking
      • Load balancers
      • VPC peering
      • HTTPS
      • Custom domain
    • Advanced
      • Setting up kubectl
      • Private Docker registry
  • Workloads
    • Realtime APIs
      • Example
      • Predictor
      • Configuration
      • Models
      • Parallelism
      • Server-side batching
      • Autoscaling
      • Statuses
      • Metrics
      • Multi-model
        • Example
        • Configuration
        • Caching
      • Traffic Splitter
        • Example
        • Configuration
      • Troubleshooting
    • Async APIs
      • Example
      • Predictor
      • Configuration
      • Statuses
      • Webhooks
      • Metrics
    • Batch APIs
      • Example
      • Predictor
      • Configuration
      • Jobs
      • Statuses
      • Metrics
    • Task APIs
      • Example
      • Definition
      • Configuration
      • Jobs
      • Statuses
      • Metrics
    • Dependencies
      • Example
      • Python packages
      • System packages
      • Custom images
  • Clients
    • Install
    • Uninstall
    • CLI commands
    • Python client
Powered by GitBook
On this page
  • Define a multi-model API
  • Deploy
  1. Workloads
  2. Realtime APIs
  3. Multi-model

Example

Deploy several models in a single API to improve resource utilization efficiency.

Define a multi-model API

# multi_model.py

import cortex

class PythonPredictor:
    def __init__(self, config):
        from transformers import pipeline
        self.analyzer = pipeline(task="sentiment-analysis")

        import wget
        import fasttext
        wget.download(
            "https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin", "/tmp/model"
        )
        self.language_identifier = fasttext.load_model("/tmp/model")

    def predict(self, query_params, payload):
        model = query_params.get("model")
        if model == "sentiment":
            return self.analyzer(payload["text"])[0]
        elif model == "language":
            return self.language_identifier.predict(payload["text"])[0][0][-2:]

requirements = ["tensorflow", "transformers", "wget", "fasttext"]

api_spec = {"name": "multi-model", "kind": "RealtimeAPI"}

cx = cortex.client("aws")
cx.create_api(api_spec, predictor=PythonPredictor, requirements=requirements)

Deploy

python multi_model.py
PreviousMulti-modelNextConfiguration

Last updated 4 years ago