Predictor
Which Predictor you use depends on how your model is exported:
TensorFlow Predictor if your model is exported as a TensorFlow
SavedModelONNX Predictor if your model is exported in the ONNX format
Python Predictor for all other cases
Project files
Cortex makes all files in the project directory (i.e. the directory which contains cortex.yaml) available for use in your Predictor implementation. Python bytecode files (*.pyc, *.pyo, *.pyd), files or folders that start with ., and the api configuration file (e.g. cortex.yaml) are excluded.
The following files can also be added at the root of the project's directory:
.cortexignorefile, which follows the same syntax and behavior as a .gitignore file..envfile, which exports environment variables that can be used in the predictor. Each line of this file must follow theVARIABLE=valueformat.
For example, if your directory looks like this:
./my-classifier/
├── cortex.yaml
├── values.json
├── predictor.py
├── ...
└── requirements.txtYou can access values.json in your Predictor like this:
Python Predictor
Interface
TensorFlow Predictor
Uses TensorFlow version 2.3.0 by default
Interface
Cortex provides a tensorflow_client to your Predictor's constructor. tensorflow_client is an instance of TensorFlowClient that manages a connection to a TensorFlow Serving container to make predictions using your model. It should be saved as an instance variable in your Predictor, and your predict() function should call tensorflow_client.predict() to make an inference with your exported TensorFlow model. Preprocessing of the JSON payload and postprocessing of predictions can be implemented in your predict() function as well.
When multiple models are defined using the Predictor's models field, the tensorflow_client.predict() method expects a second argument model_name which must hold the name of the model that you want to use for inference (for example: self.client.predict(payload, "text-generator")).
If you need to share files between your predictor implementation and the TensorFlow Serving container, you can create a new directory within /mnt (e.g. /mnt/user) and write files to it. The entire /mnt directory is shared between containers, but do not write to any of the directories in /mnt that already exist (they are used internally by Cortex).
ONNX Predictor
Uses ONNX Runtime version 1.4.0 by default
Interface
Cortex provides an onnx_client to your Predictor's constructor. onnx_client is an instance of ONNXClient that manages an ONNX Runtime session to make predictions using your model. It should be saved as an instance variable in your Predictor, and your predict() function should call onnx_client.predict() to make an inference with your exported ONNX model. Preprocessing of the JSON payload and postprocessing of predictions can be implemented in your predict() function as well.
When multiple models are defined using the Predictor's models field, the onnx_client.predict() method expects a second argument model_name which must hold the name of the model that you want to use for inference (for example: self.client.predict(model_input, "text-generator")).
Structured logging
You can use Cortex's logger in your predictor implemention to log in JSON. This will enrich your logs with Cortex's metadata, and you can add custom metadata to the logs by adding key value pairs to the extra key when using the logger. For example:
The dictionary passed in via the extra will be flattened by one level. e.g.
To avoid overriding essential Cortex metadata, please refrain from specifying the following extra keys: asctime, levelname, message, labels, and process. Log lines greater than 5 MB in size will be ignored.
Last updated