Skip to content

Commit 1a8167c

Browse files
committed
Docs for Realtime gRPC (#2018)
(cherry picked from commit b8ac916)
1 parent a90f19d commit 1a8167c

File tree

7 files changed

+474
-43
lines changed

7 files changed

+474
-43
lines changed

docs/start.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ cortex cluster up cluster.yaml
2121
cortex deploy apis.yaml
2222
```
2323

24-
* [RealtimeAPI](workloads/realtime/example.md) - create APIs that respond to prediction requests in real-time.
24+
* [RealtimeAPI](workloads/realtime/example.md) - create HTTP/gRPC APIs that respond to prediction requests in real-time.
2525
* [AsyncAPI](workloads/async/example.md) - create APIs that respond to prediction requests asynchronously.
2626
* [BatchAPI](workloads/batch/example.md) - create APIs that run distributed batch inference jobs.
2727
* [TaskAPI](workloads/task/example.md) - create APIs that run training or fine-tuning jobs.

docs/summary.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@
4545
* [Example](workloads/realtime/traffic-splitter/example.md)
4646
* [Configuration](workloads/realtime/traffic-splitter/configuration.md)
4747
* [Troubleshooting](workloads/realtime/troubleshooting.md)
48-
* [Async APIs](workloads/async/async.md)
48+
* [Async APIs](workloads/async/async-apis.md)
4949
* [Example](workloads/async/example.md)
5050
* [Predictor](workloads/async/predictors.md)
5151
* [Configuration](workloads/async/configuration.md)
File renamed without changes.

docs/workloads/realtime/configuration.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@
1919
predictor:
2020
type: python
2121
path: <string> # path to a python file with a PythonPredictor class definition, relative to the Cortex root (required)
22-
dependencies: # (optional)
22+
protobuf_path: <string> # path to a protobuf file (required if using gRPC)
23+
dependencies: # (optional)
2324
pip: <string> # relative path to requirements.txt (default: requirements.txt)
2425
conda: <string> # relative path to conda-packages.txt (default: conda-packages.txt)
2526
shell: <string> # relative path to a shell script for system package installation (default: dependencies.sh)
@@ -52,7 +53,8 @@ predictor:
5253
predictor:
5354
type: tensorflow
5455
path: <string> # path to a python file with a TensorFlowPredictor class definition, relative to the Cortex root (required)
55-
dependencies: # (optional)
56+
protobuf_path: <string> # path to a protobuf file (required if using gRPC)
57+
dependencies: # (optional)
5658
pip: <string> # relative path to requirements.txt (default: requirements.txt)
5759
conda: <string> # relative path to conda-packages.txt (default: conda-packages.txt)
5860
shell: <string> # relative path to a shell script for system package installation (default: dependencies.sh)
@@ -88,7 +90,8 @@ predictor:
8890
predictor:
8991
type: onnx
9092
path: <string> # path to a python file with an ONNXPredictor class definition, relative to the Cortex root (required)
91-
dependencies: # (optional)
93+
protobuf_path: <string> # path to a protobuf file (required if using gRPC)
94+
dependencies: # (optional)
9295
pip: <string> # relative path to requirements.txt (default: requirements.txt)
9396
conda: <string> # relative path to conda-packages.txt (default: conda-packages.txt)
9497
shell: <string> # relative path to a shell script for system package installation (default: dependencies.sh)

docs/workloads/realtime/example.md

Lines changed: 53 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
# RealtimeAPI
22

3-
Create APIs that respond to prediction requests in real-time.
3+
## HTTP
44

5-
## Implement
5+
Create HTTP APIs that respond to prediction requests in real-time.
6+
7+
### Implement
68

79
```bash
810
mkdir text-generator && cd text-generator
@@ -41,32 +43,76 @@ torch
4143
gpu: 1
4244
```
4345
44-
## Deploy
46+
### Deploy
4547
4648
```bash
4749
cortex deploy text_generator.yaml
4850
```
4951

50-
## Monitor
52+
### Monitor
5153

5254
```bash
5355
cortex get text-generator --watch
5456
```
5557

56-
## Stream logs
58+
### Stream logs
5759

5860
```bash
5961
cortex logs text-generator
6062
```
6163

62-
## Make a request
64+
### Make a request
6365

6466
```bash
6567
curl http://***.elb.us-west-2.amazonaws.com/text-generator -X POST -H "Content-Type: application/json" -d '{"text": "hello world"}'
6668
```
6769

68-
## Delete
70+
### Delete
6971

7072
```bash
7173
cortex delete text-generator
7274
```
75+
76+
## gRPC
77+
78+
To make the above API use gRPC as its protocol, make the following changes (the rest of the steps are the same):
79+
80+
### Add protobuf file
81+
82+
Create a `predictor.proto` file in your project's directory:
83+
84+
```protobuf
85+
<!-- predictor.proto -->
86+
87+
syntax = "proto3";
88+
package text_generator;
89+
90+
service Predictor {
91+
rpc Predict (Message) returns (Message);
92+
}
93+
94+
message Message {
95+
string text = 1;
96+
}
97+
```
98+
99+
Set the `predictor.protobuf_path` field in the API spec to point to the `predictor.proto` file:
100+
101+
```yaml
102+
# text_generator.yaml
103+
104+
- name: text-generator
105+
kind: RealtimeAPI
106+
predictor:
107+
type: python
108+
path: predictor.py
109+
protobuf_path: predictor.proto
110+
compute:
111+
gpu: 1
112+
```
113+
114+
### Make a gRPC request
115+
116+
```bash
117+
grpcurl -plaintext -proto predictor.proto -d '{"text": "hello-world"}' ***.elb.us-west-2.amazonaws.com:80 text_generator.Predictor/Predict
118+
```

0 commit comments

Comments
 (0)