|
| 1 | +# Deploy Image Classification as an API |
| 2 | + |
| 3 | +This example shows how to deploy an Image Classifier made with Pytorch. The Pytorch Image Classifier implementation will be using a pretrained Alexnet model from Torchvision that has been exported to ONNX format. |
| 4 | + |
| 5 | +## Define a deployment |
| 6 | + |
| 7 | +```yaml |
| 8 | +- kind: deployment |
| 9 | + name: image-classifier |
| 10 | + |
| 11 | +- kind: api |
| 12 | + name: alexnet |
| 13 | + model: s3://cortex-examples/image-classifier/alexnet.onnx |
| 14 | + request_handler: alexnet_handler.py |
| 15 | +``` |
| 16 | +
|
| 17 | +A `deployment` specifies a set of resources that are deployed as a single unit. An `api` makes a model available as a web service that can serve real-time predictions. This configuration will download the model from the `cortex-examples` S3 bucket, preprocess the request payload and postprocess the model inference with the functions defined in `alexnet_handler.py`. |
| 18 | + |
| 19 | +## Add request handling |
| 20 | + |
| 21 | +The Alexnet model requires a 2 dimensional array of 3 valued tuples representing the RGB values for each pixel in the image, but the API should accept a simple input format such as a URL to an image. Instead of returning the model's output consinsting of an array of probabilities, the API should return the class name with the highest probability. Define a `pre_inference` function to download the image from the specified URL and convert it to the expected model input and a `post_inference` function to return the name of the class with the highest probability: |
| 22 | + |
| 23 | +```python |
| 24 | +import requests |
| 25 | +import numpy as np |
| 26 | +import base64 |
| 27 | +from PIL import Image |
| 28 | +from io import BytesIO |
| 29 | +from torchvision import transforms |
| 30 | +
|
| 31 | +labels = requests.get( |
| 32 | + "https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt" |
| 33 | +).text.split("\n")[1:] |
| 34 | +
|
| 35 | +
|
| 36 | +# https://github.com/pytorch/examples/blob/447974f6337543d4de6b888e244a964d3c9b71f6/imagenet/main.py#L198-L199 |
| 37 | +normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) |
| 38 | +preprocess = transforms.Compose( |
| 39 | + [transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), normalize] |
| 40 | +) |
| 41 | +
|
| 42 | +
|
| 43 | +def pre_inference(sample, metadata): |
| 44 | + if "url" in sample: |
| 45 | + image = requests.get(sample["url"]).content |
| 46 | + elif "base64" in sample: |
| 47 | + image = base64.b64decode(sample["base64"]) |
| 48 | +
|
| 49 | + img_pil = Image.open(BytesIO(image)) |
| 50 | + img_tensor = preprocess(img_pil) |
| 51 | + img_tensor.unsqueeze_(0) |
| 52 | + return img_tensor.numpy() |
| 53 | +
|
| 54 | +
|
| 55 | +def post_inference(prediction, metadata): |
| 56 | + return labels[np.argmax(np.array(prediction).squeeze())] |
| 57 | +``` |
| 58 | + |
| 59 | +## Deploy to AWS |
| 60 | + |
| 61 | +`cortex deploy` takes the declarative configuration from `cortex.yaml` and creates it on the cluster: |
| 62 | + |
| 63 | +```bash |
| 64 | +$ cortex deploy |
| 65 | +
|
| 66 | +deployment started |
| 67 | +``` |
| 68 | + |
| 69 | +Behind the scenes, Cortex containerizes the models, makes them servable using ONNX Runtime, exposes the endpoint with a load balancer, and orchestrates the workload on Kubernetes. |
| 70 | + |
| 71 | +You can track the statuses of the APIs using `cortex get`: |
| 72 | + |
| 73 | +```bash |
| 74 | +$ cortex get alexnet --watch |
| 75 | +
|
| 76 | +status up-to-date available requested last update avg latency |
| 77 | +live 1 1 1 12s - |
| 78 | +``` |
| 79 | + |
| 80 | +The output above indicates that one replica of the API was requested and one replica is available to serve predictions. Cortex will automatically launch more replicas if the load increases and spin down replicas if there is unused capacity. |
| 81 | + |
| 82 | +## Serve real-time predictions |
| 83 | + |
| 84 | +```bash |
| 85 | +$ cortex get alexnet |
| 86 | +
|
| 87 | +url: http://***.amazonaws.com/image-classifier/alexnet |
| 88 | +
|
| 89 | +$ curl http://***.amazonaws.com/image-classifier/alexnet \ |
| 90 | + -X POST -H "Content-Type: application/json" \ |
| 91 | + -d '{"url": "https://bowwowinsurance.com.au/wp-content/uploads/2018/10/akita-700x700.jpg"}' |
| 92 | +
|
| 93 | +"Eskimo dog" |
| 94 | +``` |
| 95 | + |
| 96 | +Any questions? [chat with us](https://gitter.im/cortexlabs/cortex). |
0 commit comments