This repository consists of code and articles on the Neural Bits Newsletter that showcase:
- how to optimize, and quantize models for optimal performance
- efficient model serving in production environments at scale
| ID | 📝 Article | 💻 Code | Details | Complexity | Tech Stack |
|---|---|---|---|---|---|
| 001 | Inference Engines Profilling | Here | Profile a CNN model across PyTorch, ONNX, TensorRT, and TorchCompile | 🟩🟩⬜ | Python, Jupyter |