Skip to content

multi-modal-ai/production-hub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Neural Bits Production Hub

This repository consists of code and articles on the Neural Bits Newsletter that showcase:

  • how to optimize, and quantize models for optimal performance
  • efficient model serving in production environments at scale

Inference Engines

ID 📝  Article 💻 Code Details Complexity Tech Stack
001 Inference Engines Profilling Here Profile a CNN model across PyTorch, ONNX, TensorRT, and TorchCompile 🟩🟩⬜ Python, Jupyter

Model Deployment

Scale & Production

Optimization & Deployment

About

Hands-on hub to learn techniques to optimize and serve AI models to production the most optimal way.

Topics

Resources

Stars

Watchers

Forks

Languages