Deployment¶
Build a prototype / Separate your model and UI¶
Separate your model and UI¶
Batch Prediction - Run model on each data point and store results in a database¶
Model as a service - Run the model as a separate service¶
API¶
- REST
- GRPC
- GraphQL
Learn the tricks to scale¶
Consider moving your model to the edge when you really need to go fast¶
Dependency Management¶
Monitoring Performace¶
Performance Optimization¶
Revisit later
Edge deployment¶
- Nvidia - TensorRT
- Android - MLKit
- iOS - CoreML
- ios and Android/Python
- Tensoflow - TFLite
- Browser - Tensorflow JS
- Target device agnostic - Apache TVM