Managing the end-to-end machine learning lifecycle with MLFlow

Machine learning requires experimenting with datasets, data preparation steps, and algorithms. Deploy models to a production system and retrain it on new data. MLflow is an open source platform for managing the end-to-end machine learning lifecycle.


Please make sure to check out the installation instructions and data before participating. There might be no sufficient internet connection at the venue.

Instructions and data can be found here: https://github.com/tsterbak/pydataberlin-2019

Machine learning requires experimenting with a wide range of datasets, data preparation steps, and algorithms to build a model that maximizes some target metric. Once you have built a model, you also need to deploy it to a production system, monitor its performance, and continuously retrain it on new data and compare with alternative models. A possible solution to managing this complexity is offered by MLFlow. MLflow is an open source platform for managing the end-to-end machine learning lifecycle.

This tutorial showcases how you can use MLflow end-to-end to:
* Train models and keep track of experiments with MLflow Tracking
* Package the code that trains the model in a reusable and reproducible model format with MLFlow Projects
* Deploy the model into a HTTP server that will enable you to score predictions with MLFlow Models


Domain Expertise:

some

Domains:

Data Science, Infrastructure, Machine Learning, Data Engineering

Python Skill Level:

basic

Abstract as a tweet:

How to manage the end-to-end machine learning lifecycle with MLflow.

Link to talk slides:

https://github.com/tsterbak/pydataberlin-2019

Public link to supporting material:

https://github.com/tsterbak/pydataberlin-2019