Optimizing inference for state of the art python models EuroSciPy 2022

Optimizing inference for state of the art python models
.ical
2022-08-31 14:35–14:50, HS 118

This talk will take state of the art python models and show how, through advanced inference techniques, we can drastically increase the performance of the models at runtime. You’ll learn about the open source MLServer project and see live how easily it helps serve python-based machine learning models.

Machine learning models are often created with an emphasis on how they run during training but with little regard for how they’ll perform in production. In this talk, you’ll learn what those issues are and how to address them using some state of the art models as an example. We’ll introduce the open source project, MLServer, and look at how features, such as multi-model serving and adaptive batching, can optimize performance for your models. Finally, you’ll learn how using an inference server locally can speed up the time to deployment when moving to production.

Public link to supporting material:

https://github.com/SeldonIO/MLServer

Abstract as a tweet:

This talk will take state of the art python models and show how, through advanced inference techniques, we can drastically increase the performance of the models at runtime.

Project Homepage / Git:

https://github.com/SeldonIO/MLServer

Domains: Machine Learning Expected audience expertise: Domain: some Expected audience expertise: Python: some

Ed Shee

Ed comes from a cloud computing background and is a strong believer in making deployments as easy as possible for developers. With an education in computational modelling and an enthusiasm for machine learning, Ed has blended his work in ML and cloud native computing together to cement himself firmly in the emerging field of MLOps. Organiser of Tech Ethics London and MLOps London, Ed is heavily involved in lots of developer communities and, thankfully, loves both beer and pizza.

Optimizing inference for state of the art python models .ical 2022-08-31 14:35–14:50, HS 118

Optimizing inference for state of the art python models
.ical
2022-08-31 14:35–14:50, HS 118