2022-08-31 –, HS 118
This talk will take state of the art python models and show how, through advanced inference techniques, we can drastically increase the performance of the models at runtime. You’ll learn about the open source MLServer project and see live how easily it helps serve python-based machine learning models.
Machine learning models are often created with an emphasis on how they run during training but with little regard for how they’ll perform in production. In this talk, you’ll learn what those issues are and how to address them using some state of the art models as an example. We’ll introduce the open source project, MLServer, and look at how features, such as multi-model serving and adaptive batching, can optimize performance for your models. Finally, you’ll learn how using an inference server locally can speed up the time to deployment when moving to production.
This talk will take state of the art python models and show how, through advanced inference techniques, we can drastically increase the performance of the models at runtime.
Project Homepage / Git: Domains:Machine Learning
Expected audience expertise: Domain:some
Expected audience expertise: Python:some
Ed comes from a cloud computing background and is a strong believer in making deployments as easy as possible for developers. With an education in computational modelling and an enthusiasm for machine learning, Ed has blended his work in ML and cloud native computing together to cement himself firmly in the emerging field of MLOps. Organiser of Tech Ethics London and MLOps London, Ed is heavily involved in lots of developer communities and, thankfully, loves both beer and pizza.