PyCon DE & PyData 2025

Building a Self-Hosted MLOps Platform with Kubernetes
2025-04-24 , Europium2

Many managed MLOps platforms, while convenient, often fall short in providing flexibility, requiring complex integrations, and causing vendor lock-in. In this talk, we’ll share our experience transitioning from managed MLOps tools to a self-hosted solution built on Kubernetes. We’ll focus on how we leveraged open-source tools like Feast, MLflow, and Ray to build a more flexible, scalable, and customizable platform that is now in use at Rewe Digital. By migrating to this self-hosted architecture, we gained greater control over our ML pipelines, reduced our dependency on third-party services, and created a more adaptable infrastructure for our ML workloads.


Many managed MLOps platforms, while convenient, often fall short in providing flexibility, requiring complex integrations, and causing vendor lock-in. In this talk, we’ll share our experience transitioning from managed MLOps tools to a self-hosted solution built on Kubernetes. We’ll focus on how we leveraged open-source tools like Feast, MLflow, and Ray to build a more flexible, scalable, and customizable platform that is now in use at Rewe Digital. By migrating to this self-hosted architecture, we gained greater control over our ML pipelines, reduced our dependency on third-party services, and created a more adaptable infrastructure for our ML workloads.

Talk Outline:

  1. Introduction (5 minutes):
    - The challenges of using managed MLOps platforms: vendor lock-in, integration complexity, and lack of flexibility.
    - Why transitioning to a self-hosted solution on Kubernetes can be beneficial.

  2. Proposed Solution (10 minutes):
    - Why Kubernetes for MLOps?
    - How open-source tools like Feast, MLflow, and Ray come together to form the core of a robust self-hosted MLOps stack.
    - Benefits of building a flexible, scalable platform that fits your needs.

  3. Building the Platform (10 minutes):
    - Practical steps for setting up and configuring Feast, MLflow, and Ray on Kubernetes.
    - Integration strategies and how to manage pipelines, model tracking, and feature storage.

  4. Lessons Learned and Q&A (5 minutes):
    - Challenges and takeaways during the migration process
    - Q&A


Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Novice

I'm Josef, an econometrician turned ML engineer. With a strong background in statistics and causal inference, I have developed my skills through rigorous work at institutions such as the University of Bonn and UC Berkeley, but also through the design and implementation of ML solutions at the Rewe Group. My passion lies in reducing model and ecosystem complexity, enhancing interpretability, and bridging the gap between academia and production settings in the context of machine learning. I believe that if we do not establish reliable machine learning systems, we risk failing to harness the immense potential they offer for humanity.