PyConDE & PyData Berlin 2024

Build TikTok's Personalized Real-Time Recommendation System in Python with Hopsworks
2024-04-22 11:45-13:15 (Africa/Abidjan), A03-A04

The real-time recommendations engine in Tiktok, Monolith, is so good it has been described as "digital crack" (by Andrej Karpathy, former head of AI at Tesla). In this tutorial, we will build the core components of Tiktok Monolith (a retrieval and ranking architecture): a stream processing feature pipeline, a two-tower embedding model to support personalized queries based on each user's history/context, and a simple user interface in Python (Streamlit). Our real-time machine learning system will consist of 3 Python programs - the feature pipeline, the training pipeline, and the online inference pipeline - and the ML infrastructure they require will be provided by the open-source Hopsworks platform, including a feature store, vector database, model serving, and model registry.


The real-time recommendations engine in Tiktok is so good it has been described as "digital crack" (by Andrej Karpathy, former head of AI at Tesla). It is a retrieval and ranking architecture that uses significant ML infrastructure, including a real-time feature store, a vector database, a model registry, and model serving infrastructure.

In this tutorial, we will build the core components of Tiktok Monolith as 3 ML pipelines: a stream processing feature pipeline that takes user actions (clicks, swipes, searches) written to Kafka and computes features that are stored in Hopsworks online store in less than 1 second.
We will train a two-tower embedding model to support personalized queries using training data grounded on each user's history/context and the videos they clicked/didn't-click on.
We will develop an online inference pipeline that takes a user query, encodes it as an embedding to retrieve candidate videos, then users an online feature store to enrich the candidates before a ranking model personalizes the order of candidates for the client. We will even develop a simple user interface in Python (Streamlit) to show the whole system working visually.

Our real-time machine learning system will consist of 3 Python programs - the feature pipeline, the training pipeline, and the online inference pipeline - and the ML infrastructure they require will be provided by the open-source Hopsworks platform, including a feature store, vector database, model serving, and model registry.


Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Intermediate

Abstract as a tweet (X) or toot (Mastodon):

The real-time recommendations engine, Monolith, in Tiktok is so good it has been described as "digital crack". In 1 hr, we will build Monolith in Python as 3 ML pipelines that run on Hopsworks .

Public link to supporting material, e.g. videos, Github, etc.:

https://github.com/logicalclocks/hopsworks-tutorials/tree/master/advanced_tutorials/recommender-system

Jim Dowling is CEO of Hopsworks and an Associate Professor at KTH Royal Institute of Technology. He is lead architect of the open-source Hopsworks platform, a horizontally scalable data platform for machine learning that includes the industry's first Feature Store. He is writing a book for O'Reilly on ML Systems with a feature store.