2025-12-08 –, Abigail Adams
Recommender systems power everything from e-commerce to media streaming, but most pipelines still rely on collaborative filtering or neural models that focus narrowly on user–item interactions. Large language models (LLMs), by contrast, excel at reasoning across unstructured text, contextual information, and explanations.
This tutorial bridges the two worlds. Participants will build a hybrid recommender system that uses structured embeddings for retrieval and integrates an LLM layer for personalization and natural-language explanations. We’ll also discuss practical engineering constraints: scaling, latency, caching, distillation/quantization, and fairness.
By the end, attendees will leave with a working hybrid recommender they can extend for their own data, along with a playbook for when and how to bring LLMs into recommender workflows responsibly.
0 – 10 min | Kickoff & Setup
Overview & goals: From collaborative filtering to hybrid LLM recommenders
Environment setup: notebooks, API keys (e.g., OpenAI/Anthropic), vector DB (e.g., FAISS or Chroma)
Verify everyone can run the starter code
10 – 25 min | Step 1 Build the Baseline Recommender
Mini-lecture: Quick recap of user–item collaborative filtering
Hands-on: Implement a simple embedding-based retriever
Compute item embeddings
Perform top-k retrieval with cosine similarity
Checkpoint: Participants produce a working non-LLM recommender
25 – 45 min | Step 2 Add the LLM Layer
Concept: Why and how to integrate an LLM for reasoning and personalization
Hands-on:
Prompt the LLM with retrieved items + user profile/context
Generate natural-language recommendations (“Because you liked X…”)
Tune prompts for diversity and tone
Demo: Instructor shows few-shot prompting and reasoning variations
45 – 65 min | Step 3 Scale and Optimize
Mini-lecture: Real-world engineering constraints
Latency, cost, caching, and distillation
Hands-on:
Implement local cache for repeated prompts
Compare quantized vs full LLM inference latency
Discussion: What trade-offs make sense in production
65 – 80 min | Step 4 Fairness & Trust
Mini-lecture: Sources of bias in recommender + LLM layers
Hands-on audit:
Evaluate example outputs for bias or stereotype reinforcement
Adjust prompts or weighting strategies
Group reflection: Building explainable and fair recommenders
80 – 90 min | Wrap-Up & Next Steps
Review: key building blocks and takeaways
Share code repo + slides + resources for further exploration
Q&A and discussion of deployment patterns
Sheetal Borar is a senior applied scientist at Etsy, where she works on retrieval systems powering large-scale recommender systems. She has spoken at PyData Global and PyData NYC and has several publications under her name and is recognized as a strong advocate for knowledge sharing and community building. She has gained experience across multiple industries and has about five years of professional experience in building machine learning solutions.
Astha is a Senior Data Scientist at CVS Health, where she leads the design of recommendation engines for digital platforms, helping customers discover the right products and enabling patients to access the appropriate health services and support. She specializes in home screen personalization, leveraging data-driven insights to enhance user experiences. With a strong background in the tech industry, she is now applying her expertise to transform and innovate within the healthcare sector.