Theo van Kraay
Theo is passionate about NoSQL and distributed computing. He joined Microsoft in 2017 and has been in the Cosmos DB Engineering team as a Program Manager since 2019. He currently focuses on AI, programmability, and developer experience for Azure Cosmos DB. He has a masters degree in Data Science from Dundee University, and lives in the UK with his wife, two boys, and ragcoon cat.
Session
Multi-agent GenAI systems don’t fail because models lack intelligence, they fail because they lack memory.
As LLM applications move from demos to production, semantic memory becomes the defining systems challenge. Agents must remember user preferences, share context across roles, preserve conversational state across sessions, and evolve over time, all without exploding token costs or losing observability.
In this talk, I’ll explore semantic memory as a data engineering problem rather than a prompt engineering trick. Drawing on real-world experience from the Azure Cosmos DB engineering team, we’ll examine how to design layered memory for multi-agent systems in Python: short-term conversational state, episodic event logs, declarative and procedural memory, and retrieval-driven personalization.
Using a practical multi-agent travel planner built with LangGraph, we’ll implement patterns such as session-level versus per-turn persistence, hybrid retrieval design (structured filters plus semantic signals), memory lifecycle management (write, retrieve, summarize, supersede, expire), and checkpointed workflows for reproducibility and debugging.
You’ll leave with practical design heuristics for building agent systems that become more reliable, more efficient, and more explainable over time.
All demonstrations will be in Python and applicable to production-scale systems.