PyCon DE & PyData 2026

Thomas Prexl

Thomas builds LLM applications that create business impact. He co-founded neunzehn innovations GmbH to bring generative AI into companies that need it.

Before that, he ran startup support in Heidelberg—designing accelerators, connecting founders with money and know-how, and launching events like Neurons & Neckar, Sensors & Data Hackathon, and Startup Weekend Rhein-Neckar. Earlier: marketing and business development in electrical engineering and diagnostics.

He studied at Mannheim, got his doctorate at Basel, teaches at both Heidelberg and Mannheim, and talks about AI when someone asks him to.


Session

04-14
17:10
30min
It Works on My Machine: Why LLM Apps Fail Users (Not Tests)
Thomas Prexl, Frank Rust

LLM applications frequently pass tests but fail users in production. This talk examines the gap between evaluation metrics and user experience through three lenses: Expectations (what "working" means to users), Functional (system-level vs. component-level success), and Operational (real-world reliability).

Drawing from production experience, we'll share scenarios of expectation mismatches, silent failures, and undetected drift—plus practical strategies for bridging the gap. The core message: evaluation should answer whether your system serves users, not whether it passes tests.

PyData: Natural Language Processing & Audio (incl. Generative AI NLP)
Palladium [2nd Floor]