PyData London 2026

Adam Hill

Adam is a Staff Data Scientist at ComplyAdvantage, where they are tackling financial crime with advanced analytics, large-scale systems, and the latest in generative and agentic AI.

Before that, he spent eight years in the smart cities space at HAL24K, helping governments and infrastructure providers make better decisions with their data. Along the way, he built and led a team of ten data scientists and helped launch four spin-out ventures.

A recovering astrophysicist, Adam spent a decade analysing data from space telescopes in search of new cosmic phenomena. He’s since redirected that curiosity toward Earth-based problems.

Adam is an active member of the PyData community, the founder of PyData Southampton, and a long-time volunteer with DataKind UK, supporting charities and NGOs with pro-bono data science.


Session

06-07
15:30
45min
From Chat-with-PDF to Quiz-Master: Live-Grading RAG with LLM-as-Judge in Python
Adam Hill

Most RAG demos stop at retrieval and summarisation. In practice, we also need to measure the understanding of users, models, and the source material. This talk introduces a reusable evaluation pattern that turns any document into a live-graded “exam engine” using Python tools including Docling, DeepEval, and Marimo.

We will build a stateful application that generates multiple-choice and free-text questions from complex documents, creates realistic distractors, and scores answers in real time using an LLM-as-judge pipeline. The demo is intentionally playful, but each component maps to a production concern: layout-aware ingestion (tables and figures), synthetic QA dataset creation, semantic grading, and interactive evaluation loops.

Attendees will learn how to move beyond passive RAG towards systems that benchmark knowledge, support training workflows, and enable human-in-the-loop evaluation.

Doddington Forum