Abby Tse is Chair of PyData NYC, where she has led a community of over 8,000 data professionals since 2022. She organizes the annual PyData NYC/Boston Conference, a three-day event that brings together 600+ attendees from around the world to explore the latest in data science, machine learning, and AI. Abby is currently an MBA student at Columbia Business School, where she focuses on entrepreneurship and innovation. Previously, she worked at IBM, where she built enterprise AI systems, including large-scale generative AI applications to improve knowledge access.
- Learn to Unlock Document Intelligence with Open-Source AI
Adam is a Staff Data Scientist at ComplyAdvantage, where they are tackling financial crime with advanced analytics, large-scale systems, and the latest in generative and agentic AI.
Before that, he spent eight years in the smart cities space at HAL24K, helping governments and infrastructure providers make better decisions with their data. Along the way, he built and led a team of ten data scientists and helped launch four spin-out ventures.
A recovering astrophysicist, Adam spent a decade analysing data from space telescopes in search of new cosmic phenomena. He’s since redirected that curiosity toward Earth-based problems.
Adam is an active member of the PyData community, the founder of PyData Southampton, and a long-time volunteer with DataKind UK, supporting charities and NGOs with pro-bono data science.
- From Chat-with-PDF to Quiz-Master: Live-Grading RAG with LLM-as-Judge in Python
I am a researcher with a strong track record of transferring core scientific computing skills across very different technical and scientific backgrounds ranging from radiation detection and medical physics to Earth observation. I have worked across disciplines in academic and industry settings and am particularly drawn to complex problems that require continuous learning and close collaboration across different domains.
- What We Expect from XAI - A scientist’s experience between models and users
I study energy policy at the University of Texas at Austin. My work focuses on residential electrification and improving the efficacy of beneficial electrification upgrades.
- What Can LLMs Do with Messy Residential Electrification Data?
Arghyadeep Sarkar is a Senior Data Scientist at Red Hat with ~8 years of experience in data science and artificial intelligence. His career has evolved from traditional machine learning to architecting large-scale Generative AI and LLM-based production systems.
He built strong foundations in statistical modeling, ML pipelines, and applied AI, later specializing in deep learning, NLP, transformers, and Generative AI. He has designed and deployed LLM agents, RAG-based systems, and enterprise conversational platforms, covering the full lifecycle from training and fine-tuning to scalable deployment.
Current Focus
- Building reliable agentic AI systems
- Improving retrieval grounding and RAG quality
- Deploying LLMs and SLMs in production
- Delivering scalable, cost-efficient enterprise AI solutions
He brings a system-first engineering mindset, translating cutting-edge AI research into robust real-world products.
- The Silent Crash: Why Your RAG Evaluation Metrics Are Lying to You
A seasoned software engineer, working in both batch and real time, data intensive, python application.
- Kafka Streaming, the Pythonic Way
Austen is a computational astrophysicist specialising in scientific machine learning. Currently a postgraduate researcher at the University of Southampton, he will soon be joining the University of Cambridge as an Exoplanetary Data Science Research Associate. His work primarily focuses on accelerating complex physics simulations using "fast-forward" emulator techniques, which he has applied across diverse domains ranging from fusion-energy plasma control at the UK Atomic Energy Authority to extreme weather forecasting at IBM Research.
- Fast-Forward(ing) Models: Accelerating High-Dimensional Inference with AI Emulators
Ben Vincent is Director of InferenceWorks Ltd and a Principal Data Scientist at PyMC Labs, where he has been building Bayesian solutions for real-world business problems since 2021. He created CausalPy, an open-source Python library for causal inference in quasi-experimental settings. He holds a PhD in Neuroscience from the University of Sussex (UK) and previously held a university faculty position for 15 years.
- Did Your Rollout Actually Work? Measuring Phased Launches with Staggered DiD in Python
Carol Chen is a Community Architect at Red Hat, having led several upstream communities including InstructLab, Ansible and ManageIQ. She has been actively involved in open source communities while working for Jolla and Nokia previously. In addition, she also has experiences in software development/integration in her 12 years in the mobile industry. Carol has spoken at events around the world, including AI_Dev in Paris and OpenSearchCon in Shanghai. On a personal note, Carol plays the Timpani in an orchestra in Tampere, Finland, where she now calls home.
- Learn to Unlock Document Intelligence with Open-Source AI
Cedric Clyburn (@cedricclyburn), Senior Developer Advocate at Red Hat, is an enthusiastic software developer with a background in Kubernetes, DevOps, and container tools. Focused on open-source software, he both contributes (e.g., Podman, vLLM) and enjoys speaking, with prior experience at Devoxx, WeAreDevelopers, The Linux Foundation, and more. Cedric also spends (too much) time creating video and written content helping developers learn new topics in emerging technologies, with over 2M+ views online. He’s based in New York City and is an organizer of the local Kubernetes Community Day.
- What Can LLMs Do with Messy Residential Electrification Data?
After having a career as a Data Scientist and Developer Advocate, Cheuk dedicated her work to the open-source community. Currently, she is working as a developer advocate for JetBrains. She has co-founded Humble Data, a beginner Python workshop that has been happening around the world. Cheuk also started and hosted a Python podcast, PyPodCats, which highlights the achievements of underrepresented members in the community. She has served the EuroPython Society board for two years and is now a fellow and director of the Python Software Foundation.
- Unconference- Feminist AI
- Do you know how well your model is doing? Evaluate your LLMs
Chris is a Principal Quantitative Analyst at PyMC Labs and an Adjoint Associate Professor at the Vanderbilt University Medical Center, with 20 years of experience as a data scientist in academia, industry, and government, including 7 years in pro baseball research with the Philadelphia Phillies, New York Yankees, and Milwaukee Brewers.
He is interested in computational statistics, machine learning, Bayesian methods, and applied decision analysis. He hails from Vancouver, Canada and received his Ph.D. from the University of Georgia.
- Flexible Statistical Modeling with Bayesian Additive Regression Trees
- PyMC Code Sprint
Daina Bouquin is Senior Developer Relations Engineer at Anaconda with over 12 years of experience spanning astrophysics, library science, and software development. She previously served as Head Librarian at the Harvard-Smithsonian Center for Astrophysics, where she led projects on software citation, preservation, and recovering the contributions of early women in computing. This work gave her deep familiarity with historical computing collections in addition to experience supporting scientists doing computational research. At Anaconda, she creates educational content and strengthens connections between engineering teams and the broader open source community. She believes documentation isn't just about clarity, it's about building communities where people want to participate.
- Build your castle, dig your moat: AI sovereignty, provenance and compliance
Damian Bemben is a prominent speaker, creative technologist & developer within the Hampshire & Solent tech space. Damian is currently a Senior Software Engineer at Ada Mode - developing groundbreaking "human-in-the-loop" AI applications within highly regulated industrial sectors like civil nuclear.
He holds a First Class Masters in Computer Science from the University of Sheffield. His academic work included a dissertation on robotic locomotion using evolutionary principles and using AI to monitor air pollution & local renewable energy projects.
Damian is a dedicated educator and community organiser within the local space, who has excelled at translating complex research into accessible insights. He is an active organiser of events within the Hampshire creative & tech space.
- The Clean Energy Graveyard: Using Python & Gemini to Map the UK's Cancelled Renewable's
Daniele is a Data Scientist with expertise in statistics, data science and AI, passionate about exploring the intersection of AI and financial markets.
Since 2023, he is working at MDPI, one of the largest open-access publishers.
A former national 400m sprinter.
- Building a Scientific Taxonomy at Scale with Graph Clustering, Embeddings, and LLMs
Dmitry Petrov is the creator of open-source tool DVC (Data Version Control), holds a PhD in Computer Science, previously worked as a Data Scientist at Microsoft, and is now the founder of DataChain.ai, a Python-first data platform for Physical AI.
- From SQL to Python: Building Data Context for Agents and People
After 12+ years architecting and engineering cloud solutions for small and large enterprises, I recognized that AI represents not a replacement for expertise, but its natural evolution. Join me in shaping the semantic layer for AI-ready data.
- Making Databases LLM-Ready: Building Production Semantic Layers with Semantido
Data Engineer in AI Platform at The Economist, PyData Cornwall co-founder, and committed diversity and inclusion ally.
- Observing Agentic AI in Production: MCP Server Tracing with OpenTelemetry and Animal Crossing
Feichi Lu is a Data Scientist at MDPI in Basel, where she works on building data-driven analytics for scientific publishing. She holds a Master’s degree in Data Science from ETH Zürich. Her experience spans large-scale data analysis, semantic modeling, and applied AI.
- Building a Scientific Taxonomy at Scale with Graph Clustering, Embeddings, and LLMs
Lead MLOps Engineer at Climate Policy Radar
- Making tech boring to keep data exciting
Gabriel Lipnik is an AI engineer and applied mathematician at Anexia Digital Engineering, working on production-grade machine learning, artificial intelligence, and optimisation systems. His work focuses on bridging the gap between advanced models and real-world deployment, with a particular interest in MLOps, trustworthy AI, and regulatory-ready ML systems.
He has contributed to large-scale optimization and AI projects in the mobility and infrastructure domain, where reliability, traceability, and operational robustness are critical.
Gabriel is particularly interested in practical approaches to making machine learning systems more transparent, monitorable, and production-ready.
- Your ML Pipeline Meets the EU AI Act
Gergely Daroczi, PhD, has been a passionate open-source package developer for two decades. With over 15 years in the fintech, adtech, healthtech, and other SaaS industries, he has expertise in data science and engineering, as well as cloud infrastructure, in both California and Hungary, with a focus on building scalable data platforms. Gergely maintains a dozen open-source R and Python projects and organizes a tech meetup with 1,800 members in Hungary – along with other open-source and data conferences.
- SELECT instance FROM cloud WHERE workload = ? ORDER BY cost_efficiency
Hitendri Bomble is a Senior Data Scientist at Red Hat, where she builds Generative AI solutions to solve complex business problems. She specializes in working with Large Language Models (LLMs) to create tools that make everyday work more efficient. Deeply rooted in the open-source community, Hitendri focuses on using the latest AI innovations to automate tasks and bring fresh ideas to her team.
- The Silent Crash: Why Your RAG Evaluation Metrics Are Lying to You
Ian is a Scientific Software Developer at QuantStack. He has been an Open Source contributor for over 15 years, is a core maintainer of the libraries Matplotlib and ContourPy and a significant contributor to Bokeh and Datashader. Recently Ian has been involved throughout the Jupyter stack, from kernels and widgets through to JupyterLite.
- JupyterLite: run all your code in a web browser using WebAssembly
Ines Montani is a developer specializing in tools for AI and NLP technology. She’s the co-founder and CEO of Explosion and a core developer of spaCy, a popular open-source library for Natural Language Processing in Python, and Prodigy, a modern annotation tool for creating training data for machine learning models.
- Vibe NLP for Applied NLP
Ivo Dilov has 5 years of industry experience and 10 years of competitive programming, with a focus on high-performance software. For the past 2 and a half years, he has been a senior engineer on ArcticDB, the open-source DataFrame database backed by Man Group and Bloomberg, working in C++ and Python.
- Bridging Pandas and Polars: The Hidden Costs of Dataframe Interoperability
Jacob Tomlinson is a senior Python software engineer at NVIDIA with a focus on deployment tooling for distributed systems. His work involves maintaining open source projects including RAPIDS and Dask. RAPIDS is a suite of GPU accelerated open source Python tools which mimic APIs from the PyData stack including those of Numpy, Pandas and SciKit-Learn. Dask provides advanced parallelism for analytics with out-of-core computation, lazy evaluation and distributed execution of the PyData stack. He also tinkers with the open source Kubernetes Python framework kr8s in his spare time. Jacob volunteers with the local tech community group Tech Exeter and lives in Exeter, UK.
- Documenting your open source projects for machines
Jeremiah Lowin is the founder and CEO of Prefect and the author of FastMCP. Prefect develops automation tools used across the data and AI ecosystem, and FastMCP has become the standard framework for working with the Model Context Protocol. Before founding Prefect, he spent over a decade leading risk and data initiatives at major investment firms and was a founding member of the Apache Airflow PMC. He lives in Washington, DC.
- Keynote- Jeremiah Lowin- Build Reasonable Software
I am a senior engineering lead/executive director at Morgan Stanley.
I design and build large-scale, enterprise-ready, high-performance financial systems used in production environments where correctness, resilience, and speed matter. My work spans system design, hands-on engineering, and long-term platform evolution in regulated domains.
I place strong emphasis on clean, maintainable architecture—clear domain boundaries, explicit data contracts, and model-driven design. I optimise for systems that remain understandable and adaptable as complexity, scale, and regulatory demands increase.
A significant part of my work focuses on data analytics, complex data modelling, and financial mathematics—including forecasting, liquidity, risk, and regulatory calculations. I enjoy translating mathematically rich problem spaces and large datasets into precise, explainable, and production-grade implementations.
I work with a prototype-to-production mindset, leveraging modern cloud platforms, data tooling, and AI techniques to move quickly while preserving architectural discipline, observability, and operational robustness.
www.linkedin.com/in/kamlesh-shah
- Columnar Thinking - Designing for high-performance execution with Arrow and Polars
Dr. Katrina Riehl is a Principal Technical Product Manager at NVIDIA leading the CUDA Education program. For over two decades, Katrina has worked extensively in the fields of scientific computing, machine learning, data science, and visualization. Most notably, she has helped lead data initiatives at the University of Texas Austin Applied Research Laboratory, Anaconda, Apple, Expedia Group, Cloudflare, and Snowflake. She is an active volunteer in the Python open-source scientific software community and currently serves on the Advisory Council for NumFOCUS.
- GPU Algorithm Authoring with CUDA Tile
The speaker spent over 12 years working in quantitative roles in investment management before returning to academia to study Artificial Intelligence. They are currently completing a Master’s degree in AI and ML in Science, and are particularly interested in how modern machine learning systems behave in practice, especially where modelling assumptions quietly break down.
- Do Multilingual Embeddings Really Share a Semantic Space? Practical Lessons Across Scripts and Languages
Ken Obata is a senior data engineer currently working at Lyft, with over seven years of experience building large-scale data infrastructure at KPMG, Amazon, and Lyft. His current research focuses on scalable text deduplication for LLM training data, where he developed a partition-aware MinHash LSH system that processes hundreds of millions of documents on commodity Spark clusters.
- Beyond Spark MLlib: Deduplicating Common Crawl at Scale
Data Engineer at Climate Policy Radar
- Making tech boring to keep data exciting
Laura is a very technical designer™️, working at Pydantic as Lead Design Engineer. Her side projects include Sweet Summer Child Score (summerchild.dev) and Ethics Litmus Tests (ethical-litmus.site). Laura is passionate about feminism, digital rights and designing for privacy. She speaks, writes and runs workshops at the intersection of design and technology.
- The Human-in-the-Loop is Tired
Lena Shakurova is the founder of ParsLabs (https://parslabs.org), a Conversational AI agency, and Chatbotly (https://chatbotly.co), a no-code platform for building AI assistants trained on custom data.
At ParsLabs, she leads a team blending AI, user research and conversation science to design and develop high quality AI Conversations that sound human. She has background in NLP and Artificial intelligence and 8+ years of experience and 110+ successful projects building production-ready chatbots and voice assistants.
Lena focuses on ethical, user-first AI, leveraging her expertise in Linguistics & AI to create responsible, high-quality AI solutions. She shares insights on AI innovation and human-centered design through her blog (https://shakurova.io/blog) and LinkedIn (https://www.linkedin.com/in/lena-shakurova/).
- Evaluating multi-turn conversations: A practical guide to AI Agent evals
- From Synthetic Examples to Production Signals: Multimodal Training Data Pipelines with Privacy-Safe Feedback
AI Engineer @xtream
- Reading the Mind of an LLM
Author of Narwhals, heavy contributor to pandas, Polars, and NumPy (stubs). Marco works as Senior Software Engineer at Quansight Labs. His background is in Mathematics. Outside of work he can most likely be spotted at Celtic Folk Sessions.
- The Polars vs SQL differences nobody is talking about
I am Chief Architect at Engineering is Easy, working in aerospace and defence consulting. I hold a PhD in environmental and geospatial modelling, and I have spent over 20 years across climate research, data science, AI, and developer advocacy.
I also run Living is Easy, where I work as a certified mindset consultant focused on how habits, self-image, and mental programming drive results. That work has given me a deep understanding of how paradigms shape behaviour for both individuals and teams.
- The Rules Nobody Writes Down: Decoding and Shifting Team Culture From Any Seat
Data Engineer at Climate Policy Radar
- Making tech boring to keep data exciting
Martin O'Reilly is Director of Research Engineering at the Alan Turing Institute, where he leads a team of software, data and infrastructure engineers who work across the Turing's research portfolio to bridge the gap between research and practice - from AI for weather prediction to AI-assisted air-traffic control. Prior to Turing, Martin spent several years developing software, data standards and engineering practices in the education sector before going back to school to build robots and try and understand the brain by modelling it.
- Keynote- Martin O'Reilly- LLMs and AI agents demystified
Matt Crooks is a Principal Data Scientist at the BBC, where he works in the audiences data science team applying statistical and machine learning models to understand and improve marketing effectiveness and audience engagement. His current work focuses on using data and AI to automate the production of personalised creative assets at scale. Previous work has involved building an ML-powered adaptive learning quiz for BBC Bitesize during Covid. He has also had a previous role leading and developing the experimentation tooling and best practices at Typeform. Matt holds a PhD in Mathematics from the University of Manchester and began his career in academic research into weather and climate.
- AI-Assisted Creative for Automated Marketing using Python
Michel Semaan is the Analytics Lead for Transaction Banking at Allica Bank, previously a Senior Analytics Engineer at Amazon. Beyond his day job, Michel teaches as a DataCamp instructor with two published SQL courses and as a Python and data science mentor with Great Learning and Springboard.
- Querying the queries: SQL Metaprogramming in Python
Ming Zhao is an open source developer and Developer Advocate at IBM Research, where he helps IBM leverage open technologies while building impactful tools and growing vibrant open-source communities. He’s passionate about making open tech accessible to all and ensuring developers have the tools they need to succeed in the rapidly developing AI space. Ming now leads community efforts around Docling, IBM’s fastest-growing open source project, recently welcomed into the LF AI & Data Foundation.
- Learn to Unlock Document Intelligence with Open-Source AI
Research Scientist/Engineer at NVIDIA focused on Multimodal Synthetic Data Generation
- From Synthetic Examples to Production Signals: Multimodal Training Data Pipelines with Privacy-Safe Feedback
I'm a Data-Scientist working in HR Tech and People Analytics with Personio. I'm a big advocate of open source software and regularly contribute to PyMC, PyMC-Marketing and CausalPy. I've worked across a variety of industries ranging from e-commerce, insurance and gambling and in each, i've tried to find ways to apply statistical best practice to business problems.
I'm always open to chat about scientific python, philosophy of science and Bayesian reasoning and decision analysis.
- Hazards on the Causal Path: Bayesian Time-Varying Survival Analysis with PyMC
Neal Richardson is VP of Engineering at Posit and a member of the Apache Software Foundation. He is a maintainer of Apache Arrow, along with many other open-source projects. He holds a Ph.D. in Political Science from the University of California, Berkeley.
- MCP, or not MCP
Nick Radcliffe has used Python since around 2005 (starting with Python 2.1, in the form of Jython) and has been doing what we now call Data Science since around 1986. He is a Visiting Professor in the Maths Department (Operations Research) at University of Edinburgh and runs Stochastic Solutions Limited, a consulting and software company working in Data Science. Since around 2015 Nick has been developing the ideas of test-driven data analysis (TDDA), which is an approach to quality of data and analytical processes inspired by test-driven development (TDD). The open-source Python TDDA library (for which he is the lead developer) provides support for test-driven data analysis in those areas where software can help.
Nick has previously co-authored two books, one on Sustainability for WWF, and one on a (defunct) Python online tag-based social database called Fluidinfo. By the time of this conference, his latest book, Test-Driven Data Analysis (CRC Press) should be available.
- Test-Driven Data Analysis
Nicolas holds a Ph.D. in applied mathematics from Université Paris Dauphine - PSL, where his research focused on machine learning, with particular emphasis on attention mechanisms and geodesic approaches to segmentation. His work on designing advanced deep learning architectures for complex datasets has led to multiple publications at leading international conferences.
He brings hands-on expertise in self-supervised learning and large-scale optimisation, and is currently contributing to Neuralk's mission to develop the first enterprise tabular foundation model.
- Hands-On with Tabular Foundation Models: From Zero to Strong Baselines
Niek Tax is a Staff Research Scientist and Tech Lead at Meta's Central Applied Science team in London. He focuses on longer-term, foundational work that addresses new opportunities and challenges across Meta, bridging the gap between academic rigour and product teams. Niek has extensive experience overseeing the end-to-end lifecycle of production-grade ML systems, from research to global deployment. His expertise is in uncertainty quantification, including active learning and probability calibration, and he has published articles at NeurIPS and KDD on those topics.
Before joining Meta, Niek worked as an ML engineer at Booking.com and in applied R&D at Philips Research. He holds a PhD in Computer Science from Eindhoven University of Technology, and has authored 35+ peer-reviewed publications with over 2,500 citations.
- Beyond ML Model Calibration: Hands-On Multicalibration with MCGrad
PyData London
- Diversity Scholar Luncheon
- Lightning Talks
I am a data scientist at a international mining group.
- From Noisy Sensors to Events: Event Detection in Sensor data with Kalman Filters and Hidden Markov Models
I am a Senior Machine Learning Scientist at Monzo, where my main focus is around Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and sophisticated data augmentation strategies. With 6 years of experience specializing in Natural Language Processing (NLP), I have a proven track record of building scalable AI systems for high-stakes environments.
Prior to joining Monzo, I was a Machine Learning Engineer at Bumble, leading Trust and Safety initiatives by developing LLM-powered moderation pipelines to ensure platform safety at scale. I also worked as a Senior Data Scientist at ComplyAdvantage, where I applied NLP to financial crime detection, and as a consultant at Sia, focusing on complex question-answering tasks.
I am passionate about the intersection of LLM infrastructure and practical data engineering, specifically solving the "cold-start" problem for niche domains through synthetic data and rigorous validation frameworks
- When Your Dataset Has Blind Spots: Practical LLM-Based Data Augmentation
Oreolorun is a machine learning engineer with experience in building AI enabled software features and data processing for AI workflows.
- Building a Browser Agent from Scratch: Teach an LLM to Navigate the Web
Oriol is a computational statistician, working as a maintainer of the ArviZ and PyMC libraries and as Principal Data Scientist with PyMC Labs. He started in academia but after some years but he left after some years in order to be able to work more freely and collaboratively on open source, software and knowledge sharing. His main areas of interest are data visualization, model and inference diagnostics, model comparison, and prior elicitation. Within open source projects, he has also dedicated a large part of his work to documentation, governance and DEI.
- PyMC Code Sprint
- Model criticism through posterior predictive checks
Paddy Mullen is a full‑stack engineer and data‑tooling builder. An early employee at Anaconda, he contributed to the Bokeh visualization library. He has built data tools and led teams at hedge funds and startups. Since 2023 he has been developing Buckaroo, an interactive dataframe viewer for notebook environments. He is now leading visualization at xorq-labs.
- The Future of Notebooks in a Claude Code World**
Prattyush is a Research Software Engineer working in the Granite Feedback Team in IBM Research, based in the UK (Winchester) and the US (New York).
IBM Granite is the family of AI models from IBM and Prattyush leads product and client engagements to increase adoption of the models across various use-cases. He is a technical leader for Agentic and GenAI applications, leading efforts for education content and acts as one of the release managers, contributing to testing and release efforts.
Prattyush is part of the wider AI Foundations organisation and as such regularly contributes to the development of the latest IBM Research technologies, both internally and through open source.
- Production-Ready AI Agents: From LLMs to Small Language Models
Rachel-Lee Nabors spent the better part of their career on web standards and opensource and has spearheaded developer education at FAANG and startups, on the React Team, and W3C. Now they work to usher in the future with browser builders and Silicon Valley startups, teaching a new generation of builders that “it's not magic; it's just math” and building experiences that adapt information to people. You can find them drinking tea in London or shadowboxing in San Francisco.
- Keynote- Rachel Lee Nabors- The Community Is the Boat
Richard Kehinde Ogunyale is a Senior Software Engineer based in London, UK, with experience building production AI systems, scalable microservices, and machine learning pipelines. He currently works at Partnerize, where he leads projects involving AI-powered solutions, and has previously built RAG systems with vector databases, LLM-powered automation workflows using DAG architectures at scale.
He is passionate about open source, practical AI engineering, and bridging the gap between ML prototypes and reliable production systems.
- Building a Browser Agent from Scratch: Teach an LLM to Navigate the Web
I am a Senior Software Engineer & AI Specialist at DRW, a proprietary trading firm. I was previously the lead AI developer at Qualis Flow, a company that is using the latest AI tech to help decarbonise the construction industry.
I am also the CTO of NeuroGrid Ltd., a software consultancy firm providing data science and software engineering services. Previously I was the CoFounder of AgileVentures, where I was the CTO and ran multiple open source charity projects in Ruby, Node, React, ReactNative etc..
Before that I was Head of Education and Engineering at the Makers Academy bootcamp, and before that Associate Professor in Computer Science at Hawaii Pacific University, where I taught courses on mobile, games, AI and software engineering, remotely from the UK.
I've been mucking about with computers for over 40 years, starting with early attempts to program games in basic on the BBC Micro and ZX Spectrum in the 80s. I studied AstroPhysics, then Cognitive Science and then Computer Science at university, picking up a PhD (building Neural Nets in C and C++) and two masters along the way. I researched mobile agents at Toshiba in Japan (Java) then went freelance for two years (writing tech articles and building the NeuroGrid search engine, and working with the Cerego learning engine - lots of SQL). After that it was researching Peer to Peer at University of Tokyo, and then to University of Hawaii where I was working with collaborative systems, augmented reality and started programming in both PHP and Ruby/Rails.
I taught Computer Science for Hawaii Pacific University remotely from the UK (Software Engineering, Computer Games programming) for five years, during which time I got involved in MOOCs (I co-ran the "Agile Development using Ruby on Rails" course on edX with UCBerkeley) and started AgileVentures, taking it to full UK charity status. I also ran the MakersAcademy coding bootcamp in London for a couple of years, and am now focused on providing AI, data science and software engineering consulting services.
Education:
PhD in Neural Networks
MS in Computer Science
MSc in Cognitive Science & Natural Language
BSc in Physics with AstroPhysics
- Python Leadership and Engineering Excellence BoF
Samuel Colvin is a Python and Rust developer and Founder of Pydantic Inc., backed by Sequoia to build Pydantic Logfire, the only observability tool that traces your AI and your backend together. The Pydantic library, which he created is downloaded over 580M/month and is a dependency of virtually every GenAI Python libraries including the OpenAI SDK, the Anthropic SDK, the Google Gen AI SDK, Langchain and LlamaIndex.
- Keynote: Samuel Colvin: Pydantic Monty & Logfire: Wild LLMs, from tool calling to computer use
Samuel is a Gen AI Engineer at Capgemini UK, building production multi-agent systems for enterprise clients. He is also the founder of Atlasync AI Ltd, an early-stage AI startup focused on compliance automation. He founded and organises PyData Hull, the UK's newest NumFOCUS chapter. Samuel holds an MSc in AI and Data Science with Distinction from the University of Hull and is AWS certified. His work focuses on multi-agent architectures, RAG pipelines, and agentic observability.
- Building Production Multi-Agent RAG Systems on Serverless AWS
- Mapping the local heat transition: from large-scale geospatial data to real-world impact
Sofia is a principal data scientist at Nesta, working with the sustainable future mission team on decarbonising UK homes. During her time at Nesta, Sofia worked with energy performance certificates, social media and smart meter data to: estimate the cost of low carbon heating technologies, identify issues faced by homeowners in their low carbon heating path, understand how people consume energy in their homes and identifying the most suitable low carbon heating technology for groups of homes.
Prior to joining Nesta, Sofia worked as a data scientist at Imperial College London, assessing the accuracy of crowdsourced data for road traffic collision and injury surveillance. Before this she worked as a research fellow at the Social Physics and Complexity research group, LIP Portugal, on health related projects such as identifying antibiotic over-prescription and factors influencing it.
Sofia holds a Bachelor’s degree in Applied Mathematics and Master’s degree in Data Science and Advanced Analytics.
- Mapping the local heat transition: from large-scale geospatial data to real-world impact
Sujee Maniyam is a Developer Advocate at Nebius, with a background spanning AI, distributed systems, data engineering, and cloud infrastructure. Outside of work, he is usually on a local pickleball court.
"I’m a developer, technical instructor, and entrepreneur, now focusing on Developer Advocacy / Developer Relations for AI. I have worked with AI/ML, Data Engineering, Distributed Systems, Big Data, and Cloud technologies. As a AI Developer Relations Engineer, I combine hands-on engineering with community building to help developers make the most of AI, specially open-source AI. I also influence/shape the product by providing feedback to dev/product teams. See my developer advocacy work".
- Using coding agents with open models
Theo is passionate about NoSQL and distributed computing. He joined Microsoft in 2017 and has been in the Cosmos DB Engineering team as a Program Manager since 2019. He currently focuses on AI, programmability, and developer experience for Azure Cosmos DB. He has a masters degree in Data Science from Dundee University, and lives in the UK with his wife, two boys, and ragcoon cat.
- Designing Semantic Memory for Multi-Agent Systems with Python
Thomas Ogden is a Senior ML Engineer in Financial Engineering at Spotify. He builds tools, mostly with probabilistic machine learning on sequences and graphs. He once did a PhD in Quantum Optics theory and still thinks about physics a lot.
- Don’t Call It “The Forecast”: Designing Prediction Systems at Scale
Tun leads AI Engineering at Lenses, where he is focused on helping companies imagine and implement their strategic vision with agentic AI systems fuelled by real-time context. He was previously a Head of Data and Data/ML Engineer at high growth startups and has spent 20 years building data-intensive applications and leading T-shaped teams.
Tun is a co-organiser for the annual PyData London conference and co-founder of PyData Cornwall. He is a strong advocate in the Python AI engineering community and contributor to open source AI engineering and Apache Kafka tools.
In his spare time, Tun goes surfing, plays guitar and shoots 35mm film.
- Observing Agentic AI in Production: MCP Server Tracing with OpenTelemetry and Animal Crossing
Viktor Kessler, is Co-Founder of Vakamo and the creator of Lakekeeper, an Apache Licensed Iceberg REST Catalog. He’s a big believer in open standards like Apache Iceberg, which he sees as the backbone of today’s modern, composable Data & Analytics systems.
- Governance-as-Code for the Lakehouse: Zero Trust with Iceberg REST Catalog and Policy Engines
Hello world! 👋
I'm Özge Çinko. I'm currently an AI Engineer at ING, working around agentic AI. Before that, I spent two years as an AI Research Engineer at Huawei, where I focused on research-driven AI systems, including recommender systems. I hold a Bachelor's degree in Computer Engineering from Sakarya University. For me, engineering is a creative craft: a way of turning thoughts, emotions, and curiosity into experiences. I care about building technology that feels more purposeful, more human, and more alive. I love researching, building, learning, and exploring because they make me feel alive in the deepest way. I also love expressing myself through writing, speaking, and meaningful conversations, often inspired by art along the way.
- LLM-Based Recommendation Systems: From Embeddings to Real Personalization