PyData Boston 2025

Aayush Gauba is a researcher and developer working at the intersection of machine learning, quantum-inspired models, and AI security. He has created open-source projects such as AIWAF, an adaptive web application firewall, and has published research on quantum-inspired neural architectures and robust learning methods. His work focuses on building practical tools that are both scientifically innovative and accessible to the wider Python community. Outside of research, Aayush is passionate about sharing knowledge through talks, tutorials, and collaborations that bridge theory with real world application.

Embracing Noise: How Data Corruption Can Make Models Smarter

Abhishek Murthy

Abhishek Murthy, Ph.D., is a Senior Principal ML/AI Architect at Schneider Electric (SE) in Boston, Massachusetts.

At SE, Dr. Murthy develops Machine Learning (ML) algorithms on sensor data, which are critical for the company's energy technology offers. He is currently focused on leveraging these technologies to improve SE's service offerings for data centers.

Dr. Murthy is also an Adjunct Faculty Member at Northeastern University, where he teaches machine learning algorithms for the Internet of Things (IoT).

He holds a Ph.D. in Computer Science from Stony Brook University, State University of New York, and an M.S. in Computer Science from the University at Buffalo. His doctoral research, supported by a National Science Foundation (NSF) Expedition in Computing, focused on developing algorithms for automatically establishing the input-to-output stability of dynamical systems.

Prior to joining SE, he led the Data Science Algorithms team at WHOOP and served as a Senior Data Scientist at Signify (formerly Philips Lighting), where he led research on IoT applications for smart buildings. Dr. Murthy is an active contributor to the field, with several publications, more than 210 citations, 21 awarded patents, and over 40 pending applications.

Applying Foundational Models for Time Series Anomaly Detection

Allen Downey

Allen Downey is a professor emeritus at Olin College and Principal Data Scientist at PyMC Labs. He is the author of several books -- including Think Python, Think Bayes, and Probably Overthinking It -- and a blog about programming and data science. He is a consultant and instructor specializing in Bayesian statistics. He received a Ph.D. in computer science from the University of California, Berkeley, and Bachelor's and Master's degrees from MIT.

The SAT math gap: gender difference or selection bias?

Aman Bhandari

Aman Bhandari leads the corporate data science/AI function at Vertex Pharmaceuticals. This division integrates and scales advanced analytics and AI (e.g. NLP, machine learning, generative AI/LLMs) across disease and business areas including clinical, commercial, manufacturing and HR. Collaborating directly with executive management, our privacy office and IT, he has developed enterprise capabilities and a model for using AI to drive impact

Prior to joining Vertex in 2017, he held roles at Merck, Genentech, the White House, and the Centers for Medicare and Medicaid Services. In these roles he created the first formal data science team at Merck and while at the White House paved the wave for the first Chief Data Officers across the U.S. Government. Aman earned my PhD in health services research and a master's in epidemiology, with a focus on using large scale data to better understand healthcare. He have been an advisor to data/tech initiatives for the World Bank, USAID, Harvard, Cornell, Ashoka Foundation, Knight News Foundation, Boston Children's Hospital and others.

Using Traditional AI and LLMs to Automate Complex and Critical Documents in Healthcare

Astha Puri

Astha is a Senior Data Scientist at CVS Health, where she leads the design of recommendation engines for digital platforms, helping customers discover the right products and enabling patients to access the appropriate health services and support. She specializes in home screen personalization, leveraging data-driven insights to enhance user experiences. With a strong background in the tech industry, she is now applying her expertise to transform and innovate within the healthcare sector.

Hands-On with LLM-Powered Recommenders: Hybrid Architectures for Next-Gen Personalization

Benjamin Batorsky

Ben is the Lead Data Scientist at GoGuardian, working on text-based ML pipelines for student safety. Previously, he led Data Science teams in academia (Northeastern University, MIT) and industry (ThriveHive). He obtained his Masters in Public Health (MPH) from Johns Hopkins and his PhD in Policy Analysis from the Pardee RAND Graduate School. Since 2014, he has been working in data science for government, academia and industry. His major focus has been on Natural Language Processing (NLP) technology and applications. Throughout his career, he has pursued opportunities to contribute to the larger data science community. He has presented his work at conferences, published articles, taught courses in data science and NLP, and is co-organizer of the Boston chapter of PyData. He also contributes to volunteer projects applying data science tools for public good.

Three agents, three frameworks, one talk

Benjamin Lear

Benjamin Lear is a professor of chemistry at the Pennsylvania State University in University Park, PA. There he runs a research group focused on understanding the interactions between nanoscale materials and their chemical environment. In addition to running this research group, he teaches a course on the design of data visualizations---a topic on which he has delivered numerous international presentations and workshops. He is a regular user of Python and has co-authored a forthcoming book from MIT press that teaches Python to experimental chemists.

Understanding and using color for storytelling in data visualizations

Brandon (Anbang) Wu

Brandon (Anbang) Wu is a Senior Machine Learning Engineer at Quizlet, where he drives search relevance for tens of millions of learners worldwide. Previously at Shopify At Shopify, he built large-scale recommendation systems that powered product discovery for hundreds of thousands of merchants. Earlier at NBCUniversal’s Fandango, he led machine learning initiatives developing content recommendation algorithms for both theatrical and streaming platforms. Brandon holds master’s degrees in Computer Science from Georgia Tech and Analytics from UCLA.

Unlocking Smarter Typeahead Search: A Hybrid Framework for Large-Scale Query Suggestions

Bryce Casavant

Bryce Casavant is a Senior Data Scientist at WHOOP, specializing in experimentation and marketing analytics. He applies causal inference and geo-lift testing to measure the true impact of media and product initiatives, focusing on rigorous statistical results and turning them into actionable business insights that drive smarter business decisions.

Measuring Media Impact: Practical Geo-Lift Incrementality Testing

Caitlin Lewis

PhD Student
Duke University
Pearson Lab

core dev fastplotlib 🚀

fastplotlib: driving scientific discovery through data visualization

Chuxin Liu

Chuxin Liu, PhD, is a Senior Quantitative Modeling Associate at JPMorgan Chase, focusing on model risk and LLM applications in this space. She is also a WiDS NYC Ambassador and organizer of multiple communities. She is passionate about how AI and automation are reshaping the workforce. Blending her research background with hands-on experience in modeling and community leadership, she speaks on building human-centered AI practices and empowering professionals to adapt in an evolving AI era.

Build Your MCP server
How AI Is Transforming Data Careers — A Panel Discussion

Daina Bouquin

Daina Bouquin is Senior Developer Relations Engineer at Anaconda with over 12 years of experience bridging technical and non-technical contexts across astrophysics, library science, data science, and federal government work. Her past experience as a Data Scientist for the US Administration for Children and Families gave her firsthand experience with the gap between technical AI evaluation and real-world impact, particularly in high-stakes government systems affecting vulnerable populations. This work fundamentally changed how she thinks about responsible AI development and sparked her ongoing interest in how Python developers can build systems that truly help people. At Anaconda, she works to strengthen connections between Anaconda's engineering teams and the broader developer community, creating resources and fostering relationships that help people solve important problems with AI and open source tools.

Is Your LLM Evaluation Missing the Point?

David Jones-Gilardi

A Gen-AI / Agentic nerd with decades of coding experience who loves to learn and help others do the same!

One agent, one job, better AI

Dawn Wages

Dawn Gibson Wages is the Director of Community & Developer Relations at Anaconda. From her early work as Research Developer at Wharton Computing and Instructional Technology, then working with Python developer experience at Microsoft to her current role, she has been consulting on Python developer experiences across the ecosystem -- speaking to thousands of developers in the process. She co-hosts the Sad Python Girls Club podcast and served as Chair of the Python Software Foundation Board.

Dawn is a member of the Wagtail CMS core team, has organized DjangoCon US sponsorship efforts and Django Girls workshops, and mentors through Djangonaut Space. She's founder of At The Root, which developed the first Anti-Racist Ethical Source License.

A frequent conference speaker on Python development topics, Dawn is currently writing "Domain-Driven Django," exploring architecture patterns for Django applications. Her work focuses on gathering insights from the Python ecosystem to improve developer tooling and experiences.

When she is not engaging in Python, she's chilling at home in Philadelphia with her wife and dogs.

The Lifecycle of a Jupyter Environment: From Exploration to Production-Grade Pipelines

Deb Nicholson

Deb Nicholson is a free software policy expert and a passionate community advocate. She is the Executive Director at the Python Software Foundation which serves as the non-profit steward of the Python programming language. She has previously served the open source ecosystem through her work at the Open Source Initiative, Software Freedom Conservancy, and the Open Invention Network. She lives with her husband and her lucky black cat in Cambridge, Massachusetts.

Who is Python for? EVERYONE (and why that matters)

Deepyaman Datta

Deepyaman is a Senior Vice President on the Core Data Platform team at Goldman Sachs. Previously, he was a Senior Staff Software Engineer at Voltron Data on the Ibis team. Before their acquisition by Voltron Data, he was a Founding Machine Learning Engineer at Claypot AI, working on their real-time feature engineering platform. Prior to that, he led data engineering teams and asset development across a range of industries at QuantumBlack, AI by McKinsey.

Deepyaman is passionate about building and contributing to the broader open-source data ecosystem. Outside of his day job, he helps maintain Kedro, an open-source Python framework for building production-ready data science pipelines, and Pandera, a lightweight Python data validation library.

Data engineering with Python the right way: introducing the composable, Python-native data stack

Dr. Rebecca Bilbro

Dr. Rebecca Bilbro, co-founder and CTO of Rotational Labs, is a trailblazer in applied AI and machine learning engineering. She co-created Yellowbrick, a Python library that enhances model diagnostics by integrating scikit-learn and matplotlib APIs, facilitating more intuitive model steering.

At Rotational Labs, Dr. Bilbro leads initiatives that empower companies to harness their domain expertise and data, resulting in the successful deployment of large language models and data-driven products. Her efforts bridge the gap between data science and engineering, driving AI solutions that are grounded in real-world business needs, informed by research, rigorously prototyped, and built with deployment and data governance in mind.

She is the co-author of Applied Text Analysis with Python (2018, O’Reilly) and Apache Hudi: The Definitive Guide (2025, O’Reilly). Dr. Bilbro earned her Ph.D. from the University of Illinois, Urbana-Champaign, focusing her research on domain-specific languages within engineering.

Where Have All the Metrics Gone?

Eric Ma

As Senior Principal Data Scientist at Moderna Eric leads the Data Science and Artificial Intelligence (Research) team to accelerate science to the speed of thought. Prior to Moderna, he was at the Novartis Institutes for Biomedical Research conducting biomedical data science research with a focus on using Bayesian statistical methods in the service of discovering medicines for patients. Prior to Novartis, he was an Insight Health Data Fellow in the summer of 2017 and defended his doctoral thesis in the Department of Biological Engineering at MIT in the spring of 2017.

Eric is also an open-source software developer and has led the development of pyjanitor, a clean API for cleaning data in Python, and nxviz, a visualization package for NetworkX. He is also on the core developer team of NetworkX and PyMC. In addition, he gives back to the community through code contributions, blogging, teaching, and writing.

His personal life motto is found in the Gospel of Luke 12:48.

Patterns for Productive Agent-Assisted Programming
Building LLM Agents Made Simple

Gayathri Ramanathan

Gayathri Ramanathan is a Senior Data Scientist at CVS Health with a background in consulting. She has applied machine learning, deep learning, experimentation and AI to solve challenges across industries. She has organised an Analytics for Good Hackathon and led data science career discussions with students and industry professionals navigating different paths in the field. Gayathri is passionate about using data to create operational impact and empowering current and aspiring data professionals to grow in a rapidly evolving field.

How AI Is Transforming Data Careers — A Panel Discussion

Gilberto Hernandez

Gilberto has spent over a decade shaping technical developer education worldwide. To date, he's made complex concepts accessible to over 100,000 students and engineers through both online learning platforms and in-person experiences.

At Codecademy, he authored and launched several of their foundational courses. Since then, he's worn multiple hats as both product manager and technical content creator at industry leading companies, including MongoDB, Domino Data Lab, Plaid, and Snowflake.

Gilberto is passionate about crafting exceptional developer experiences and educational resources. He frequently writes about data engineering, AI, and application development.

Connect with him on LinkedIn: https://www.linkedin.com/in/gilberto-hernandez/

From Notebook to Pipeline: Hands-On Data Engineering with Python

Ian Stokes-Rees

Ian is a Computational Scientist and Software Engineer. His current role is "Partner in AI Engineering" with BCG, a global management consulting firm. He works with BCG's clients around the world to identify opportunities to combine data, technology, and analytics to create step change capabilities in their organization. What does that translate to? On a day-to-day basis it means leading crack teams of BCG engineers and data scientists in the development of AI-driven and (typically) Python-based bespoke solutions which leverage the best tools, technology, and techniques available.

Prior to BCG, Ian was a Product Manager at Anaconda, and has been in the Python community for over 20 years. Ian has a PhD from Oxford where he worked on the CERN LHCb experiment and developed the Python-based distributed computing middleware that managed 10 million queued physics jobs to schedule across a quarter million servers in a globally federated compute environment. He also spent several years at Harvard Medical School collaborating with bio physicists on novel techniques for protein structure discovery.

Ian is a member of the Python Software Foundation and the Open Source Initiative. In his free time he enjoys sailing, cycling, xc skiing, and motorbiking. On rainy days he'll pull out a board game to play with his wife & kids: Ark Nova, Ticket To Ride, and Takenoko are current favorites.

"Save your API Keys for someone else" -- Using the HuggingFace and Ollama ecosystems to run good-enough LLMs on your laptop

Isaac Godfried

Going multi-modal: How to leverage the lastest multi-modal LLMs and deep learning models on real world applications

Ishita Sequeira

I’m a Senior Software Developer at Red Hat, currently working on the Dataverse platform using a data mesh architecture. I am a maintainer of the ArgoCD project and also mentor engineers and contribute to the developer community through ArgoCon reviews and certification initiatives.

Scaling Specialist Knowledge with AI: From Virtual Specialist to Revenue Acceleration Agent

Itamar Turner-Trauring

Itamar Turner-Trauring is a consultant, and writes about Python performance at https://pythonspeed.com/. He helps companies maintain open source software and speed up their data processing code.

In his spare time he is a volunteer with Cambridge Bicycle Safety, and writes about Cambridge local politics at Let's Change Cambridge.

Processing large JSON files without running out of memory

Jacob Tomlinson

Jacob Tomlinson is a senior Python software engineer at NVIDIA with a focus on deployment tooling for distributed systems. His work involves maintaining open source projects including RAPIDS and Dask. RAPIDS is a suite of GPU accelerated open source Python tools which mimic APIs from the PyData stack including those of Numpy, Pandas and SciKit-Learn. Dask provides advanced parallelism for analytics with out-of-core computation, lazy evaluation and distributed execution of the PyData stack. He also tinkers with the open source Kubernetes Python framework kr8s in his spare time. Jacob volunteers with the local tech community group Tech Exeter and lives in Exeter, UK.

Accelerating Geospatial Analysis with GPUs

Jake Lorocco

Generative Programming with Mellea: from Agentic Soup to Robust Software

Jaya Venkatesh

Jaya Venkatesh is a software engineer at NVIDIA, working on the RAPIDS ecosystem with a focus on simplifying deployment in the cloud and distributed systems. Previously, Jaya worked as a machine learning engineer at Pixxel Space, where he developed large scale, real-time inferencing models for Earth Observation. He holds a Master’s degree in Computer Science from Arizona State University, where his research project centered on snow melt monitoring in the Arizona region through satellite imagery analysis.

Accelerating Geospatial Analysis with GPUs

Jules Walzer-Goldfeld

Jules was a Mathematics and Computer Science major at Williams College, with an interest in data and data visualization. He is excited about interactivity with data, whether that be tables, emails, dashboards, emails, or fully-fledged websites. He is now working on open source tools for data: namely Great Tables and email in Python for Posit.

Wrappers and Extenders: Companion Packages for Python Projects

Katrina Riehl

Dr. Katrina Riehl is a Principal Technical Product Manager at NVIDIA leading the CUDA Education program. For over two decades, Katrina has worked extensively in the fields of scientific computing, machine learning, data science, and visualization. Most notably, she has helped lead data initiatives at the University of Texas Austin Applied Research Laboratory, Anaconda, Apple, Expedia Group, Cloudflare, and Snowflake. She is an active volunteer in the Python open-source scientific software community and currently serves on the Advisory Council for NumFOCUS.

CUDA Python Kernel Authoring

Konstantin Taletskiy

The JupyterLab Extension Ecosystem: Trends & Signals from PyPI and GitHub

Kushal Kolar

PhD Candidate at NYU. 10+ years of experience using Python for data analysis and machine learning with neuroscience datasets. Core developer of fastplotlib and maintainer of several Python libraries in neuroscience with significant user bases, and a contributor to other libraries such as tslearn.

fastplotlib: driving scientific discovery through data visualization

Leonardo Ferreira

Create your Health Research Agent

Lily Xu

Lily Xu is a Clinical Data Science Director in the corporate data science team at Vertex Pharmaceuticals. She has a wealth of experience in building cutting-edge data science solutions for clinical operations and commercial strategy across multiple disease areas. Lily has spearheaded projects on GenAI for clinical documentation, predictive patient modeling, large-scale claims analytics, data-driven protocol design, and centralized site intelligence applications.

Using Traditional AI and LLMs to Automate Complex and Critical Documents in Healthcare

Luca

Luca Fiaschi is an accomplished tech executive and AI/ML expert with over 15 years of leadership experience in AI, data science, and analytics teams at hypergrowth technology companies. He currently serves as a partner at PyMC Labs, where he drives Gen AI solutions and Bayesian consultancy for enterprise clients. Previously, he led as Chief Data & AI Officer at Mistplay, and held executive roles at HelloFresh, Stitch Fix, Rocket Internet, and Redmart/Alibaba, delivering scalable AI-driven products and revenue growth through advanced personalization and experimentation platforms.

MMM Open- Source Showdown: A Practitioner's Benchmark of PyMC-Marketing vs. Google Meridian

Mike Woodward

Mike Woodward is a VP Data Science with a long history in data science and software engineering. He's worked in cybersecurity, business intelligence, technical software, radio communications, and even fashion. He has degrees from British and American universities and has spoken at numerous international conferences.

Using Cursor (and other AI code gen tools) for data science

Mingxuan Zhao

Ming Zhao is an open source developer and Developer Advocate at IBM, where he helps IBM leverage open technologies while building impactful tools and growing vibrant open-source communities. He’s passionate about making open tech accessible to all and ensuring developers have the tools they need to succeed in the rapidly developing AI space. Ming now leads community efforts around Docling, IBM’s fastest-growing open source project, recently welcomed into the LF AI & Data Foundation.

Learn to Unlock Document Intelligence with Open-Source AI

Morgan Vincent

I’m a Ph.D. candidate in Chemistry at Penn State University with a passion for teaching and educational innovation. My work centers on integrating AI tools into chemistry instruction, designing research-based learning activities, and exploring how students develop conceptual understanding in science. I’m also involved in global agricultural education through the Global Teach Ag Network, where I help connect food science, sustainability, and classroom practice.

Understanding and using color for storytelling in data visualizations

Nathan Fulton

Nathan Fulton is a manager at IBM Research. He is an expert in large language models, formal verification, and reinforcement learning. Nathan earned bachelors degree from Carthage College in Computer Science and Mathematics, and a Ph.D. from Carnegie Mellon University's Computer Science Department. During his PhD studies, Nathan was a member of André Platzer's Logical Systems Lab and a core developer of the KeYmaera X theorem prover for hybrid systems. Nathan has previously worked as a Senior Applied Scientist at Amazon Web Services and as a Research Scientist at the MIT-IBM AI Lab.

Generative Programming with Mellea: from Agentic Soup to Robust Software

Naty Clementi

Naty Clementi is a senior software engineer at NVIDIA. She is a former academic with a Masters in Physics and PhD in Mechanical and Aerospace Engineering to her name. Her work involves contributing to RAPIDS, and in the past she has also contributed and maintained other open source projects such as Ibis and Dask. She is an active member of PyLadies and an active volunteer and organizer of Women and Gender Expansive Coders DC meetups.

Accelerating Geospatial Analysis with GPUs

Nikunj Doshi

I am Nikunj Doshi a Cloud, Data & AI Consultant, entrepreneur, and startup founder passionate about empowering tomorrow’s leaders. I hold a Master’s in Information Systems from Northeastern University and a Bachelor’s in Information Technology from Thadomal Shahani Engineering College, which have equipped me with a blend of technical expertise and management skills to drive innovation in cloud computing, DevOps, data analytics, and automation.

As the Founder of Achievers Astra, I have guided over 1,500 international students through career development workshops, personalized mentorship, and strategic planning for professional success. As the Director & Regional Head for North America at Abroad Aashaye, I have helped 5,000+ students navigate the U.S. academic journey and built global partnerships to enhance opportunities for international education.

In my corporate career, I worked with Red Hat as a Cloud Solutions Architect and Cloud Site Reliability Engineer, gaining hands-on experience with AWS, OpenShift, distributed cloud architectures, and large-scale automation. My technical toolkit includes Java, Python, C/C++, R, AWS Cloud services, Microsoft Azure, MongoDB, MySQL, Selenium, Ansible, and Git.

I am passionate about fostering engineering excellence, mentoring future leaders, and contributing to discussions on technology, innovation, and global youth leadership. I have been recognized as a LinkedIn Top Voice and invited as a guest speaker at leading universities, sharing insights on career growth, cloud technologies, and developer culture.

Building Production RAG Systems for Health Care Domains : Clinical Decision

Paddy Mullen

Paddy Mullen is a full‑stack engineer and data‑tooling builder. An early employee at Anaconda, he contributed to the Bokeh visualization library. He has built data tools and led teams at hedge funds and startups. Since 2023 he has been developing Buckaroo, an interactive dataframe viewer for notebook environments.

The Column's the limit: interactive exploration of larger than memory data sets in a notebook with Polars and Buckaroo

Pranav Kompally

AI and Machine Learning Engineering @ Vizit.

Modeling Aesthetic Identity: Building a Digital Twin from Instagram Likes & Visual Preferences

Rodrigo Silva Ferreira

Rodrigo Silva Ferreira is a QA Engineer at Posit, where he contributes to the quality and usability of open-source tools that empower data scientists working in R and Python. He focuses on both manual and automated testing strategies to ensure reliability, performance, and an excellent user experience.

Rodrigo holds a BSc. in Chemistry with minors in Applied Math and Arabic from NYU Abu Dhabi and a MSc. in Analytical Chemistry from the University of Pittsburgh. Multilingual and globally minded, he enjoys working at the intersection of data, science, and technology — especially when it means building tools that help people better understand and navigate the world through its increasingly complex data.

When Rivers Speak: Analyzing Massive Water Quality Datasets using USGS API and Remote SSH in Positron

Sanjit Paliwal

As a Principal Data Scientist at Verizon, I deliver innovative and impactful data solutions for various business units and functions. I have over seven years of experience in data science, with a focus on Machine Learning, Artificial Intelligence, NLP, Gen AI, Time Series analysis, Visualization, Geospatial analysis, and Statistical Analysis (A/B Testing).

My mission is to leverage data and analytics to solve complex and challenging problems, optimize processes and performance, and generate actionable insights and recommendations. I use Python, SQL, GCP, Tableau, and Git as my main tools to develop, deploy, and monitor data models and pipelines. I also collaborate with cross-functional teams and stakeholders to understand their needs, communicate results, and provide data-driven guidance. I am passionate about learning new skills and technologies, and sharing my knowledge and expertise with others.

No Cloud? No Problem. Local RAG with Embedding Gemma

Sarthak Pattnaik

I have a passion for leveraging data to drive transformative outcomes. My journey spans across diverse roles, including that of a data analyst, data engineer, and currently an AI engineer.

As a Graduate Research Assistant at Boston University, I was involved in a wide array of projects that allowed me to creatively juxtapose the technological aspects of data science and machine learning on top of sophisticated concepts from finance, advertising, energy to perform analysis on interesting use cases. I presented my research at prominent conferences like Computer Science and Education in Computer Science (CSECS), ITISE, and NEDSI.

My professional experience encapsulates working with a wide array of AI and Cloud tools like OpenAI and Gemini models, RAGs, Agents using crew.ai, MCPs with Claude, advanced prompting, Snowflake, SnapLogic, and Power BI.

I am on a relentless pursuit of knowledge and excellence, committed to harnessing the power of data for informed decision-making and driving meaningful impact. Let's connect and explore how my versatile skill set can contribute to your data-centric endeavors.

Tracking Policy Evolution Through Clustering: A New Approach to Temporal Pattern Analysis in Multi-Dimensional Data

Sebastian Wallkötter

I'm an engineer and open-source maintainer with a PhD in Computer Science and over a decade of hands-on experience building with AI/ML. Having scaled ImageIO, a foundational Python library, from 2 to 35 million monthly downloads, I know what it takes to build robust, scalable software. I co-founded PyData Stockholm and am deeply integrated into our data community. My current focus is to bring first principles thinking to the GenAI landscape and help developers build more robust systems.

The Boringly Simple Loop Powering GenAI Apps

Serhii Sokolenko

Serhii Sokolenko is a co-founder of Tower.dev. Tower orchestrates Python-native workflows and offers management tools for data lakehouses. Prior to founding Tower, Serhii worked at Databricks, Snowflake and Google on data processing and databases.

Surviving the Agentic Hype with Small Language Models

Sheetal Borar

Sheetal Borar is a senior applied scientist at Etsy, where she works on retrieval systems powering large-scale recommender systems. She has spoken at PyData Global and PyData NYC and has several publications under her name and is recognized as a strong advocate for knowledge sharing and community building. She has gained experience across multiple industries and has about five years of professional experience in building machine learning solutions.

Hands-On with LLM-Powered Recommenders: Hybrid Architectures for Next-Gen Personalization
How AI Is Transforming Data Careers — A Panel Discussion

Shikhar Patel

Shikhar Patel, an AI Data Engineer at Mass General Brigham. Shikhar brings extensive hands-on experience in building LLM and RAG systems for healthcare, particularly in clinical decision support.

Building Production RAG Systems for Health Care Domains : Clinical Decision

Siddharth Shankar

Siddharth Shankar is a Machine Learning Engineer working at Mphais.AI. His current work focuses on multimodal fine-tuning for mortgage and investment banking. Before entering financial AI, he worked on optimization modeling for aviation operations and developed MLOps pipelines that enabled scalable, reproducible machine learning deployment across complex systems.

He earned his Master’s in Computer Science and Information Systems from the University of Maryland, where his research interests lied in the intersection between Machine Learning and Human Computer Interaction.

Siddharth is passionate about designing AI systems that are not just accurate or efficient, but also trustworthy, compliant, and production-ready.

LLMOps in Practice: Building Secure, Governed Pipelines for Large Language Models

Susan Shu Chang

Susan Shu Chang is a Principal Data Scientist at Elastic (Elasticsearch). She has spoke at 6 PyCons around the world, and is the author of Machine Learning Interviews (O'Reilly).

Evaluating AI Agents in production with Python

Ted Conway

Ted Conway is a Data Analyst working in the financial sector. Ted studied Computer Science at the University of Illinois and DePaul University.

Fun With Python and Emoji: What Might Adding Pictures to Text Programming Languages Look Like?

Yiwen Liu

Former JPMorgan Vice President
AI Product Manager, Engineer
Now exploring new opportunities with Stealth Startup!!
Linkedin: www.linkedin.com/in/yiwen-liu-cssbb-24902016

Build Your MCP server
How AI Is Transforming Data Careers — A Panel Discussion

Yunxin Gao

Yunxin holds a Bachelor’s degree in Applied Statistics from the University of Wisconsin–Madison and a Master’s degree in Applied Statistics from New York University, with a focus on data science and big data. Since completing graduate school, Yunxin has worked under Model Risk in the finance industry for the past 2.5 years, where they specialize in evaluating, validating, and interpreting complex quantitative models. Their experience spans statistical modeling, machine learning, and model risk management, with a strong emphasis on translating analytical insights into actionable business decisions.

Rethinking Feature Importance: Evaluating SHAP and TreeSHAP for Tree-Based Machine Learning Models

Zvi Topol

Data professional with 15+ years of experience in software, data engineering, analytics, data science, and AI/ML. Graduate degrees in Computer Science and Statistics. Domain knowledge and background in multiple verticals, including media and entertainment, marketing analytics, and finance.

Uncertainty-Guided AI Red Teaming: Efficient Vulnerability Discovery in LLMs

fasal shah

Fasal Shah is a Principal Machine Learning Engineer at Red Hat with over ten years of experience in artificial intelligence and machine learning. He focuses on developing advanced AI systems and applying them to solve real-world problems. Fasal holds a Master’s degree in Machine Learning and Artificial Intelligence and has published peer-reviewed papers in natural language processing and knowledge graphs.

Scaling Specialist Knowledge with AI: From Virtual Specialist to Revenue Acceleration Agent