PyLadiesCon 2025

Making GUI Data Exploration Reproducible with Python
05/12/2025 , Main Stream
Linguagem: English

Interactive data exploration tools are excellent for visualizing the data as you are cleaning it, but when a data practitioner analyzes data through drag-and-drop interfaces, the path to reproducibility becomes opaque. This project bridges that gap by capturing UI interactions in the Positron IDE and converting them into clean, readable code across pandas, polars, SQL, and multiple R syntaxes.

This talk will include a demonstration of exploring data with a UI and converting that exploration into reproducible code. We’ll walk through the architecture that makes this possible, from tracking UI changes to generating semantically equivalent code across different data manipulation libraries. We’ll also discuss the challenges and considerations that went into the design.

This work addresses a critical need in the data science community: tools that enhance usability and reproducibility. Whether you’re building data science tools, analyzing data, or simply frustrated by the gap between exploration and reproduction, this talk will show how thoughtful design can make reproducible science as easy as point-and-click.

I am a software engineer at Posit, PBC, where I build Python-based open source data science tools. I’m an Emeritus Editor in Chief at pyOpenSci, an organization that supports scientific Python tools by offering peer reviews of packages. I graduated from Florida Polytechnic University with my Masters in Computer Science, focused in Data Science. When not thinking about computers, I enjoy reading and teaching my dog tricks.