Chris Kucharczyk
I'm Chris Kucharczyk, a data scientist and data visualization designer. I live in Oxfordshire, UK.
I currently work at DrivenData, a social enterprise developing machine learning solutions to social impact problems. We host data science competitions and offer data science consulting services.
Session
Publicly available data is rarely analysis-ready, hampering researchers, organizations, and the public from easily accessing the information these datasets contain. One way to address this shortcoming is to "bake" the data into a structured format and ship it alongside code that can be used for analysis. For analytical work in particular, DuckDB provides a performant way to query the structured data in a variety of contexts.
This talk will explore the benefits and tradeoffs of this architectural pattern using the design of scipeds–an open source Python package for analyzing higher-education data in the US–as a case study.
No DuckDB experience required, beginner Python and programming experience recommended. This talk is aimed at data practitioners, especially those who work with public datasets.