Your Data Is Leaking: A Hands-On Introduction to Differential Privacy with OpenDP
Data analysis and machine learning often involve sensitive information. But how can we ensure that our analyses and releases do not inadvertently reveal information about the individuals in our data? Traditional approaches such as anonymization or releasing only aggregate statistics have repeatedly proven insufficient.
Differential privacy is a mathematical framework that offers provable privacy guarantees while still enabling useful data analysis. In this tutorial, we provide a hands-on introduction to differential privacy, covering key concepts relevant to understanding and applying it in practice. The focus will be on practical implementation rather than underlying theory.
Using interactive examples in Python, we will explore the core ideas of differential privacy, highlight its attractive properties and limitations, and demonstrate how to build privacy-preserving analyses using OpenDP, an open-source Python library for differential privacy. Participants will leave equipped to continue exploring differential privacy on their own. Familiarity with the basics of Python programming is helpful, but no prior knowledge of differential privacy is required.