PyCon DE & PyData 2025

Unforgettable, that's what you are: Evaluating Machine Unlearning and Forgetting
2025-04-25 , Titanium3

Can deep learning/AI models forget? In this talk, you'll explore the realm of machine unlearning, where researchers and practitioners aim to remove memorized examples from machine learning models. This is relevant for training increasingly overparameterized models and growing GDPR/Privacy concerns with large scale model development and use.


Deep learning memorization is a known phenomena, where deep learning / AI models memorize parts of their training dataset. This happens often for repeated examples, novel examples and occurs more often in overparameterized models.

This presents problems for guiding machine learning behavior, requiring much effort in guardrails and output monitoring, as well as questioning whether the models can be GDPR-compliant (i.e. the right to be forgotten).

A growing area of research on machine unlearning or machine forgetting has emerged to investigate ways a model might unlearn or forget particular memorized examples. In this talk, you'll learn about the field of machine unlearning and related topics like data anonymization to evaluate exactly what's truly unforgettable. Jokes aside: you'll have some practical take-aways to apply to your work in data and machine learning development.


Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

None

Katharine Jarmul is a privacy activist and an internationally recognized data scientist and lecturer who focuses her work and research on privacy and security in data science and machine learning. You can follow her work via her newsletter, Probably Private (https://probablyprivate.com) or in her recently published book, Practical Data Privacy (O'Reilly 2023) now also available in German as Data Privacy in der Praxis.