PyConDE & PyData Berlin 2024

Jan Teichert-Kluge

My name is Jan and I work as a research associate at the University of Hamburg, where I am studying for my PhD in statistics and data science. I have a master's degree in industrial engineering and together with my experience from industry, I have a strong application-oriented background.
I have contributed to the DoubleML package for Python and my research focuses on Causal ML for unstructured data such as text and images.


Session

04-23
10:30
90min
Using ML to find out the "Why"? A Tutorial in Causal Machine Learning
Oliver Schacht, Jan Teichert-Kluge

Machine learning is mostly used for predicting outcome variables. But in many cases, we are interested in causal questions: Why do customers churn? What is the effect of a price change on sales? How can we optimize personalized marketing campaigns or medical treatments?

This tutorial introduces participants to the field of Causal Machine Learning (Causal ML). We will start with a basic motivation of causal analysis and share insights on how to recognize causal questions in data science. We will dive into the basics of Causal ML: Why can't we simply use of-the-shelf ML methods to answer causal questions? The tutorial will focus on the Double Machine Learning approach and demonstrate the use of Causal ML with the Python library DoubleML (Bach et al., 2022). The general introduction will be complemented by hands-on data examples and interactive discussion and Q&A sessions. The tutorial is a great starting point for participants to discover Causality/Causal ML and start their own causal data science projects.

References

Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M. (2022), DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python, Journal of Machine Learning Research, 23(53): 1-6, https://www.jmlr.org/papers/v23/21-0862.html

PyData: Machine Learning & Deep Learning & Stats
A03-A04