PyCon DE & PyData 2025

How to use Data Science Superpowers in real life, a Bayesian perspective
2025-04-23 , Helium3

In the data science field, we use all these powerful methods to solve important problems. Most of the time, we do this very well because our data science and machine-learning toolbox fits the problems we tackle quite precisely. Yet, what about our everyday choices or even our most important life decisions? Can we use for our private lives what we advocate for in our jobs or are these choices inherently different?
Many of this real life decisions are a little different than textbook machine-learning problems. There is often less or hard-to-come-by data and the decisions are infrequent, but sometimes very consequential. This talk will dive into what makes everyday decisions difficult to handle with our data science toolbox. It will show how Bayesian thinking can help to reason in such cases, especially when there is not a lot of data to rely on.


In this talk, I want to have a look on decision making from a slightly different angle. In a world that produces an ever growing amount of data in every domain, data scientists can shine with their tools to make data-driven decisions. Often there is even too much data and the most tedious part of the work is to remove the noise from the signal with clever feature engineering. Though the world gets covered more and more by big data, this development is not distributed evenly.

Lots of decisions we need to make in real life do not follow this pattern. In fact, there are often surprisingly few data points that help us here. Yet, are there fundamental differences between everyday decisions and the type of decisions we automate so well with machine learning in our jobs? In this part of the talk, I will attempt a characterisation of both types of decisions. We will have a closer look at what implicit assumptions we make to use our machine learning toolbox. After this we might get a first explanation why these tools might be unsuited to answer questions like ’how longe should I study for an exam’ or ‚’should I accept this new job or not’.

Enter Bayesian statistics: This part of the talk will introduce Bayesian statistics for beginners using simple examples and images. It will highlight the benefits of the method when we are short of data but have some additional experience not encoded in the data. I will show how in these circumstances prior distributions come in really handy.

After laying the groundwork on Bayesian methods we will circle back to the everyday decisions and see how well both things fit together. On a higher level, this will show what makes problems in decision making a great fit for Bayesian methods. I will introduce this using a practical example. The example will deal with the decision how long one should study for a test or exam. Taking a step-by-step approach, we explore how this decision can be informed with just a few data points. Set aside finding the key to successful exam preparation, the example is also helpful to see some of the basics for working with the pymc library.

The talk will end with some more general thoughts. This will answer where to go from here and for which decisions a thorough investigation like the presented one is worthwhile.,Yet, once one is familiar with the basics of Bayesian thinking, there might be shortcuts. I will show that we can use the principles as a great tool to improve discussions about important decisions on a broader scale.


Expected audience expertise: Domain:

Novice

Expected audience expertise: Python:

Novice

I am currently working as a Senior Data Scientist at Ailio. My focus is on helping improve organizations by better utilizing their data. I contribute to these transformation projects by bringing in my broad expertise in data related topics ranging from data engineering and cloud-development (AWS, Azure) over data science and machine learning to communication and leadership skills.

After completing my masters in chemistry, I really started my journey in the data science and machine learning field during my PhD studies in theoretical chemistry. The next step for me was a role as a data scientist in a company developing software in the IT-Security field. For five years, I worked on a system to detect suspicious e-mail traffic using machine learning. Set aside the technical aspect of the job, I also built a small team. From this experience I learnt a lot about leadership and developing software products on a larger scale.

I strongly believe that using the right data to inform important decisions helps organizations of all kinds improve. However, often this is easier said than done. I am always curios to discover and tackle these interesting challenges. Also, I am more than happy to sharing my knowledge and learnings.