2025-04-25 –, Ferrum
A/B testing is a critical tool for making data-driven decisions, yet its statistical underpinnings—p-values, confidence intervals, and hypothesis testing—are often challenging for those without a background in statistics. Coders frequently encounter these concepts but lack a straightforward way to compute and interpret them using their existing skill set.
This talk presents a practical approach to A/B test evaluations tailored for coders. By utilizing Python’s random number generator and basic loops, it introduces bootstrapping as an accessible method for calculating p-values and confidence intervals directly from data. The goal is to simplify statistical concepts and provide coders with an intuitive understanding of how to evaluate test results without relying on complex formulas or statistical jargon.
Making A/B Test Evaluations Intuitive for Coders: A Python-Based Approach
A/B testing is an essential method for data-driven decision-making, but interpreting the results can be daunting. Complex jargon around p-values and confidence intervals often creates barriers to understanding. This talk simplifies A/B testing by introducing a practical, Python-powered approach using bootstrapping—a flexible and accessible method that aligns with how software engineers think and works without requiring statistical knowledge.
Session Highlights:
- Statistical Significance and Hypothesis Testing:
- Why is statistical testing crucial for A/B tests? Simple comparisons overlook randomness.
- Using Python, we’ll demonstrate how to simulate "what-if" scenarios by shuffling and resampling data, allowing participants to compute p-values and understand the likelihood of observed differences occurring by chance.
- Confidence Intervals with Bootstrapping:
- Confidence intervals clarify the range of plausible outcomes.
- We’ll explore how to resample experiment data repeatedly to estimate variability and construct intuitive confidence intervals—all using basic tools like random number generators and loops, without requiring advanced math.
- Key Takeaways:
- Hands-on skills to compute p-values and confidence intervals using basic programming concepts.
- Clear, step-by-step demonstrations of shuffling, resampling, and generating statistical insights.
- Practical knowledge to move beyond black-box libraries and understand the "why" and "how" behind A/B test evaluations.
By the end of the session, attendees will be equipped to demystify A/B testing with a coder-friendly workflow, empowering them to make confident, data-driven decisions in their projects.
Talk Outline:
- Setting the Stage (5 minutes)
- What is A/B testing?
- Why isn't it enough to just compare numbers? Why do we need statistics to interpret results?
- Statistical Significance and P-Values (5 minutes)
- Statistical tests (t-test, z-test, binomial test) are frequently used, but what is the intuition behind them?
- Introducing the basic idea of bootstrapping.
- Bootstrapping Explained (8 minutes)
- Step-by-step illustration of the bootstrapping approach.
- What is a p-value? An intuitive description using resampling.
- Confidence Intervals Explained (7 minutes)
- Importance of confidence intervals and how they help interpret results.
- Intuitive computation of confidence intervals using bootstrapping.
- Impact of sample size on confidence intervals and certainty.
- Why These Statistics Matter (5 minutes)
- Discussion on the practical necessity of statistical techniques.
- How these methods ensure data-driven decision-making in A/B testing.
Novice
Expected audience expertise: Python:Novice
Thomas Mayer holds a PhD in Quantitative Language Comparison and brings a profound background in Machine Learning and Natural Language Processing (NLP) to his work. As Team Lead in the Data Intelligence team at HolidayCheck, Thomas combines his passion for data-driven insights with his expertise in linguistics and AI to drive innovation in the travel industry. With a deep understanding of both technical and business challenges, he plays a pivotal role in leveraging data to enhance customer experiences and inform strategic decisions.