PyCon DE & PyData 2025

expectation: A modern take on statistical A/B testing with e-values and martingales
2025-04-23 , Helium3

This talk introduces a novel Python library for statistical testing using e-values, offering a refreshing alternative to traditional p-values. We'll explore how this approach enables real-time sequential testing, allowing data scientists to monitor experiments continuously without the statistical penalties of repeated testing. Through practical examples, we'll demonstrate how e-values provide more intuitive evidence measures and enable flexible stopping rules in A/B testing, clinical trials, and anomaly detection. The library implements cutting-edge methods from game-theoretic probability, making advanced sequential testing accessible to Python practitioners. Whether you're conducting A/B tests, monitoring production models, or running clinical trials, this talk will equip you with powerful new tools for sequential data analysis.


Modern data science demands flexible statistical methods that can handle sequential data analysis and continuous monitoring. Traditional p-values, while widely used, have limitations when dealing with sequential testing scenarios. This talk introduces a Python library that implements e-values and e-processes, offering a more natural approach to measuring statistical evidence and enabling true sequential testing.

Outline:
1. Statistical toolkit
- Current tools
- Purpose and fundamental concepts
- Challenges in modern statistics
- Type 1 error concerns
- Optional stopping problems

  1. Sequential testing
    - Origins
    - The concept of sequential testing
    - Peeking

  2. e-values
    - What are e-values?
    - Definitions and concepts
    - Betting interpretation
    - Wealth process
    - Ville's inequality
    - Anytime valid inference
    - p-value vs. e-value differences

  3. Python library
    - Architecture
    - Core components
    - Installation and basic setup

  4. Demo 1: A/B testing

  5. Beyond A/B testing
    - Broader applications
    - Conformal e-testing
    - Confidence sequences

  6. Demo 2: It is a versatile library

  7. Acknowledgments

Q&A Session


Expected audience expertise: Domain:

Advanced

Expected audience expertise: Python:

Novice

Public link to supporting material, e.g. videos, Github, etc.:

https://github.com/jakorostami/expectation

I am a Machine Learning Engineer at H&M Group, former Data Scientist at Lidl Sweden, as a professional I am designing Machine Learning services, extracting insights and arranging meaningful stories for my clients by conducting high-quality modeling, engineering, data mining and analytics.

I have a Bachelor degree in Statistics and Probability theory from Uppsala University of Sweden. Because I am a Statistician at core I have good experience with Data Sciencr, Python, R, time series modeling, simulations, machine learning algorithms, SQL, Excel, Spark and database technologies, as well as good communication skills.

You’ll find two comprehensive Python libraries I have open-sourced. One is based on an emerging modern statistical hypothesis testing framework using e-values and martingales based on game-theoretic statistics. The other is for computational Supply Chain and Logistics. The first one is called ’expectation’ and the second one is called ’supplyseer’ and you can find both on my GitHub.

This speaker also appears in: