How strong is my opponent? Using Bayesian methods for skill assessment PyConDE & PyData Berlin 2019

How strong is my opponent? Using Bayesian methods for skill assessment

Being able to correctly estimate a competitors' skill is a crucial question in sport forecasting and matchmaking. This talk will provide an overview of the three most common algorithms for this task: Elo, Glicko2 and Trueskill.

If A beat B and B beat C, should A be ranked higher than C? What seems like an easy question quickly gets more complex if we ask ourselves when A played B for the last time, what the scores were and whether or not B's grandmother had died just before the match. The question of correctly ranking contestants has been around as long as people have been playing sports, and it doesn't lose its relevance. In online video games, rankings are used to build teams of fair skill. In betting, ranking directly translates to the probability of a contestant winning the match. And of course every chess player world wide can tell you their Elo rating.
In this talk we will look at the most established ranking algorithms: Elo, Glicko2 and Trueskill. All three are based on Bayesian updating. We will consider the theoretical foundation of the three, and compare their use cases and shortcomings. All the three algorithms are readily available as Python packages. Using a real-life data set we will generate a ranking of German Bundesliga teams and compare it to the currently accepted status quo.

Domains: Algorithms Domain Expertise: none Python Skill Level: none Abstract as a tweet:

Introduction to the ranking algorithms Elo, Glicko2, and Trueskill.

Darina Goldin

Darina came to Data Science via a Ph.D. in Control Science. For the past four years, she's been working on predicting outcomes of esports matches. By now she has probably applied and implemented every ranking algorithm that's been published since the 60ies.