Offline Ranking Validation - Predicting A/B Test Results
06-13, 12:40–13:00 (Europe/Berlin), Kesselhaus

Implementing a machine learning model for ranking in an ecommerce search requires a well-designed approach to how the target metric is defined. In our team we validate our target metrics with online tests on live traffic. This requires both long preparation times and long enough runtimes to yield valid results. Having to choose only a few candidates for the next A/B test is hard and slows us down significantly. So what if we had a way to evaluate the candidates beforehand to make a more informed decision?

We came up with an approach to predict how a certain ranking will perform in an onsite test. We leverage historic user interaction data from search events and try to correlate them with ranking metrics like NDCG. This gives us insights on how well the ranking meets the user intent. This is not meant to be a replacement for a real A/B test, but allows us to narrow down the field of candidates to a manageable number. In this talk we will share our approach to offline ranking validation and how it performed in practice.

The Search track is presented by OpenSource Connections


Get your ticket now!

Register for Berlin Buzzwords in our ticket shop! We also have online tickets and reduced tickets for students available and you can find more information about our Diversity Ticket Initiative here!

Andrea Schütt is a Data Scientist at OTTO’s search team. Currently she is working on bringing OTTO’s first learning to rank model into production. She has a degree in electrical engineering with a focus on automation.

Yunus is a Data Scientist at Otto, where he works on bringing Otto’s first learning to rank model into production. Prior to joining Otto, he worked as a Data Scientist and Engineer at Deloitte, where he developed pragmatic and data-driven solutions for various clients. He holds a M.Sc. in Quantitative Economics with a focus on statistics and time series analysis.