PyConDE & PyData Berlin 2024

The key to reliability - Testing in the field of ML-Ops
04-23, 14:45–15:30 (Europe/Berlin), B05-B06

Testing is a de facto standard in modern software development. With increasing awareness that comes with ML-Ops, testing becomes more important for the development and operation of machine learning-based components. In this talk we would like to share our view and solution for testing in the field of machine learning. We will present the applied testing strategy used and the lessons learned from the last four years of experience in operating idealo’s cataloging system.


idealo.de offers a price comparison service for millions of products from a wide variety of categories. It navigates the dynamic landscape of about 3.7 billion offerings from 50,000+ shops, our central challenge is cataloging this huge offer automatically. Machine learning plays a crucial role for us in processing data.

Machine learning components must be considered as a part of a more complex domain. In our domain those components are part of an event driven asynchronous architecture. The need to continuously develop, deliver, and train accompanied by the capability to smoothly work together with traditional software components raises high demands on stable software development and operations. Testing plays a crucial role and brings up many open questions in the field of machine learning.

In this talk we want to share and present our holistic approach to testing in machine learning. The following aspects are taken into account:
- Introduction into our machine learning lifecycle
- Testing in context of traditional software development comprising unit tests, code coverage, contract tests, tests on infrastructure as code
- Specific challenges of testing in the machine learning domain comprising end-to-end test of training pipelines, deployment testing of inference endpoints in operational modes
- The role of logging and monitoring for safe operations

The presented test strategy is based on our 4 years' experience in operating idealo's cataloging system. Examples will be aligned along our tech stack consisting of e.g., PyTest, CDK , Pactman, AWS Sagemaker, Github Actions, OpenSearch Kibana and Grafana.


Expected audience expertise: Domain

Intermediate

Expected audience expertise: Python

Intermediate

Abstract as a tweet (X) or toot (Mastodon)

idealo.de presents its holistic approach for testing in machine learning

See also: Presentation Slides (1.5 MB)

Gunar Maiwald has a background in Computer Science. For the last 4 years he worked as an ML engineer at idealo.de. His professional programming path led him from Perl via TypeScript to Python.

Tobias Senst is a Senior Machine Learning Engineer at idealo internet GmbH. Tobias Senst received his PhD in 2019 from the Technische Universität Berlin under the supervision of Prof. Thomas Sikora. He has more than 10 years of experience in Computer Vision and Video Analytics research.

At idealo, he switched from the world of images and videos to Natural Language Processing and is responsible for the operation and development of machine learning models in a productive environment.