PyConDE & PyData Berlin 2024

Everything you need to know about change-point detection
2024-04-22 , B05-B06

Change-point detection is a crucial processing step when dealing with long and non-stationary time series. It has been applied in many contexts, such as human activity recognition, speech/sound processing and industrial monitoring. This talk guides data scientists, engineers and researchers through the mathematical foundations of this subject, introduces the ruptures Python package for change-point detection, and illustrates algorithms in a biomedical context. By the end, the audience will be able to integrate them into complex data pipelines.


How do you detect an activity change (e.g. walking to running to biking) from smartwatch data? Or abrupt transitions in paleoclimate records? Or when a server failure occurs, using hardware telemetry sensor data (fan speed, acoustic noise, etc.) and software metrics (CPU, memory, I/O, etc.)? If you work with long time series, you will inevitably have to detect changes in the data-generating model.

Change-point detection is a crucial task for such signals. It consists in estimating the timestamps when the underlying signal model changes. First introduced in the 50s to monitor quality changes in industrial processes, this subject has since been extended to numerous contexts, such as sound/speech processing, human activity recognition, DNA analysis, analysis of COVID-19 policies' effects, software and hardware monitoring, etc. Over several decades, this subject has generated an important but heterogeneous body of work.

This talk will help data scientists, engineers and researchers navigate this vast literature. We will start by describing the mathematical and algorithmic background behind change-point detection in a high-level and easy-to-understand fashion. Then, we will introduce ruptures, a Python package containing many change-point detection methods, as well as calibration and visualisation routines. Algorithms will be illustrated in a real-world biomedical application.
At the end of the talk, the audience will be able to understand when to use change-point detection algorithms and how to calibrate and integrate them in a complex data pipeline.

Time breakdown:
- Introduction and motivations: 5 min
- Background on change-point detection: 10 min
- Python framework: 5 min
- Illustration on a real-world biomedical data pipeline: 10 min
- Q&A: 5 min


Expected audience expertise: Python:

Intermediate

Public link to supporting material, e.g. videos, Github, etc.:

https://github.com/deepcharles/ruptures

Expected audience expertise: Domain:

Novice

Abstract as a tweet (X) or toot (Mastodon):

How do you detect an activity change from smartwatch data, abrupt climate transitions, or server failures? If you work with long time series, you will inevitably have to detect changes. This talk describes how to do that using ruptures (https://github.com/deepcharles/ruptures).

Charles Truong is a researcher at Centre Borelli, ENS Paris-Saclay, France. His research interests lie between signal processing, statistics and machine learning. Most of his work is applied in biomedical and industrial contexts. He is the core developper of ruptures, a Python package dedicated to change-point detection algorithms.