PyConDE & PyData Berlin 2024

Time series anomaly detection with a human-in-the-loop
04-23, 16:00–16:30 (Europe/Berlin), B09

In the cross-industry wide trend towards industry 4.0 solutions, the amount of gathered sensor data is ever growing. Through the sheer amount of data, manual or human-based monitoring of the collected time series data becomes cumbersome if not even impossible. Yet, careful inspection of the time series data and identification of possible anomalies therein is crucial to detect problems in the underlying processes. To resolve this demand, ZEISS is developing a fully automated time series processing tool that performs ML based time series anomaly detection with a human-in-the-loop.


Starting from a completely unlabelled dataset, unsupervised anomaly detection is performed. Identified anomaly candidates are presented via a web app to domain experts, who can judge whether the identified time series segments are indeed abnormal or are expected behaviour, i.e., false positives generated by the anomaly detection. The domain-expert’s feedback is stored to create a partially labelled dataset. The intended benefits from storing the collected labels are: 1) Metrics can be generated that allow to evaluate the performance of the initially unsupervised anomaly detection run. 2) The number of false positives generated by the algorithm, i.e., time series segments that were incorrectly flagged as anomaly, can be reduced via pattern matching. 3) Based on a partially labelled dataset more domain problem specific methods might be applied such as semi-supervised anomaly detection or time series classification.
The framework uses open source tools and all its components, i.e., data pipelines, anomaly detection, web app, are deployed to the cloud.


Expected audience expertise: Python

None

Abstract as a tweet (X) or toot (Mastodon)

In the cross-industry wide trend towards industry 4.0 solutions, the amount of gathered sensor data is ever growing. Through the sheer amount of data, manual or human-based monitoring of the collected time series data becomes cumbersome if not even impossible.

Expected audience expertise: Domain

None

With a background in particle physics, Philipp Millet has been working as Data Scientist for HotSprings/Umlaut/Accenture in various projects and domains. In 2023 he joined ZEISS Digital Partners as a Machine Learning Engineer. His focus is getting Data Science projects from a PoC stage into production.