TSML (Time Series Machine Learning)
2019-07-23 , Room 349

TSML is a package for time series data processing, classification, and prediction. It provides common API for ML libraries from Python's ScikitLearn, R's caret, and native Julia MLs for seamless integration of heterogenous libraries to create complex ensembles for robust time-series preprocessing, prediction, clustering, and classification.


Over the past years, the industrial sector has seen many innovations brought about by automation. Inherent in this automation is the installation of sensor networks for status monitoring and data collection. One of the major challenges in these data-rich environments is how to extract and exploit information from these large volume of data to detect anomalies, discover patterns to reduce downtimes and manufacturing errors, reduce energy usage, etc.

To address these issues, we developed TSML package. It leverages AI and ML libraries from ScikitLearn, Caret, and Julia as building blocks in processing huge amount of industrial time series data. It has the following characteristics:
- TS data type clustering/classification for automatic data discovery
- TS aggregation based on date/time interval
- TS imputation based on symmetric Nearest Neighbors
- TS statistical metrics for data quality assessment
- TS ML wrapper with more than 100+ libraries from caret, scikitlearn, and julia
- TS date/value matrix conversion of 1-D TS using sliding windows for ML input
- Common API wrappers for ML libs from JuliaML, PyCall, and RCall
- Pipeline API allows high-level description of the processing workflow
- Specific cleaning/normalization workflow based on data type
- Automatic selection of optimised ML model
- Automatic segmentation of time-series data into matrix form for ML training and prediction
- Easily extensible architecture by using just two main interfaces: fit and transform
- Meta-ensembles for robust prediction
- Support for distributed computation, for scalability, and speed

TSML uses a pipeline which iteratively calls the fit and transform families of functions relying on multiple dispatch to select the correct algorithm. Machine learning functions in TSML are wrappers to the corresponding Scikit-learn, Caret, and native Julia ML libraries. There are more than hundred classifiers and regression functions available using a common API.

Full TSML documentation: https://ibm.github.io/TSML.jl/stable/


Co-authors

Joern Ploennigs, Niall Brady

I am a research scientist at the IBM Dublin Research Lab working in the areas of analytics, datamining, optimization, development of intelligent agents using machine learning and evolutionary computation, neuroinformatics, and biomedical engineering.