AutoGluon: AutoML for Tabular, Multimodal and Time Series Data
2023-04-17 , B05-B06

AutoML, or automated machine learning, offers the promise of transforming raw data into accurate predictions with minimal human intervention, expertise, and manual experimentation. In this talk, we will introduce AutoGluon, a cutting-edge toolkit that enables AutoML for tabular, multimodal and time series data. AutoGluon emphasizes usability, enabling a wide variety of tasks from regression to time series forecasting and image classification through a unified and intuitive API. We will specifically focus on tasks on tabular and time series tasks where AutoGluon is the current state-of-the-art, and demonstrate how AutoGluon can be used to achieve competitive performance on tabular and time series competition data sets. We will also discuss the techniques used to automatically build and train these models, peeking under the hood of AutoGluon.


AutoGluon is a Python machine learning library which offers cutting edge accuracy and value-for-compute on a wide variety of tasks. These tasks include regression, classification and quantile regression in tabular data, as well as multimodal tasks such as image classification, image-to-text and text-to-text similarity. A recent addition to AutoGluon is AutoGluon-TimeSeries, the library's module for time series forecasting tasks.

AutoGluon is organized into modules for Tabular, Multimodal and Time Series tasks all of which share an intuitive scikit-learn-like API for fitting and performing inference with cutting-edge machine learning in as little as three lines of code, with no in-depth understanding of ML. AutoGluon is widely considered the state-of-the-art in tabular tasks as confirmed by the independent AutoML Benchmark, and is the current top performer on multimodal tasks on the RAFT leaderboard. In this talk, we will focus on the tabular and time series modules and showcase how the library can be used to get competitive results on competition platforms such as Kaggle.

AutoGluon also differs quite significantly under the hood from other AutoML frameworks. The library does not take AutoML to primarily mean hyperparameter optimization, but leans heavily into building (stack) ensembles of strong but varied learning algorithms to achieve superior results. We will also showcase some of the theory and building blocks of AutoGluon, describing how we built an AutoML system that takes model ensembling as a central element.


Expected audience expertise: Python:

Intermediate

Expected audience expertise: Domain:

Novice

Abstract as a tweet:

Learn about #AutoML and @AutoGluon, which can handle a range of tasks from regression to image classification and time series forecasting with state-of-the-art performance. #AutoML #datascience

Public link to supporting material:

http://auto.gluon.ai

Caner Turkmen is a Senior Applied Scientist at Amazon Web Services, where he works on problems at the intersection of machine learning and forecasting, in addition to developing AutoGluon-TimeSeries. Before joining AWS, he worked in the management consulting industry as a data scientist, serving the financial services and telecommunications industries on projects across the globe. Caner’s personal research interests span a range of topics, including forecasting, causal inference, and AutoML.

Oleksandr Shchur is an Applied Scientist at Amazon Web Services, where he works on time series forecasting in AutoGluon. Before joining AWS, he completed a PhD in Machine Learning at the Technical University of Munich, Germany, doing research on probabilistic models for event data. His research interests include machine learning for temporal data and generative modeling