Common issues with Time Series data and how to solve them PyCon DE & PyData Berlin 2023

Common issues with Time Series data and how to solve them
.ical
2023-04-17 15:10–15:40, B05-B06

Time-series data is all around us: from logistics to digital marketing, from pricing to stock
markets. It’s hard to imagine a modern business that has no time series data to forecast.
However, mastering such forecasting is not an easy task.
For this talk, together with other domain experts, I have collected a list of common time
series issues that data professionals commonly run into. After this talk, you will learn to
identify, understand, and resolve such issues. This will include stabilising divergent time
series, organising delayed / irregular data, handling missing values without anomaly propagation,
and reducing the impact of noise and outliers on your forecasting models.

This talk will walk you through 4 common issues with Time Series and illustrate them using
the context of energy demand forecasting. For each of these issues you will learn to identify,
understand, and resolve them better. These issues are time series instability, delayed and
irregular time series data, hard-to-impute missing values, impact of noise and outliers on
forecasting models. The talk is therefore split into 4 parts each with some room for
questions. Each part will provide some high-level background, explanations, examples and
code snippets, while avoiding unnecessary in-depth computations and formulas. Therefore,
the whole talk is accessible to both specialists with experience in Time Series analytics as
well as those without such experience who nonetheless intend to broaden their
understanding of this field and gain some valuable insights for the business problems that
they are likely to encounter in the future.

Data Scientists / Analysts working with time series data and understanding at least the
basics of Pandas / Scikit-learn Python libraries as well as what a time series forecasting
problem entails would benefit the most from this talk. However, other less technical
specialists (management, product owners etc.) can still gain valuable domain knowledge in
this field.

Expected audience expertise: Domain: Novice Expected audience expertise: Python: Novice Abstract as a tweet:

Handling time series data is an important yet not an easy task. After this talk you will learn to identify, understand, and resolve time series issues such as divergence, delayed data, time series imputation and impact of outliers.

Vadim Nelidov

Vadim Nelidov is a Lead Data Science consultant at Xebia Data with diverse experience in the data domain in a variety of industries from energy sector and banking to skincare and agriculture. Throughout his years in the data world, Vadim has been combining advanced data science with business insights to make data work with an impact. He aspires to see far beyond what is on the surface and get to the essence of the problems, discovering robust and scalable long-term solutions rather than temporary fixes.

Vadim is passionate about sharing his knowledge and insights, believing that Data literacy should not be a privilege of a few. And his goal is to be there to make this a reality. Making the intricacies of data science intelligible and uncovering the regularities hiding in the data is a major source of inspiration for Vadim. With this goal in mind, he combines his years of experience in consulting with his background in statistics, research and teaching to make this knowledge accessible to businesses and individuals in need.

Common issues with Time Series data and how to solve them .ical 2023-04-17 15:10–15:40, B05-B06

Common issues with Time Series data and how to solve them
.ical
2023-04-17 15:10–15:40, B05-B06