Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Deep Learning for Time Series Cookbook
  • Table Of Contents Toc
  • Feedback & Rating feedback
Deep Learning for Time Series Cookbook

Deep Learning for Time Series Cookbook

By : Cerqueira, Luís Roque
4.8 (10)
close
close
Deep Learning for Time Series Cookbook

Deep Learning for Time Series Cookbook

4.8 (10)
By: Cerqueira, Luís Roque

Overview of this book

Most organizations exhibit a time-dependent structure in their processes, including fields such as finance. By leveraging time series analysis and forecasting, these organizations can make informed decisions and optimize their performance. Accurate forecasts help reduce uncertainty and enable better planning of operations. Unlike traditional approaches to forecasting, deep learning can process large amounts of data and help derive complex patterns. Despite its increasing relevance, getting the most out of deep learning requires significant technical expertise. This book guides you through applying deep learning to time series data with the help of easy-to-follow code recipes. You’ll cover time series problems, such as forecasting, anomaly detection, and classification. This deep learning book will also show you how to solve these problems using different deep neural network architectures, including convolutional neural networks (CNNs) or transformers. As you progress, you’ll use PyTorch, a popular deep learning framework based on Python to build production-ready prediction solutions. By the end of this book, you'll have learned how to solve different time series tasks with deep learning using the PyTorch ecosystem.
Table of Contents (12 chapters)
close
close

Dealing with missing values

In this recipe, we’ll cover how to impute time series missing values. We’ll discuss different methods of imputing missing values and the factors to consider when choosing a method. We’ll show an example of how to solve this problem using pandas.

Getting ready

Missing values are an issue that plagues all kinds of data, including time series. Observations are often unavailable for various reasons, such as sensor failure or annotation errors. In such cases, data imputation can be used to overcome this problem. Data imputation works by assigning a value based on some rule, such as the mean or some predefined value.

How to do it…

We start by simulating missing data. The following code removes 60% of observations from a sample of two years of the solar radiation time series:

import numpy as np
sample_with_nan = series_daily.head(365 * 2).copy()
size_na=int(0.6 * len(sample_with_nan))
idx = np.random.choice(a=range(len(sample_with_nan)),
                       size=size_na,
                       replace=False)
sample_with_nan[idx] = np.nan

We leverage the np.random.choice() method from numpy to select a random sample of the time series. The observations of this sample are changed to a missing value (np.nan).

In datasets without temporal order, it is common to impute missing values using central statistics such as the mean or median. This can be done as follows:

average_value = sample_with_nan.mean()
imp_mean = sample_with_nan.fillna(average_value)

Time series imputation must take into account the temporal nature of observations. This means that the assigned value should follow the dynamics of the series. A more common approach in time series is to impute missing data with the last known observation. This approach is implemented in the ffill() method:

imp_ffill = sample_with_nan.ffill()

Another, less common, approach that uses the order of observations is bfill():

imp_bfill = sample_with_nan.bfill()

The bfill() method imputes missing data with the next available observation in the dataset.

How it works…

The following figure shows the reconstructed time series after imputation with each method:

Figure 1.2: Imputing missing data with different strategies

Figure 1.2: Imputing missing data with different strategies

The mean imputation approach misses the time series dynamics, while both ffill and bfill lead to a reconstructed time series with similar dynamics as the original time series. Usually, ffill is preferable because it does not break the temporal order of observations, that is, using future information to alter (impute) the past.

There’s more…

The imputation process can also be carried out using some conditions, such as limiting the number of imputed observations. You can learn more about this in the documentation pages of these functions, for example, https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.ffill.html.

Create a Note

Modal Close icon
You need to login to use this feature.
notes
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY