Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Feature Extraction for Time Series | Foundations of ML-Based Time Series Forecasting
Quizzes & Challenges
Quizzes
Challenges
/
Machine Learning for Time Series Forecasting

bookFeature Extraction for Time Series

Understanding how to transform raw time series data into meaningful features is essential for building effective machine learning models. Three of the most important types of features you can create for time series forecasting are lagged values, rolling statistics (such as rolling means and standard deviations), and calendar features like day of the week or month. Each of these feature types helps your model capture different patterns and dependencies in the data.

1234567891011121314151617181920
import pandas as pd import numpy as np # Create a synthetic daily time series np.random.seed(42) dates = pd.date_range("2024-01-01", periods=10, freq="D") values = np.random.randint(10, 100, size=10) df = pd.DataFrame({"date": dates, "value": values}) df.set_index("date", inplace=True) # Add a lagged feature (previous day's value) df["lag_1"] = df["value"].shift(1) # Add a rolling mean (window of 3 days, excluding current day) df["rolling_mean_3"] = df["value"].shift(1).rolling(window=3).mean() # Add a calendar feature: day of week (Monday=0, Sunday=6) df["day_of_week"] = df.index.dayofweek print(df)
copy
Lagged values
expand arrow
  • Use lagged features when your target variable depends on its own recent history;
  • Lagged values are essential for capturing autocorrelation and short-term dependencies;
  • Pitfall: using future lags (e.g., negative shifts) can cause data leakage and unrealistic forecasts.
Rolling means and standard deviations
expand arrow
  • Use rolling statistics to smooth out short-term fluctuations and reveal local trends or volatility;
  • Rolling features are useful when recent averages or variability influence the target;
  • Pitfall: including the current or future value in the rolling window can leak information from the target period.
Calendar features
expand arrow
  • Use calendar features to capture recurring patterns tied to time, such as weekly seasonality or holidays;
  • Day of week or month features help the model recognize cycles related to the calendar;
  • Pitfall: for highly irregular time series or non-calendar-based data, these features may add noise instead of value.

1. Which feature type helps capture seasonality in daily sales data?

2. What is a potential risk when using rolling statistics as features?

question mark

Which feature type helps capture seasonality in daily sales data?

Select the correct answer

question mark

What is a potential risk when using rolling statistics as features?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 3

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

bookFeature Extraction for Time Series

Swipe to show menu

Understanding how to transform raw time series data into meaningful features is essential for building effective machine learning models. Three of the most important types of features you can create for time series forecasting are lagged values, rolling statistics (such as rolling means and standard deviations), and calendar features like day of the week or month. Each of these feature types helps your model capture different patterns and dependencies in the data.

1234567891011121314151617181920
import pandas as pd import numpy as np # Create a synthetic daily time series np.random.seed(42) dates = pd.date_range("2024-01-01", periods=10, freq="D") values = np.random.randint(10, 100, size=10) df = pd.DataFrame({"date": dates, "value": values}) df.set_index("date", inplace=True) # Add a lagged feature (previous day's value) df["lag_1"] = df["value"].shift(1) # Add a rolling mean (window of 3 days, excluding current day) df["rolling_mean_3"] = df["value"].shift(1).rolling(window=3).mean() # Add a calendar feature: day of week (Monday=0, Sunday=6) df["day_of_week"] = df.index.dayofweek print(df)
copy
Lagged values
expand arrow
  • Use lagged features when your target variable depends on its own recent history;
  • Lagged values are essential for capturing autocorrelation and short-term dependencies;
  • Pitfall: using future lags (e.g., negative shifts) can cause data leakage and unrealistic forecasts.
Rolling means and standard deviations
expand arrow
  • Use rolling statistics to smooth out short-term fluctuations and reveal local trends or volatility;
  • Rolling features are useful when recent averages or variability influence the target;
  • Pitfall: including the current or future value in the rolling window can leak information from the target period.
Calendar features
expand arrow
  • Use calendar features to capture recurring patterns tied to time, such as weekly seasonality or holidays;
  • Day of week or month features help the model recognize cycles related to the calendar;
  • Pitfall: for highly irregular time series or non-calendar-based data, these features may add noise instead of value.

1. Which feature type helps capture seasonality in daily sales data?

2. What is a potential risk when using rolling statistics as features?

question mark

Which feature type helps capture seasonality in daily sales data?

Select the correct answer

question mark

What is a potential risk when using rolling statistics as features?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 3
some-alt