Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Feature Extraction for Time Series | Foundations of ML-Based Time Series Forecasting
Quizzes & Challenges
Quizzes
Challenges
/
Machine Learning for Time Series Forecasting

bookFeature Extraction for Time Series

Understanding how to transform raw time series data into meaningful features is essential for building effective machine learning models. Three of the most important types of features you can create for time series forecasting are lagged values, rolling statistics (such as rolling means and standard deviations), and calendar features like day of the week or month. Each of these feature types helps your model capture different patterns and dependencies in the data.

1234567891011121314151617181920
import pandas as pd import numpy as np # Create a synthetic daily time series np.random.seed(42) dates = pd.date_range("2024-01-01", periods=10, freq="D") values = np.random.randint(10, 100, size=10) df = pd.DataFrame({"date": dates, "value": values}) df.set_index("date", inplace=True) # Add a lagged feature (previous day's value) df["lag_1"] = df["value"].shift(1) # Add a rolling mean (window of 3 days, excluding current day) df["rolling_mean_3"] = df["value"].shift(1).rolling(window=3).mean() # Add a calendar feature: day of week (Monday=0, Sunday=6) df["day_of_week"] = df.index.dayofweek print(df)
copy
Lagged values
expand arrow
  • Use lagged features when your target variable depends on its own recent history;
  • Lagged values are essential for capturing autocorrelation and short-term dependencies;
  • Pitfall: using future lags (e.g., negative shifts) can cause data leakage and unrealistic forecasts.
Rolling means and standard deviations
expand arrow
  • Use rolling statistics to smooth out short-term fluctuations and reveal local trends or volatility;
  • Rolling features are useful when recent averages or variability influence the target;
  • Pitfall: including the current or future value in the rolling window can leak information from the target period.
Calendar features
expand arrow
  • Use calendar features to capture recurring patterns tied to time, such as weekly seasonality or holidays;
  • Day of week or month features help the model recognize cycles related to the calendar;
  • Pitfall: for highly irregular time series or non-calendar-based data, these features may add noise instead of value.

1. Which feature type helps capture seasonality in daily sales data?

2. What is a potential risk when using rolling statistics as features?

question mark

Which feature type helps capture seasonality in daily sales data?

Select the correct answer

question mark

What is a potential risk when using rolling statistics as features?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 3

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

bookFeature Extraction for Time Series

Swipe um das Menü anzuzeigen

Understanding how to transform raw time series data into meaningful features is essential for building effective machine learning models. Three of the most important types of features you can create for time series forecasting are lagged values, rolling statistics (such as rolling means and standard deviations), and calendar features like day of the week or month. Each of these feature types helps your model capture different patterns and dependencies in the data.

1234567891011121314151617181920
import pandas as pd import numpy as np # Create a synthetic daily time series np.random.seed(42) dates = pd.date_range("2024-01-01", periods=10, freq="D") values = np.random.randint(10, 100, size=10) df = pd.DataFrame({"date": dates, "value": values}) df.set_index("date", inplace=True) # Add a lagged feature (previous day's value) df["lag_1"] = df["value"].shift(1) # Add a rolling mean (window of 3 days, excluding current day) df["rolling_mean_3"] = df["value"].shift(1).rolling(window=3).mean() # Add a calendar feature: day of week (Monday=0, Sunday=6) df["day_of_week"] = df.index.dayofweek print(df)
copy
Lagged values
expand arrow
  • Use lagged features when your target variable depends on its own recent history;
  • Lagged values are essential for capturing autocorrelation and short-term dependencies;
  • Pitfall: using future lags (e.g., negative shifts) can cause data leakage and unrealistic forecasts.
Rolling means and standard deviations
expand arrow
  • Use rolling statistics to smooth out short-term fluctuations and reveal local trends or volatility;
  • Rolling features are useful when recent averages or variability influence the target;
  • Pitfall: including the current or future value in the rolling window can leak information from the target period.
Calendar features
expand arrow
  • Use calendar features to capture recurring patterns tied to time, such as weekly seasonality or holidays;
  • Day of week or month features help the model recognize cycles related to the calendar;
  • Pitfall: for highly irregular time series or non-calendar-based data, these features may add noise instead of value.

1. Which feature type helps capture seasonality in daily sales data?

2. What is a potential risk when using rolling statistics as features?

question mark

Which feature type helps capture seasonality in daily sales data?

Select the correct answer

question mark

What is a potential risk when using rolling statistics as features?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 3
some-alt