Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Tree-Based Models for Forecasting | Classical ML Models for Time Series
Machine Learning for Time Series Forecasting

bookTree-Based Models for Forecasting

Tree-based models, such as decision trees and random forests, have become popular tools for time series forecasting, especially when you represent your time series data as tabular features. Unlike traditional statistical models that require strong assumptions about data distribution or stationarity, tree-based models can flexibly capture nonlinear relationships and interactions among lagged and engineered features. This makes them especially suitable for time series problems where you have already constructed features like previous time steps (lags), rolling means, or calendar variables. These models are robust to outliers and can handle both numerical and categorical variables, which further enhances their applicability to real-world forecasting tasks.

123456789101112131415161718192021222324252627282930313233343536373839404142434445
import pandas as pd import numpy as np from sklearn.ensemble import RandomForestRegressor import statsmodels.api as sm import matplotlib.pyplot as plt # Load CO2 dataset data = sm.datasets.co2.load_pandas().data data = data.rename(columns={"co2": "value"}) # Ensure datetime index data.index = pd.to_datetime(data.index) # 1) Resample weekly (CO2 dataset is irregular weekly) data = data.resample("W").mean() # 2) Interpolate missing values (important!) data["value"] = data["value"].interpolate() # 3) Create lag features data["lag1"] = data["value"].shift(1) data["lag2"] = data["value"].shift(2) data["lag3"] = data["value"].shift(3) # Remove rows with NaNs introduced by shifting data = data.dropna() # 4) Train-test split train_size = int(len(data) * 0.8) X = data[["lag1", "lag2", "lag3"]] y = data["value"] X_train, X_test = X.iloc[:train_size], X.iloc[train_size:] y_train, y_test = y.iloc[:train_size], y.iloc[train_size:] # Make sure both splits are non-empty print("Train size:", X_train.shape, "Test size:", X_test.shape) # 5) Fit model model = RandomForestRegressor(n_estimators=300, random_state=42) model.fit(X_train, y_train) # 6) Predict predictions = model.predict(X_test)
copy
123456789101112
# Visualization plt.figure(figsize=(14, 6)) plt.plot(y.index, y.values, label="Actual COβ‚‚", color="black") plt.plot(y_test.index, predictions, label="Predicted (RF)", color="orange") plt.axvline(y_test.index[0], color="gray", linestyle="--", label="Train/Test Split") plt.title("Random Forest Forecasting on Weekly COβ‚‚ Concentrations") plt.xlabel("Date") plt.ylabel("COβ‚‚ Level (ppm)") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy

While tree-based models offer flexibility and strong performance, there are important considerations to keep in mind. Overfitting can occur if you use too many trees, or if your features are highly correlated or not sufficiently informative. Random forests help mitigate overfitting by averaging predictions across many trees, which reduces variance compared to a single decision tree.

A key advantage of tree-based models is their ability to provide feature importance scores, helping you understand which lagged or engineered features are most influential for predictions. This enhances interpretability, as you can visualize which factors drive the forecast. However, tree-based models may struggle when relationships are highly linear or when extrapolation far outside the training data is required, and they do not natively model temporal dependencies as sequence models do.

1. Why are tree-based models popular for time series forecasting with engineered features?

2. What is a limitation of using decision trees for time series forecasting?

question mark

Why are tree-based models popular for time series forecasting with engineered features?

Select the correct answer

question mark

What is a limitation of using decision trees for time series forecasting?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 1

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

bookTree-Based Models for Forecasting

Swipe to show menu

Tree-based models, such as decision trees and random forests, have become popular tools for time series forecasting, especially when you represent your time series data as tabular features. Unlike traditional statistical models that require strong assumptions about data distribution or stationarity, tree-based models can flexibly capture nonlinear relationships and interactions among lagged and engineered features. This makes them especially suitable for time series problems where you have already constructed features like previous time steps (lags), rolling means, or calendar variables. These models are robust to outliers and can handle both numerical and categorical variables, which further enhances their applicability to real-world forecasting tasks.

123456789101112131415161718192021222324252627282930313233343536373839404142434445
import pandas as pd import numpy as np from sklearn.ensemble import RandomForestRegressor import statsmodels.api as sm import matplotlib.pyplot as plt # Load CO2 dataset data = sm.datasets.co2.load_pandas().data data = data.rename(columns={"co2": "value"}) # Ensure datetime index data.index = pd.to_datetime(data.index) # 1) Resample weekly (CO2 dataset is irregular weekly) data = data.resample("W").mean() # 2) Interpolate missing values (important!) data["value"] = data["value"].interpolate() # 3) Create lag features data["lag1"] = data["value"].shift(1) data["lag2"] = data["value"].shift(2) data["lag3"] = data["value"].shift(3) # Remove rows with NaNs introduced by shifting data = data.dropna() # 4) Train-test split train_size = int(len(data) * 0.8) X = data[["lag1", "lag2", "lag3"]] y = data["value"] X_train, X_test = X.iloc[:train_size], X.iloc[train_size:] y_train, y_test = y.iloc[:train_size], y.iloc[train_size:] # Make sure both splits are non-empty print("Train size:", X_train.shape, "Test size:", X_test.shape) # 5) Fit model model = RandomForestRegressor(n_estimators=300, random_state=42) model.fit(X_train, y_train) # 6) Predict predictions = model.predict(X_test)
copy
123456789101112
# Visualization plt.figure(figsize=(14, 6)) plt.plot(y.index, y.values, label="Actual COβ‚‚", color="black") plt.plot(y_test.index, predictions, label="Predicted (RF)", color="orange") plt.axvline(y_test.index[0], color="gray", linestyle="--", label="Train/Test Split") plt.title("Random Forest Forecasting on Weekly COβ‚‚ Concentrations") plt.xlabel("Date") plt.ylabel("COβ‚‚ Level (ppm)") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy

While tree-based models offer flexibility and strong performance, there are important considerations to keep in mind. Overfitting can occur if you use too many trees, or if your features are highly correlated or not sufficiently informative. Random forests help mitigate overfitting by averaging predictions across many trees, which reduces variance compared to a single decision tree.

A key advantage of tree-based models is their ability to provide feature importance scores, helping you understand which lagged or engineered features are most influential for predictions. This enhances interpretability, as you can visualize which factors drive the forecast. However, tree-based models may struggle when relationships are highly linear or when extrapolation far outside the training data is required, and they do not natively model temporal dependencies as sequence models do.

1. Why are tree-based models popular for time series forecasting with engineered features?

2. What is a limitation of using decision trees for time series forecasting?

question mark

Why are tree-based models popular for time series forecasting with engineered features?

Select the correct answer

question mark

What is a limitation of using decision trees for time series forecasting?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 1
some-alt