Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Gradient Boosting for Time Series | Classical ML Models for Time Series
Machine Learning for Time Series Forecasting

bookGradient Boosting for Time Series

Gradient boosting is an ensemble learning technique that builds models sequentially, with each new model correcting the errors of the previous ones. Unlike a single decision tree, which can be prone to overfitting or underfitting, gradient boosting combines many weak learners—typically shallow trees—into a strong predictive model. This approach is especially powerful for time series forecasting, where capturing subtle temporal patterns and non-linear relationships is crucial. Gradient boosting methods, such as scikit-learn's HistGradientBoostingRegressor, are robust to outliers and can handle complex feature interactions, making them well-suited for forecasting tasks when compared to single trees.

12345678910111213141516171819202122232425262728293031323334353637383940414243444546
import numpy as np import pandas as pd from sklearn.ensemble import HistGradientBoostingRegressor from sklearn.model_selection import TimeSeriesSplit from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt # Create a synthetic time series (sinusoidal with noise) np.random.seed(42) n = 300 t = np.arange(n) y = np.sin(0.04 * t) + np.random.normal(scale=0.3, size=n) df = pd.DataFrame({'y': y}) # Create lagged features and rolling means df['lag1'] = df['y'].shift(1) df['lag2'] = df['y'].shift(2) df['roll3'] = df['y'].rolling(window=3).mean().shift(1) df['roll7'] = df['y'].rolling(window=7).mean().shift(1) df = df.dropna() X = df[['lag1', 'lag2', 'roll3', 'roll7']] y = df['y'] tscv = TimeSeriesSplit(n_splits=5) mse_scores = [] # For plotting last fold last_train_idx, last_test_idx = None, None last_predictions = None for train_idx, test_idx in tscv.split(X): X_train, X_test = X.iloc[train_idx], X.iloc[test_idx] y_train, y_test = y.iloc[train_idx], y.iloc[test_idx] model = HistGradientBoostingRegressor(max_iter=100) model.fit(X_train, y_train) preds = model.predict(X_test) mse = mean_squared_error(y_test, preds) mse_scores.append(mse) last_train_idx, last_test_idx = train_idx, test_idx last_predictions = preds print("Mean MSE across folds:", np.mean(mse_scores))
copy
1234567891011121314151617181920
# Visualize plt.figure(figsize=(14, 6)) # Actual values (full series) plt.plot(y.index, y.values, label="Actual", color="black") # Predictions on final fold plt.plot(y.index[last_test_idx], last_predictions, label="Predictions (Last Fold)", color="orange", linewidth=2) # Train-test split line plt.axvline(y.index[last_test_idx][0], linestyle="--", color="gray", label="Train/Test Split") plt.title("HistGradientBoosting TS Forecast - Last Fold Visualization") plt.xlabel("Time") plt.ylabel("Value") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy

When comparing random forests and gradient boosting for time series forecasting, several differences stand out:

  • Random forests average predictions from many independent trees;
  • This averaging reduces variance and improves robustness;
  • Some bias may remain if the underlying patterns are complex.

Gradient boosting takes a different approach:

  • Trees are built sequentially, with each tree correcting the errors of the previous one;
  • This process often leads to lower bias and better performance on challenging forecasting tasks;
  • Boosting models can be more sensitive to noise and overfitting if not tuned carefully.

Interpretability also differs:

  • Random forests are typically easier to analyze, as feature importances are more stable and the ensemble is less sensitive to small data changes;
  • Boosting models, while potentially more accurate, may require more careful examination to understand their predictions and feature dependencies.

1. What is a key advantage of gradient boosting over random forests for time series forecasting?

2. Which features are typically most important for boosting models in time series?

question mark

What is a key advantage of gradient boosting over random forests for time series forecasting?

Select the correct answer

question mark

Which features are typically most important for boosting models in time series?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 2

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

bookGradient Boosting for Time Series

Svep för att visa menyn

Gradient boosting is an ensemble learning technique that builds models sequentially, with each new model correcting the errors of the previous ones. Unlike a single decision tree, which can be prone to overfitting or underfitting, gradient boosting combines many weak learners—typically shallow trees—into a strong predictive model. This approach is especially powerful for time series forecasting, where capturing subtle temporal patterns and non-linear relationships is crucial. Gradient boosting methods, such as scikit-learn's HistGradientBoostingRegressor, are robust to outliers and can handle complex feature interactions, making them well-suited for forecasting tasks when compared to single trees.

12345678910111213141516171819202122232425262728293031323334353637383940414243444546
import numpy as np import pandas as pd from sklearn.ensemble import HistGradientBoostingRegressor from sklearn.model_selection import TimeSeriesSplit from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt # Create a synthetic time series (sinusoidal with noise) np.random.seed(42) n = 300 t = np.arange(n) y = np.sin(0.04 * t) + np.random.normal(scale=0.3, size=n) df = pd.DataFrame({'y': y}) # Create lagged features and rolling means df['lag1'] = df['y'].shift(1) df['lag2'] = df['y'].shift(2) df['roll3'] = df['y'].rolling(window=3).mean().shift(1) df['roll7'] = df['y'].rolling(window=7).mean().shift(1) df = df.dropna() X = df[['lag1', 'lag2', 'roll3', 'roll7']] y = df['y'] tscv = TimeSeriesSplit(n_splits=5) mse_scores = [] # For plotting last fold last_train_idx, last_test_idx = None, None last_predictions = None for train_idx, test_idx in tscv.split(X): X_train, X_test = X.iloc[train_idx], X.iloc[test_idx] y_train, y_test = y.iloc[train_idx], y.iloc[test_idx] model = HistGradientBoostingRegressor(max_iter=100) model.fit(X_train, y_train) preds = model.predict(X_test) mse = mean_squared_error(y_test, preds) mse_scores.append(mse) last_train_idx, last_test_idx = train_idx, test_idx last_predictions = preds print("Mean MSE across folds:", np.mean(mse_scores))
copy
1234567891011121314151617181920
# Visualize plt.figure(figsize=(14, 6)) # Actual values (full series) plt.plot(y.index, y.values, label="Actual", color="black") # Predictions on final fold plt.plot(y.index[last_test_idx], last_predictions, label="Predictions (Last Fold)", color="orange", linewidth=2) # Train-test split line plt.axvline(y.index[last_test_idx][0], linestyle="--", color="gray", label="Train/Test Split") plt.title("HistGradientBoosting TS Forecast - Last Fold Visualization") plt.xlabel("Time") plt.ylabel("Value") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
copy

When comparing random forests and gradient boosting for time series forecasting, several differences stand out:

  • Random forests average predictions from many independent trees;
  • This averaging reduces variance and improves robustness;
  • Some bias may remain if the underlying patterns are complex.

Gradient boosting takes a different approach:

  • Trees are built sequentially, with each tree correcting the errors of the previous one;
  • This process often leads to lower bias and better performance on challenging forecasting tasks;
  • Boosting models can be more sensitive to noise and overfitting if not tuned carefully.

Interpretability also differs:

  • Random forests are typically easier to analyze, as feature importances are more stable and the ensemble is less sensitive to small data changes;
  • Boosting models, while potentially more accurate, may require more careful examination to understand their predictions and feature dependencies.

1. What is a key advantage of gradient boosting over random forests for time series forecasting?

2. Which features are typically most important for boosting models in time series?

question mark

What is a key advantage of gradient boosting over random forests for time series forecasting?

Select the correct answer

question mark

Which features are typically most important for boosting models in time series?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 2
some-alt