Gradient Boosting for Time Series
Gradient boosting is an ensemble learning technique that builds models sequentially, with each new model correcting the errors of the previous ones. Unlike a single decision tree, which can be prone to overfitting or underfitting, gradient boosting combines many weak learners—typically shallow trees—into a strong predictive model. This approach is especially powerful for time series forecasting, where capturing subtle temporal patterns and non-linear relationships is crucial. Gradient boosting methods, such as scikit-learn's HistGradientBoostingRegressor, are robust to outliers and can handle complex feature interactions, making them well-suited for forecasting tasks when compared to single trees.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546import numpy as np import pandas as pd from sklearn.ensemble import HistGradientBoostingRegressor from sklearn.model_selection import TimeSeriesSplit from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt # Create a synthetic time series (sinusoidal with noise) np.random.seed(42) n = 300 t = np.arange(n) y = np.sin(0.04 * t) + np.random.normal(scale=0.3, size=n) df = pd.DataFrame({'y': y}) # Create lagged features and rolling means df['lag1'] = df['y'].shift(1) df['lag2'] = df['y'].shift(2) df['roll3'] = df['y'].rolling(window=3).mean().shift(1) df['roll7'] = df['y'].rolling(window=7).mean().shift(1) df = df.dropna() X = df[['lag1', 'lag2', 'roll3', 'roll7']] y = df['y'] tscv = TimeSeriesSplit(n_splits=5) mse_scores = [] # For plotting last fold last_train_idx, last_test_idx = None, None last_predictions = None for train_idx, test_idx in tscv.split(X): X_train, X_test = X.iloc[train_idx], X.iloc[test_idx] y_train, y_test = y.iloc[train_idx], y.iloc[test_idx] model = HistGradientBoostingRegressor(max_iter=100) model.fit(X_train, y_train) preds = model.predict(X_test) mse = mean_squared_error(y_test, preds) mse_scores.append(mse) last_train_idx, last_test_idx = train_idx, test_idx last_predictions = preds print("Mean MSE across folds:", np.mean(mse_scores))
1234567891011121314151617181920# Visualize plt.figure(figsize=(14, 6)) # Actual values (full series) plt.plot(y.index, y.values, label="Actual", color="black") # Predictions on final fold plt.plot(y.index[last_test_idx], last_predictions, label="Predictions (Last Fold)", color="orange", linewidth=2) # Train-test split line plt.axvline(y.index[last_test_idx][0], linestyle="--", color="gray", label="Train/Test Split") plt.title("HistGradientBoosting TS Forecast - Last Fold Visualization") plt.xlabel("Time") plt.ylabel("Value") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
When comparing random forests and gradient boosting for time series forecasting, several differences stand out:
- Random forests average predictions from many independent trees;
- This averaging reduces variance and improves robustness;
- Some bias may remain if the underlying patterns are complex.
Gradient boosting takes a different approach:
- Trees are built sequentially, with each tree correcting the errors of the previous one;
- This process often leads to lower bias and better performance on challenging forecasting tasks;
- Boosting models can be more sensitive to noise and overfitting if not tuned carefully.
Interpretability also differs:
- Random forests are typically easier to analyze, as feature importances are more stable and the ensemble is less sensitive to small data changes;
- Boosting models, while potentially more accurate, may require more careful examination to understand their predictions and feature dependencies.
1. What is a key advantage of gradient boosting over random forests for time series forecasting?
2. Which features are typically most important for boosting models in time series?
Obrigado pelo seu feedback!
Pergunte à IA
Pergunte à IA
Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo
Incrível!
Completion taxa melhorada para 8.33
Gradient Boosting for Time Series
Deslize para mostrar o menu
Gradient boosting is an ensemble learning technique that builds models sequentially, with each new model correcting the errors of the previous ones. Unlike a single decision tree, which can be prone to overfitting or underfitting, gradient boosting combines many weak learners—typically shallow trees—into a strong predictive model. This approach is especially powerful for time series forecasting, where capturing subtle temporal patterns and non-linear relationships is crucial. Gradient boosting methods, such as scikit-learn's HistGradientBoostingRegressor, are robust to outliers and can handle complex feature interactions, making them well-suited for forecasting tasks when compared to single trees.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546import numpy as np import pandas as pd from sklearn.ensemble import HistGradientBoostingRegressor from sklearn.model_selection import TimeSeriesSplit from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt # Create a synthetic time series (sinusoidal with noise) np.random.seed(42) n = 300 t = np.arange(n) y = np.sin(0.04 * t) + np.random.normal(scale=0.3, size=n) df = pd.DataFrame({'y': y}) # Create lagged features and rolling means df['lag1'] = df['y'].shift(1) df['lag2'] = df['y'].shift(2) df['roll3'] = df['y'].rolling(window=3).mean().shift(1) df['roll7'] = df['y'].rolling(window=7).mean().shift(1) df = df.dropna() X = df[['lag1', 'lag2', 'roll3', 'roll7']] y = df['y'] tscv = TimeSeriesSplit(n_splits=5) mse_scores = [] # For plotting last fold last_train_idx, last_test_idx = None, None last_predictions = None for train_idx, test_idx in tscv.split(X): X_train, X_test = X.iloc[train_idx], X.iloc[test_idx] y_train, y_test = y.iloc[train_idx], y.iloc[test_idx] model = HistGradientBoostingRegressor(max_iter=100) model.fit(X_train, y_train) preds = model.predict(X_test) mse = mean_squared_error(y_test, preds) mse_scores.append(mse) last_train_idx, last_test_idx = train_idx, test_idx last_predictions = preds print("Mean MSE across folds:", np.mean(mse_scores))
1234567891011121314151617181920# Visualize plt.figure(figsize=(14, 6)) # Actual values (full series) plt.plot(y.index, y.values, label="Actual", color="black") # Predictions on final fold plt.plot(y.index[last_test_idx], last_predictions, label="Predictions (Last Fold)", color="orange", linewidth=2) # Train-test split line plt.axvline(y.index[last_test_idx][0], linestyle="--", color="gray", label="Train/Test Split") plt.title("HistGradientBoosting TS Forecast - Last Fold Visualization") plt.xlabel("Time") plt.ylabel("Value") plt.legend() plt.grid(True) plt.tight_layout() plt.show()
When comparing random forests and gradient boosting for time series forecasting, several differences stand out:
- Random forests average predictions from many independent trees;
- This averaging reduces variance and improves robustness;
- Some bias may remain if the underlying patterns are complex.
Gradient boosting takes a different approach:
- Trees are built sequentially, with each tree correcting the errors of the previous one;
- This process often leads to lower bias and better performance on challenging forecasting tasks;
- Boosting models can be more sensitive to noise and overfitting if not tuned carefully.
Interpretability also differs:
- Random forests are typically easier to analyze, as feature importances are more stable and the ensemble is less sensitive to small data changes;
- Boosting models, while potentially more accurate, may require more careful examination to understand their predictions and feature dependencies.
1. What is a key advantage of gradient boosting over random forests for time series forecasting?
2. Which features are typically most important for boosting models in time series?
Obrigado pelo seu feedback!