Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Direct Forecasting | Multi-Step Forecasting Strategies
Machine Learning for Time Series Forecasting

bookDirect Forecasting

When you want to predict multiple future values in a time series, the direct strategy offers a clear and structured approach. With this method, you train a separate supervised model for each forecast step you care about. For example, if you want to predict the next two time steps, you would build one model to forecast the value at time t+1 and a completely separate model to predict the value at time t+2. Each model is trained independently using the available lagged features, so the t+1 model learns to predict one step ahead, while the t+2 model learns to predict two steps ahead, directly from the current and past data.

1234567891011121314151617181920212223242526272829303132333435363738394041424344
import pandas as pd import numpy as np from sklearn.ensemble import RandomForestRegressor # Generate a simple time series np.random.seed(42) n = 100 data = pd.DataFrame({ "y": np.cumsum(np.random.randn(n)) + 10 }) # Create lagged features data["lag1"] = data["y"].shift(1) data["lag2"] = data["y"].shift(2) # Create targets for t+1 and t+2 forecasts data["target_t1"] = data["y"].shift(-1) data["target_t2"] = data["y"].shift(-2) # Drop rows with NaN values due to shifting train = data.dropna() # Features for both models features = ["lag1", "lag2"] # Model for 1-step ahead (t+1) X_t1 = train[features] y_t1 = train["target_t1"] model_t1 = RandomForestRegressor(random_state=0) model_t1.fit(X_t1, y_t1) # Model for 2-step ahead (t+2) X_t2 = train[features] y_t2 = train["target_t2"] model_t2 = RandomForestRegressor(random_state=0) model_t2.fit(X_t2, y_t2) # Example: Predict next value(s) using the last available data point last_row = data.iloc[[-1]][features] pred_t1 = model_t1.predict(last_row) pred_t2 = model_t2.predict(last_row) print(f"1-step ahead prediction: {pred_t1[0]:.2f}") print(f"2-step ahead prediction: {pred_t2[0]:.2f}")
copy

A key difference between the direct and recursive strategies lies in how predictions are generated and how errors can propagate. The direct approach, as shown above, trains a distinct model for each forecast horizon, so the prediction at t+2 does not depend on the t+1 prediction. This can offer more flexibility because you can tailor each model to its specific step ahead. However, it can also increase complexity, since you need to manage and train multiple modelsβ€”one for each forecasted step. In contrast, the recursive strategy uses a single model to predict the next step, then feeds its prediction back as input to forecast further into the future. While this is simpler to implement, it can lead to error accumulation, since mistakes made in early steps are carried forward to later predictions. The direct strategy avoids this particular error propagation, but may require more data and computational resources due to the increased number of models.

1. What is a key advantage of the direct approach for multi-step forecasting?

2. What is a potential downside of training separate models for each horizon?

question mark

What is a key advantage of the direct approach for multi-step forecasting?

Select the correct answer

question mark

What is a potential downside of training separate models for each horizon?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 2

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

bookDirect Forecasting

Swipe to show menu

When you want to predict multiple future values in a time series, the direct strategy offers a clear and structured approach. With this method, you train a separate supervised model for each forecast step you care about. For example, if you want to predict the next two time steps, you would build one model to forecast the value at time t+1 and a completely separate model to predict the value at time t+2. Each model is trained independently using the available lagged features, so the t+1 model learns to predict one step ahead, while the t+2 model learns to predict two steps ahead, directly from the current and past data.

1234567891011121314151617181920212223242526272829303132333435363738394041424344
import pandas as pd import numpy as np from sklearn.ensemble import RandomForestRegressor # Generate a simple time series np.random.seed(42) n = 100 data = pd.DataFrame({ "y": np.cumsum(np.random.randn(n)) + 10 }) # Create lagged features data["lag1"] = data["y"].shift(1) data["lag2"] = data["y"].shift(2) # Create targets for t+1 and t+2 forecasts data["target_t1"] = data["y"].shift(-1) data["target_t2"] = data["y"].shift(-2) # Drop rows with NaN values due to shifting train = data.dropna() # Features for both models features = ["lag1", "lag2"] # Model for 1-step ahead (t+1) X_t1 = train[features] y_t1 = train["target_t1"] model_t1 = RandomForestRegressor(random_state=0) model_t1.fit(X_t1, y_t1) # Model for 2-step ahead (t+2) X_t2 = train[features] y_t2 = train["target_t2"] model_t2 = RandomForestRegressor(random_state=0) model_t2.fit(X_t2, y_t2) # Example: Predict next value(s) using the last available data point last_row = data.iloc[[-1]][features] pred_t1 = model_t1.predict(last_row) pred_t2 = model_t2.predict(last_row) print(f"1-step ahead prediction: {pred_t1[0]:.2f}") print(f"2-step ahead prediction: {pred_t2[0]:.2f}")
copy

A key difference between the direct and recursive strategies lies in how predictions are generated and how errors can propagate. The direct approach, as shown above, trains a distinct model for each forecast horizon, so the prediction at t+2 does not depend on the t+1 prediction. This can offer more flexibility because you can tailor each model to its specific step ahead. However, it can also increase complexity, since you need to manage and train multiple modelsβ€”one for each forecasted step. In contrast, the recursive strategy uses a single model to predict the next step, then feeds its prediction back as input to forecast further into the future. While this is simpler to implement, it can lead to error accumulation, since mistakes made in early steps are carried forward to later predictions. The direct strategy avoids this particular error propagation, but may require more data and computational resources due to the increased number of models.

1. What is a key advantage of the direct approach for multi-step forecasting?

2. What is a potential downside of training separate models for each horizon?

question mark

What is a key advantage of the direct approach for multi-step forecasting?

Select the correct answer

question mark

What is a potential downside of training separate models for each horizon?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 2
some-alt