Direct Forecasting
When you want to predict multiple future values in a time series, the direct strategy offers a clear and structured approach. With this method, you train a separate supervised model for each forecast step you care about. For example, if you want to predict the next two time steps, you would build one model to forecast the value at time t+1 and a completely separate model to predict the value at time t+2. Each model is trained independently using the available lagged features, so the t+1 model learns to predict one step ahead, while the t+2 model learns to predict two steps ahead, directly from the current and past data.
1234567891011121314151617181920212223242526272829303132333435363738394041424344import pandas as pd import numpy as np from sklearn.ensemble import RandomForestRegressor # Generate a simple time series np.random.seed(42) n = 100 data = pd.DataFrame({ "y": np.cumsum(np.random.randn(n)) + 10 }) # Create lagged features data["lag1"] = data["y"].shift(1) data["lag2"] = data["y"].shift(2) # Create targets for t+1 and t+2 forecasts data["target_t1"] = data["y"].shift(-1) data["target_t2"] = data["y"].shift(-2) # Drop rows with NaN values due to shifting train = data.dropna() # Features for both models features = ["lag1", "lag2"] # Model for 1-step ahead (t+1) X_t1 = train[features] y_t1 = train["target_t1"] model_t1 = RandomForestRegressor(random_state=0) model_t1.fit(X_t1, y_t1) # Model for 2-step ahead (t+2) X_t2 = train[features] y_t2 = train["target_t2"] model_t2 = RandomForestRegressor(random_state=0) model_t2.fit(X_t2, y_t2) # Example: Predict next value(s) using the last available data point last_row = data.iloc[[-1]][features] pred_t1 = model_t1.predict(last_row) pred_t2 = model_t2.predict(last_row) print(f"1-step ahead prediction: {pred_t1[0]:.2f}") print(f"2-step ahead prediction: {pred_t2[0]:.2f}")
A key difference between the direct and recursive strategies lies in how predictions are generated and how errors can propagate. The direct approach, as shown above, trains a distinct model for each forecast horizon, so the prediction at t+2 does not depend on the t+1 prediction. This can offer more flexibility because you can tailor each model to its specific step ahead. However, it can also increase complexity, since you need to manage and train multiple models—one for each forecasted step. In contrast, the recursive strategy uses a single model to predict the next step, then feeds its prediction back as input to forecast further into the future. While this is simpler to implement, it can lead to error accumulation, since mistakes made in early steps are carried forward to later predictions. The direct strategy avoids this particular error propagation, but may require more data and computational resources due to the increased number of models.
1. What is a key advantage of the direct approach for multi-step forecasting?
2. What is a potential downside of training separate models for each horizon?
Tack för dina kommentarer!
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal
Fantastiskt!
Completion betyg förbättrat till 8.33
Direct Forecasting
Svep för att visa menyn
When you want to predict multiple future values in a time series, the direct strategy offers a clear and structured approach. With this method, you train a separate supervised model for each forecast step you care about. For example, if you want to predict the next two time steps, you would build one model to forecast the value at time t+1 and a completely separate model to predict the value at time t+2. Each model is trained independently using the available lagged features, so the t+1 model learns to predict one step ahead, while the t+2 model learns to predict two steps ahead, directly from the current and past data.
1234567891011121314151617181920212223242526272829303132333435363738394041424344import pandas as pd import numpy as np from sklearn.ensemble import RandomForestRegressor # Generate a simple time series np.random.seed(42) n = 100 data = pd.DataFrame({ "y": np.cumsum(np.random.randn(n)) + 10 }) # Create lagged features data["lag1"] = data["y"].shift(1) data["lag2"] = data["y"].shift(2) # Create targets for t+1 and t+2 forecasts data["target_t1"] = data["y"].shift(-1) data["target_t2"] = data["y"].shift(-2) # Drop rows with NaN values due to shifting train = data.dropna() # Features for both models features = ["lag1", "lag2"] # Model for 1-step ahead (t+1) X_t1 = train[features] y_t1 = train["target_t1"] model_t1 = RandomForestRegressor(random_state=0) model_t1.fit(X_t1, y_t1) # Model for 2-step ahead (t+2) X_t2 = train[features] y_t2 = train["target_t2"] model_t2 = RandomForestRegressor(random_state=0) model_t2.fit(X_t2, y_t2) # Example: Predict next value(s) using the last available data point last_row = data.iloc[[-1]][features] pred_t1 = model_t1.predict(last_row) pred_t2 = model_t2.predict(last_row) print(f"1-step ahead prediction: {pred_t1[0]:.2f}") print(f"2-step ahead prediction: {pred_t2[0]:.2f}")
A key difference between the direct and recursive strategies lies in how predictions are generated and how errors can propagate. The direct approach, as shown above, trains a distinct model for each forecast horizon, so the prediction at t+2 does not depend on the t+1 prediction. This can offer more flexibility because you can tailor each model to its specific step ahead. However, it can also increase complexity, since you need to manage and train multiple models—one for each forecasted step. In contrast, the recursive strategy uses a single model to predict the next step, then feeds its prediction back as input to forecast further into the future. While this is simpler to implement, it can lead to error accumulation, since mistakes made in early steps are carried forward to later predictions. The direct strategy avoids this particular error propagation, but may require more data and computational resources due to the increased number of models.
1. What is a key advantage of the direct approach for multi-step forecasting?
2. What is a potential downside of training separate models for each horizon?
Tack för dina kommentarer!