Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende What Makes Time Series Forecasting Unique | Foundations of ML-Based Time Series Forecasting
Machine Learning for Time Series Forecasting

bookWhat Makes Time Series Forecasting Unique

Time series forecasting stands apart from standard regression or classification tasks in machine learning due to its unique structure and goals. In typical supervised learning, you are given a dataset of independent samples, each with features and a corresponding label. The order of the data points does not matter, and shuffling the dataset is often a recommended practice to ensure model robustness.

However, in time series forecasting, the data is inherently ordered in time. Each observation is not independent; instead, it is usually correlated with previous observations—a property known as autocorrelation. Your goal is to predict future values based on past data, making the temporal order essential. The target variable is often a future value of the same series, not a separate label.

This temporal dependency means that the standard approach of randomly splitting or shuffling data for training and testing can break the very patterns you want your model to learn. Understanding these differences is crucial for building effective machine learning models for forecasting.

1234567891011121314151617181920
import pandas as pd import numpy as np import matplotlib.pyplot as plt # Generate a synthetic time series with autocorrelation np.random.seed(42) n_points = 100 time = np.arange(n_points) series = np.zeros(n_points) for t in range(1, n_points): series[t] = 0.8 * series[t-1] + np.random.normal(scale=0.5) df = pd.DataFrame({'time': time, 'value': series}) plt.figure(figsize=(10, 4)) plt.plot(df['time'], df['value'], marker='o') plt.title('Synthetic Time Series with Temporal Dependency') plt.xlabel('Time') plt.ylabel('Value') plt.show()
copy
Note
Definition

Autocorrelation measures how current values in a time series relate to past values. In time series data, observations are often not independent—values at one time point can be highly correlated with previous values. This is why shuffling data, which destroys the temporal structure, is problematic for time series forecasting: it removes the very dependencies your model needs to learn.

1. Why can't you randomly shuffle time series data when preparing it for machine learning forecasting tasks?

2. Which property distinguishes time series forecasting from standard regression?

question mark

Why can't you randomly shuffle time series data when preparing it for machine learning forecasting tasks?

Select the correct answer

question mark

Which property distinguishes time series forecasting from standard regression?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 1

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Suggested prompts:

Can you explain more about autocorrelation in time series?

What are some common methods for splitting time series data for training and testing?

How does this synthetic example relate to real-world forecasting problems?

bookWhat Makes Time Series Forecasting Unique

Desliza para mostrar el menú

Time series forecasting stands apart from standard regression or classification tasks in machine learning due to its unique structure and goals. In typical supervised learning, you are given a dataset of independent samples, each with features and a corresponding label. The order of the data points does not matter, and shuffling the dataset is often a recommended practice to ensure model robustness.

However, in time series forecasting, the data is inherently ordered in time. Each observation is not independent; instead, it is usually correlated with previous observations—a property known as autocorrelation. Your goal is to predict future values based on past data, making the temporal order essential. The target variable is often a future value of the same series, not a separate label.

This temporal dependency means that the standard approach of randomly splitting or shuffling data for training and testing can break the very patterns you want your model to learn. Understanding these differences is crucial for building effective machine learning models for forecasting.

1234567891011121314151617181920
import pandas as pd import numpy as np import matplotlib.pyplot as plt # Generate a synthetic time series with autocorrelation np.random.seed(42) n_points = 100 time = np.arange(n_points) series = np.zeros(n_points) for t in range(1, n_points): series[t] = 0.8 * series[t-1] + np.random.normal(scale=0.5) df = pd.DataFrame({'time': time, 'value': series}) plt.figure(figsize=(10, 4)) plt.plot(df['time'], df['value'], marker='o') plt.title('Synthetic Time Series with Temporal Dependency') plt.xlabel('Time') plt.ylabel('Value') plt.show()
copy
Note
Definition

Autocorrelation measures how current values in a time series relate to past values. In time series data, observations are often not independent—values at one time point can be highly correlated with previous values. This is why shuffling data, which destroys the temporal structure, is problematic for time series forecasting: it removes the very dependencies your model needs to learn.

1. Why can't you randomly shuffle time series data when preparing it for machine learning forecasting tasks?

2. Which property distinguishes time series forecasting from standard regression?

question mark

Why can't you randomly shuffle time series data when preparing it for machine learning forecasting tasks?

Select the correct answer

question mark

Which property distinguishes time series forecasting from standard regression?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 1
some-alt