Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Bias–Variance Trade-Off and Ensembles | Introduction to Ensemble Learning
Ensemble Learning Techniques with Python

bookBias–Variance Trade-Off and Ensembles

In machine learning, prediction error is composed of three main components: bias, variance, and irreducible error. Bias measures how far, on average, your model's predictions are from the actual values because of simplifying assumptions. High bias means the model is too simple to capture the underlying patterns in the data, causing underfitting. Variance describes how much your model's predictions would change if you used a different training set. High variance means the model is too sensitive to the training data, which results in overfitting and poor generalization to new data.

Note
Definition

Bias is an error from erroneous assumptions in the learning algorithm. High bias can cause underfitting.

Note
Definition: Variance

Variance is an error from sensitivity to small fluctuations in the training set. High variance can cause overfitting.

Prediction error in machine learning can be broken down into three key components: bias, variance, and irreducible error. Bias is the error that results from the simplifying assumptions made by a model; it measures how far, on average, your model's predictions are from the actual values. High bias occurs when the model is too simple to capture the true patterns in the data, leading to underfitting. Variance is the error caused by the model's sensitivity to the specific training data; it reflects how much predictions would change if you used a different training set. High variance means the model fits the training data too closely, causing overfitting and poor generalization.

Mathematically, the bias–variance trade-off can be described by the following decomposition for the expected squared error at a new input value $x$:

E[(yf^(x))2]=Bias(f^(x))2+Variance(f^(x))+Irreducible Error\mathbb{E}\left[(y - \hat{f}(x))^2\right] = \text{Bias}(\hat{f}(x))^2 + \text{Variance}(\hat{f}(x)) + \text{Irreducible Error}
  • Bias(f^(x))\text{Bias}(\hat{f}(x)): the difference between the average prediction of your model and the actual value you are trying to predict;
  • Variance(f^(x))\text{Variance}(\hat{f}(x)): the variability of the model prediction for a given data point $x$ due to different training sets;
  • Irreducible Error\text{Irreducible Error}: the noise inherent in the data, which cannot be reduced by any model.

The trade-off arises because decreasing bias (by making the model more flexible) usually increases variance, and vice versa. The goal is to find a model with the right balance, minimizing total prediction error. Ensemble methods are widely used to tackle this trade-off and achieve better performance.

123456789101112131415161718192021222324252627282930313233343536373839
import numpy as np import matplotlib.pyplot as plt from sklearn.tree import DecisionTreeRegressor from sklearn.ensemble import BaggingRegressor # Create a noisy sine wave dataset np.random.seed(42) X = np.sort(np.random.rand(80, 1) * 6 - 3, axis=0) # X in [-3, 3] y = np.sin(X).ravel() + np.random.normal(0, 0.25, X.shape[0]) # Fit a single decision tree tree = DecisionTreeRegressor(max_depth=3, random_state=0) tree.fit(X, y) # Fit a bagging ensemble of decision trees bagging = BaggingRegressor( estimator=DecisionTreeRegressor(max_depth=3), n_estimators=30, random_state=0 ) bagging.fit(X, y) # Generate test data for predictions X_test = np.linspace(-3, 3, 500).reshape(-1, 1) y_true = np.sin(X_test).ravel() y_tree = tree.predict(X_test) y_bagging = bagging.predict(X_test) # Plotting plt.figure(figsize=(10, 6)) plt.plot(X_test, y_true, label="True Function (sin)", color="green", linewidth=2) plt.scatter(X, y, label="Training Data", color="gray", alpha=0.5) plt.plot(X_test, y_tree, label="Single Decision Tree", color="red", linestyle="--") plt.plot(X_test, y_bagging, label="Bagging Ensemble", color="blue") plt.title("Variance Reduction with Bagging Ensemble") plt.xlabel("X") plt.ylabel("y") plt.legend() plt.show()
copy
question mark

Which statements about bias and variance in machine learning are correct?

Select the correct answer

Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 2

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

bookBias–Variance Trade-Off and Ensembles

Stryg for at vise menuen

In machine learning, prediction error is composed of three main components: bias, variance, and irreducible error. Bias measures how far, on average, your model's predictions are from the actual values because of simplifying assumptions. High bias means the model is too simple to capture the underlying patterns in the data, causing underfitting. Variance describes how much your model's predictions would change if you used a different training set. High variance means the model is too sensitive to the training data, which results in overfitting and poor generalization to new data.

Note
Definition

Bias is an error from erroneous assumptions in the learning algorithm. High bias can cause underfitting.

Note
Definition: Variance

Variance is an error from sensitivity to small fluctuations in the training set. High variance can cause overfitting.

Prediction error in machine learning can be broken down into three key components: bias, variance, and irreducible error. Bias is the error that results from the simplifying assumptions made by a model; it measures how far, on average, your model's predictions are from the actual values. High bias occurs when the model is too simple to capture the true patterns in the data, leading to underfitting. Variance is the error caused by the model's sensitivity to the specific training data; it reflects how much predictions would change if you used a different training set. High variance means the model fits the training data too closely, causing overfitting and poor generalization.

Mathematically, the bias–variance trade-off can be described by the following decomposition for the expected squared error at a new input value $x$:

E[(yf^(x))2]=Bias(f^(x))2+Variance(f^(x))+Irreducible Error\mathbb{E}\left[(y - \hat{f}(x))^2\right] = \text{Bias}(\hat{f}(x))^2 + \text{Variance}(\hat{f}(x)) + \text{Irreducible Error}
  • Bias(f^(x))\text{Bias}(\hat{f}(x)): the difference between the average prediction of your model and the actual value you are trying to predict;
  • Variance(f^(x))\text{Variance}(\hat{f}(x)): the variability of the model prediction for a given data point $x$ due to different training sets;
  • Irreducible Error\text{Irreducible Error}: the noise inherent in the data, which cannot be reduced by any model.

The trade-off arises because decreasing bias (by making the model more flexible) usually increases variance, and vice versa. The goal is to find a model with the right balance, minimizing total prediction error. Ensemble methods are widely used to tackle this trade-off and achieve better performance.

123456789101112131415161718192021222324252627282930313233343536373839
import numpy as np import matplotlib.pyplot as plt from sklearn.tree import DecisionTreeRegressor from sklearn.ensemble import BaggingRegressor # Create a noisy sine wave dataset np.random.seed(42) X = np.sort(np.random.rand(80, 1) * 6 - 3, axis=0) # X in [-3, 3] y = np.sin(X).ravel() + np.random.normal(0, 0.25, X.shape[0]) # Fit a single decision tree tree = DecisionTreeRegressor(max_depth=3, random_state=0) tree.fit(X, y) # Fit a bagging ensemble of decision trees bagging = BaggingRegressor( estimator=DecisionTreeRegressor(max_depth=3), n_estimators=30, random_state=0 ) bagging.fit(X, y) # Generate test data for predictions X_test = np.linspace(-3, 3, 500).reshape(-1, 1) y_true = np.sin(X_test).ravel() y_tree = tree.predict(X_test) y_bagging = bagging.predict(X_test) # Plotting plt.figure(figsize=(10, 6)) plt.plot(X_test, y_true, label="True Function (sin)", color="green", linewidth=2) plt.scatter(X, y, label="Training Data", color="gray", alpha=0.5) plt.plot(X_test, y_tree, label="Single Decision Tree", color="red", linestyle="--") plt.plot(X_test, y_bagging, label="Bagging Ensemble", color="blue") plt.title("Variance Reduction with Bagging Ensemble") plt.xlabel("X") plt.ylabel("y") plt.legend() plt.show()
copy
question mark

Which statements about bias and variance in machine learning are correct?

Select the correct answer

Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 2
some-alt