Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Impara Histogram Binning for Calibration | Calibration Methods in Practice
Quizzes & Challenges
Quizzes
Challenges
/
Model Calibration with Python

bookHistogram Binning for Calibration

Histogram binning is a non-parametric method for calibrating probabilistic predictions from machine learning models. The idea is to divide the interval of predicted probabilities into discrete bins, then adjust the predicted probabilities within each bin to match the empirical frequency of positive outcomes observed in the training data. This approach is simple and intuitive, making it a popular baseline for calibration tasks.

The basic algorithm for histogram binning involves the following steps:

  1. Sort all predicted probabilities and their corresponding true labels;
  2. Divide the probability range (usually [0, 1]) into a fixed number of bins;
  3. For each bin, compute the average true label (i.e., the fraction of positive examples) for predictions falling into that bin;
  4. Replace each predicted probability with the average true label of its bin.

This process ensures that, within each bin, the recalibrated probabilities reflect the observed empirical outcome frequencies, thus correcting systematic biases in the original predictions.

1234567891011121314151617181920212223242526272829303132333435363738
import numpy as np import pandas as pd # Example predicted probabilities and true labels y_pred = np.array([0.05, 0.12, 0.18, 0.22, 0.29, 0.34, 0.44, 0.51, 0.63, 0.72, 0.81, 0.93]) y_true = np.array([0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1]) # Define number of bins n_bins = 4 # Create bins and assign each prediction to a bin bins = np.linspace(0.0, 1.0, n_bins + 1) bin_indices = np.digitize(y_pred, bins, right=True) - 1 # Compute empirical probability for each bin bin_sums = np.zeros(n_bins) bin_counts = np.zeros(n_bins) for idx, label in zip(bin_indices, y_true): bin_sums[idx] += label bin_counts[idx] += 1 # Avoid division by zero bin_probs = np.zeros(n_bins) for i in range(n_bins): if bin_counts[i] > 0: bin_probs[i] = bin_sums[i] / bin_counts[i] # Calibrate predictions y_pred_calibrated = np.array([bin_probs[idx] for idx in bin_indices]) # Show original and calibrated predictions in a DataFrame df = pd.DataFrame({ "y_pred": y_pred, "y_true": y_true, "bin": bin_indices, "y_pred_calibrated": y_pred_calibrated }) print(df)
copy

While histogram binning is easy to implement and interpret, it comes with several important trade-offs. One key consideration is the choice of bin size. Using too few bins can lead to underfitting, where important calibration details are missed and different probability ranges are merged together, potentially hiding systematic biases. Using too many bins, on the other hand, can cause overfitting, as each bin may contain too few samples to reliably estimate the empirical probability, making the calibration unstable.

Another limitation is the requirement for sufficient data in each bin. If the dataset is small or predictions are clustered in a narrow range, some bins may have very few or even no samples, resulting in unreliable or undefined calibrated probabilities. Histogram binning is also less smooth than parametric methods such as Platt scaling, as the calibrated output is piecewise constant rather than continuous. This can be problematic if you require smooth probability estimates.

In summary, histogram binning is a practical and interpretable approach to calibration, but its effectiveness depends on thoughtful bin size selection and having enough data to populate each bin.

1. What is the likely effect of using too few bins when applying histogram binning for calibration?

2. Why might histogram binning be sensitive to the sample size in your dataset?

question mark

What is the likely effect of using too few bins when applying histogram binning for calibration?

Select the correct answer

question mark

Why might histogram binning be sensitive to the sample size in your dataset?

Select the correct answer

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 2. Capitolo 3

Chieda ad AI

expand

Chieda ad AI

ChatGPT

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

Suggested prompts:

Can you explain how to choose the optimal number of bins for histogram binning?

What are some alternatives to histogram binning for probability calibration?

Can you provide an example of when histogram binning might fail?

bookHistogram Binning for Calibration

Scorri per mostrare il menu

Histogram binning is a non-parametric method for calibrating probabilistic predictions from machine learning models. The idea is to divide the interval of predicted probabilities into discrete bins, then adjust the predicted probabilities within each bin to match the empirical frequency of positive outcomes observed in the training data. This approach is simple and intuitive, making it a popular baseline for calibration tasks.

The basic algorithm for histogram binning involves the following steps:

  1. Sort all predicted probabilities and their corresponding true labels;
  2. Divide the probability range (usually [0, 1]) into a fixed number of bins;
  3. For each bin, compute the average true label (i.e., the fraction of positive examples) for predictions falling into that bin;
  4. Replace each predicted probability with the average true label of its bin.

This process ensures that, within each bin, the recalibrated probabilities reflect the observed empirical outcome frequencies, thus correcting systematic biases in the original predictions.

1234567891011121314151617181920212223242526272829303132333435363738
import numpy as np import pandas as pd # Example predicted probabilities and true labels y_pred = np.array([0.05, 0.12, 0.18, 0.22, 0.29, 0.34, 0.44, 0.51, 0.63, 0.72, 0.81, 0.93]) y_true = np.array([0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1]) # Define number of bins n_bins = 4 # Create bins and assign each prediction to a bin bins = np.linspace(0.0, 1.0, n_bins + 1) bin_indices = np.digitize(y_pred, bins, right=True) - 1 # Compute empirical probability for each bin bin_sums = np.zeros(n_bins) bin_counts = np.zeros(n_bins) for idx, label in zip(bin_indices, y_true): bin_sums[idx] += label bin_counts[idx] += 1 # Avoid division by zero bin_probs = np.zeros(n_bins) for i in range(n_bins): if bin_counts[i] > 0: bin_probs[i] = bin_sums[i] / bin_counts[i] # Calibrate predictions y_pred_calibrated = np.array([bin_probs[idx] for idx in bin_indices]) # Show original and calibrated predictions in a DataFrame df = pd.DataFrame({ "y_pred": y_pred, "y_true": y_true, "bin": bin_indices, "y_pred_calibrated": y_pred_calibrated }) print(df)
copy

While histogram binning is easy to implement and interpret, it comes with several important trade-offs. One key consideration is the choice of bin size. Using too few bins can lead to underfitting, where important calibration details are missed and different probability ranges are merged together, potentially hiding systematic biases. Using too many bins, on the other hand, can cause overfitting, as each bin may contain too few samples to reliably estimate the empirical probability, making the calibration unstable.

Another limitation is the requirement for sufficient data in each bin. If the dataset is small or predictions are clustered in a narrow range, some bins may have very few or even no samples, resulting in unreliable or undefined calibrated probabilities. Histogram binning is also less smooth than parametric methods such as Platt scaling, as the calibrated output is piecewise constant rather than continuous. This can be problematic if you require smooth probability estimates.

In summary, histogram binning is a practical and interpretable approach to calibration, but its effectiveness depends on thoughtful bin size selection and having enough data to populate each bin.

1. What is the likely effect of using too few bins when applying histogram binning for calibration?

2. Why might histogram binning be sensitive to the sample size in your dataset?

question mark

What is the likely effect of using too few bins when applying histogram binning for calibration?

Select the correct answer

question mark

Why might histogram binning be sensitive to the sample size in your dataset?

Select the correct answer

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 2. Capitolo 3
some-alt