Isotonic Regression for Calibration
Isotonic regression is a non-parametric calibration method that transforms the predicted probabilities of a classifier into calibrated probabilities by learning a monotonically increasing function. Unlike Platt scaling, which fits a logistic regression model (a specific S-shaped curve) to map uncalibrated scores to probabilities, isotonic regression does not assume any particular functional form. Instead, it fits a piecewise constant function that only requires the mapping to be non-decreasing. This flexibility allows isotonic regression to adapt to a wider range of calibration issues, making it especially useful when the relationship between predicted scores and true probabilities is not well described by a logistic curve.
The main advantage of isotonic regression over Platt scaling is its ability to capture more complex, non-linear relationships between the model's raw outputs and the true likelihood of an event. This is particularly beneficial when the classifier's output deviates from the logistic shape, as isotonic regression can adjust to local patterns and irregularities in the data without being restricted by a fixed equation. However, this increased flexibility also means that isotonic regression may be more sensitive to noise, especially when the calibration set is small.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.ensemble import GradientBoostingClassifier from sklearn.model_selection import train_test_split from sklearn.calibration import CalibratedClassifierCV, calibration_curve # Generate synthetic data X, y = make_classification( n_samples=2000, n_features=20, n_informative=10, class_sep=1.0, flip_y=0.05, random_state=42 ) # Train-test split X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.5, random_state=42 ) # Base model (typically overconfident) gb = GradientBoostingClassifier( random_state=42, learning_rate=0.2, n_estimators=30, max_depth=3 ) gb.fit(X_train, y_train) # Calibrated model with isotonic regression (internal CV on train) calibrated_gb = CalibratedClassifierCV( estimator=gb, method="isotonic", cv=3 ) calibrated_gb.fit(X_train, y_train) # Predicted probabilities on test set prob_pos_uncal = gb.predict_proba(X_test)[:, 1] prob_pos_iso = calibrated_gb.predict_proba(X_test)[:, 1] # Compute calibration curves frac_pos_uncal, mean_pred_uncal = calibration_curve( y_test, prob_pos_uncal, n_bins=10 ) frac_pos_iso, mean_pred_iso = calibration_curve( y_test, prob_pos_iso, n_bins=10 ) # Plot calibration curves plt.figure(figsize=(8, 6)) plt.plot(mean_pred_uncal, frac_pos_uncal, "s-", label="Uncalibrated") plt.plot(mean_pred_iso, frac_pos_iso, "s-", label="Isotonic Regression") plt.plot([0, 1], [0, 1], "k:", label="Perfectly calibrated") plt.xlabel("Mean predicted value") plt.ylabel("Fraction of positives") plt.title("Calibration Curve: Isotonic Regression vs. Uncalibrated") plt.legend() plt.tight_layout() plt.show()
When you apply isotonic regression to calibrate a classifier, you often see a noticeable change to the shape of the calibration curve. Because isotonic regression fits a piecewise constant, monotonically increasing function, it can bend and adjust the curve at multiple points, correcting for both overconfidence and underconfidence in various regions of predicted probability. The resulting calibration curve may have steps or flat regions, especially if there are few samples in some probability intervals. This flexibility allows the calibration curve to closely follow the empirical relationship between predicted probabilities and observed frequencies, often resulting in a curve that hugs the diagonal more tightly than Platt scaling when the underlying relationship is not logistic.
1. What makes isotonic regression more flexible than Platt scaling?
2. When might isotonic regression overfit?
Bedankt voor je feedback!
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Can you explain when to prefer isotonic regression over Platt scaling?
What are the potential drawbacks of using isotonic regression for calibration?
How does the calibration curve help evaluate model calibration?
Geweldig!
Completion tarief verbeterd naar 6.67
Isotonic Regression for Calibration
Veeg om het menu te tonen
Isotonic regression is a non-parametric calibration method that transforms the predicted probabilities of a classifier into calibrated probabilities by learning a monotonically increasing function. Unlike Platt scaling, which fits a logistic regression model (a specific S-shaped curve) to map uncalibrated scores to probabilities, isotonic regression does not assume any particular functional form. Instead, it fits a piecewise constant function that only requires the mapping to be non-decreasing. This flexibility allows isotonic regression to adapt to a wider range of calibration issues, making it especially useful when the relationship between predicted scores and true probabilities is not well described by a logistic curve.
The main advantage of isotonic regression over Platt scaling is its ability to capture more complex, non-linear relationships between the model's raw outputs and the true likelihood of an event. This is particularly beneficial when the classifier's output deviates from the logistic shape, as isotonic regression can adjust to local patterns and irregularities in the data without being restricted by a fixed equation. However, this increased flexibility also means that isotonic regression may be more sensitive to noise, especially when the calibration set is small.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.ensemble import GradientBoostingClassifier from sklearn.model_selection import train_test_split from sklearn.calibration import CalibratedClassifierCV, calibration_curve # Generate synthetic data X, y = make_classification( n_samples=2000, n_features=20, n_informative=10, class_sep=1.0, flip_y=0.05, random_state=42 ) # Train-test split X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.5, random_state=42 ) # Base model (typically overconfident) gb = GradientBoostingClassifier( random_state=42, learning_rate=0.2, n_estimators=30, max_depth=3 ) gb.fit(X_train, y_train) # Calibrated model with isotonic regression (internal CV on train) calibrated_gb = CalibratedClassifierCV( estimator=gb, method="isotonic", cv=3 ) calibrated_gb.fit(X_train, y_train) # Predicted probabilities on test set prob_pos_uncal = gb.predict_proba(X_test)[:, 1] prob_pos_iso = calibrated_gb.predict_proba(X_test)[:, 1] # Compute calibration curves frac_pos_uncal, mean_pred_uncal = calibration_curve( y_test, prob_pos_uncal, n_bins=10 ) frac_pos_iso, mean_pred_iso = calibration_curve( y_test, prob_pos_iso, n_bins=10 ) # Plot calibration curves plt.figure(figsize=(8, 6)) plt.plot(mean_pred_uncal, frac_pos_uncal, "s-", label="Uncalibrated") plt.plot(mean_pred_iso, frac_pos_iso, "s-", label="Isotonic Regression") plt.plot([0, 1], [0, 1], "k:", label="Perfectly calibrated") plt.xlabel("Mean predicted value") plt.ylabel("Fraction of positives") plt.title("Calibration Curve: Isotonic Regression vs. Uncalibrated") plt.legend() plt.tight_layout() plt.show()
When you apply isotonic regression to calibrate a classifier, you often see a noticeable change to the shape of the calibration curve. Because isotonic regression fits a piecewise constant, monotonically increasing function, it can bend and adjust the curve at multiple points, correcting for both overconfidence and underconfidence in various regions of predicted probability. The resulting calibration curve may have steps or flat regions, especially if there are few samples in some probability intervals. This flexibility allows the calibration curve to closely follow the empirical relationship between predicted probabilities and observed frequencies, often resulting in a curve that hugs the diagonal more tightly than Platt scaling when the underlying relationship is not logistic.
1. What makes isotonic regression more flexible than Platt scaling?
2. When might isotonic regression overfit?
Bedankt voor je feedback!