Learn Platt Scaling: Logistic Calibration | Calibration Methods in Practice

Platt scaling is a widely used probability calibration technique that leverages logistic regression to adjust the predicted probabilities from a classification model. It is particularly useful when your model's predicted probabilities are systematically overconfident or underconfident, but the model's ranking of instances is still reliable. Platt scaling works by fitting a logistic regression model to the outputs (scores or probabilities) of an existing classifier, with the goal of mapping these outputs to better-calibrated probabilities that reflect the true likelihood of each class. This approach is most appropriate when you suspect your model's probabilities are not well-calibrated, but its ability to distinguish between classes is still strong.


              123456789101112131415161718192021222324252627282930
            
from sklearn.ensemble import RandomForestClassifier
from sklearn.calibration import CalibratedClassifierCV
from sklearn.metrics import brier_score_loss
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=1000, n_features=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Overconfident model
rf = RandomForestClassifier(n_estimators=200, random_state=42)
rf.fit(X_train, y_train)
probs_uncalibrated = rf.predict_proba(X_test)[:, 1]

# Calibrated RF with Platt scaling
cal_rf = CalibratedClassifierCV(
    estimator=RandomForestClassifier(n_estimators=200, random_state=42),
    method="sigmoid",
    cv=3
)
cal_rf.fit(X_train, y_train)
probs_calibrated = cal_rf.predict_proba(X_test)[:, 1]

brier_uncalibrated = brier_score_loss(y_test, probs_uncalibrated)
brier_calibrated = brier_score_loss(y_test, probs_calibrated)

print(f"Brier RF before calibration:  {brier_uncalibrated:.4f}")
print(f"Brier RF after calibration:   {brier_calibrated:.4f}")

Platt scaling transforms the predicted probabilities by passing them through a logistic (sigmoid) function fitted to the model's outputs. This process can pull overconfident probabilities closer to the true observed frequencies, resulting in better-calibrated outputs. Platt scaling is most effective when the miscalibration can be corrected by a simple sigmoid-shaped adjustment—such as when the model's outputs are monotonically related to the true probabilities but are systematically too extreme or too conservative. However, if the relationship between predicted scores and true probabilities is highly non-linear or irregular, Platt scaling may not provide sufficient flexibility to achieve good calibration.

1. What type of calibration function does Platt scaling apply?

2. In which scenario may Platt scaling not perform well?

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 1

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Swipe to show menu


              123456789101112131415161718192021222324252627282930
            
from sklearn.ensemble import RandomForestClassifier
from sklearn.calibration import CalibratedClassifierCV
from sklearn.metrics import brier_score_loss
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=1000, n_features=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Overconfident model
rf = RandomForestClassifier(n_estimators=200, random_state=42)
rf.fit(X_train, y_train)
probs_uncalibrated = rf.predict_proba(X_test)[:, 1]

# Calibrated RF with Platt scaling
cal_rf = CalibratedClassifierCV(
    estimator=RandomForestClassifier(n_estimators=200, random_state=42),
    method="sigmoid",
    cv=3
)
cal_rf.fit(X_train, y_train)
probs_calibrated = cal_rf.predict_proba(X_test)[:, 1]

brier_uncalibrated = brier_score_loss(y_test, probs_uncalibrated)
brier_calibrated = brier_score_loss(y_test, probs_calibrated)

print(f"Brier RF before calibration:  {brier_uncalibrated:.4f}")
print(f"Brier RF after calibration:   {brier_calibrated:.4f}")

1. What type of calibration function does Platt scaling apply?

2. In which scenario may Platt scaling not perform well?

Everything was clear?

Thanks for your feedback!

Section 2. Chapter 1