Platt Scaling: Logistic Calibration
Platt scaling is a widely used probability calibration technique that leverages logistic regression to adjust the predicted probabilities from a classification model. It is particularly useful when your model's predicted probabilities are systematically overconfident or underconfident, but the model's ranking of instances is still reliable. Platt scaling works by fitting a logistic regression model to the outputs (scores or probabilities) of an existing classifier, with the goal of mapping these outputs to better-calibrated probabilities that reflect the true likelihood of each class. This approach is most appropriate when you suspect your model's probabilities are not well-calibrated, but its ability to distinguish between classes is still strong.
123456789101112131415161718192021222324252627282930from sklearn.ensemble import RandomForestClassifier from sklearn.calibration import CalibratedClassifierCV from sklearn.metrics import brier_score_loss from sklearn.model_selection import train_test_split from sklearn.datasets import make_classification X, y = make_classification(n_samples=1000, n_features=10, random_state=42) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=42 ) # Overconfident model rf = RandomForestClassifier(n_estimators=200, random_state=42) rf.fit(X_train, y_train) probs_uncalibrated = rf.predict_proba(X_test)[:, 1] # Calibrated RF with Platt scaling cal_rf = CalibratedClassifierCV( estimator=RandomForestClassifier(n_estimators=200, random_state=42), method="sigmoid", cv=3 ) cal_rf.fit(X_train, y_train) probs_calibrated = cal_rf.predict_proba(X_test)[:, 1] brier_uncalibrated = brier_score_loss(y_test, probs_uncalibrated) brier_calibrated = brier_score_loss(y_test, probs_calibrated) print(f"Brier RF before calibration: {brier_uncalibrated:.4f}") print(f"Brier RF after calibration: {brier_calibrated:.4f}")
Platt scaling transforms the predicted probabilities by passing them through a logistic (sigmoid) function fitted to the model's outputs. This process can pull overconfident probabilities closer to the true observed frequencies, resulting in better-calibrated outputs. Platt scaling is most effective when the miscalibration can be corrected by a simple sigmoid-shaped adjustmentβsuch as when the model's outputs are monotonically related to the true probabilities but are systematically too extreme or too conservative. However, if the relationship between predicted scores and true probabilities is highly non-linear or irregular, Platt scaling may not provide sufficient flexibility to achieve good calibration.
1. What type of calibration function does Platt scaling apply?
2. In which scenario may Platt scaling not perform well?
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 6.67
Platt Scaling: Logistic Calibration
Swipe to show menu
Platt scaling is a widely used probability calibration technique that leverages logistic regression to adjust the predicted probabilities from a classification model. It is particularly useful when your model's predicted probabilities are systematically overconfident or underconfident, but the model's ranking of instances is still reliable. Platt scaling works by fitting a logistic regression model to the outputs (scores or probabilities) of an existing classifier, with the goal of mapping these outputs to better-calibrated probabilities that reflect the true likelihood of each class. This approach is most appropriate when you suspect your model's probabilities are not well-calibrated, but its ability to distinguish between classes is still strong.
123456789101112131415161718192021222324252627282930from sklearn.ensemble import RandomForestClassifier from sklearn.calibration import CalibratedClassifierCV from sklearn.metrics import brier_score_loss from sklearn.model_selection import train_test_split from sklearn.datasets import make_classification X, y = make_classification(n_samples=1000, n_features=10, random_state=42) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=42 ) # Overconfident model rf = RandomForestClassifier(n_estimators=200, random_state=42) rf.fit(X_train, y_train) probs_uncalibrated = rf.predict_proba(X_test)[:, 1] # Calibrated RF with Platt scaling cal_rf = CalibratedClassifierCV( estimator=RandomForestClassifier(n_estimators=200, random_state=42), method="sigmoid", cv=3 ) cal_rf.fit(X_train, y_train) probs_calibrated = cal_rf.predict_proba(X_test)[:, 1] brier_uncalibrated = brier_score_loss(y_test, probs_uncalibrated) brier_calibrated = brier_score_loss(y_test, probs_calibrated) print(f"Brier RF before calibration: {brier_uncalibrated:.4f}") print(f"Brier RF after calibration: {brier_calibrated:.4f}")
Platt scaling transforms the predicted probabilities by passing them through a logistic (sigmoid) function fitted to the model's outputs. This process can pull overconfident probabilities closer to the true observed frequencies, resulting in better-calibrated outputs. Platt scaling is most effective when the miscalibration can be corrected by a simple sigmoid-shaped adjustmentβsuch as when the model's outputs are monotonically related to the true probabilities but are systematically too extreme or too conservative. However, if the relationship between predicted scores and true probabilities is highly non-linear or irregular, Platt scaling may not provide sufficient flexibility to achieve good calibration.
1. What type of calibration function does Platt scaling apply?
2. In which scenario may Platt scaling not perform well?
Thanks for your feedback!