Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Anomaly Detection Metrics | Unsupervised Learning Metrics
Evaluation Metrics in Machine Learning

bookAnomaly Detection Metrics

Evaluating anomaly detection models presents unique challenges, especially when dealing with highly imbalanced data. In most real-world anomaly detection scenarios, the vast majority of data points are normal, while only a tiny fraction represent the rare events or anomalies that you are trying to detect. This imbalance means that traditional accuracy metrics can be misleading: a model that always predicts "normal" may achieve high accuracy simply by ignoring the anomalies altogether. As a result, you need evaluation metrics that focus on the model's ability to correctly identify these rare events without being overwhelmed by the large number of normal cases.

In the context of anomaly detection, especially with imbalanced datasets, two key metrics are precision and recall. These are defined as:

Precision=True PositivesTrue Positives+False Positives\text{Precision} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}} Recall=True PositivesTrue Positives+False Negatives\text{Recall} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Negatives}}
  • Precision answers: "Of all the points the model flagged as anomalies, how many were actually anomalies?";
  • Recall answers: "Of all the actual anomalies, how many did the model correctly flag?".

The ROC AUC (Receiver Operating Characteristic Area Under Curve) measures the ability of the model to distinguish between classes across all thresholds:

ROC AUC=01TPR(FPR)dFPR\text{ROC AUC} = \int_{0}^{1} \text{TPR}(\text{FPR}) \, d\text{FPR}

Where TPR\text{TPR} is the true positive rate (recall) and FPR\text{FPR} is the false positive rate.

123456789101112131415161718192021222324252627282930313233
from sklearn.datasets import make_classification from sklearn.ensemble import IsolationForest from sklearn.metrics import precision_recall_curve, roc_auc_score, auc # Simulate imbalanced data: 1% anomalies X, y = make_classification( n_samples=2000, n_features=20, n_informative=2, n_redundant=10, n_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=42, ) # y==1: normal, y==0: anomaly (flip for anomaly detection convention) y_anomaly = 1 - y # Fit Isolation Forest (unsupervised anomaly detection) clf = IsolationForest(contamination=0.01, random_state=42) clf.fit(X) # Decision function: higher means more normal, lower means more anomalous scores = -clf.decision_function(X) # Flip sign: higher = more anomalous # Precision-Recall precision, recall, thresholds = precision_recall_curve(y_anomaly, scores) pr_auc = auc(recall, precision) # ROC AUC roc_auc = roc_auc_score(y_anomaly, scores) print(f"Precision-Recall AUC: {pr_auc:.3f}") print(f"ROC AUC: {roc_auc:.3f}")
copy

In rare event detection with imbalanced data, prioritize precision-recall curves over ROC curves, as PR AUC better reflects your model's ability to detect anomalies without excessive false alarms. ROC AUC can overstate performance due to the large number of normal cases. Always choose and tune metrics — such as precision, recall, or PR AUC — based on your specific operational priorities, like minimizing missed anomalies or reducing false positives.

question mark

Which statement best describes why precision-recall curves are often preferred over ROC curves for evaluating anomaly detection models on highly imbalanced datasets?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 4

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Awesome!

Completion rate improved to 6.25

bookAnomaly Detection Metrics

Svep för att visa menyn

Evaluating anomaly detection models presents unique challenges, especially when dealing with highly imbalanced data. In most real-world anomaly detection scenarios, the vast majority of data points are normal, while only a tiny fraction represent the rare events or anomalies that you are trying to detect. This imbalance means that traditional accuracy metrics can be misleading: a model that always predicts "normal" may achieve high accuracy simply by ignoring the anomalies altogether. As a result, you need evaluation metrics that focus on the model's ability to correctly identify these rare events without being overwhelmed by the large number of normal cases.

In the context of anomaly detection, especially with imbalanced datasets, two key metrics are precision and recall. These are defined as:

Precision=True PositivesTrue Positives+False Positives\text{Precision} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}} Recall=True PositivesTrue Positives+False Negatives\text{Recall} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Negatives}}
  • Precision answers: "Of all the points the model flagged as anomalies, how many were actually anomalies?";
  • Recall answers: "Of all the actual anomalies, how many did the model correctly flag?".

The ROC AUC (Receiver Operating Characteristic Area Under Curve) measures the ability of the model to distinguish between classes across all thresholds:

ROC AUC=01TPR(FPR)dFPR\text{ROC AUC} = \int_{0}^{1} \text{TPR}(\text{FPR}) \, d\text{FPR}

Where TPR\text{TPR} is the true positive rate (recall) and FPR\text{FPR} is the false positive rate.

123456789101112131415161718192021222324252627282930313233
from sklearn.datasets import make_classification from sklearn.ensemble import IsolationForest from sklearn.metrics import precision_recall_curve, roc_auc_score, auc # Simulate imbalanced data: 1% anomalies X, y = make_classification( n_samples=2000, n_features=20, n_informative=2, n_redundant=10, n_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=42, ) # y==1: normal, y==0: anomaly (flip for anomaly detection convention) y_anomaly = 1 - y # Fit Isolation Forest (unsupervised anomaly detection) clf = IsolationForest(contamination=0.01, random_state=42) clf.fit(X) # Decision function: higher means more normal, lower means more anomalous scores = -clf.decision_function(X) # Flip sign: higher = more anomalous # Precision-Recall precision, recall, thresholds = precision_recall_curve(y_anomaly, scores) pr_auc = auc(recall, precision) # ROC AUC roc_auc = roc_auc_score(y_anomaly, scores) print(f"Precision-Recall AUC: {pr_auc:.3f}") print(f"ROC AUC: {roc_auc:.3f}")
copy

In rare event detection with imbalanced data, prioritize precision-recall curves over ROC curves, as PR AUC better reflects your model's ability to detect anomalies without excessive false alarms. ROC AUC can overstate performance due to the large number of normal cases. Always choose and tune metrics — such as precision, recall, or PR AUC — based on your specific operational priorities, like minimizing missed anomalies or reducing false positives.

question mark

Which statement best describes why precision-recall curves are often preferred over ROC curves for evaluating anomaly detection models on highly imbalanced datasets?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 4
some-alt