Lära Anomaly Detection Metrics | Unsupervised Learning Metrics

Svep för att visa menyn

Evaluating anomaly detection models presents unique challenges, especially when dealing with highly imbalanced data. In most real-world anomaly detection scenarios, the vast majority of data points are normal, while only a tiny fraction represent the rare events or anomalies that you are trying to detect. This imbalance means that traditional accuracy metrics can be misleading: a model that always predicts "normal" may achieve high accuracy simply by ignoring the anomalies altogether. As a result, you need evaluation metrics that focus on the model's ability to correctly identify these rare events without being overwhelmed by the large number of normal cases.

In the context of anomaly detection, especially with imbalanced datasets, two key metrics are precision and recall. These are defined as:

\text{Precision} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}}

\text{Recall} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Negatives}}

Precision answers: "Of all the points the model flagged as anomalies, how many were actually anomalies?";
Recall answers: "Of all the actual anomalies, how many did the model correctly flag?".

The ROC AUC (Receiver Operating Characteristic Area Under Curve) measures the ability of the model to distinguish between classes across all thresholds:

\text{ROC AUC} = \int_{0}^{1} \text{TPR}(\text{FPR}) \, d\text{FPR}

Where $\text{TPR}$ is the true positive rate (recall) and $\text{FPR}$ is the false positive rate.


              123456789101112131415161718192021222324252627282930313233
            
from sklearn.datasets import make_classification
from sklearn.ensemble import IsolationForest
from sklearn.metrics import precision_recall_curve, roc_auc_score, auc

# Simulate imbalanced data: 1% anomalies
X, y = make_classification(
    n_samples=2000,
    n_features=20,
    n_informative=2,
    n_redundant=10,
    n_clusters_per_class=1,
    weights=[0.99],
    flip_y=0,
    random_state=42,
)
# y==1: normal, y==0: anomaly (flip for anomaly detection convention)
y_anomaly = 1 - y

# Fit Isolation Forest (unsupervised anomaly detection)
clf = IsolationForest(contamination=0.01, random_state=42)
clf.fit(X)
# Decision function: higher means more normal, lower means more anomalous
scores = -clf.decision_function(X)  # Flip sign: higher = more anomalous

# Precision-Recall
precision, recall, thresholds = precision_recall_curve(y_anomaly, scores)
pr_auc = auc(recall, precision)

# ROC AUC
roc_auc = roc_auc_score(y_anomaly, scores)

print(f"Precision-Recall AUC: {pr_auc:.3f}")
print(f"ROC AUC: {roc_auc:.3f}")

In rare event detection with imbalanced data, prioritize precision-recall curves over ROC curves, as PR AUC better reflects your model's ability to detect anomalies without excessive false alarms. ROC AUC can overstate performance due to the large number of normal cases. Always choose and tune metrics — such as precision, recall, or PR AUC — based on your specific operational priorities, like minimizing missed anomalies or reducing false positives.

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 4

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 3. Kapitel 4