1.5 IQR Rule
The 1.5 IQR (Interquartile Range) rule is a simple but effective method for identifying outliers in a dataset. It's based on the spread of data around the median and is commonly used in anomaly detection.
How to use 1.5 IQR rule
- Calculate the IQR, which is the range between the 75th percentile (Q3) and the 25th percentile (Q1) of the dataset;
- Define the lower threshold as
Q1 - 1.5 * IQR
and the upper threshold asQ3 + 1.5 * IQR
; - Any data point below the lower threshold or above the upper threshold is considered an outlier.
Here is the implementation of this rule:
import numpy as np
def detect_outliers_iqr(data):
data = np.array(data)
q1 = np.percentile(data, 25)
q3 = np.percentile(data, 75)
iqr = q3 - q1
lower_threshold = q1 - 1.5 * iqr
upper_threshold = q3 + 1.5 * iqr
outliers = data[(data < lower_threshold) | (data > upper_threshold)]
return outliers
We simply calculate threshold values and condenser all points out of IQR range as outliers.
1.5 IQR rule for commonly used distributions
Pros and cons of using 1.5 IQR rule
Grazie per i tuoi commenti!
Chieda ad AI
Chieda ad AI
Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione
Mi faccia domande su questo argomento
Riassuma questo capitolo
Mostri esempi dal mondo reale
Awesome!
Completion rate improved to 6.67
1.5 IQR Rule
Scorri per mostrare il menu
The 1.5 IQR (Interquartile Range) rule is a simple but effective method for identifying outliers in a dataset. It's based on the spread of data around the median and is commonly used in anomaly detection.
How to use 1.5 IQR rule
- Calculate the IQR, which is the range between the 75th percentile (Q3) and the 25th percentile (Q1) of the dataset;
- Define the lower threshold as
Q1 - 1.5 * IQR
and the upper threshold asQ3 + 1.5 * IQR
; - Any data point below the lower threshold or above the upper threshold is considered an outlier.
Here is the implementation of this rule:
import numpy as np
def detect_outliers_iqr(data):
data = np.array(data)
q1 = np.percentile(data, 25)
q3 = np.percentile(data, 75)
iqr = q3 - q1
lower_threshold = q1 - 1.5 * iqr
upper_threshold = q3 + 1.5 * iqr
outliers = data[(data < lower_threshold) | (data > upper_threshold)]
return outliers
We simply calculate threshold values and condenser all points out of IQR range as outliers.
1.5 IQR rule for commonly used distributions
Pros and cons of using 1.5 IQR rule
Grazie per i tuoi commenti!