Course Content
Data Anomaly Detection
Data Anomaly Detection
3-Sigma Rule
The 3-sigma rule, also known as the 68-95-99.7 rule or the empirical rule, is a statistical guideline used in anomaly detection and quality control.
It is based on the normal distribution and is used to identify outliers or anomalies in data.
Main aspects of this rule
- Normal Distribution Assumption: The 3-sigma rule assumes that the data follows a normal distribution (Gaussian distribution). In a normal distribution, approximately
68%
of the data falls within one standard deviation (sigma) of the mean, approximately95%
falls within two standard deviations, and about99.7%
falls within three standard deviations; - Identification of Outliers: According to the 3-sigma rule, data points that fall more than three standard deviations away from the mean are considered potential outliers. These data points are significantly different from the majority of the data and are often flagged for further investigation.
3-sigma rule implementation
Thanks for your feedback!