Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
What Should We Do With Detected Outliers | What is Anomaly Detection?
Data Anomaly Detection
course content

Course Content

Data Anomaly Detection

Data Anomaly Detection

1. What is Anomaly Detection?
2. Statistical Methods in Anomaly Detection
3. Machine Learning Techniques

bookWhat Should We Do With Detected Outliers

The approach to dealing with outliers in machine learning depends on the nature and cause of the outliers, as well as the goals of the analysis or model. Here are some common approaches to handling outliers:

1. Ignore the outliers: In some cases, outliers may be valid and meaningful data points that should not be removed. If the outliers are not errors and do not significantly affect the overall distribution or analysis, it may be appropriate to leave them in the dataset. We can use different regularization techniques to decrease their influence on the predictions;

2. Replace outlier value with mode/ median: If you have many outliers or they significantly change the data's overall pattern, a basic method is to replace them with the average or median values calculated from the rest of the data, without including those outliers;

Note

This method is suitable only for data that has a constant mean value. If the data exhibits any kind of trend, whether it's linear or nonlinear, this approach cannot be applied effectively.

3. Transform the data: In some cases, transforming the data using mathematical functions such as logarithms, square roots, or power functions can help to reduce the impact of outliers and improve the accuracy of machine learning models;

4. Treat outliers as a separate class: In classification tasks outliers may represent a distinct class of data that should be analyzed separately from the rest of the dataset. For example, in fraud detection, outliers may represent fraudulent transactions that require special attention and analysis;

When should you consider removing outliers from your dataset?

When should you consider removing outliers from your dataset?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 4
some-alt