Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn How DBSCAN Works? | DBSCAN
Cluster Analysis
course content

Course Content

Cluster Analysis

Cluster Analysis

1. Clustering Fundamentals
2. Core Concepts
3. K-Means
4. Hierarchical Clustering
5. DBSCAN
6. GMMs

book
How DBSCAN Works?

DBSCAN operates based on the idea of density reachability. It defines clusters as dense regions of data points separated by areas of lower density. Two key parameters govern its behavior:

  • Epsilon (Ξ΅): the radius within which you search for neighboring points;

  • Minimum number of points (MinPts): the minimum number of points required within the Ξ΅-radius to form a dense region (including the point itself).

DBSCAN classifies points into three categories:

  • Core points: a point is a core point if it has at least MinPts within its Ξ΅-radius;

  • Border points: a point is a border point if it has fewer than MinPts within its Ξ΅-radius but is reachable from a core point (i.e., within the Ξ΅-radius of a core point);

  • Noise points: a point that is neither a core point nor a border point is considered a noise point.

Algorithm

  1. Start with an arbitrary unvisited point;

  2. Find all points within its Ξ΅-radius;

  3. If a point has at least MinPts neighbors within an Ξ΅-radius, it's marked as a core point, initiating a new cluster that recursively expands by adding all directly density-reachable points;

  4. If the number of points within the Ξ΅-radius is less than MinPts, mark the point as a border point (if it's within the Ξ΅-radius of a core point) or a noise point (if it's not);

  5. Repeat steps 1-4 until all points are visited.

Imagine a scatter plot of data points. DBSCAN would start by picking a point. If it finds enough neighbors within its Ξ΅-radius, it marks it as a core point and starts forming a cluster. It then expands this cluster by checking the neighbors of the core point and their neighbors, and so on. Points that are close to a core point but don't have enough neighbors themselves are marked as border points. Points that are isolated are identified as noise.

question mark

In DBSCAN, what condition must be met for a point to be classified as a core point?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 5. ChapterΒ 2
We're sorry to hear that something went wrong. What happened?
some-alt