Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте How K-Means Algorithm Works? | K-Means
Cluster Analysis
course content

Зміст курсу

Cluster Analysis

Cluster Analysis

1. Clustering Fundamentals
2. Core Concepts
3. K-Means
4. Hierarchical Clustering
5. DBSCAN
6. GMMs

book
How K-Means Algorithm Works?

Initialization

The algorithm begins by randomly selecting K initial cluster centers, also known as centroids. These centroids serve as the starting points for each cluster. A common approach is to randomly choose K data points from the dataset to be the initial centroids.

Assignment Step

In this step, each data point is assigned to the closest centroid. The distance is typically measured using Euclidean distance, but other distance metrics can also be used. Each data point is placed into the cluster represented by the nearest centroid.

Update Step

Once all data points are assigned to clusters, the centroids are recalculated. For each cluster, the new centroid is computed as the mean of all the data points belonging to that cluster. Essentially, the centroid is moved to the center of its cluster.

Iteration

Steps 2 and 3 are repeated iteratively. In each iteration, data points are reassigned to clusters based on the updated centroids, and then centroids are recalculated based on the new cluster assignments. This iterative process continues until a stopping criterion is met.

Convergence

The algorithm stops when one of the following conditions is met:

  • Centroids do not change significantly: the positions of the centroids stabilize, meaning that in subsequent iterations, there is minimal change in their locations;

  • Data point assignments do not change: data points remain in the same clusters, indicating that the cluster structure has become stable;

  • Maximum number of iterations is reached: a pre-defined maximum number of iterations is reached. This prevents the algorithm from running indefinitely.

Upon convergence, the K-means algorithm has partitioned the data into K clusters, with each cluster represented by its centroid. The resulting clusters aim to be internally cohesive and externally separated based on the chosen distance metric and the iterative refinement process.

question mark

During the update step in the K-means algorithm, what is the main action performed?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2
Ми дуже хвилюємося, що щось пішло не так. Що трапилося?
some-alt