Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Finding Optimal Number of Clusters Using Silhouette Score | K-Means
Clusteranalyse
course content

Kursinhalt

Clusteranalyse

Clusteranalyse

1. Clustering Fundamentals
2. Core Concepts
3. K-Means
4. Hierarchical Clustering
5. DBSCAN
6. GMMs

book
Finding Optimal Number of Clusters Using Silhouette Score

Besides the WSS method, the silhouette score is another valuable metric for determining the optimal number of clusters (K) in K-means. It evaluates how well each data point fits its cluster compared to others.

For each data point, the silhouette ccore considers:

  • Cohesion (a): average distance to points within its cluster;

  • Separation (b): average distance to points in the nearest other cluster.

The Silhouette Score is calculated as: (b - a) / max(a, b), ranging from -1 to +1.

Score interpretation:

  • +1: point is well-clustered;

  • ~0: point is on the cluster boundary;

  • -1: point may be misclassified.

Steps to find optimal K using silhouette score are the following:

  • Run K-means for a range of K values (e.g., K=2 to a reasonable limit);

  • For each K, calculate the average Silhouette Score;

  • Plot average silhouette score vs. K (silhouette plot);

  • Choose K with the highest average silhouette score.

Examining the silhouette plot, which shows scores for each point, can offer deeper insights into cluster consistency. Higher average scores and consistent scores across points are desirable.

In summary, while WSS minimizes within-cluster distances, silhouette score balances cohesion and separation. Using both provides a more robust approach to finding the optimal K.

question mark

What does a high average silhouette score (close to +1) indicate when evaluating clustering results?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 4
Wir sind enttäuscht, dass etwas schief gelaufen ist. Was ist passiert?
some-alt