Learn Cluster Analysis | Description of Track Courses

Swipe to show menu

Clustering is a machine learning technique that groups similar data points into clusters based on their features or characteristics.

The main objective of clustering is to partition a dataset into subsets or clusters, where data points within the same cluster are more similar than those in other clusters.

Applications of clustering

Example

Let's consider the Iris dataset that contains measurements of various attributes of iris flowers belonging to three different species: Setosa, Versicolor, and Virginica.

The goal of the clustering task is to group similar iris flowers together based on their attribute measurements without using the species labels.


              123456789101112131415161718192021222324252627282930313233343536
            
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

# Load the Iris dataset
data = load_iris()
X = data.data

# Standardize the features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Create a KMeans clustering model
kmeans = KMeans(n_clusters=3, random_state=42)

# Fit the model to the scaled data
kmeans.fit(X_scaled)

# Predict the cluster labels for each data point
labels = kmeans.labels_

# Create a colormap for the labels
cmap = plt.get_cmap('viridis', 3)

# Visualize the clusters in 2D using the first two features (Sepal Length and Sepal Width)
plt.figure(figsize=(10, 6))
for i in range(3):
    cluster_data = X[labels == i]
    plt.scatter(cluster_data[:, 0], cluster_data[:, 1], label=f'Cluster {i}', cmap=cmap)
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.title('Iris Flower Clustering using K-Means')
plt.legend()
plt.show()

Everything was clear?

Thanks for your feedback!

Section 1. Chapter 5

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Section 1. Chapter 5