Cluster Analysis
Clustering is a machine learning technique that groups similar data points into clusters based on their features or characteristics.
The main objective of clustering is to partition a dataset into subsets or clusters, where data points within the same cluster are more similar than those in other clusters.
Applications of clustering
Example
Let's consider the Iris dataset that contains measurements of various attributes of iris flowers belonging to three different species: Setosa, Versicolor, and Virginica.
The goal of the clustering task is to group similar iris flowers together based on their attribute measurements without using the species labels.
123456789101112131415161718192021222324252627282930313233343536import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.cluster import KMeans from sklearn.preprocessing import StandardScaler # Load the Iris dataset data = load_iris() X = data.data # Standardize the features scaler = StandardScaler() X_scaled = scaler.fit_transform(X) # Create a KMeans clustering model kmeans = KMeans(n_clusters=3, random_state=42) # Fit the model to the scaled data kmeans.fit(X_scaled) # Predict the cluster labels for each data point labels = kmeans.labels_ # Create a colormap for the labels cmap = plt.get_cmap('viridis', 3) # Visualize the clusters in 2D using the first two features (Sepal Length and Sepal Width) plt.figure(figsize=(10, 6)) for i in range(3): cluster_data = X[labels == i] plt.scatter(cluster_data[:, 0], cluster_data[:, 1], label=f'Cluster {i}', cmap=cmap) plt.xlabel('Sepal Length (cm)') plt.ylabel('Sepal Width (cm)') plt.title('Iris Flower Clustering using K-Means') plt.legend() plt.show()
Bedankt voor je feedback!
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Stel mij vragen over dit onderwerp
Vat dit hoofdstuk samen
Toon voorbeelden uit de praktijk
Awesome!
Completion rate improved to 16.67
Cluster Analysis
Veeg om het menu te tonen
Clustering is a machine learning technique that groups similar data points into clusters based on their features or characteristics.
The main objective of clustering is to partition a dataset into subsets or clusters, where data points within the same cluster are more similar than those in other clusters.
Applications of clustering
Example
Let's consider the Iris dataset that contains measurements of various attributes of iris flowers belonging to three different species: Setosa, Versicolor, and Virginica.
The goal of the clustering task is to group similar iris flowers together based on their attribute measurements without using the species labels.
123456789101112131415161718192021222324252627282930313233343536import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.cluster import KMeans from sklearn.preprocessing import StandardScaler # Load the Iris dataset data = load_iris() X = data.data # Standardize the features scaler = StandardScaler() X_scaled = scaler.fit_transform(X) # Create a KMeans clustering model kmeans = KMeans(n_clusters=3, random_state=42) # Fit the model to the scaled data kmeans.fit(X_scaled) # Predict the cluster labels for each data point labels = kmeans.labels_ # Create a colormap for the labels cmap = plt.get_cmap('viridis', 3) # Visualize the clusters in 2D using the first two features (Sepal Length and Sepal Width) plt.figure(figsize=(10, 6)) for i in range(3): cluster_data = X[labels == i] plt.scatter(cluster_data[:, 0], cluster_data[:, 1], label=f'Cluster {i}', cmap=cmap) plt.xlabel('Sepal Length (cm)') plt.ylabel('Sepal Width (cm)') plt.title('Iris Flower Clustering using K-Means') plt.legend() plt.show()
Bedankt voor je feedback!