Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Is 4 the Optimal Number of Clusters? | Spectral Clustering
Cluster Analysis in Python
course content

Зміст курсу

Cluster Analysis in Python

Cluster Analysis in Python

1. K-Means Algorithm
2. K-Medoids Algorithm
3. Hierarchical Clustering
4. Spectral Clustering

Is 4 the Optimal Number of Clusters?

The last chart (displayed below) left the question about an optimal number of clusters unanswered. Seems like 4 is the 'local maximum', but the value 5 is not significantly lower than 4. We need to consider both cases.

Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.

123456789101112131415161718
# Import the libraries import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from sklearn.cluster import SpectralClustering # Read the data data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0) # Create the model model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors') # Fit the data and predict the labels data['prediction'] = model.fit_predict(data.iloc[:,2:14]) # Visualize the results sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data) plt.show()
copy

The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!

Завдання

Table
  1. Import SpectralClustering function from sklearn.cluster.
  2. Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
  3. Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
  4. Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Завдання

Table
  1. Import SpectralClustering function from sklearn.cluster.
  2. Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
  3. Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
  4. Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

Секція 4. Розділ 5
toggle bottom row

Is 4 the Optimal Number of Clusters?

The last chart (displayed below) left the question about an optimal number of clusters unanswered. Seems like 4 is the 'local maximum', but the value 5 is not significantly lower than 4. We need to consider both cases.

Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.

123456789101112131415161718
# Import the libraries import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from sklearn.cluster import SpectralClustering # Read the data data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0) # Create the model model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors') # Fit the data and predict the labels data['prediction'] = model.fit_predict(data.iloc[:,2:14]) # Visualize the results sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data) plt.show()
copy

The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!

Завдання

Table
  1. Import SpectralClustering function from sklearn.cluster.
  2. Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
  3. Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
  4. Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Завдання

Table
  1. Import SpectralClustering function from sklearn.cluster.
  2. Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
  3. Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
  4. Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

Секція 4. Розділ 5
toggle bottom row

Is 4 the Optimal Number of Clusters?

The last chart (displayed below) left the question about an optimal number of clusters unanswered. Seems like 4 is the 'local maximum', but the value 5 is not significantly lower than 4. We need to consider both cases.

Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.

123456789101112131415161718
# Import the libraries import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from sklearn.cluster import SpectralClustering # Read the data data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0) # Create the model model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors') # Fit the data and predict the labels data['prediction'] = model.fit_predict(data.iloc[:,2:14]) # Visualize the results sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data) plt.show()
copy

The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!

Завдання

Table
  1. Import SpectralClustering function from sklearn.cluster.
  2. Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
  3. Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
  4. Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Завдання

Table
  1. Import SpectralClustering function from sklearn.cluster.
  2. Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
  3. Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
  4. Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

The last chart (displayed below) left the question about an optimal number of clusters unanswered. Seems like 4 is the 'local maximum', but the value 5 is not significantly lower than 4. We need to consider both cases.

Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.

123456789101112131415161718
# Import the libraries import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from sklearn.cluster import SpectralClustering # Read the data data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0) # Create the model model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors') # Fit the data and predict the labels data['prediction'] = model.fit_predict(data.iloc[:,2:14]) # Visualize the results sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data) plt.show()
copy

The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!

Завдання

Table
  1. Import SpectralClustering function from sklearn.cluster.
  2. Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
  3. Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
  4. Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Секція 4. Розділ 5
Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
We're sorry to hear that something went wrong. What happened?
some-alt