Is 4 the Optimal Number of Clusters?
The last chart (displayed below) left the question about an optimal number of clusters unanswered. Seems like 4 is the 'local maximum', but the value 5 is not significantly lower than 4. We need to consider both cases.
Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.
123456789101112131415161718# Import the libraries import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from sklearn.cluster import SpectralClustering # Read the data data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0) # Create the model model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors') # Fit the data and predict the labels data['prediction'] = model.fit_predict(data.iloc[:,2:14]) # Visualize the results sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data) plt.show()
The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!
Swipe to start coding

- Import
SpectralClustering
function fromsklearn.cluster
. - Create a
SpectralClustering
model with 5 clusters using the'nearest_neighbors'
affinity. - Fit the 3-14 columns of
data
to themodel
and predict the labels. Save the result within the'prediction'
column of data. - Build the
seaborn
scatter plot with average January (column'Jan'
) vs July (column'Jul'
) temperatures for each cluster (column'prediction'
).
Рішення
Дякуємо за ваш відгук!
single
Запитати АІ
Запитати АІ
Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат
Сумаризуйте цей розділ
Пояснити код у file
Пояснити, чому file не вирішує завдання
Awesome!
Completion rate improved to 3.57
Is 4 the Optimal Number of Clusters?
Свайпніть щоб показати меню
The last chart (displayed below) left the question about an optimal number of clusters unanswered. Seems like 4 is the 'local maximum', but the value 5 is not significantly lower than 4. We need to consider both cases.
Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.
123456789101112131415161718# Import the libraries import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from sklearn.cluster import SpectralClustering # Read the data data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0) # Create the model model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors') # Fit the data and predict the labels data['prediction'] = model.fit_predict(data.iloc[:,2:14]) # Visualize the results sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data) plt.show()
The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!
Swipe to start coding

- Import
SpectralClustering
function fromsklearn.cluster
. - Create a
SpectralClustering
model with 5 clusters using the'nearest_neighbors'
affinity. - Fit the 3-14 columns of
data
to themodel
and predict the labels. Save the result within the'prediction'
column of data. - Build the
seaborn
scatter plot with average January (column'Jan'
) vs July (column'Jul'
) temperatures for each cluster (column'prediction'
).
Рішення
Дякуємо за ваш відгук!
Awesome!
Completion rate improved to 3.57single