Deslize para mostrar o menu

Is 4 the Optimal Number of Clusters?

The last chart (displayed below) left the question about an optimal number of clusters unanswered. Seems like 4 is the 'local maximum', but the value 5 is not significantly lower than 4. We need to consider both cases.

Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.


              123456789101112131415161718
            
# Import the libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.cluster import SpectralClustering

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Create the model
model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors')

# Fit the data and predict the labels
data['prediction'] = model.fit_predict(data.iloc[:,2:14])

# Visualize the results
sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data)
plt.show()

The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!

Tarefa

Swipe to start coding

Import SpectralClustering function from sklearn.cluster.
Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Solução

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 4. Capítulo 5

single

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Is 4 the Optimal Number of Clusters?

Let's watch the scatter plot of average January vs July temperatures in the case of 4 clusters.


              123456789101112131415161718
            
# Import the libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.cluster import SpectralClustering

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Create the model
model = SpectralClustering(n_clusters = 4, affinity = 'nearest_neighbors')

# Fit the data and predict the labels
data['prediction'] = model.fit_predict(data.iloc[:,2:14])

# Visualize the results
sns.scatterplot(x = 'Jan', y = 'Jul', hue = 'prediction', data = data)
plt.show()

The clustering seems logical, it splits the cities into different disjoint groups. But what if we build the same chart but for 5 clusters? That will be your task!

Tarefa

Swipe to start coding

Import SpectralClustering function from sklearn.cluster.
Create a SpectralClustering model with 5 clusters using the 'nearest_neighbors' affinity.
Fit the 3-14 columns of data to the model and predict the labels. Save the result within the 'prediction' column of data.
Build the seaborn scatter plot with average January (column 'Jan') vs July (column 'Jul') temperatures for each cluster (column 'prediction').

Solução

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Obrigado pelo seu feedback!

Deslize para mostrar o menu

Is 4 the Optimal Number of Clusters?

Solução

Awesome!

Is 4 the Optimal Number of Clusters?

Solução

Awesome!