Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Deciding the Number of Clusters | Hierarchical Clustering
Cluster Analysis in Python
course content

Conteúdo do Curso

Cluster Analysis in Python

Cluster Analysis in Python

1. K-Means Algorithm
2. K-Medoids Algorithm
3. Hierarchical Clustering
4. Spectral Clustering

Deciding the Number of Clusters

Well done! Let's look one more time at all the dendrograms for the weather data.

As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.

Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.

Tarefa

Table
  1. Import the AgglomerativeClustering function from sklearn.cluster.
  2. Create AgglomerativeClustering model object named model with 3 clusters and using 'average' linkage.
  3. Fit the numerical data (columns 3 - 14) to model and predict the labels. Save predicted labels as the 'prediction' column of data.
  4. For modified DataFrame monthly_data group the observations of columns from col by 'prediction' column, and calculate the mean within each group.
  5. Build line plot 'Month' vs 'Temp' for each value of 'Group' using monthly_data DataFrame.

Tarefa

Table
  1. Import the AgglomerativeClustering function from sklearn.cluster.
  2. Create AgglomerativeClustering model object named model with 3 clusters and using 'average' linkage.
  3. Fit the numerical data (columns 3 - 14) to model and predict the labels. Save predicted labels as the 'prediction' column of data.
  4. For modified DataFrame monthly_data group the observations of columns from col by 'prediction' column, and calculate the mean within each group.
  5. Build line plot 'Month' vs 'Temp' for each value of 'Group' using monthly_data DataFrame.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Seção 3. Capítulo 6
toggle bottom row

Deciding the Number of Clusters

Well done! Let's look one more time at all the dendrograms for the weather data.

As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.

Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.

Tarefa

Table
  1. Import the AgglomerativeClustering function from sklearn.cluster.
  2. Create AgglomerativeClustering model object named model with 3 clusters and using 'average' linkage.
  3. Fit the numerical data (columns 3 - 14) to model and predict the labels. Save predicted labels as the 'prediction' column of data.
  4. For modified DataFrame monthly_data group the observations of columns from col by 'prediction' column, and calculate the mean within each group.
  5. Build line plot 'Month' vs 'Temp' for each value of 'Group' using monthly_data DataFrame.

Tarefa

Table
  1. Import the AgglomerativeClustering function from sklearn.cluster.
  2. Create AgglomerativeClustering model object named model with 3 clusters and using 'average' linkage.
  3. Fit the numerical data (columns 3 - 14) to model and predict the labels. Save predicted labels as the 'prediction' column of data.
  4. For modified DataFrame monthly_data group the observations of columns from col by 'prediction' column, and calculate the mean within each group.
  5. Build line plot 'Month' vs 'Temp' for each value of 'Group' using monthly_data DataFrame.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Seção 3. Capítulo 6
toggle bottom row

Deciding the Number of Clusters

Well done! Let's look one more time at all the dendrograms for the weather data.

As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.

Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.

Tarefa

Table
  1. Import the AgglomerativeClustering function from sklearn.cluster.
  2. Create AgglomerativeClustering model object named model with 3 clusters and using 'average' linkage.
  3. Fit the numerical data (columns 3 - 14) to model and predict the labels. Save predicted labels as the 'prediction' column of data.
  4. For modified DataFrame monthly_data group the observations of columns from col by 'prediction' column, and calculate the mean within each group.
  5. Build line plot 'Month' vs 'Temp' for each value of 'Group' using monthly_data DataFrame.

Tarefa

Table
  1. Import the AgglomerativeClustering function from sklearn.cluster.
  2. Create AgglomerativeClustering model object named model with 3 clusters and using 'average' linkage.
  3. Fit the numerical data (columns 3 - 14) to model and predict the labels. Save predicted labels as the 'prediction' column of data.
  4. For modified DataFrame monthly_data group the observations of columns from col by 'prediction' column, and calculate the mean within each group.
  5. Build line plot 'Month' vs 'Temp' for each value of 'Group' using monthly_data DataFrame.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo

Tudo estava claro?

Well done! Let's look one more time at all the dendrograms for the weather data.

As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.

Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.

Tarefa

Table
  1. Import the AgglomerativeClustering function from sklearn.cluster.
  2. Create AgglomerativeClustering model object named model with 3 clusters and using 'average' linkage.
  3. Fit the numerical data (columns 3 - 14) to model and predict the labels. Save predicted labels as the 'prediction' column of data.
  4. For modified DataFrame monthly_data group the observations of columns from col by 'prediction' column, and calculate the mean within each group.
  5. Build line plot 'Month' vs 'Temp' for each value of 'Group' using monthly_data DataFrame.

Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
Seção 3. Capítulo 6
Mude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
We're sorry to hear that something went wrong. What happened?
some-alt