Deciding the Number of Clusters
Well done! Let's look one more time at all the dendrograms for the weather data.
As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.
Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.
Swipe to start coding
- Import the
AgglomerativeClusteringfunction fromsklearn.cluster. - Create
AgglomerativeClusteringmodel object namedmodelwith 3 clusters and using'average'linkage. - Fit the numerical data (columns 3 - 14) to
modeland predict the labels. Save predicted labels as the'prediction'column ofdata. - For modified DataFrame
monthly_datagroup the observations of columns fromcolby'prediction'column, and calculate the mean within each group. - Build line plot
'Month'vs'Temp'for each value of'Group'usingmonthly_dataDataFrame.
Ratkaisu
Kiitos palautteestasi!
single
Kysy tekoälyä
Kysy tekoälyä
Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme
Tiivistä tämä luku
Explain code
Explain why doesn't solve task
Awesome!
Completion rate improved to 3.57
Deciding the Number of Clusters
Pyyhkäise näyttääksesi valikon
Well done! Let's look one more time at all the dendrograms for the weather data.
As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.
Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.
Swipe to start coding
- Import the
AgglomerativeClusteringfunction fromsklearn.cluster. - Create
AgglomerativeClusteringmodel object namedmodelwith 3 clusters and using'average'linkage. - Fit the numerical data (columns 3 - 14) to
modeland predict the labels. Save predicted labels as the'prediction'column ofdata. - For modified DataFrame
monthly_datagroup the observations of columns fromcolby'prediction'column, and calculate the mean within each group. - Build line plot
'Month'vs'Temp'for each value of'Group'usingmonthly_dataDataFrame.
Ratkaisu
Kiitos palautteestasi!
single