Deciding the Number of Clusters
Well done! Let's look one more time at all the dendrograms for the weather data.
As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.
Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.
Swipe to start coding
- Import the
AgglomerativeClusteringfunction fromsklearn.cluster. - Create
AgglomerativeClusteringmodel object namedmodelwith 3 clusters and using'average'linkage. - Fit the numerical data (columns 3 - 14) to
modeland predict the labels. Save predicted labels as the'prediction'column ofdata. - For modified DataFrame
monthly_datagroup the observations of columns fromcolby'prediction'column, and calculate the mean within each group. - Build line plot
'Month'vs'Temp'for each value of'Group'usingmonthly_dataDataFrame.
Oplossing
Bedankt voor je feedback!
single
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Vat dit hoofdstuk samen
Explain code
Explain why doesn't solve task
Awesome!
Completion rate improved to 3.57
Deciding the Number of Clusters
Veeg om het menu te tonen
Well done! Let's look one more time at all the dendrograms for the weather data.
As you can see, the single linkage method's dendrogram is unreadable. The average linkage method most likely led us to three clusters (if you draw a horizontal line between 75 and 100 you will intersect one blue line, and two green. The complete and ward linkages methods lead us to 4 clusters. For complete linkage, you can draw the horizontal line between 120 and 150 (it will intersect two orange and two green lines), and between 400 and 600 for ward linkage. Let's see what will be the results of using three clusters with average linkage.
Note, that in the previous sections we considered the cases of 5 or 4 clusters. Let's see how it will work now.
Swipe to start coding
- Import the
AgglomerativeClusteringfunction fromsklearn.cluster. - Create
AgglomerativeClusteringmodel object namedmodelwith 3 clusters and using'average'linkage. - Fit the numerical data (columns 3 - 14) to
modeland predict the labels. Save predicted labels as the'prediction'column ofdata. - For modified DataFrame
monthly_datagroup the observations of columns fromcolby'prediction'column, and calculate the mean within each group. - Build line plot
'Month'vs'Temp'for each value of'Group'usingmonthly_dataDataFrame.
Oplossing
Bedankt voor je feedback!
single