Contenu du cours
Cluster Analysis in Python
Cluster Analysis in Python
Comparing the Dynamics
That's an interesting result! The yearly average temperatures across clusters significantly differ for 3 of them (47.3, 60.9, and 79.24). It seems like a good split.
Now let's visualize the monthly dynamics of average temperatures across clusters, and compare the result with the 5 clusters by the K-Means algorithm. The respective line plot is below.
Swipe to start coding
Visualize the monthly temperature dynamics across clusters. Follow the next steps:
- Import
KMedoids
function fromsklearn_extra.cluster
. - Create a
KMedoids
object namedmodel
with 4 clusters. - Fit the 3-15 columns (these are not indices, but positions) of
data
tomodel
. - Add the
'prediction'
column todata
with predicted bymodel
labels. - Calculate the monthly averages using
data
and save the result within thed
DataFrame:
- Group the observations by the
'prediction'
column. - Calculate the mean values.
- Stack the columns into indices (already done).
- Reset the indices.
- Assign
['Group', 'Month', 'Temp']
as columns names ofd
. - Build
lineplot
with'Month'
on the x-axis,'Temp'
on the y-axis for each'Group'
ofd
DataFrame (i.e. separate line and color for each'Group'
).
Solution
Merci pour vos commentaires !
Comparing the Dynamics
That's an interesting result! The yearly average temperatures across clusters significantly differ for 3 of them (47.3, 60.9, and 79.24). It seems like a good split.
Now let's visualize the monthly dynamics of average temperatures across clusters, and compare the result with the 5 clusters by the K-Means algorithm. The respective line plot is below.
Swipe to start coding
Visualize the monthly temperature dynamics across clusters. Follow the next steps:
- Import
KMedoids
function fromsklearn_extra.cluster
. - Create a
KMedoids
object namedmodel
with 4 clusters. - Fit the 3-15 columns (these are not indices, but positions) of
data
tomodel
. - Add the
'prediction'
column todata
with predicted bymodel
labels. - Calculate the monthly averages using
data
and save the result within thed
DataFrame:
- Group the observations by the
'prediction'
column. - Calculate the mean values.
- Stack the columns into indices (already done).
- Reset the indices.
- Assign
['Group', 'Month', 'Temp']
as columns names ofd
. - Build
lineplot
with'Month'
on the x-axis,'Temp'
on the y-axis for each'Group'
ofd
DataFrame (i.e. separate line and color for each'Group'
).
Solution
Merci pour vos commentaires !