Contenido del Curso
Cluster Analysis in Python
Cluster Analysis in Python
What is a K-Medoids Algorithm?
You did well in the first section! Now it's time to learn a new algorithm!
So, unlike centroid, medoid has to be a data point - which is the key difference. In the case of large and 'grouped' data there most likely will be an insignificant difference.
The K-Medoids algorithm works similarly to the K-Means, but in the first step, the random points have to be data points. These are medoids. Then, each data point is associated with the closest medoid by some metric. Then, the medoids swap with some data points, and the cost function (sum of the variances for all the points within a cluster) compares with the one in the previous step until we get the minimum possible values.
How to implement the K-Medoids algorithm using Python? The same way you
did with the K-Means algorithm! The key function there is KMedoids
from the
sklearn_extra.cluster
library. The algorithm is the next:
- Create a
KMedoids
model assigned to a certain variable. - Compute
K-Medoids
clustering using the.fit()
method ofKMedoids
object with data set as a parameter. - Predict the labels using the fitted model by applying the
.predict()
function to theKMedoids
object with the data set as a parameter. - (not necessary) Visualize the result of clustering.
For example, let's try to cluster the points using the K-Medoids method. The scatter plot for the data points is below.
Swipe to start coding
Implement the K-Medoids algorithm to divide the points into two groups. To do it, follow the next steps:
- Import
KMedoids
fromsklearn_extra.cluster
. - Create a
KMedoids
model object with 2 clusters assigned to themodel
variable. - Fit the
data
to themodel
. - Predict the labels for
data
usingmodel
. Save the labels within theprediction
variable. - Add
'prediction'
column todata
filled withprediction
values. - Build a scatter plot with
'x'
column on the x-axis,'y'
column on the y-axis, and a'prediction'
column as the color of the point. Do not forget to apply the respective function ofplt
to display the plot.
Solución
¡Gracias por tus comentarios!
What is a K-Medoids Algorithm?
You did well in the first section! Now it's time to learn a new algorithm!
So, unlike centroid, medoid has to be a data point - which is the key difference. In the case of large and 'grouped' data there most likely will be an insignificant difference.
The K-Medoids algorithm works similarly to the K-Means, but in the first step, the random points have to be data points. These are medoids. Then, each data point is associated with the closest medoid by some metric. Then, the medoids swap with some data points, and the cost function (sum of the variances for all the points within a cluster) compares with the one in the previous step until we get the minimum possible values.
How to implement the K-Medoids algorithm using Python? The same way you
did with the K-Means algorithm! The key function there is KMedoids
from the
sklearn_extra.cluster
library. The algorithm is the next:
- Create a
KMedoids
model assigned to a certain variable. - Compute
K-Medoids
clustering using the.fit()
method ofKMedoids
object with data set as a parameter. - Predict the labels using the fitted model by applying the
.predict()
function to theKMedoids
object with the data set as a parameter. - (not necessary) Visualize the result of clustering.
For example, let's try to cluster the points using the K-Medoids method. The scatter plot for the data points is below.
Swipe to start coding
Implement the K-Medoids algorithm to divide the points into two groups. To do it, follow the next steps:
- Import
KMedoids
fromsklearn_extra.cluster
. - Create a
KMedoids
model object with 2 clusters assigned to themodel
variable. - Fit the
data
to themodel
. - Predict the labels for
data
usingmodel
. Save the labels within theprediction
variable. - Add
'prediction'
column todata
filled withprediction
values. - Build a scatter plot with
'x'
column on the x-axis,'y'
column on the y-axis, and a'prediction'
column as the color of the point. Do not forget to apply the respective function ofplt
to display the plot.
Solución
¡Gracias por tus comentarios!