Contenido del Curso
ML Introduction with scikit-learn
ML Introduction with scikit-learn
Challenge: Creating a Pipeline
In this challenge, you need to put all the preprocessing steps we did together into one pipeline. The dataset is the initial penguins.csv
file we started from.
The first step is to remove two useless rows. Then you will have to create a pipeline containing encoding, imputing, and scaling.
You need to encode only two columns, 'sex'
and 'island'
. Since you do not want to encode the entire X
, you must use a ColumnTransformer
. Afterward, apply the SimpleImputer
and StandardScaler
to the entire X
.
Here is a reminder of the make_column_transformer()
and make_pipeline()
functions you will use.
Swipe to begin your solution
- Import the correct function for creating a pipeline.
- Make a
ColumnTransformer
with theOneHotEncoder
applied only to columns'sex'
and'island'
. - Make sure that all other columns remain untouched.
- Create a pipeline containing
ct
you just created,SimpleImputer
that fills in missing values with the most frequent value and aStandardScaler
as a last step. - Transform the
X
using thepipe
you created.
Solución
¡Gracias por tus comentarios!
Challenge: Creating a Pipeline
In this challenge, you need to put all the preprocessing steps we did together into one pipeline. The dataset is the initial penguins.csv
file we started from.
The first step is to remove two useless rows. Then you will have to create a pipeline containing encoding, imputing, and scaling.
You need to encode only two columns, 'sex'
and 'island'
. Since you do not want to encode the entire X
, you must use a ColumnTransformer
. Afterward, apply the SimpleImputer
and StandardScaler
to the entire X
.
Here is a reminder of the make_column_transformer()
and make_pipeline()
functions you will use.
Swipe to begin your solution
- Import the correct function for creating a pipeline.
- Make a
ColumnTransformer
with theOneHotEncoder
applied only to columns'sex'
and'island'
. - Make sure that all other columns remain untouched.
- Create a pipeline containing
ct
you just created,SimpleImputer
that fills in missing values with the most frequent value and aStandardScaler
as a last step. - Transform the
X
using thepipe
you created.
Solución
¡Gracias por tus comentarios!