Challenge: Creating a Pipeline
In this challenge, combine all preprocessing steps into a single pipeline using the original penguins.csv
dataset.
- Remove the two rows with insufficient data.
- Build a pipeline that includes encoding, imputing, and scaling.
You need to encode only two columns, 'sex'
and 'island'
. Since you do not want to encode the entire X
, you must use a ColumnTransformer
. Afterward, apply the SimpleImputer
and StandardScaler
to the entire X
.
Here is a reminder of the make_column_transformer()
and make_pipeline()
functions you will use.
Swipe to start coding
- Import the correct function for creating a pipeline.
- Make a
ColumnTransformer
with theOneHotEncoder
applied only to columns'sex'
and'island'
. - Make sure that all other columns remain untouched.
- Create a pipeline containing
ct
you just created,SimpleImputer
that fills in missing values with the most frequent value and aStandardScaler
as a last step. - Transform the
X
using thepipe
you created.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Can you show me how to build the pipeline step by step?
What should I use for encoding the 'sex' and 'island' columns?
How do I remove the two rows with insufficient data?
Awesome!
Completion rate improved to 3.13
Challenge: Creating a Pipeline
Swipe to show menu
In this challenge, combine all preprocessing steps into a single pipeline using the original penguins.csv
dataset.
- Remove the two rows with insufficient data.
- Build a pipeline that includes encoding, imputing, and scaling.
You need to encode only two columns, 'sex'
and 'island'
. Since you do not want to encode the entire X
, you must use a ColumnTransformer
. Afterward, apply the SimpleImputer
and StandardScaler
to the entire X
.
Here is a reminder of the make_column_transformer()
and make_pipeline()
functions you will use.
Swipe to start coding
- Import the correct function for creating a pipeline.
- Make a
ColumnTransformer
with theOneHotEncoder
applied only to columns'sex'
and'island'
. - Make sure that all other columns remain untouched.
- Create a pipeline containing
ct
you just created,SimpleImputer
that fills in missing values with the most frequent value and aStandardScaler
as a last step. - Transform the
X
using thepipe
you created.
Solution
Thanks for your feedback!
Awesome!
Completion rate improved to 3.13single