Challenge: Creating a Pipeline
In this challenge, combine all preprocessing steps into a single pipeline using the original penguins.csv dataset.
- Remove the two rows with insufficient data.
- Build a pipeline that includes encoding, imputing, and scaling.
You need to encode only two columns, 'sex' and 'island'. Since you do not want to encode the entire X, you must use a ColumnTransformer. Afterward, apply the SimpleImputer and StandardScaler to the entire X.
Here is a reminder of the make_column_transformer() and make_pipeline() functions you will use.
Swipe to start coding
- Import the correct function for creating a pipeline.
- Make a
ColumnTransformerwith theOneHotEncoderapplied only to columns'sex'and'island'. - Make sure that all other columns remain untouched.
- Create a pipeline containing
ctyou just created,SimpleImputerthat fills in missing values with the most frequent value and aStandardScaleras a last step. - Transform the
Xusing thepipeyou created.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Can you show me how to build the pipeline step by step?
What should I use for encoding the 'sex' and 'island' columns?
How do I remove the two rows with insufficient data?
Awesome!
Completion rate improved to 3.13
Challenge: Creating a Pipeline
Swipe to show menu
In this challenge, combine all preprocessing steps into a single pipeline using the original penguins.csv dataset.
- Remove the two rows with insufficient data.
- Build a pipeline that includes encoding, imputing, and scaling.
You need to encode only two columns, 'sex' and 'island'. Since you do not want to encode the entire X, you must use a ColumnTransformer. Afterward, apply the SimpleImputer and StandardScaler to the entire X.
Here is a reminder of the make_column_transformer() and make_pipeline() functions you will use.
Swipe to start coding
- Import the correct function for creating a pipeline.
- Make a
ColumnTransformerwith theOneHotEncoderapplied only to columns'sex'and'island'. - Make sure that all other columns remain untouched.
- Create a pipeline containing
ctyou just created,SimpleImputerthat fills in missing values with the most frequent value and aStandardScaleras a last step. - Transform the
Xusing thepipeyou created.
Solution
Thanks for your feedback!
single