Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Creating a Pipeline | Pipelines
ML Introduction with scikit-learn

bookChallenge: Creating a Pipeline

In this challenge, combine all preprocessing steps into a single pipeline using the original penguins.csv dataset.

  1. Remove the two rows with insufficient data.
  2. Build a pipeline that includes encoding, imputing, and scaling.

You need to encode only two columns, 'sex' and 'island'. Since you do not want to encode the entire X, you must use a ColumnTransformer. Afterward, apply the SimpleImputer and StandardScaler to the entire X.

Here is a reminder of the make_column_transformer() and make_pipeline() functions you will use.

Task

Swipe to start coding

  1. Import the correct function for creating a pipeline.
  2. Make a ColumnTransformer with the OneHotEncoder applied only to columns 'sex' and 'island'.
  3. Make sure that all other columns remain untouched.
  4. Create a pipeline containing ct you just created, SimpleImputer that fills in missing values with the most frequent value and a StandardScaler as a last step.
  5. Transform the X using the pipe you created.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 4
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you show me how to build the pipeline step by step?

What should I use for encoding the 'sex' and 'island' columns?

How do I remove the two rows with insufficient data?

close

Awesome!

Completion rate improved to 3.13

bookChallenge: Creating a Pipeline

Swipe to show menu

In this challenge, combine all preprocessing steps into a single pipeline using the original penguins.csv dataset.

  1. Remove the two rows with insufficient data.
  2. Build a pipeline that includes encoding, imputing, and scaling.

You need to encode only two columns, 'sex' and 'island'. Since you do not want to encode the entire X, you must use a ColumnTransformer. Afterward, apply the SimpleImputer and StandardScaler to the entire X.

Here is a reminder of the make_column_transformer() and make_pipeline() functions you will use.

Task

Swipe to start coding

  1. Import the correct function for creating a pipeline.
  2. Make a ColumnTransformer with the OneHotEncoder applied only to columns 'sex' and 'island'.
  3. Make sure that all other columns remain untouched.
  4. Create a pipeline containing ct you just created, SimpleImputer that fills in missing values with the most frequent value and a StandardScaler as a last step.
  5. Transform the X using the pipe you created.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

close

Awesome!

Completion rate improved to 3.13
SectionΒ 3. ChapterΒ 4
single

single

some-alt