Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Creating a Complete ML Pipeline | Pipelines
ML Introduction with scikit-learn

bookChallenge: Creating a Complete ML Pipeline

Now create a pipeline that includes a final estimator. This produces a trained prediction pipeline that can generate predictions for new instances using the .predict() method.

Since a predictor requires the target variable y, encode it separately from the pipeline built for X. Use LabelEncoder to encode the target.

Note
Note

Since the predictions are encoded as 0, 1, or 2, the .inverse_transform() method of LabelEncoder can be used to convert them back to the original labels: 'Adelie', 'Chinstrap', or 'Gentoo'.

Task

Swipe to start coding

You are given a DataFrame named df that contains penguin data. Your task is to build and train a complete machine learning pipeline that preprocesses the data and applies a KNeighborsClassifier model.

  1. Encode the target variable y using the LabelEncoder class.
  2. Create a ColumnTransformer named ct that applies a OneHotEncoder to the 'island' and 'sex' columns, while leaving the other columns unchanged (remainder='passthrough').
  3. Create a pipeline that includes the following steps in order:
    • The ColumnTransformer you defined (ct);
    • A SimpleImputer with the strategy parameter set to 'most_frequent';
    • A StandardScaler for feature scaling;
    • A KNeighborsClassifier as the final model.
  4. Train the pipeline on the features X and target y.
  5. Generate predictions for X using the trained pipeline and print the decoded class names.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 6
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

close

Awesome!

Completion rate improved to 3.13

bookChallenge: Creating a Complete ML Pipeline

Swipe to show menu

Now create a pipeline that includes a final estimator. This produces a trained prediction pipeline that can generate predictions for new instances using the .predict() method.

Since a predictor requires the target variable y, encode it separately from the pipeline built for X. Use LabelEncoder to encode the target.

Note
Note

Since the predictions are encoded as 0, 1, or 2, the .inverse_transform() method of LabelEncoder can be used to convert them back to the original labels: 'Adelie', 'Chinstrap', or 'Gentoo'.

Task

Swipe to start coding

You are given a DataFrame named df that contains penguin data. Your task is to build and train a complete machine learning pipeline that preprocesses the data and applies a KNeighborsClassifier model.

  1. Encode the target variable y using the LabelEncoder class.
  2. Create a ColumnTransformer named ct that applies a OneHotEncoder to the 'island' and 'sex' columns, while leaving the other columns unchanged (remainder='passthrough').
  3. Create a pipeline that includes the following steps in order:
    • The ColumnTransformer you defined (ct);
    • A SimpleImputer with the strategy parameter set to 'most_frequent';
    • A StandardScaler for feature scaling;
    • A KNeighborsClassifier as the final model.
  4. Train the pipeline on the features X and target y.
  5. Generate predictions for X using the trained pipeline and print the decoded class names.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 6
single

single

some-alt