Course Content
ML Introduction with scikit-learn
ML Introduction with scikit-learn
1. Machine Learning Concepts
2. Preprocessing Data with Scikit-learn
Challenge: Putting It All Together
In this challenge, you will apply everything you learned throughout the course from data preprocessing to training and evaluating the model.
Task
Swipe to begin your solution
- Encode the target.
- Split the data so that 33% is used for the test set and the remainder for the training set.
- Make a
ColumnTransformer
to encode only the'island'
and'sex'
columns. Make sure the others columns remain untouched. Use a proper encoder for nominal data. - Fill the gaps in a
param_grid
to try the following values for the number of neighbors:[1, 3, 5, 7, 9, 12, 15, 20, 25]
. - Create a
GridSearchCV
object with theKNeighborsClassifier
as a model. - Construct a pipeline that begins with
ct
as the first step, followed by imputation using the most frequent value, standardization, and concludes withGridSearchCV
as the final estimator. - Train the model using a pipeline on the training set.
- Evaluate the model on the test set. (Print its score)
- Get a predicted target for
X_test
. - Print the best estimator found by
grid_search
.
Solution
Switch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?
Thanks for your feedback!
Section 4. Chapter 10
Challenge: Putting It All Together
In this challenge, you will apply everything you learned throughout the course from data preprocessing to training and evaluating the model.
Task
Swipe to begin your solution
- Encode the target.
- Split the data so that 33% is used for the test set and the remainder for the training set.
- Make a
ColumnTransformer
to encode only the'island'
and'sex'
columns. Make sure the others columns remain untouched. Use a proper encoder for nominal data. - Fill the gaps in a
param_grid
to try the following values for the number of neighbors:[1, 3, 5, 7, 9, 12, 15, 20, 25]
. - Create a
GridSearchCV
object with theKNeighborsClassifier
as a model. - Construct a pipeline that begins with
ct
as the first step, followed by imputation using the most frequent value, standardization, and concludes withGridSearchCV
as the final estimator. - Train the model using a pipeline on the training set.
- Evaluate the model on the test set. (Print its score)
- Get a predicted target for
X_test
. - Print the best estimator found by
grid_search
.
Solution
Switch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?
Thanks for your feedback!
Section 4. Chapter 10
Switch to desktop for real-world practiceContinue from where you are using one of the options below