In this challenge, you need to put all the preprocessing steps we did together **into one pipeline**. The dataset is the initial `penguins.csv` file we started from.  

The first step is to **remove two useless rows**. Then you will have to create a pipeline containing encoding, imputing, and scaling.

You need to encode only two columns, `'sex'` and `'island'`. Since you do not want to encode the entire `X`, you must use a `ColumnTransformer`. Afterward, apply the `SimpleImputer` and `StandardScaler` to the entire `X`.

Here is a reminder of the `make_column_transformer()` and `make_pipeline()` functions you will use.

Machine learning is now used everywhere. Want to learn it yourself? This course is an introduction to the world of Machine learning for you to learn basic concepts, work with Scikit-learn – the most popular library for ML and build your first Machine Learning project.
This course is intended for students with a basic knowledge of Python, Pandas, and Numpy.

Learn the Machine Learning concepts and the ML project workflow.

Preprocessing is probably the most important stage of an ML project. This chapter covers the preprocessing steps needed for almost any dataset.

A pipeline is a neat way to combine all the preprocessing steps as well as a model. Pipelines make it much easier to train and use a model.

Modeling is the most fun stage of an ML project. Let's learn to build, fine-tune and evaluate the model!

Challenge: Creating a Pipeline

Solución

Awesome!

Challenge: Creating a Pipeline

Solución

Awesome!