Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Removing Duplicates | Data Cleaning
Preprocessing Data
course content

Course Content

Preprocessing Data

Preprocessing Data

1. Data Exploration
2. Data Cleaning
3. Data Validation
4. Normalization & Standardization
5. Data Encoding

bookRemoving Duplicates

To remove the duplicate rows, simply use function drop_duplicates(). To change the current dataframe, add inplace=True.

123
new_data = data.drop_duplicates() # data is not modified # or data.drop_duplicates(inplace=True) # data is modified
copy

Task

The planets dataset is given to you. Remove the duplicates and then check the new shape of dataframe. Compare it with the original shape.

Note that dataframe may have only distinct records, in this case, the shape will remain the same.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 2. Chapter 7
toggle bottom row

bookRemoving Duplicates

To remove the duplicate rows, simply use function drop_duplicates(). To change the current dataframe, add inplace=True.

123
new_data = data.drop_duplicates() # data is not modified # or data.drop_duplicates(inplace=True) # data is modified
copy

Task

The planets dataset is given to you. Remove the duplicates and then check the new shape of dataframe. Compare it with the original shape.

Note that dataframe may have only distinct records, in this case, the shape will remain the same.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 2. Chapter 7
toggle bottom row

bookRemoving Duplicates

To remove the duplicate rows, simply use function drop_duplicates(). To change the current dataframe, add inplace=True.

123
new_data = data.drop_duplicates() # data is not modified # or data.drop_duplicates(inplace=True) # data is modified
copy

Task

The planets dataset is given to you. Remove the duplicates and then check the new shape of dataframe. Compare it with the original shape.

Note that dataframe may have only distinct records, in this case, the shape will remain the same.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

To remove the duplicate rows, simply use function drop_duplicates(). To change the current dataframe, add inplace=True.

123
new_data = data.drop_duplicates() # data is not modified # or data.drop_duplicates(inplace=True) # data is modified
copy

Task

The planets dataset is given to you. Remove the duplicates and then check the new shape of dataframe. Compare it with the original shape.

Note that dataframe may have only distinct records, in this case, the shape will remain the same.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Section 2. Chapter 7
Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
some-alt