Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Filling In the Missing Values | Preprocessing Data
Advanced Techniques in pandas

bookFilling In the Missing Values

Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.

Look at the example of filling missing values in the column 'Age' with the median value of this column:

1234
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
copy

Explanation:

.fillna(value=data['Age'].median(), inplace=True)
  • value = data['Age'].median() - using the argument value, we tell the .fillna() method what to do with the NaN values. In this case, we applied the .fillna() method to the column 'Age' and replaced all missing values with the median of the column;
  • inplace=True - the argument we can use for saving changes.
Task

Swipe to start coding

Missing values can cause problems when analyzing data. One of the most common ways to handle them is by replacing missing values with the mean of the column.

Your task is to:

  1. Replace all NaN values in the column 'Age' with the mean of that column.

    • Use the .fillna() method with the arguments value=data['Age'].mean() and inplace=True.
  2. Calculate and print the number of remaining missing values in the 'Age' column.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 5. ChapterΒ 5
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

close

Awesome!

Completion rate improved to 3.03

bookFilling In the Missing Values

Swipe to show menu

Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.

Look at the example of filling missing values in the column 'Age' with the median value of this column:

1234
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
copy

Explanation:

.fillna(value=data['Age'].median(), inplace=True)
  • value = data['Age'].median() - using the argument value, we tell the .fillna() method what to do with the NaN values. In this case, we applied the .fillna() method to the column 'Age' and replaced all missing values with the median of the column;
  • inplace=True - the argument we can use for saving changes.
Task

Swipe to start coding

Missing values can cause problems when analyzing data. One of the most common ways to handle them is by replacing missing values with the mean of the column.

Your task is to:

  1. Replace all NaN values in the column 'Age' with the mean of that column.

    • Use the .fillna() method with the arguments value=data['Age'].mean() and inplace=True.
  2. Calculate and print the number of remaining missing values in the 'Age' column.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 5. ChapterΒ 5
single

single

some-alt