Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Filling In the Missing Values | Preprocessing Data
Advanced Techniques in pandas
course content

Зміст курсу

Advanced Techniques in pandas

Advanced Techniques in pandas

1. Getting Familiar With Indexing and Selecting Data
2. Dealing With Conditions
3. Extracting Data
4. Aggregating Data
5. Preprocessing Data

bookFilling In the Missing Values

Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.

Look at the example of filling missing values in the column 'Age' with the median value of this column:

1234
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
copy

Explanation:

  • value = data['Age'].median() - using the argument value, we tell the .fillna() method what to do with the NaN values. In this case, we applied the .fillna() method to the column 'Age' and replaced all missing values with the median of the column;
  • inplace=True - the argument we can use for saving changes.

Завдання

One of the most common ways of filling missing values is replacing them with the mean value of the column. So, your task here is to replace the NaN values in the column 'Age' with the mean value of the column (using the inplace = True argument). Then output the sum of the missing value in the column 'Age'.

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 5. Розділ 5
toggle bottom row

bookFilling In the Missing Values

Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.

Look at the example of filling missing values in the column 'Age' with the median value of this column:

1234
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
copy

Explanation:

  • value = data['Age'].median() - using the argument value, we tell the .fillna() method what to do with the NaN values. In this case, we applied the .fillna() method to the column 'Age' and replaced all missing values with the median of the column;
  • inplace=True - the argument we can use for saving changes.

Завдання

One of the most common ways of filling missing values is replacing them with the mean value of the column. So, your task here is to replace the NaN values in the column 'Age' with the mean value of the column (using the inplace = True argument). Then output the sum of the missing value in the column 'Age'.

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 5. Розділ 5
toggle bottom row

bookFilling In the Missing Values

Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.

Look at the example of filling missing values in the column 'Age' with the median value of this column:

1234
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
copy

Explanation:

  • value = data['Age'].median() - using the argument value, we tell the .fillna() method what to do with the NaN values. In this case, we applied the .fillna() method to the column 'Age' and replaced all missing values with the median of the column;
  • inplace=True - the argument we can use for saving changes.

Завдання

One of the most common ways of filling missing values is replacing them with the mean value of the column. So, your task here is to replace the NaN values in the column 'Age' with the mean value of the column (using the inplace = True argument). Then output the sum of the missing value in the column 'Age'.

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.

Look at the example of filling missing values in the column 'Age' with the median value of this column:

1234
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
copy

Explanation:

  • value = data['Age'].median() - using the argument value, we tell the .fillna() method what to do with the NaN values. In this case, we applied the .fillna() method to the column 'Age' and replaced all missing values with the median of the column;
  • inplace=True - the argument we can use for saving changes.

Завдання

One of the most common ways of filling missing values is replacing them with the mean value of the column. So, your task here is to replace the NaN values in the column 'Age' with the mean value of the column (using the inplace = True argument). Then output the sum of the missing value in the column 'Age'.

Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Секція 5. Розділ 5
Switch to desktopПерейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
some-alt