Course Content
Pandas First Steps
Pandas First Steps
Finding Null Values
DataFrames often contain missing values, represented as None
or NaN
. When working with DataFrames, it's essential to identify these missing values because they can distort calculations, lead to inaccurate analyses, and compromise the reliability of results.
Addressing them ensures data integrity and improves the performance of tasks like statistical analysis and machine learning. For this purpose, pandas offers specific methods.
The first of these is isna()
, which returns a boolean DataFrame. In this context, a True
value indicates a missing value within the DataFrame, while a False
value suggests the value is present.
For clarity, we'll apply this method on the animals
DataFrame. The isna()
method will return a DataFrame filled with True
/False
values, where each True
value represents a missing value in the animals
DataFrame.
import pandas as pd import numpy as np animals_data = {'animal': [np.NaN, 'Dog', np.NaN, 'Cat','Parrot', None], 'name': ['Dolly', None, 'Erin', 'Kelly', None, 'Odie']} animals = pd.DataFrame(animals_data) # Find missing values missing_values = animals.isna() print(missing_values)
The second method is isnull()
. It behaves identically to the previous one, with no discernible difference between them.
Swipe to begin your solution
Your objective is to pinpoint the missing values in a given DataFrame named wine_data
.
Solution
Thanks for your feedback!
Finding Null Values
DataFrames often contain missing values, represented as None
or NaN
. When working with DataFrames, it's essential to identify these missing values because they can distort calculations, lead to inaccurate analyses, and compromise the reliability of results.
Addressing them ensures data integrity and improves the performance of tasks like statistical analysis and machine learning. For this purpose, pandas offers specific methods.
The first of these is isna()
, which returns a boolean DataFrame. In this context, a True
value indicates a missing value within the DataFrame, while a False
value suggests the value is present.
For clarity, we'll apply this method on the animals
DataFrame. The isna()
method will return a DataFrame filled with True
/False
values, where each True
value represents a missing value in the animals
DataFrame.
import pandas as pd import numpy as np animals_data = {'animal': [np.NaN, 'Dog', np.NaN, 'Cat','Parrot', None], 'name': ['Dolly', None, 'Erin', 'Kelly', None, 'Odie']} animals = pd.DataFrame(animals_data) # Find missing values missing_values = animals.isna() print(missing_values)
The second method is isnull()
. It behaves identically to the previous one, with no discernible difference between them.
Swipe to begin your solution
Your objective is to pinpoint the missing values in a given DataFrame named wine_data
.
Solution
Thanks for your feedback!