SimpleImputer
We figured out the identification of missing values. Time now to find out what to do with them and how.
SimpleImputer
- it is a class from the scikit-learn library, and which is used to work with the missing values.
SimpleImputer()
. This method replaces the missing values with more logical values. It has such main arguments, let's look at them.
missing_values - a way to represent missing values, by default is NaN, but as we have already said, it can be for example 0.
strategy - here we indicate which values we will replace with. It can be
mean
(default),median
,most_frequent
andconstant
.fill_value - a constant value, with which we will replace the missing values, if we chose
strategy = constant
.
We learn
fit()
andtransform()
functions a little more later.
Swipe to start coding
Let's try to fill the empty space in your small dataset.To use SimpleImputer you have to implement the next steps:
- Import the class.
- Create an instance of the class (imputer object).
- Specify the parameters you need, especially: we see that here the missing values are represented by NaN, so replace them with the constant value 15.
- Fit the imputer on your data using
fit()
function - Impute all missing values in you data using
transform()
function.
Solution
Merci pour vos commentaires !