Course Content
Data Preprocessing
Data Preprocessing
Label Encoding of the Target Variable
Let's go straight to the main thing - label encoding implements everything the same as ordinal encoder, but:
- Methods work with different data dimensions;
- The order of the categories is not important for label encoding.
How to use this method in Python:
from sklearn.preprocessing import LabelEncoder import pandas as pd # Simple categorical variable fruits = pd.Series(['apple', 'orange', 'banana', 'banana', 'apple', 'orange', 'banana']) # Create label encoder object le = LabelEncoder() # Fit and transform the categorical variable using label encoding fruits_encoded = le.fit_transform(fruits) # Print the encoded values print(fruits_encoded)
Task
Read the dataset 'salary_and_gender.csv'
and encode the output column 'Gender'
with label encoding.
Thanks for your feedback!
Label Encoding of the Target Variable
Let's go straight to the main thing - label encoding implements everything the same as ordinal encoder, but:
- Methods work with different data dimensions;
- The order of the categories is not important for label encoding.
How to use this method in Python:
from sklearn.preprocessing import LabelEncoder import pandas as pd # Simple categorical variable fruits = pd.Series(['apple', 'orange', 'banana', 'banana', 'apple', 'orange', 'banana']) # Create label encoder object le = LabelEncoder() # Fit and transform the categorical variable using label encoding fruits_encoded = le.fit_transform(fruits) # Print the encoded values print(fruits_encoded)
Task
Read the dataset 'salary_and_gender.csv'
and encode the output column 'Gender'
with label encoding.
Thanks for your feedback!
Label Encoding of the Target Variable
Let's go straight to the main thing - label encoding implements everything the same as ordinal encoder, but:
- Methods work with different data dimensions;
- The order of the categories is not important for label encoding.
How to use this method in Python:
from sklearn.preprocessing import LabelEncoder import pandas as pd # Simple categorical variable fruits = pd.Series(['apple', 'orange', 'banana', 'banana', 'apple', 'orange', 'banana']) # Create label encoder object le = LabelEncoder() # Fit and transform the categorical variable using label encoding fruits_encoded = le.fit_transform(fruits) # Print the encoded values print(fruits_encoded)
Task
Read the dataset 'salary_and_gender.csv'
and encode the output column 'Gender'
with label encoding.
Thanks for your feedback!
Let's go straight to the main thing - label encoding implements everything the same as ordinal encoder, but:
- Methods work with different data dimensions;
- The order of the categories is not important for label encoding.
How to use this method in Python:
from sklearn.preprocessing import LabelEncoder import pandas as pd # Simple categorical variable fruits = pd.Series(['apple', 'orange', 'banana', 'banana', 'apple', 'orange', 'banana']) # Create label encoder object le = LabelEncoder() # Fit and transform the categorical variable using label encoding fruits_encoded = le.fit_transform(fruits) # Print the encoded values print(fruits_encoded)
Task
Read the dataset 'salary_and_gender.csv'
and encode the output column 'Gender'
with label encoding.