Converting Data Types
AI in Action
import pandas as pd
df = pd.read_csv("passengers.csv")
print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")
Converting Columns to Another Type
You can convert a column's data type using the .astype() method:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
Converting Strings to Numeric or Datetime
If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.
Converting to Categorical Type
If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.
123456789import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
This is especially useful for columns like passenger class, gender, or embarkation port.
1. Which method converts the column's data type?
2. What happens when you use errors="coerce" in pd.to_numeric()?
3. Why would you convert a column to the category type?
Merci pour vos commentaires !
Demandez à l'IA
Demandez à l'IA
Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion
Awesome!
Completion rate improved to 10
Converting Data Types
Glissez pour afficher le menu
AI in Action
import pandas as pd
df = pd.read_csv("passengers.csv")
print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")
Converting Columns to Another Type
You can convert a column's data type using the .astype() method:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
Converting Strings to Numeric or Datetime
If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.
Converting to Categorical Type
If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.
123456789import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
This is especially useful for columns like passenger class, gender, or embarkation port.
1. Which method converts the column's data type?
2. What happens when you use errors="coerce" in pd.to_numeric()?
3. Why would you convert a column to the category type?
Merci pour vos commentaires !