Converting Data Types
AI in Action
import pandas as pd
df = pd.read_csv("passengers.csv")
print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")
Converting Columns to Another Type
You can convert a column's data type using the .astype() method:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
Converting Strings to Numeric or Datetime
If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.
Converting to Categorical Type
If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.
123456789import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
This is especially useful for columns like passenger class, gender, or embarkation port.
1. Which method converts the column's data type?
2. What happens when you use errors="coerce" in pd.to_numeric()?
3. Why would you convert a column to the category type?
Tak for dine kommentarer!
Spørg AI
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat
Awesome!
Completion rate improved to 10
Converting Data Types
Stryg for at vise menuen
AI in Action
import pandas as pd
df = pd.read_csv("passengers.csv")
print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")
Converting Columns to Another Type
You can convert a column's data type using the .astype() method:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
Converting Strings to Numeric or Datetime
If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:
1234567891011import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.
Converting to Categorical Type
If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.
123456789import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
This is especially useful for columns like passenger class, gender, or embarkation port.
1. Which method converts the column's data type?
2. What happens when you use errors="coerce" in pd.to_numeric()?
3. Why would you convert a column to the category type?
Tak for dine kommentarer!