Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Converting Data Types | Data Cleaning
Introduction to Pandas with AI

bookConverting Data Types

AI in Action

import pandas as pd

df = pd.read_csv("passengers.csv")

print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")

Converting Columns to Another Type

You can convert a column's data type using the .astype() method:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
copy

Converting Strings to Numeric or Datetime

If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
copy

The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.

Converting to Categorical Type

If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.

123456789
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
copy

This is especially useful for columns like passenger class, gender, or embarkation port.

1. Which method converts the column's data type?

2. What happens when you use errors="coerce" in pd.to_numeric()?

3. Why would you convert a column to the category type?

question mark

Which method converts the column's data type?

Select the correct answer

question mark

What happens when you use errors="coerce" in pd.to_numeric()?

Select the correct answer

question mark

Why would you convert a column to the category type?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 2. Hoofdstuk 4

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Suggested prompts:

How do I know which columns are good candidates for categorical type?

Can you explain more about the benefits of using categorical types?

What happens if I try to convert a column with many unique values to categorical?

Awesome!

Completion rate improved to 10

bookConverting Data Types

Veeg om het menu te tonen

AI in Action

import pandas as pd

df = pd.read_csv("passengers.csv")

print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")

Converting Columns to Another Type

You can convert a column's data type using the .astype() method:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
copy

Converting Strings to Numeric or Datetime

If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
copy

The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.

Converting to Categorical Type

If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.

123456789
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
copy

This is especially useful for columns like passenger class, gender, or embarkation port.

1. Which method converts the column's data type?

2. What happens when you use errors="coerce" in pd.to_numeric()?

3. Why would you convert a column to the category type?

question mark

Which method converts the column's data type?

Select the correct answer

question mark

What happens when you use errors="coerce" in pd.to_numeric()?

Select the correct answer

question mark

Why would you convert a column to the category type?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 2. Hoofdstuk 4
some-alt