Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Converting Data Types | Data Cleaning
Introduction to Pandas with AI

bookConverting Data Types

AI in Action

import pandas as pd

df = pd.read_csv("passengers.csv")

print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")

Converting Columns to Another Type

You can convert a column's data type using the .astype() method:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
copy

Converting Strings to Numeric or Datetime

If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
copy

The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.

Converting to Categorical Type

If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.

123456789
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
copy

This is especially useful for columns like passenger class, gender, or embarkation port.

1. Which method converts the column's data type?

2. What happens when you use errors="coerce" in pd.to_numeric()?

3. Why would you convert a column to the category type?

question mark

Which method converts the column's data type?

Select the correct answer

question mark

What happens when you use errors="coerce" in pd.to_numeric()?

Select the correct answer

question mark

Why would you convert a column to the category type?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 2. Capítulo 4

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Awesome!

Completion rate improved to 10

bookConverting Data Types

Desliza para mostrar el menú

AI in Action

import pandas as pd

df = pd.read_csv("passengers.csv")

print(df.dtypes)
df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce")

Converting Columns to Another Type

You can convert a column's data type using the .astype() method:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert Pclass to int df["Pclass"] = df["Pclass"].astype(int) # Convert Age to float df["Age"] = df["Age"].astype(float) print(df.dtypes)
copy

Converting Strings to Numeric or Datetime

If you load numbers or dates and store them as text, you can use pd.to_numeric() and pd.to_datetime() to safely convert them back into a correct data type:

1234567891011
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df.dtypes) # Convert a text column to numeric df["Fare"] = pd.to_numeric(df["Fare"], errors="coerce") # Convert a text column to datetime df["TicketDate"] = pd.to_datetime(df["TicketDate"], errors="coerce") print(df.dtypes)
copy

The errors="coerce" argument replaces invalid entries with NaN instead of raising an error.

Converting to Categorical Type

If a column has only a few repeated values, you can convert it to the categorical type. This saves memory and speeds up comparisons.

123456789
import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv", dtype=str) print(df["Embarked"].dtype) # Convert column to categorical df["Embarked"] = df["Embarked"].astype("category") print(df["Embarked"].dtype)
copy

This is especially useful for columns like passenger class, gender, or embarkation port.

1. Which method converts the column's data type?

2. What happens when you use errors="coerce" in pd.to_numeric()?

3. Why would you convert a column to the category type?

question mark

Which method converts the column's data type?

Select the correct answer

question mark

What happens when you use errors="coerce" in pd.to_numeric()?

Select the correct answer

question mark

Why would you convert a column to the category type?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 2. Capítulo 4
some-alt