Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Changing the Data Type | Brief Introduction
Data Preprocessing
course content

Course Content

Data Preprocessing

Data Preprocessing

1. Brief Introduction
2. Processing Quantitative Data
3. Processing Categorical Data
4. Time Series Data Processing
5. Feature Engineering
6. Moving on to Tasks

Changing the Data Type

You already know how to change the data type from string to number, for example. But let's take a closer look at this small but important task.

Let's start by changing the data type from string to datetime. Most often, you will need this to work with time series. You can perform this operation using the .to_datetime() method:

To convert a string to a bool - use the .map() method on the column whose values you want to change:

For example, if you have a price column that looks like "$198,800" and you want to turn it into a float - you should create custom transformation functions:

12345678910111213
import pandas as pd import re # Create simple dataset df = pd.DataFrame(data={'Price':['$4,122.94', '$1,002.3']}) # Create a custom function to transform data # x - value from column def price2int(x): return float(re.sub(r'[\$\,]', '', x)) # Use custom transformation on a column df['Price'] = df['Price'].apply(price2int)
copy

Task

Read the sales_data_types.csv dataset and change the data type in the Active column from str to bool.

Task

Read the sales_data_types.csv dataset and change the data type in the Active column from str to bool.

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Section 1. Chapter 5
toggle bottom row

Changing the Data Type

You already know how to change the data type from string to number, for example. But let's take a closer look at this small but important task.

Let's start by changing the data type from string to datetime. Most often, you will need this to work with time series. You can perform this operation using the .to_datetime() method:

To convert a string to a bool - use the .map() method on the column whose values you want to change:

For example, if you have a price column that looks like "$198,800" and you want to turn it into a float - you should create custom transformation functions:

12345678910111213
import pandas as pd import re # Create simple dataset df = pd.DataFrame(data={'Price':['$4,122.94', '$1,002.3']}) # Create a custom function to transform data # x - value from column def price2int(x): return float(re.sub(r'[\$\,]', '', x)) # Use custom transformation on a column df['Price'] = df['Price'].apply(price2int)
copy

Task

Read the sales_data_types.csv dataset and change the data type in the Active column from str to bool.

Task

Read the sales_data_types.csv dataset and change the data type in the Active column from str to bool.

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Section 1. Chapter 5
toggle bottom row

Changing the Data Type

You already know how to change the data type from string to number, for example. But let's take a closer look at this small but important task.

Let's start by changing the data type from string to datetime. Most often, you will need this to work with time series. You can perform this operation using the .to_datetime() method:

To convert a string to a bool - use the .map() method on the column whose values you want to change:

For example, if you have a price column that looks like "$198,800" and you want to turn it into a float - you should create custom transformation functions:

12345678910111213
import pandas as pd import re # Create simple dataset df = pd.DataFrame(data={'Price':['$4,122.94', '$1,002.3']}) # Create a custom function to transform data # x - value from column def price2int(x): return float(re.sub(r'[\$\,]', '', x)) # Use custom transformation on a column df['Price'] = df['Price'].apply(price2int)
copy

Task

Read the sales_data_types.csv dataset and change the data type in the Active column from str to bool.

Task

Read the sales_data_types.csv dataset and change the data type in the Active column from str to bool.

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

You already know how to change the data type from string to number, for example. But let's take a closer look at this small but important task.

Let's start by changing the data type from string to datetime. Most often, you will need this to work with time series. You can perform this operation using the .to_datetime() method:

To convert a string to a bool - use the .map() method on the column whose values you want to change:

For example, if you have a price column that looks like "$198,800" and you want to turn it into a float - you should create custom transformation functions:

12345678910111213
import pandas as pd import re # Create simple dataset df = pd.DataFrame(data={'Price':['$4,122.94', '$1,002.3']}) # Create a custom function to transform data # x - value from column def price2int(x): return float(re.sub(r'[\$\,]', '', x)) # Use custom transformation on a column df['Price'] = df['Price'].apply(price2int)
copy

Task

Read the sales_data_types.csv dataset and change the data type in the Active column from str to bool.

Switch to desktop for real-world practiceContinue from where you are using one of the options below
Section 1. Chapter 5
Switch to desktop for real-world practiceContinue from where you are using one of the options below
We're sorry to hear that something went wrong. What happened?
some-alt