Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Data Type Conversion | Time Series Data Processing
Data Preprocessing

book
Data Type Conversion

Data type conversion in time series data processing is the process of converting time series data from one data type to another. Why do we need to use that? In time series data processing, this can be useful when you want to change your data format to make it easier to work with or when you want to perform calculations that require a different data type.
For example, you might convert a string representation of a date into a datetime object so that you can perform calculations on it.

Let's look at an example of converting date data from string format to datetime format:

import pandas as pd

# Create simple dataset with date information in string format
dataset = pd.DataFrame({'PatientID': [1, 2, 3],
'Name': ['John', 'Sarah', 'Michael'],
'AdmissionDate': ['2022-03-15', '2021-11-10', '2022-02-28']})

# Convert 'AdmissionDate' column from string to datetime format
dataset['AdmissionDate'] = pd.to_datetime(dataset['AdmissionDate'], format='%Y-%m-%d')

# Print the converted data
print('Converted types:')
print(dataset.dtypes)
12345678910111213
import pandas as pd # Create simple dataset with date information in string format dataset = pd.DataFrame({'PatientID': [1, 2, 3], 'Name': ['John', 'Sarah', 'Michael'], 'AdmissionDate': ['2022-03-15', '2021-11-10', '2022-02-28']}) # Convert 'AdmissionDate' column from string to datetime format dataset['AdmissionDate'] = pd.to_datetime(dataset['AdmissionDate'], format='%Y-%m-%d') # Print the converted data print('Converted types:') print(dataset.dtypes)
copy

You can change the format of the date entry template with the format argument.

We can consider different date patterns:

  • '15 Jul 2009' - '%d %m %Y';
  • '1-Feb-15' - '%d-%m-%Y';
  • '12/08/2019' - '%d/%m/%Y'.

Also, take into account that when we talk about processing time-series data, this means that we will work not only with dates but with all other data types (numeric, categorical, etc.).

Task

Swipe to start coding

Read the 'sales.csv' dataset and convert the 'Date' column to the datetime data type.

Solution

import pandas as pd

# Read the dataset
dataset = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/sales.csv')

# Convert 'Date' column to datetime format
dataset['Date'] = pd.to_datetime(dataset['Date'], format='%Y-%m-%d')

# Print the converted data
print(dataset)

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 1
import pandas as pd

# Read the dataset
dataset = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/sales.csv')

# Convert 'Date' column to datetime format
dataset['Date'] = pd.___(dataset['Date'], format='%Y-%m-%d')

# Print the converted data
print(dataset)
toggle bottom row
some-alt