Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Data Types | Brief Introduction
Data Preprocessing
course content

Cursusinhoud

Data Preprocessing

Data Preprocessing

1. Brief Introduction
2. Processing Quantitative Data
3. Processing Categorical Data
4. Time Series Data Processing
5. Feature Engineering
6. Moving on to Tasks

book
Data Types

The main tool we will use to manipulate data is pandas. We can start right away by loading the data:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.head())
copy

As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes property:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.dtypes)
copy

Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype() method:

python
Taak

Swipe to start coding

Read the penguins.csv dataset and change the data type in the body_mass_g column from float to int.

Don't modify the initial code, only replace the gaps ___ with the correct code.

Once you've completed this task, click the button below the code to check your solution.

Oplossing

Switch to desktopSchakel over naar desktop voor praktijkervaringGa verder vanaf waar je bent met een van de onderstaande opties
Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 1
toggle bottom row

book
Data Types

The main tool we will use to manipulate data is pandas. We can start right away by loading the data:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.head())
copy

As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes property:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.dtypes)
copy

Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype() method:

python
Taak

Swipe to start coding

Read the penguins.csv dataset and change the data type in the body_mass_g column from float to int.

Don't modify the initial code, only replace the gaps ___ with the correct code.

Once you've completed this task, click the button below the code to check your solution.

Oplossing

Switch to desktopSchakel over naar desktop voor praktijkervaringGa verder vanaf waar je bent met een van de onderstaande opties
Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 1
Switch to desktopSchakel over naar desktop voor praktijkervaringGa verder vanaf waar je bent met een van de onderstaande opties
Onze excuses dat er iets mis is gegaan. Wat is er gebeurd?
some-alt