Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Data Types | Brief Introduction
Data Preprocessing
course content

Kursinnhold

Data Preprocessing

Data Preprocessing

1. Brief Introduction
2. Processing Quantitative Data
3. Processing Categorical Data
4. Time Series Data Processing
5. Feature Engineering
6. Moving on to Tasks

book
Data Types

The main tool we will use to manipulate data is pandas. We can start right away by loading the data:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.head())
copy

As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes property:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.dtypes)
copy

Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype() method:

python
Oppgave

Swipe to start coding

Read the penguins.csv dataset and change the data type in the body_mass_g column from float to int.

Don't modify the initial code, only replace the gaps ___ with the correct code.

Once you've completed this task, click the button below the code to check your solution.

Løsning

Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 1
toggle bottom row

book
Data Types

The main tool we will use to manipulate data is pandas. We can start right away by loading the data:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.head())
copy

As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes property:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.dtypes)
copy

Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype() method:

python
Oppgave

Swipe to start coding

Read the penguins.csv dataset and change the data type in the body_mass_g column from float to int.

Don't modify the initial code, only replace the gaps ___ with the correct code.

Once you've completed this task, click the button below the code to check your solution.

Løsning

Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 1
Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Vi beklager at noe gikk galt. Hva skjedde?
some-alt