Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Prepearing Data Set 2/2 | Models in Scikit Learn
Introduction to Scikit Learn

bookPrepearing Data Set 2/2

Let's make the last preperation of the prices houses in Amsterdam dataset. If you take a look one more time at this dataset...

You will see that, for example, the values in price and room columns are different orders. We know, that it is better to work with data, which are reduced to one range of values. Let's do it with standardization. We will do it in two ways. Firslty without built-in functions, just using fomula.

  1. Let's find mean and variance values.
1234
# Calculating mean values print('The mean value of each column in the dataset:', dataset.mean()) # Calculating variance values print('The std value of each column in the dataset:', dataset.var())
copy
  1. Then we calculate standardized values using the following formula:

12
# Checking null values dataset.apply(lambda x: (x-x.mean())/ x.std(), axis=0)
copy
  1. Or we can do it, just using StandardScaler() function in the follwing way:
12345678
scaler = StandardScaler() scaler.fit(dataset) # Calculating mean value print(scaler.mean_) # Calculating variance value print(scaler.var_) scaled_data = scaler.transform(dataset) print(scaled_data)
copy

It is time to make all this steps on the dataset in the task. Let's start!

Uppgift

Swipe to start coding

  1. Importing libraries and loading dataset.
  2. Finding and dropping duplicated values.
  3. Finding and replacing null values with mean value.
  4. Delete categorial values, leaving only numerals.

Lösning

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 2

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Suggested prompts:

Ställ mig frågor om detta ämne

Sammanfatta detta kapitel

Visa verkliga exempel

Awesome!

Completion rate improved to 12.5

bookPrepearing Data Set 2/2

Svep för att visa menyn

Let's make the last preperation of the prices houses in Amsterdam dataset. If you take a look one more time at this dataset...

You will see that, for example, the values in price and room columns are different orders. We know, that it is better to work with data, which are reduced to one range of values. Let's do it with standardization. We will do it in two ways. Firslty without built-in functions, just using fomula.

  1. Let's find mean and variance values.
1234
# Calculating mean values print('The mean value of each column in the dataset:', dataset.mean()) # Calculating variance values print('The std value of each column in the dataset:', dataset.var())
copy
  1. Then we calculate standardized values using the following formula:

12
# Checking null values dataset.apply(lambda x: (x-x.mean())/ x.std(), axis=0)
copy
  1. Or we can do it, just using StandardScaler() function in the follwing way:
12345678
scaler = StandardScaler() scaler.fit(dataset) # Calculating mean value print(scaler.mean_) # Calculating variance value print(scaler.var_) scaled_data = scaler.transform(dataset) print(scaled_data)
copy

It is time to make all this steps on the dataset in the task. Let's start!

Uppgift

Swipe to start coding

  1. Importing libraries and loading dataset.
  2. Finding and dropping duplicated values.
  3. Finding and replacing null values with mean value.
  4. Delete categorial values, leaving only numerals.

Lösning

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 2
some-alt