Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Prepearing Data Set 2/2 | Models in Scikit Learn
Introduction to Scikit Learn
course content

Kursinhalt

Introduction to Scikit Learn

Introduction to Scikit Learn

1. The Very First Steps
2. Scaling Numerical Data
3. Models in Scikit Learn

book
Prepearing Data Set 2/2

Let's make the last preperation of the prices houses in Amsterdam dataset. If you take a look one more time at this dataset...

You will see that, for example, the values in price and room columns are different orders. We know, that it is better to work with data, which are reduced to one range of values. Let's do it with standardization. We will do it in two ways. Firslty without built-in functions, just using fomula.

  1. Let's find mean and variance values.
1234
# Calculating mean values print('The mean value of each column in the dataset:', dataset.mean()) # Calculating variance values print('The std value of each column in the dataset:', dataset.var())
copy
  1. Then we calculate standardized values using the following formula:

12
# Checking null values dataset.apply(lambda x: (x-x.mean())/ x.std(), axis=0)
copy
  1. Or we can do it, just using StandardScaler() function in the follwing way:
12345678
scaler = StandardScaler() scaler.fit(dataset) # Calculating mean value print(scaler.mean_) # Calculating variance value print(scaler.var_) scaled_data = scaler.transform(dataset) print(scaled_data)
copy

It is time to make all this steps on the dataset in the task. Let's start!

Aufgabe

Swipe to start coding

  1. Importing libraries and loading dataset.
  2. Finding and dropping duplicated values.
  3. Finding and replacing null values with mean value.
  4. Delete categorial values, leaving only numerals.

Lösung

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 2
We're sorry to hear that something went wrong. What happened?
some-alt