Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Types Conversion | Data Validation
Preprocessing Data
course content

Зміст курсу

Preprocessing Data

Preprocessing Data

1. Data Exploration
2. Data Cleaning
3. Data Validation
4. Normalization & Standardization
5. Data Encoding

Types Conversion

You can discover that data can be stored in the dataset in the wrong format or type. The most common cases are:

  • storing integer or float values as string variables.
  • storing date and time values as strings.
  • storing values in a form that can be converted to a more suitable one.

Let's explore the dataset exercise containing info about diet, pulse, time, and kind of different exercises. There is sample data:

unnamediddietpulsetimekind
3512low fat10430 minwalking
6422low fat10415 minrunning
104low fat8215 minrest
187no fat871 minrest
4817no fat1031 minwalking

It makes sense to modify the time column data: all rows contain the duration in minutes, so info about time units (min, sec, ot hours) is useless. We're gonna remove the extra symbols and store only numerical values, which additionally will be converted to int.

Завдання

Apply the type conversion to the time column. Remove the last 4 symbols which are equal to min and convert the rest to int. Check the sample.

Завдання

Apply the type conversion to the time column. Remove the last 4 symbols which are equal to min and convert the rest to int. Check the sample.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

Секція 3. Розділ 1
toggle bottom row

Types Conversion

You can discover that data can be stored in the dataset in the wrong format or type. The most common cases are:

  • storing integer or float values as string variables.
  • storing date and time values as strings.
  • storing values in a form that can be converted to a more suitable one.

Let's explore the dataset exercise containing info about diet, pulse, time, and kind of different exercises. There is sample data:

unnamediddietpulsetimekind
3512low fat10430 minwalking
6422low fat10415 minrunning
104low fat8215 minrest
187no fat871 minrest
4817no fat1031 minwalking

It makes sense to modify the time column data: all rows contain the duration in minutes, so info about time units (min, sec, ot hours) is useless. We're gonna remove the extra symbols and store only numerical values, which additionally will be converted to int.

Завдання

Apply the type conversion to the time column. Remove the last 4 symbols which are equal to min and convert the rest to int. Check the sample.

Завдання

Apply the type conversion to the time column. Remove the last 4 symbols which are equal to min and convert the rest to int. Check the sample.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

Секція 3. Розділ 1
toggle bottom row

Types Conversion

You can discover that data can be stored in the dataset in the wrong format or type. The most common cases are:

  • storing integer or float values as string variables.
  • storing date and time values as strings.
  • storing values in a form that can be converted to a more suitable one.

Let's explore the dataset exercise containing info about diet, pulse, time, and kind of different exercises. There is sample data:

unnamediddietpulsetimekind
3512low fat10430 minwalking
6422low fat10415 minrunning
104low fat8215 minrest
187no fat871 minrest
4817no fat1031 minwalking

It makes sense to modify the time column data: all rows contain the duration in minutes, so info about time units (min, sec, ot hours) is useless. We're gonna remove the extra symbols and store only numerical values, which additionally will be converted to int.

Завдання

Apply the type conversion to the time column. Remove the last 4 symbols which are equal to min and convert the rest to int. Check the sample.

Завдання

Apply the type conversion to the time column. Remove the last 4 symbols which are equal to min and convert the rest to int. Check the sample.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

You can discover that data can be stored in the dataset in the wrong format or type. The most common cases are:

  • storing integer or float values as string variables.
  • storing date and time values as strings.
  • storing values in a form that can be converted to a more suitable one.

Let's explore the dataset exercise containing info about diet, pulse, time, and kind of different exercises. There is sample data:

unnamediddietpulsetimekind
3512low fat10430 minwalking
6422low fat10415 minrunning
104low fat8215 minrest
187no fat871 minrest
4817no fat1031 minwalking

It makes sense to modify the time column data: all rows contain the duration in minutes, so info about time units (min, sec, ot hours) is useless. We're gonna remove the extra symbols and store only numerical values, which additionally will be converted to int.

Завдання

Apply the type conversion to the time column. Remove the last 4 symbols which are equal to min and convert the rest to int. Check the sample.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Секція 3. Розділ 1
Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
We're sorry to hear that something went wrong. What happened?
some-alt