Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Types of Data | Machine Learning Concepts
ML Introduction with scikit-learn

bookTypes of Data

Each column(feature) in a training set has a datatype associated with it. Those datatypes can be grouped into numerical, categorical, and date and(or) time.

Most ML algorithms perform well only with numerical data, so categorical and datetime values need to be converted into numbers.

For date and time, features such as 'year', 'month', and similar can be extracted, depending on the task. These are already numerical values, so they can be used directly.

Categorical data is a little more challenging to deal with.

Types of Categorical Data

Categorical data is classified into two types:

  • Ordinal data is a type of categorical data in which categories follow a natural order. For example, level of education (from elementary school to Ph.D.) or rates (from very bad to very well), etc.;

  • Nominal data is a type of categorical data that follows no natural order. For example, name, gender, country of origin, etc.

Converting ordinal and nominal data types into numerical values requires different approaches, so they must be handled separately.

Note
Study More

There are better ways to convert dates to numerical values that are beyond the scope of this introductory course. For example, if we only use the 'month' feature, it fails to consider that the 12th month is actually closer to the 1st than to the 9th.

question-icon

Match the feature and its data type.

Price (100, 235) –
Color (blue, orange) –

Academic grades (A, B, C, and so on) –

Click or drag`n`drop items and fill in the blanks

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 4

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 3.13

bookTypes of Data

Swipe to show menu

Each column(feature) in a training set has a datatype associated with it. Those datatypes can be grouped into numerical, categorical, and date and(or) time.

Most ML algorithms perform well only with numerical data, so categorical and datetime values need to be converted into numbers.

For date and time, features such as 'year', 'month', and similar can be extracted, depending on the task. These are already numerical values, so they can be used directly.

Categorical data is a little more challenging to deal with.

Types of Categorical Data

Categorical data is classified into two types:

  • Ordinal data is a type of categorical data in which categories follow a natural order. For example, level of education (from elementary school to Ph.D.) or rates (from very bad to very well), etc.;

  • Nominal data is a type of categorical data that follows no natural order. For example, name, gender, country of origin, etc.

Converting ordinal and nominal data types into numerical values requires different approaches, so they must be handled separately.

Note
Study More

There are better ways to convert dates to numerical values that are beyond the scope of this introductory course. For example, if we only use the 'month' feature, it fails to consider that the 12th month is actually closer to the 1st than to the 9th.

question-icon

Match the feature and its data type.

Price (100, 235) –
Color (blue, orange) –

Academic grades (A, B, C, and so on) –

Click or drag`n`drop items and fill in the blanks

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 4
some-alt