Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Data Types | Data Exploration
Preprocessing Data
course content

Kursinhalt

Preprocessing Data

Preprocessing Data

1. Data Exploration
2. Data Cleaning
3. Data Validation
4. Normalization & Standardization
5. Data Encoding

book
Data Types

Let's talk about the types of data that dataframe may contain.

Numerical

Numerical data is presented by int or float values. In the dataframe, it should be stored as int64 or float64 data types value. Use data.info() to check the data types for each column.

Note that some fields in the dataframe may contain numerical values, but are stored using some other data type (object or str). You have to convert it to the int64 or float64, and we’ll explore how to do it later.

Categorical

Categorical data has no numerical representation, it is an item from the list of some groups or categories. For example, column Sex has values Male or Female, or column Season with values Spring, Summer, Fall, and Winter. It requires special conversion and preprocessing. This data has data types: object, bool, str.

Fortunately, the dataset titanic already contains numerical data as int64 and float64.

Aufgabe

Swipe to start coding

Let's divide the columns into numerical and categorical. Create num_cols as numpy array, including types int and float. Let the cat_cols be all other features except the num_cols.

Lösung

Switch to desktopWechseln Sie zum Desktop, um in der realen Welt zu übenFahren Sie dort fort, wo Sie sind, indem Sie eine der folgenden Optionen verwenden
War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 3
toggle bottom row

book
Data Types

Let's talk about the types of data that dataframe may contain.

Numerical

Numerical data is presented by int or float values. In the dataframe, it should be stored as int64 or float64 data types value. Use data.info() to check the data types for each column.

Note that some fields in the dataframe may contain numerical values, but are stored using some other data type (object or str). You have to convert it to the int64 or float64, and we’ll explore how to do it later.

Categorical

Categorical data has no numerical representation, it is an item from the list of some groups or categories. For example, column Sex has values Male or Female, or column Season with values Spring, Summer, Fall, and Winter. It requires special conversion and preprocessing. This data has data types: object, bool, str.

Fortunately, the dataset titanic already contains numerical data as int64 and float64.

Aufgabe

Swipe to start coding

Let's divide the columns into numerical and categorical. Create num_cols as numpy array, including types int and float. Let the cat_cols be all other features except the num_cols.

Lösung

Switch to desktopWechseln Sie zum Desktop, um in der realen Welt zu übenFahren Sie dort fort, wo Sie sind, indem Sie eine der folgenden Optionen verwenden
War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 3
Switch to desktopWechseln Sie zum Desktop, um in der realen Welt zu übenFahren Sie dort fort, wo Sie sind, indem Sie eine der folgenden Optionen verwenden
We're sorry to hear that something went wrong. What happened?
some-alt