Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Standardization | Basic Concepts of PCA
Principal Component Analysis
course content

Conteúdo do Curso

Principal Component Analysis

Principal Component Analysis

1. What is Principal Component Analysis
2. Basic Concepts of PCA
3. Model Building
4. Results Analysis

bookStandardization

Finally, let's start with the analysis of the PCA mathematical model.

First of all, we start by standardizing the data that the algorithm will work with. By standardization is meant the reduction of all continuous variables to a set where the mean will be equal to 0.

This is an important step because PCA cannot work properly if there is a variable in the dataset with a range of values ​​0-20 and 100-10,000, for example. PCA will start to "ignore" the characteristic with a small spread (0-20) and it will not be able to affect the result of the algorithm.

The formula for data standardization is very simple. Subtract the mean from the value of the variable and divide the result by the standard deviation:

The scikit-learn Python library allows us to do this in 1 line:

Tarefa

Implement standardization of X array using the numpy functions np.mean() and np.std().

Switch to desktopMude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 2. Capítulo 1
toggle bottom row

bookStandardization

Finally, let's start with the analysis of the PCA mathematical model.

First of all, we start by standardizing the data that the algorithm will work with. By standardization is meant the reduction of all continuous variables to a set where the mean will be equal to 0.

This is an important step because PCA cannot work properly if there is a variable in the dataset with a range of values ​​0-20 and 100-10,000, for example. PCA will start to "ignore" the characteristic with a small spread (0-20) and it will not be able to affect the result of the algorithm.

The formula for data standardization is very simple. Subtract the mean from the value of the variable and divide the result by the standard deviation:

The scikit-learn Python library allows us to do this in 1 line:

Tarefa

Implement standardization of X array using the numpy functions np.mean() and np.std().

Switch to desktopMude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 2. Capítulo 1
toggle bottom row

bookStandardization

Finally, let's start with the analysis of the PCA mathematical model.

First of all, we start by standardizing the data that the algorithm will work with. By standardization is meant the reduction of all continuous variables to a set where the mean will be equal to 0.

This is an important step because PCA cannot work properly if there is a variable in the dataset with a range of values ​​0-20 and 100-10,000, for example. PCA will start to "ignore" the characteristic with a small spread (0-20) and it will not be able to affect the result of the algorithm.

The formula for data standardization is very simple. Subtract the mean from the value of the variable and divide the result by the standard deviation:

The scikit-learn Python library allows us to do this in 1 line:

Tarefa

Implement standardization of X array using the numpy functions np.mean() and np.std().

Switch to desktopMude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Finally, let's start with the analysis of the PCA mathematical model.

First of all, we start by standardizing the data that the algorithm will work with. By standardization is meant the reduction of all continuous variables to a set where the mean will be equal to 0.

This is an important step because PCA cannot work properly if there is a variable in the dataset with a range of values ​​0-20 and 100-10,000, for example. PCA will start to "ignore" the characteristic with a small spread (0-20) and it will not be able to affect the result of the algorithm.

The formula for data standardization is very simple. Subtract the mean from the value of the variable and divide the result by the standard deviation:

The scikit-learn Python library allows us to do this in 1 line:

Tarefa

Implement standardization of X array using the numpy functions np.mean() and np.std().

Switch to desktopMude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
Seção 2. Capítulo 1
Switch to desktopMude para o desktop para praticar no mundo realContinue de onde você está usando uma das opções abaixo
some-alt