Seeing the Big Picture
Now we have become familiar with all stages of the PCA algorithm: data standardization, calculation of the covariance matrix, calculation of eigenvalues, eigenvectors, formation of a feature vector, and subsequent application of the results to the data.
So, as we can see, the main components are linear combinations of the original variables from the dataset. This is one of the main ideas that is important to remember. Also, as we may have noticed, we only used PCA to work with continuous data. In the following sections we will find out why.
We have seen that Python has all the tools we need to implement the Principal Component Method step by step. As we have seen in previous chapters, creating a PCA model can be done in 1 line, but here we wanted to show a little more detail behind it all.
In the following chapters, we will be able to solve the problem of data dimensionality reduction on a large dataset.
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Ask me questions about this topic
Summarize this chapter
Show real-world examples
Awesome!
Completion rate improved to 5.26
Seeing the Big Picture
Swipe to show menu
Now we have become familiar with all stages of the PCA algorithm: data standardization, calculation of the covariance matrix, calculation of eigenvalues, eigenvectors, formation of a feature vector, and subsequent application of the results to the data.
So, as we can see, the main components are linear combinations of the original variables from the dataset. This is one of the main ideas that is important to remember. Also, as we may have noticed, we only used PCA to work with continuous data. In the following sections we will find out why.
We have seen that Python has all the tools we need to implement the Principal Component Method step by step. As we have seen in previous chapters, creating a PCA model can be done in 1 line, but here we wanted to show a little more detail behind it all.
In the following chapters, we will be able to solve the problem of data dimensionality reduction on a large dataset.
Thanks for your feedback!