Course Content
Explore the Linear Regression Using Python
Explore the Linear Regression Using Python
Correlation Matrix
Let’s go back to our dataset. To explore the relationships between all the columns, we can use a correlation matrix. It finds pairwise correlation coefficients of all columns(that's why the matrix is symmetric). Use the following method to build it and show correlation coefficients between all variables: dataframe.corr()
.
Use this code to see the matrix for our wine dataset:
matrix= data.corr().round(2) print(matrix)
If we want to visualize this matrix just call function sns.heatmap
and import library:
import seaborn as sns sns.heatmap(matrix, annot=True)
If you want to improve your knowledge in Seaborn Visualization, click here!
We can see that alcohol is most positively correlated with the proline (0.64), which means that the amount of alcohol increases as the proline. The hue is most negatively correlated with the color intensity (-0.52), which means that the greater the color intensity of the wine, the lower the hue.
Task
In the future, we will try to predict the characteristics of wine by the number of flavanoids in it. Flavanoids are plant pigments, and their most prominent role is to color our crops brightly.
- [Lines #3-4] Import the
pandas
,seaborn
libraries. - [Line #17] Write the code to define the correlation matrix rounding it to the second digit.
- [Lines #20-24] Find with which column flavanoids have the highest positive correlation and the negative correlation. Using the previous diagram we can obviously find that that's total_phenols (0.86) and nonflavanoid_phenols(-0.54) respectively. Assign numbers above to the variables
positive_cor_value
andnegative_cor_value
respectively (positive_cor_value = 0.86
andnegative_cor_value = -0.54
). Assign names and numbers to the corresponding variables.
Thanks for your feedback!
Correlation Matrix
Let’s go back to our dataset. To explore the relationships between all the columns, we can use a correlation matrix. It finds pairwise correlation coefficients of all columns(that's why the matrix is symmetric). Use the following method to build it and show correlation coefficients between all variables: dataframe.corr()
.
Use this code to see the matrix for our wine dataset:
matrix= data.corr().round(2) print(matrix)
If we want to visualize this matrix just call function sns.heatmap
and import library:
import seaborn as sns sns.heatmap(matrix, annot=True)
If you want to improve your knowledge in Seaborn Visualization, click here!
We can see that alcohol is most positively correlated with the proline (0.64), which means that the amount of alcohol increases as the proline. The hue is most negatively correlated with the color intensity (-0.52), which means that the greater the color intensity of the wine, the lower the hue.
Task
In the future, we will try to predict the characteristics of wine by the number of flavanoids in it. Flavanoids are plant pigments, and their most prominent role is to color our crops brightly.
- [Lines #3-4] Import the
pandas
,seaborn
libraries. - [Line #17] Write the code to define the correlation matrix rounding it to the second digit.
- [Lines #20-24] Find with which column flavanoids have the highest positive correlation and the negative correlation. Using the previous diagram we can obviously find that that's total_phenols (0.86) and nonflavanoid_phenols(-0.54) respectively. Assign numbers above to the variables
positive_cor_value
andnegative_cor_value
respectively (positive_cor_value = 0.86
andnegative_cor_value = -0.54
). Assign names and numbers to the corresponding variables.
Thanks for your feedback!
Correlation Matrix
Let’s go back to our dataset. To explore the relationships between all the columns, we can use a correlation matrix. It finds pairwise correlation coefficients of all columns(that's why the matrix is symmetric). Use the following method to build it and show correlation coefficients between all variables: dataframe.corr()
.
Use this code to see the matrix for our wine dataset:
matrix= data.corr().round(2) print(matrix)
If we want to visualize this matrix just call function sns.heatmap
and import library:
import seaborn as sns sns.heatmap(matrix, annot=True)
If you want to improve your knowledge in Seaborn Visualization, click here!
We can see that alcohol is most positively correlated with the proline (0.64), which means that the amount of alcohol increases as the proline. The hue is most negatively correlated with the color intensity (-0.52), which means that the greater the color intensity of the wine, the lower the hue.
Task
In the future, we will try to predict the characteristics of wine by the number of flavanoids in it. Flavanoids are plant pigments, and their most prominent role is to color our crops brightly.
- [Lines #3-4] Import the
pandas
,seaborn
libraries. - [Line #17] Write the code to define the correlation matrix rounding it to the second digit.
- [Lines #20-24] Find with which column flavanoids have the highest positive correlation and the negative correlation. Using the previous diagram we can obviously find that that's total_phenols (0.86) and nonflavanoid_phenols(-0.54) respectively. Assign numbers above to the variables
positive_cor_value
andnegative_cor_value
respectively (positive_cor_value = 0.86
andnegative_cor_value = -0.54
). Assign names and numbers to the corresponding variables.
Thanks for your feedback!
Let’s go back to our dataset. To explore the relationships between all the columns, we can use a correlation matrix. It finds pairwise correlation coefficients of all columns(that's why the matrix is symmetric). Use the following method to build it and show correlation coefficients between all variables: dataframe.corr()
.
Use this code to see the matrix for our wine dataset:
matrix= data.corr().round(2) print(matrix)
If we want to visualize this matrix just call function sns.heatmap
and import library:
import seaborn as sns sns.heatmap(matrix, annot=True)
If you want to improve your knowledge in Seaborn Visualization, click here!
We can see that alcohol is most positively correlated with the proline (0.64), which means that the amount of alcohol increases as the proline. The hue is most negatively correlated with the color intensity (-0.52), which means that the greater the color intensity of the wine, the lower the hue.
Task
In the future, we will try to predict the characteristics of wine by the number of flavanoids in it. Flavanoids are plant pigments, and their most prominent role is to color our crops brightly.
- [Lines #3-4] Import the
pandas
,seaborn
libraries. - [Line #17] Write the code to define the correlation matrix rounding it to the second digit.
- [Lines #20-24] Find with which column flavanoids have the highest positive correlation and the negative correlation. Using the previous diagram we can obviously find that that's total_phenols (0.86) and nonflavanoid_phenols(-0.54) respectively. Assign numbers above to the variables
positive_cor_value
andnegative_cor_value
respectively (positive_cor_value = 0.86
andnegative_cor_value = -0.54
). Assign names and numbers to the corresponding variables.