Contenuti del Corso
Advanced Techniques in pandas
Advanced Techniques in pandas
Finding the Correlation
Finally, let's move to the last method of this section called .corr()
. It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:
Let's examine the output of the data.corr()
in our case:
So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.
- 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
- -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
- 0 means that the two dependent values aren't proportional.
Note
If the dataset contains non-numeric columns, such as in the
cars.csv
dataset used in the task, you should set the argumentnumeric_only=True
to compute the correlation using only the numeric columns.
Swipe to start coding
You'll end this section with an effortless task: apply the .corr()
function to the dataset. Then, try to analyze the numbers you get.
Soluzione
Grazie per i tuoi commenti!
Finding the Correlation
Finally, let's move to the last method of this section called .corr()
. It helps out a lot to find the relationship between numerical data. Imagine that you have a dataset on houses:
Let's examine the output of the data.corr()
in our case:
So, let's do it step by step: You have vertical and horizontal values; each pair overlaps. In each overlap, we can receive a value from -1 to 1.
- 1 means that two values depend on each other in a directly proportional way (if one value increases, the other increases too);
- -1 means that two values depend on each other in an inversely proportional way (if one value increases, the other decreases);
- 0 means that the two dependent values aren't proportional.
Note
If the dataset contains non-numeric columns, such as in the
cars.csv
dataset used in the task, you should set the argumentnumeric_only=True
to compute the correlation using only the numeric columns.
Swipe to start coding
You'll end this section with an effortless task: apply the .corr()
function to the dataset. Then, try to analyze the numbers you get.
Soluzione
Grazie per i tuoi commenti!