Зміст курсу
Explore the Linear Regression Using Python
Explore the Linear Regression Using Python
What Is Correlation? How To Find Pearson Coefficient
In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.
When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).
We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.
The farther the points are from our straight line, the closer the correlation value is to 0.
Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y)
, which returns slope, intercept, and also correlation coefficient as a third argument.
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.
Завдання
In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.
- [Lines #2-3] Import the
matplotlib.pyplot
and also the library SciPy. - [Line #10] Find the correlation coefficient calling the function we were talking about.
- [Line #18] Print the Pearson coefficient.
- [Lines #21-22] Display the plot
Дякуємо за ваш відгук!
What Is Correlation? How To Find Pearson Coefficient
In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.
When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).
We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.
The farther the points are from our straight line, the closer the correlation value is to 0.
Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y)
, which returns slope, intercept, and also correlation coefficient as a third argument.
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.
Завдання
In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.
- [Lines #2-3] Import the
matplotlib.pyplot
and also the library SciPy. - [Line #10] Find the correlation coefficient calling the function we were talking about.
- [Line #18] Print the Pearson coefficient.
- [Lines #21-22] Display the plot
Дякуємо за ваш відгук!
What Is Correlation? How To Find Pearson Coefficient
In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.
When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).
We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.
The farther the points are from our straight line, the closer the correlation value is to 0.
Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y)
, which returns slope, intercept, and also correlation coefficient as a third argument.
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.
Завдання
In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.
- [Lines #2-3] Import the
matplotlib.pyplot
and also the library SciPy. - [Line #10] Find the correlation coefficient calling the function we were talking about.
- [Line #18] Print the Pearson coefficient.
- [Lines #21-22] Display the plot
Дякуємо за ваш відгук!
In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.
When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).
We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.
The farther the points are from our straight line, the closer the correlation value is to 0.
Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y)
, which returns slope, intercept, and also correlation coefficient as a third argument.
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.
Завдання
In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.
- [Lines #2-3] Import the
matplotlib.pyplot
and also the library SciPy. - [Line #10] Find the correlation coefficient calling the function we were talking about.
- [Line #18] Print the Pearson coefficient.
- [Lines #21-22] Display the plot