Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
What Is Correlation? How To Find Pearson Coefficient | Correlation
Explore the Linear Regression Using Python
course content

Course Content

Explore the Linear Regression Using Python

Explore the Linear Regression Using Python

1. What is the Linear Regression?
2. Correlation
3. Building and Training Model
4. Metrics to Evaluate the Model
5. Multivariate Linear Regression

What Is Correlation? How To Find Pearson Coefficient

In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.

When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).

We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.

The farther the points are from our straight line, the closer the correlation value is to 0.

Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y), which returns slope, intercept, and also correlation coefficient as a third argument.

12345
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
copy

There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.

Task

In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.

  1. [Lines #2-3] Import the matplotlib.pyplotand also the library SciPy.
  2. [Line #10] Find the correlation coefficient calling the function we were talking about.
  3. [Line #18] Print the Pearson coefficient.
  4. [Lines #21-22] Display the plot

Task

In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.

  1. [Lines #2-3] Import the matplotlib.pyplotand also the library SciPy.
  2. [Line #10] Find the correlation coefficient calling the function we were talking about.
  3. [Line #18] Print the Pearson coefficient.
  4. [Lines #21-22] Display the plot

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Section 2. Chapter 1
toggle bottom row

What Is Correlation? How To Find Pearson Coefficient

In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.

When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).

We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.

The farther the points are from our straight line, the closer the correlation value is to 0.

Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y), which returns slope, intercept, and also correlation coefficient as a third argument.

12345
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
copy

There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.

Task

In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.

  1. [Lines #2-3] Import the matplotlib.pyplotand also the library SciPy.
  2. [Line #10] Find the correlation coefficient calling the function we were talking about.
  3. [Line #18] Print the Pearson coefficient.
  4. [Lines #21-22] Display the plot

Task

In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.

  1. [Lines #2-3] Import the matplotlib.pyplotand also the library SciPy.
  2. [Line #10] Find the correlation coefficient calling the function we were talking about.
  3. [Line #18] Print the Pearson coefficient.
  4. [Lines #21-22] Display the plot

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

Section 2. Chapter 1
toggle bottom row

What Is Correlation? How To Find Pearson Coefficient

In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.

When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).

We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.

The farther the points are from our straight line, the closer the correlation value is to 0.

Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y), which returns slope, intercept, and also correlation coefficient as a third argument.

12345
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
copy

There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.

Task

In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.

  1. [Lines #2-3] Import the matplotlib.pyplotand also the library SciPy.
  2. [Line #10] Find the correlation coefficient calling the function we were talking about.
  3. [Line #18] Print the Pearson coefficient.
  4. [Lines #21-22] Display the plot

Task

In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.

  1. [Lines #2-3] Import the matplotlib.pyplotand also the library SciPy.
  2. [Line #10] Find the correlation coefficient calling the function we were talking about.
  3. [Line #18] Print the Pearson coefficient.
  4. [Lines #21-22] Display the plot

Switch to desktop for real-world practiceContinue from where you are using one of the options below

Everything was clear?

In statistics, there is another definition that we need to get acquainted with. It’s a correlation. Correlation is a mathematical method for examining a relationship between variables.

When we look at our data, it is sometimes very easy to say if the relationship between our variables is weak or strong or if it even exists. But how can we mathematically show the strength of the connection of our variables or even prove its existence to predict further values? The strength of the relationship can be quantified by correlation. It means that the dataset with a weak relationship has a small correlation value, and with a strong relationship, the large one. The maximum value of correlation is 1, and the minimum is -1. It is also called the correlation coefficient or the Pearson correlation coefficient, often denoted with the letter r (called Pearson’s r).

We have a correlation value = 1 when the straight line (which we have told in the previous section) goes through each point of the dataset with a positive slope. A similar situation is when a correlation value - 1. It happens when the straight line goes through each dataset point with a negative slope.

The farther the points are from our straight line, the closer the correlation value is to 0.

Do you remember the library from the previous section we worked with? SpiPy provided us with helpful method stats.linregress(x, y), which returns slope, intercept, and also correlation coefficient as a third argument.

12345
# Get the linear regression parameters slope, intercept, r, p, std_err = stats.linregress(x, y) # Print the correlation coefficient print(r)
copy

There are a lot of ways to find the Pearson coefficient using other libraries in Python, which are sometimes more convenient. We will look at some of them later.

Task

In the previous task, we found that a cat's weight depends on how much they eat. Let's determine the correlation coefficient to find out how the number of calories consumed depends on the cat's age.

  1. [Lines #2-3] Import the matplotlib.pyplotand also the library SciPy.
  2. [Line #10] Find the correlation coefficient calling the function we were talking about.
  3. [Line #18] Print the Pearson coefficient.
  4. [Lines #21-22] Display the plot

Switch to desktop for real-world practiceContinue from where you are using one of the options below
Section 2. Chapter 1
Switch to desktop for real-world practiceContinue from where you are using one of the options below
We're sorry to hear that something went wrong. What happened?
some-alt