Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Mean Absolute Error (MAE) | Metrics to Evaluate the Model
Explore the Linear Regression Using Python
course content

Course Content

Explore the Linear Regression Using Python

Explore the Linear Regression Using Python

1. What is the Linear Regression?
2. Correlation
3. Building and Training Model
4. Metrics to Evaluate the Model
5. Multivariate Linear Regression

bookMean Absolute Error (MAE)

Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?

Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.

Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).

1
MAE = abs(residuals).mean()
copy

We can also use the method mean_squared_error() from scikit-learn metrics module to output the same result:

12
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
copy

Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values ​​by our predictions.

The general formula for this metric:

Task

Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.

  1. [Line #33] Assign the module average of residuals to the variable MAE_math.
  2. [Line #36] Import mean_absolute_error for calculating metrics.
  3. [Line #37] Use the method mean_absolute_error() to find MAE and assign it to the variable MAE_func.
  4. [Line #40] Print MAE_math and MAE_func using the function print only one time.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 2
toggle bottom row

bookMean Absolute Error (MAE)

Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?

Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.

Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).

1
MAE = abs(residuals).mean()
copy

We can also use the method mean_squared_error() from scikit-learn metrics module to output the same result:

12
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
copy

Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values ​​by our predictions.

The general formula for this metric:

Task

Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.

  1. [Line #33] Assign the module average of residuals to the variable MAE_math.
  2. [Line #36] Import mean_absolute_error for calculating metrics.
  3. [Line #37] Use the method mean_absolute_error() to find MAE and assign it to the variable MAE_func.
  4. [Line #40] Print MAE_math and MAE_func using the function print only one time.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 4. Chapter 2
toggle bottom row

bookMean Absolute Error (MAE)

Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?

Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.

Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).

1
MAE = abs(residuals).mean()
copy

We can also use the method mean_squared_error() from scikit-learn metrics module to output the same result:

12
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
copy

Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values ​​by our predictions.

The general formula for this metric:

Task

Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.

  1. [Line #33] Assign the module average of residuals to the variable MAE_math.
  2. [Line #36] Import mean_absolute_error for calculating metrics.
  3. [Line #37] Use the method mean_absolute_error() to find MAE and assign it to the variable MAE_func.
  4. [Line #40] Print MAE_math and MAE_func using the function print only one time.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?

Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.

Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).

1
MAE = abs(residuals).mean()
copy

We can also use the method mean_squared_error() from scikit-learn metrics module to output the same result:

12
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
copy

Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values ​​by our predictions.

The general formula for this metric:

Task

Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.

  1. [Line #33] Assign the module average of residuals to the variable MAE_math.
  2. [Line #36] Import mean_absolute_error for calculating metrics.
  3. [Line #37] Use the method mean_absolute_error() to find MAE and assign it to the variable MAE_func.
  4. [Line #40] Print MAE_math and MAE_func using the function print only one time.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Section 4. Chapter 2
Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
some-alt