Зміст курсу
Explore the Linear Regression Using Python
Explore the Linear Regression Using Python
Mean Absolute Error (MAE)
Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?
Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.
Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).
MAE = abs(residuals).mean()
We can also use the method mean_squared_error()
from scikit-learn metrics module
to output the same result:
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values by our predictions.
The general formula for this metric:
Завдання
Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.
- [Line #33] Assign the module average of residuals to the variable
MAE_math
. - [Line #36] Import
mean_absolute_error
for calculating metrics. - [Line #37] Use the method
mean_absolute_error()
to find MAE and assign it to the variableMAE_func
. - [Line #40] Print
MAE_math
andMAE_func
using the functionprint
only one time.
Дякуємо за ваш відгук!
Mean Absolute Error (MAE)
Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?
Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.
Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).
MAE = abs(residuals).mean()
We can also use the method mean_squared_error()
from scikit-learn metrics module
to output the same result:
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values by our predictions.
The general formula for this metric:
Завдання
Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.
- [Line #33] Assign the module average of residuals to the variable
MAE_math
. - [Line #36] Import
mean_absolute_error
for calculating metrics. - [Line #37] Use the method
mean_absolute_error()
to find MAE and assign it to the variableMAE_func
. - [Line #40] Print
MAE_math
andMAE_func
using the functionprint
only one time.
Дякуємо за ваш відгук!
Mean Absolute Error (MAE)
Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?
Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.
Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).
MAE = abs(residuals).mean()
We can also use the method mean_squared_error()
from scikit-learn metrics module
to output the same result:
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values by our predictions.
The general formula for this metric:
Завдання
Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.
- [Line #33] Assign the module average of residuals to the variable
MAE_math
. - [Line #36] Import
mean_absolute_error
for calculating metrics. - [Line #37] Use the method
mean_absolute_error()
to find MAE and assign it to the variableMAE_func
. - [Line #40] Print
MAE_math
andMAE_func
using the functionprint
only one time.
Дякуємо за ваш відгук!
Residuals plot is a good way to see if our model is good but not the best. Let's look at the metrics that are commonly used to evaluate a regression model. If we go through all the residuals and analyze them, we will understand how much we were wrong in each prediction. But what if we have a very large dataset?
Analyzing each of the tens of thousands of residues is quite a marginal business. Therefore, in this case it is easier to take the sum of residuals and divide it by the number of objects in our sample. We will find the mean of the residuals over the sample. Moreover, if we get a very small value that is close to zero, this does not mean that our model is good, it can happen when the positive and negative residuals themselves are reduced to a very small number.
Here the sum of residuals divided by its number is 0, but it doesn’t mean that our model is perfect. Thus, it becomes evident that taking the deviation of predictions from the truth and averaging is not the best calculation metric performance of our model. We need to come up with something more complex. First, we require to get rid of the minus sign. This can be done using the module, that is, take the sum of the modules of the differences (thereby, our negative deviation becomes positive and there will no longer be such that the negative and positive deviation add up to zero) and divide by the number of objects, thereby obtaining the mean absolute error (MAE).
MAE = abs(residuals).mean()
We can also use the method mean_squared_error()
from scikit-learn metrics module
to output the same result:
from sklearn.metrics import mean_absolute_error print(mean_absolute_error(Y_test, y_test_predicted))
Ideally, when our mean absolute error tends to zero, in this case, we get into all the true values by our predictions.
The general formula for this metric:
Завдання
Let’s calculate the MAE using the data from the previous task. You should find it mathematically and also using built-in function.
- [Line #33] Assign the module average of residuals to the variable
MAE_math
. - [Line #36] Import
mean_absolute_error
for calculating metrics. - [Line #37] Use the method
mean_absolute_error()
to find MAE and assign it to the variableMAE_func
. - [Line #40] Print
MAE_math
andMAE_func
using the functionprint
only one time.