Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Residuals | Metrics to Evaluate the Model
Explore the Linear Regression Using Python
course content

Зміст курсу

Explore the Linear Regression Using Python

Explore the Linear Regression Using Python

1. What is the Linear Regression?
2. Correlation
3. Building and Training Model
4. Metrics to Evaluate the Model
5. Multivariate Linear Regression

Residuals

If we look at the plot that shows the dependence of flavanoids on the number of phenols, it will be obvious that the use of linear regression, in this case, was not entirely correct. Moreover, how do we interpret how good our prediction is?

Some points will lie on our constructed line, and some will lie away from it. We can measure the distance between a point and a line along the y-axis. This distance is called the residual or error. The remainder is the difference between the observed value of the target and the predicted value. The closer the residual is to 0, the better our model performs. Let's calculate the residuals and present them as a chart.

12345678
residuals = Y_test - y_test_predicted # Visualize the data ax = plt.gca() ax.set_xlabel('total_phenols') ax.set_ylabel('residuals') plt.scatter(X_test, residuals) plt.show()
copy

Output:

Our residuals formed three almost straight lines. This distribution is a sign that the model is not working. Ideally, the remains should be arranged symmetrically and randomly around the horizontal axis. Still, if the residual graph shows some pattern (linear or non-linear), it means that our model is not the best.

Завдання

Try to find residuals to our previous challenge:

  1. [Line #29] Define the variable y_test_predicted as predicted data for X_test.
  2. [Line #30] Assign the difference between variables Y_test and y_test_predicted to the residuals.
  3. [Line #31] Print the variable residuals.

Завдання

Try to find residuals to our previous challenge:

  1. [Line #29] Define the variable y_test_predicted as predicted data for X_test.
  2. [Line #30] Assign the difference between variables Y_test and y_test_predicted to the residuals.
  3. [Line #31] Print the variable residuals.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

Секція 4. Розділ 1
toggle bottom row

Residuals

If we look at the plot that shows the dependence of flavanoids on the number of phenols, it will be obvious that the use of linear regression, in this case, was not entirely correct. Moreover, how do we interpret how good our prediction is?

Some points will lie on our constructed line, and some will lie away from it. We can measure the distance between a point and a line along the y-axis. This distance is called the residual or error. The remainder is the difference between the observed value of the target and the predicted value. The closer the residual is to 0, the better our model performs. Let's calculate the residuals and present them as a chart.

12345678
residuals = Y_test - y_test_predicted # Visualize the data ax = plt.gca() ax.set_xlabel('total_phenols') ax.set_ylabel('residuals') plt.scatter(X_test, residuals) plt.show()
copy

Output:

Our residuals formed three almost straight lines. This distribution is a sign that the model is not working. Ideally, the remains should be arranged symmetrically and randomly around the horizontal axis. Still, if the residual graph shows some pattern (linear or non-linear), it means that our model is not the best.

Завдання

Try to find residuals to our previous challenge:

  1. [Line #29] Define the variable y_test_predicted as predicted data for X_test.
  2. [Line #30] Assign the difference between variables Y_test and y_test_predicted to the residuals.
  3. [Line #31] Print the variable residuals.

Завдання

Try to find residuals to our previous challenge:

  1. [Line #29] Define the variable y_test_predicted as predicted data for X_test.
  2. [Line #30] Assign the difference between variables Y_test and y_test_predicted to the residuals.
  3. [Line #31] Print the variable residuals.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

Секція 4. Розділ 1
toggle bottom row

Residuals

If we look at the plot that shows the dependence of flavanoids on the number of phenols, it will be obvious that the use of linear regression, in this case, was not entirely correct. Moreover, how do we interpret how good our prediction is?

Some points will lie on our constructed line, and some will lie away from it. We can measure the distance between a point and a line along the y-axis. This distance is called the residual or error. The remainder is the difference between the observed value of the target and the predicted value. The closer the residual is to 0, the better our model performs. Let's calculate the residuals and present them as a chart.

12345678
residuals = Y_test - y_test_predicted # Visualize the data ax = plt.gca() ax.set_xlabel('total_phenols') ax.set_ylabel('residuals') plt.scatter(X_test, residuals) plt.show()
copy

Output:

Our residuals formed three almost straight lines. This distribution is a sign that the model is not working. Ideally, the remains should be arranged symmetrically and randomly around the horizontal axis. Still, if the residual graph shows some pattern (linear or non-linear), it means that our model is not the best.

Завдання

Try to find residuals to our previous challenge:

  1. [Line #29] Define the variable y_test_predicted as predicted data for X_test.
  2. [Line #30] Assign the difference between variables Y_test and y_test_predicted to the residuals.
  3. [Line #31] Print the variable residuals.

Завдання

Try to find residuals to our previous challenge:

  1. [Line #29] Define the variable y_test_predicted as predicted data for X_test.
  2. [Line #30] Assign the difference between variables Y_test and y_test_predicted to the residuals.
  3. [Line #31] Print the variable residuals.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів

Все було зрозуміло?

If we look at the plot that shows the dependence of flavanoids on the number of phenols, it will be obvious that the use of linear regression, in this case, was not entirely correct. Moreover, how do we interpret how good our prediction is?

Some points will lie on our constructed line, and some will lie away from it. We can measure the distance between a point and a line along the y-axis. This distance is called the residual or error. The remainder is the difference between the observed value of the target and the predicted value. The closer the residual is to 0, the better our model performs. Let's calculate the residuals and present them as a chart.

12345678
residuals = Y_test - y_test_predicted # Visualize the data ax = plt.gca() ax.set_xlabel('total_phenols') ax.set_ylabel('residuals') plt.scatter(X_test, residuals) plt.show()
copy

Output:

Our residuals formed three almost straight lines. This distribution is a sign that the model is not working. Ideally, the remains should be arranged symmetrically and randomly around the horizontal axis. Still, if the residual graph shows some pattern (linear or non-linear), it means that our model is not the best.

Завдання

Try to find residuals to our previous challenge:

  1. [Line #29] Define the variable y_test_predicted as predicted data for X_test.
  2. [Line #30] Assign the difference between variables Y_test and y_test_predicted to the residuals.
  3. [Line #31] Print the variable residuals.

Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
Секція 4. Розділ 1
Перейдіть на комп'ютер для реальної практикиПродовжуйте з того місця, де ви зупинились, використовуючи один з наведених нижче варіантів
We're sorry to hear that something went wrong. What happened?
some-alt