Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Evaluating Regression Models | Section
Predictive Modeling with Tidymodels in R

bookEvaluating Regression Models

メニューを表示するにはスワイプしてください

Evaluating regression models is a crucial step in predictive modeling, as it helps you understand how well your models predict continuous outcomes. The most common regression evaluation metrics are Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R²). RMSE measures the average magnitude of prediction errors, penalizing larger errors more heavily. MAE calculates the average absolute difference between predicted and actual values, making it less sensitive to outliers than RMSE. R-squared represents the proportion of variance in the dependent variable explained by the model, with values closer to 1 indicating better model fit.

12345678910111213141516171819202122232425
options(crayon.enabled = FALSE) library(tidymodels) # Assume you have a trained regression model and a split dataset # Fit model (for demonstration, use linear regression) lm_spec <- linear_reg() %>% set_engine("lm") lm_fit <- lm_spec %>% fit(mpg ~ ., data = mtcars) # Generate predictions on test data (here, using the same data for simplicity) predictions <- predict(lm_fit, mtcars) %>% bind_cols(truth = mtcars$mpg) # Calculate regression metrics metrics <- metric_set(rmse, mae, rsq) results <- metrics(predictions, truth = truth, estimate = .pred) print(results) # Visualize predictions vs. actuals library(ggplot2) ggplot(predictions, aes(x = truth, y = .pred)) + geom_point(color = "steelblue") + geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") + labs(title = "Predicted vs. Actual MPG", x = "Actual MPG", y = "Predicted MPG")
copy

Once you have calculated these metrics, you need to interpret the results to assess model quality. Lower RMSE and MAE values indicate more accurate predictions, while a higher R-squared value suggests that your model explains more of the outcome variability. Comparing these metrics across different models or preprocessing strategies helps you select the best approach for your data. If you notice high error values or a low R-squared, it could signal issues such as underfitting, data quality problems, or the need for additional feature engineering. Visualizing predicted versus actual values can also reveal patterns like systematic under- or over-prediction, heteroscedasticity, or outliers, all of which provide valuable diagnostic insights for further model refinement.

question mark

Which metric is most appropriate if you want to minimize the impact of large outliers when evaluating a regression model?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 1.  5

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 1.  5
some-alt