Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Data Preprocessing with Recipes | Section
Predictive Modeling with Tidymodels in R
Abschnitt 1. Kapitel 2
single

single

bookData Preprocessing with Recipes

Swipe um das Menü anzuzeigen

One of the most powerful tools for preprocessing data in a tidy modeling workflow is the recipes package. The recipes package allows you to define a sequence of preprocessing steps – such as normalization, standardization, encoding, and imputation – using a consistent, readable syntax. Each preprocessing step is added to a "recipe," which can then be applied to your data in a reproducible way. This tidy approach means you can bundle all your data preparation steps together, ensuring that transformations are performed in the correct order and can be easily reproduced or shared. Recipes are especially useful when you want to keep your preprocessing and modeling steps separate, or when you need to apply the same transformations to new data (like test or validation sets).

123456789101112131415161718192021
options(crayon.enabled = FALSE) library(recipes) # Sample data data <- data.frame( age = c(25, 30, NA, 40), income = c(50000, 60000, 55000, NA), gender = c("male", "female", "female", "male") ) # Create a recipe for normalization and missing value imputation rec <- recipe(~ ., data = data) %>% step_impute_mean(all_numeric_predictors()) %>% step_normalize(all_numeric_predictors()) # Prep the recipe (estimate parameters) rec_prep <- prep(rec, training = data) # Apply the recipe to the data data_processed <- bake(rec_prep, new_data = data) print(data_processed)
copy

When working with the recipes package, you build a recipe by chaining together a series of steps. Each step specifies a transformation or preprocessing action, such as imputing missing values or normalizing numeric variables. You start by creating a recipe object, typically using the recipe() function, and then add steps like step_impute_mean() or step_normalize() using the pipe operator (%>%). Once all steps are added, you prep the recipe with the prep() function, which estimates any required parameters (like means or standard deviations) from your training data. The prepped recipe can then be applied to any dataset using the bake() function, ensuring that the same transformations are used consistently. This workflow keeps your preprocessing steps organized, reproducible, and separate from your modeling code, making it easier to manage complex data transformations.

Aufgabe

Wischen, um mit dem Codieren zu beginnen

Create a recipe that standardizes all numeric variables and encodes all categorical variables in the provided training data.

  • Load the recipes package.
  • Initialize a recipe() using the formula ~ . and the train_data.
  • Add a step to center all numeric predictors utilizing step_center() and all_numeric_predictors().
  • Add a step to scale all numeric predictors utilizing step_scale() and all_numeric_predictors().
  • Add a step to convert all nominal (categorical) predictors to dummy variables utilizing step_dummy() and all_nominal_predictors().
  • Prepare the recipe utilizing the prep() function on the training data.
  • Apply the prepared recipe to the training data utilizing the bake() function.

Lösung

Switch to desktopWechseln Sie zum Desktop, um in der realen Welt zu übenFahren Sie dort fort, wo Sie sind, indem Sie eine der folgenden Optionen verwenden
War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 2
single

single

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

some-alt