Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Best Practices for Readable Pipelines | Pipes and Chaining Operations
Data Manipulation in R

bookBest Practices for Readable Pipelines

メニューを表示するにはスワイプしてください

When writing pipelines in R, following best practices for readability is essential for both collaboration and your own future reference. Readable code helps teams quickly understand each step of a data transformation, reduces errors, and makes future updates much easier. Clear, well-structured pipelines also help you debug and maintain your code as your projects grow in complexity.

12345678910111213141516171819
# Load necessary library library(dplyr) # Create sample sales data sales_data <- data.frame( region = c("North", "South", "North", "West", NA), quantity = c(10, 5, 8, 12, 7), price = c(100, 120, 100, 90, 110) ) # Clean and summarize sales data cleaned_sales <- sales_data %>% filter(!is.na(region)) %>% # Remove rows with missing region mutate(total_sale = quantity * price) %>% # Calculate total sale per row group_by(region) %>% # Group by region summarise(total_revenue = sum(total_sale))# Summarize total revenue per region library(knitr) kable(cleaned_sales)
copy

Notice how this pipeline uses clear variable names such as cleaned_sales and includes comments for each step. Each data transformation is written on its own line, and the verbs are aligned for easy scanning. This formatting makes it easy for anyone reading the code to follow the logic from raw data to the final summary, and the inline comments explain the purpose of each operation.

12
cleaned<-sales_data%>%filter(!is.na(region))%>%mutate(total_sale=quantity*price)%>%group_by(region)%>%summarise(total_revenue=sum(total_sale)) kable(cleaned)
copy

The previous code sample shows a poorly formatted pipeline. The code is compressed onto a single line, variable names are less descriptive, and there are no comments. This makes it difficult to quickly understand what the code is doing, increasing the risk of mistakes and making it harder to debug or update in the future. Common pitfalls include using unclear variable names, skipping comments, and cramming too many operations into a single line. To avoid these issues, always use descriptive names, break up long pipelines into logical steps, and document your process with comments.

Note
Note

When debugging pipelines, insert print() or glimpse() after steps to inspect the data's structure and values. This helps you catch errors early and understand how each transformation affects your data.

1. What makes a pipeline readable and maintainable?

2. Why is it important to use clear variable names and comments?

3. How can you debug a long pipeline in R?

question mark

What makes a pipeline readable and maintainable?

正しい答えを選んでください

question mark

Why is it important to use clear variable names and comments?

正しい答えを選んでください

question mark

How can you debug a long pipeline in R?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 2.  3

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 2.  3
some-alt