Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Data Selection - Advanced Techniques | Data Manipulation and Cleaning
Data Analysis with R

bookData Selection - Advanced Techniques

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 5

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain the difference between selecting columns by position and by name in R?

How does the pipe operator in dplyr improve code readability?

Can you show more examples of slicing rows using dplyr?

Awesome!

Completion rate improved to 4

bookData Selection - Advanced Techniques

Swipe to show menu

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 5
some-alt