Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Data Selection - Advanced Techniques | Data Manipulation and Cleaning
Data Analysis with R

bookData Selection - Advanced Techniques

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 5

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Awesome!

Completion rate improved to 4

bookData Selection - Advanced Techniques

Veeg om het menu te tonen

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 5
some-alt