Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Data Selection - Advanced Techniques | Data Manipulation and Cleaning
Data Analysis with R

bookData Selection - Advanced Techniques

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 5

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Suggested prompts:

Can you explain the difference between selecting columns by position and by name in R?

How does the pipe operator in dplyr improve code readability?

Can you show more examples of slicing rows using dplyr?

Awesome!

Completion rate improved to 4

bookData Selection - Advanced Techniques

Sveip for å vise menyen

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 5
some-alt