Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Data Selection - Advanced Techniques | Data Manipulation and Cleaning
Data Analysis with R

bookData Selection - Advanced Techniques

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 5

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Suggested prompts:

Can you explain the difference between selecting columns by position and by name in R?

How does the pipe operator in dplyr improve code readability?

Can you show more examples of slicing rows using dplyr?

Awesome!

Completion rate improved to 4

bookData Selection - Advanced Techniques

Svep för att visa menyn

In the previous chapter, we explored how to select single rows and columns using basic indexing. Now, we’ll go a step further and learn how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting multiple columns (Base R)

  • Use the c() function to combine multiple column positions or names;

  • This allows you to select several columns at once;

  • The result is a smaller data frame with only the specified columns.

Example using column positions:

selected_data_base <- df[, c(1, 2, 3)]
head(selected_data_base)

Example using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]
head(selected_data_base)

Indexing single values

  • You can access a specific value using row and column numbers;

  • This is helpful for checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Select first 5 rows: Using base R:

first_5_rows_base <- df[1:5, ]
head(first_5_rows_base)

Using dplyr:

first_5_rows_dplyr <- df %>%
  slice(1:5)
head(first_5_rows_dplyr)
question mark

What does df[1:5, ] do?

Select the correct answer

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 5
some-alt