Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Sorting Data | Data Manipulation and Cleaning
Data Analysis with R

bookSorting Data

Sorting is a fundamental operation in data analysis. It allows you to organize your dataset based on one or more variablesβ€”such as price, mileage, or year. This makes it easier to identify trends, outliers, or simply view the data in a meaningful order.

In this chapter, we’ll learn how to sort data using both base R and dplyr.

Sort by a single column (ascending order)

Using base R:

  • Use the order() function to sort data by column values;

  • This returns the dataset in ascending order.

df_sorted_price_base <- df[order(df$selling_price), ]
view(df_sorted_price_base)

Using dplyr:

  • Use the arrange() function to sort a data frame;

  • The default order is ascending.

df_sorted_price_dplyr <- df %>%
  arrange(selling_price)
view(df_sorted_price_dplyr)

Sort in descending order

Using base R:

Apply a negative sign (-) before the column inside order() to reverse the order.

df_sorted_price_desc <- df[order(-df$selling_price), ]
head(df_sorted_price_desc)

Using dplyr:

Use the desc() function within arrange() to sort in descending order.

sorted_price_desc_dplyr <- df %>%
  arrange(desc(selling_price))
head(sorted_price_desc_dplyr)

Sort by multiple columns

  • You can sort by more than one column to create a prioritized order;

  • For example, sort first by fuel type (alphabetically), then by selling price (high to low).

Using base R:

df_sorted <- df[order(df$fuel, -df$selling_price), ]
head(df_sorted)

Sort by year (newest to oldest)

  • Sorting by year is useful to prioritize newer vehicles;

  • Again, use descending order for this case.

Using base R:

sorted_df_base <- df[order(-df$year), ]
view(sorted_df_base)
question mark

What does order(df$selling_price) do?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 8

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain the difference between using base R and dplyr for sorting?

How do I sort by more than two columns?

What should I do if my column names have spaces or special characters?

Awesome!

Completion rate improved to 4

bookSorting Data

Swipe to show menu

Sorting is a fundamental operation in data analysis. It allows you to organize your dataset based on one or more variablesβ€”such as price, mileage, or year. This makes it easier to identify trends, outliers, or simply view the data in a meaningful order.

In this chapter, we’ll learn how to sort data using both base R and dplyr.

Sort by a single column (ascending order)

Using base R:

  • Use the order() function to sort data by column values;

  • This returns the dataset in ascending order.

df_sorted_price_base <- df[order(df$selling_price), ]
view(df_sorted_price_base)

Using dplyr:

  • Use the arrange() function to sort a data frame;

  • The default order is ascending.

df_sorted_price_dplyr <- df %>%
  arrange(selling_price)
view(df_sorted_price_dplyr)

Sort in descending order

Using base R:

Apply a negative sign (-) before the column inside order() to reverse the order.

df_sorted_price_desc <- df[order(-df$selling_price), ]
head(df_sorted_price_desc)

Using dplyr:

Use the desc() function within arrange() to sort in descending order.

sorted_price_desc_dplyr <- df %>%
  arrange(desc(selling_price))
head(sorted_price_desc_dplyr)

Sort by multiple columns

  • You can sort by more than one column to create a prioritized order;

  • For example, sort first by fuel type (alphabetically), then by selling price (high to low).

Using base R:

df_sorted <- df[order(df$fuel, -df$selling_price), ]
head(df_sorted)

Sort by year (newest to oldest)

  • Sorting by year is useful to prioritize newer vehicles;

  • Again, use descending order for this case.

Using base R:

sorted_df_base <- df[order(-df$year), ]
view(sorted_df_base)
question mark

What does order(df$selling_price) do?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 8
some-alt