Sorting Data
Sorting is a fundamental operation in data analysis. It allows you to organize your dataset based on one or more variables—such as price, mileage, or year. This makes it easier to identify trends, outliers, or simply view the data in a meaningful order.
In this chapter, we’ll learn how to sort data using both base R and dplyr.
Sort by a single column (ascending order)
Using base R:
-
Use the
order()
function to sort data by column values; -
This returns the dataset in ascending order.
df_sorted_price_base <- df[order(df$selling_price), ]
view(df_sorted_price_base)
Using dplyr:
-
Use the
arrange()
function to sort a data frame; -
The default order is ascending.
df_sorted_price_dplyr <- df %>%
arrange(selling_price)
view(df_sorted_price_dplyr)
Sort in descending order
Using base R:
Apply a negative sign (-
) before the column inside order() to reverse the order.
df_sorted_price_desc <- df[order(-df$selling_price), ]
head(df_sorted_price_desc)
Using dplyr:
Use the desc()
function within arrange()
to sort in descending order.
sorted_price_desc_dplyr <- df %>%
arrange(desc(selling_price))
head(sorted_price_desc_dplyr)
Sort by multiple columns
-
You can sort by more than one column to create a prioritized order;
-
For example, sort first by fuel type (alphabetically), then by selling price (high to low).
Using base R:
df_sorted <- df[order(df$fuel, -df$selling_price), ]
head(df_sorted)
Sort by year (newest to oldest)
-
Sorting by year is useful to prioritize newer vehicles;
-
Again, use descending order for this case.
Using base R:
sorted_df_base <- df[order(-df$year), ]
view(sorted_df_base)
Bedankt voor je feedback!
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Can you explain the difference between using base R and dplyr for sorting?
How do I sort by more than two columns?
What should I do if my column names have spaces or special characters?
Awesome!
Completion rate improved to 4
Sorting Data
Veeg om het menu te tonen
Sorting is a fundamental operation in data analysis. It allows you to organize your dataset based on one or more variables—such as price, mileage, or year. This makes it easier to identify trends, outliers, or simply view the data in a meaningful order.
In this chapter, we’ll learn how to sort data using both base R and dplyr.
Sort by a single column (ascending order)
Using base R:
-
Use the
order()
function to sort data by column values; -
This returns the dataset in ascending order.
df_sorted_price_base <- df[order(df$selling_price), ]
view(df_sorted_price_base)
Using dplyr:
-
Use the
arrange()
function to sort a data frame; -
The default order is ascending.
df_sorted_price_dplyr <- df %>%
arrange(selling_price)
view(df_sorted_price_dplyr)
Sort in descending order
Using base R:
Apply a negative sign (-
) before the column inside order() to reverse the order.
df_sorted_price_desc <- df[order(-df$selling_price), ]
head(df_sorted_price_desc)
Using dplyr:
Use the desc()
function within arrange()
to sort in descending order.
sorted_price_desc_dplyr <- df %>%
arrange(desc(selling_price))
head(sorted_price_desc_dplyr)
Sort by multiple columns
-
You can sort by more than one column to create a prioritized order;
-
For example, sort first by fuel type (alphabetically), then by selling price (high to low).
Using base R:
df_sorted <- df[order(df$fuel, -df$selling_price), ]
head(df_sorted)
Sort by year (newest to oldest)
-
Sorting by year is useful to prioritize newer vehicles;
-
Again, use descending order for this case.
Using base R:
sorted_df_base <- df[order(-df$year), ]
view(sorted_df_base)
Bedankt voor je feedback!