Descriptive Statistics
Understanding your data begins with descriptive statistics — these provide essential summaries about the distribution, central tendency, and spread of variables. This chapter guides you through basic statistical calculations and grouped summaries using base R
and dplyr
.
Basic Descriptive Statistics (Base R)
The most common statistical measures are:
- Mean: average value;
- Median: middle value;
- Min / Max: smallest and largest values.
mean(df$max_power, na.rm = TRUE) # Average max power
median(df$selling_price, na.rm = TRUE) # Median selling price
min(df$mileage, na.rm = TRUE) # Minimum mileage
max(df$mileage, na.rm = TRUE) # Maximum mileage
summary(df) # Quick summary for all numeric columns
Descriptive Statistics using dplyr
Using dplyr makes calculations more readable and efficient.
df %>%
summarise(
avg_power = mean(max_power, na.rm = TRUE),
sd_power = sd(max_power, na.rm = TRUE),
median_power = median(max_power, na.rm = TRUE)
)
Danke für Ihr Feedback!
Fragen Sie AI
Fragen Sie AI
Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen
Awesome!
Completion rate improved to 4
Descriptive Statistics
Swipe um das Menü anzuzeigen
Understanding your data begins with descriptive statistics — these provide essential summaries about the distribution, central tendency, and spread of variables. This chapter guides you through basic statistical calculations and grouped summaries using base R
and dplyr
.
Basic Descriptive Statistics (Base R)
The most common statistical measures are:
- Mean: average value;
- Median: middle value;
- Min / Max: smallest and largest values.
mean(df$max_power, na.rm = TRUE) # Average max power
median(df$selling_price, na.rm = TRUE) # Median selling price
min(df$mileage, na.rm = TRUE) # Minimum mileage
max(df$mileage, na.rm = TRUE) # Maximum mileage
summary(df) # Quick summary for all numeric columns
Descriptive Statistics using dplyr
Using dplyr makes calculations more readable and efficient.
df %>%
summarise(
avg_power = mean(max_power, na.rm = TRUE),
sd_power = sd(max_power, na.rm = TRUE),
median_power = median(max_power, na.rm = TRUE)
)
Danke für Ihr Feedback!