Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprenda Correlation Analysis | Basic Statistical Analysis
Data Analysis with R

bookCorrelation Analysis

Correlation analysis is a statistical technique used to measure the strength and direction of a relationship between two numeric variables. It helps us understand how changes in one variable are associated with changes in another.

What is Correlation?

A correlation coefficient (usually represented as r) ranges between -1 and A correlation coefficient (usually represented as r) ranges between -1 and

  • +1: perfect positive correlation;
  • 0: no correlation;
  • −1: Perfect negative correlation.

There are several types of correlation methods, but Pearson correlation is the most commonly used for numeric continuous data in R.

Correlation Between Two Variables

cor(df$selling_price, df$km_driven)  # Selling price vs kilometers driven
cor(df$mileage, df$max_power)        # Mileage vs power

These functions return a value between -1 and 1, indicating strength and direction.

Correlation Matrix (Multiple Variables)

You can also examine relationships among several variables using a correlation matrix:

# Select only numeric columns
numeric_df <- df[, c("selling_price", "km_driven", "max_power", "mileage", "engine", "seats")]
# Compute correlation matrix
cor_matrix <- cor(numeric_df, use = "complete.obs")  # Ignores any rows with missing data
View(cor_matrix)

The matrix shows pairwise correlation values between all selected numeric variables. This helps in identifying which variables are strongly related.

Summary

  • Use cor() to measure relationship strength and direction between variables;

  • Use a correlation matrix to analyze relationships between several numeric variables simultaneously;

  • Always clean and prepare your data before running correlation analysis.

question mark

A correlation coefficient of -0.9 indicates:

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 3. Capítulo 5

Pergunte à IA

expand

Pergunte à IA

ChatGPT

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Awesome!

Completion rate improved to 4

bookCorrelation Analysis

Deslize para mostrar o menu

Correlation analysis is a statistical technique used to measure the strength and direction of a relationship between two numeric variables. It helps us understand how changes in one variable are associated with changes in another.

What is Correlation?

A correlation coefficient (usually represented as r) ranges between -1 and A correlation coefficient (usually represented as r) ranges between -1 and

  • +1: perfect positive correlation;
  • 0: no correlation;
  • −1: Perfect negative correlation.

There are several types of correlation methods, but Pearson correlation is the most commonly used for numeric continuous data in R.

Correlation Between Two Variables

cor(df$selling_price, df$km_driven)  # Selling price vs kilometers driven
cor(df$mileage, df$max_power)        # Mileage vs power

These functions return a value between -1 and 1, indicating strength and direction.

Correlation Matrix (Multiple Variables)

You can also examine relationships among several variables using a correlation matrix:

# Select only numeric columns
numeric_df <- df[, c("selling_price", "km_driven", "max_power", "mileage", "engine", "seats")]
# Compute correlation matrix
cor_matrix <- cor(numeric_df, use = "complete.obs")  # Ignores any rows with missing data
View(cor_matrix)

The matrix shows pairwise correlation values between all selected numeric variables. This helps in identifying which variables are strongly related.

Summary

  • Use cor() to measure relationship strength and direction between variables;

  • Use a correlation matrix to analyze relationships between several numeric variables simultaneously;

  • Always clean and prepare your data before running correlation analysis.

question mark

A correlation coefficient of -0.9 indicates:

Select the correct answer

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 3. Capítulo 5
some-alt