Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Oppiskele Correlation Analysis | Basic Statistical Analysis
Data Analysis with R

bookCorrelation Analysis

Correlation analysis is a statistical technique used to measure the strength and direction of a relationship between two numeric variables. It helps us understand how changes in one variable are associated with changes in another.

What is Correlation?

A correlation coefficient (usually represented as r) ranges between -1 and A correlation coefficient (usually represented as r) ranges between -1 and

  • +1: perfect positive correlation;
  • 0: no correlation;
  • −1: Perfect negative correlation.

There are several types of correlation methods, but Pearson correlation is the most commonly used for numeric continuous data in R.

Correlation Between Two Variables

cor(df$selling_price, df$km_driven)  # Selling price vs kilometers driven
cor(df$mileage, df$max_power)        # Mileage vs power

These functions return a value between -1 and 1, indicating strength and direction.

Correlation Matrix (Multiple Variables)

You can also examine relationships among several variables using a correlation matrix:

# Select only numeric columns
numeric_df <- df[, c("selling_price", "km_driven", "max_power", "mileage", "engine", "seats")]
# Compute correlation matrix
cor_matrix <- cor(numeric_df, use = "complete.obs")  # Ignores any rows with missing data
View(cor_matrix)

The matrix shows pairwise correlation values between all selected numeric variables. This helps in identifying which variables are strongly related.

Summary

  • Use cor() to measure relationship strength and direction between variables;

  • Use a correlation matrix to analyze relationships between several numeric variables simultaneously;

  • Always clean and prepare your data before running correlation analysis.

question mark

A correlation coefficient of -0.9 indicates:

Select the correct answer

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 3. Luku 5

Kysy tekoälyä

expand

Kysy tekoälyä

ChatGPT

Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme

Awesome!

Completion rate improved to 4

bookCorrelation Analysis

Pyyhkäise näyttääksesi valikon

Correlation analysis is a statistical technique used to measure the strength and direction of a relationship between two numeric variables. It helps us understand how changes in one variable are associated with changes in another.

What is Correlation?

A correlation coefficient (usually represented as r) ranges between -1 and A correlation coefficient (usually represented as r) ranges between -1 and

  • +1: perfect positive correlation;
  • 0: no correlation;
  • −1: Perfect negative correlation.

There are several types of correlation methods, but Pearson correlation is the most commonly used for numeric continuous data in R.

Correlation Between Two Variables

cor(df$selling_price, df$km_driven)  # Selling price vs kilometers driven
cor(df$mileage, df$max_power)        # Mileage vs power

These functions return a value between -1 and 1, indicating strength and direction.

Correlation Matrix (Multiple Variables)

You can also examine relationships among several variables using a correlation matrix:

# Select only numeric columns
numeric_df <- df[, c("selling_price", "km_driven", "max_power", "mileage", "engine", "seats")]
# Compute correlation matrix
cor_matrix <- cor(numeric_df, use = "complete.obs")  # Ignores any rows with missing data
View(cor_matrix)

The matrix shows pairwise correlation values between all selected numeric variables. This helps in identifying which variables are strongly related.

Summary

  • Use cor() to measure relationship strength and direction between variables;

  • Use a correlation matrix to analyze relationships between several numeric variables simultaneously;

  • Always clean and prepare your data before running correlation analysis.

question mark

A correlation coefficient of -0.9 indicates:

Select the correct answer

Oliko kaikki selvää?

Miten voimme parantaa sitä?

Kiitos palautteestasi!

Osio 3. Luku 5
some-alt