Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Understanding Central Tendency & Spread | Probability & Statistics
Mathematics for Data Science

bookUnderstanding Central Tendency & Spread

Understanding how data behaves is a crucial part of any data analysis. Whether you’re working with marketing numbers, medical stats, or machine learning models, being able to describe the average behavior and the spread of your data is essential.

Mean (Average)

Definition:
The mean is the sum of all values divided by the number of values. It represents the “central” or “typical” value in your dataset.

Formula:

Mean=xin\text{Mean} = \frac{\sum x_i}{n}

Example:
If your website had 100, 120, and 110 visitors over three days:

100+120+1103=110\frac{100 + 120 + 110}{3} = 110

Interpretation:
On average, the site received 110 visitors per day.


Concept 2: Variance

Definition:
Variance measures how far each number in the set is from the mean. It gives a sense of how “spread out” the data is.

Formula:

σ2=(xiμ)2n\sigma^2 = \frac{\sum (x_i - \mu)^2}{n}

Example (using the previous data):

  • Mean = 110
  • (100110)2=100(100 − 110)^2 = 100
  • (120110)2=100(120 − 110)^2 = 100
  • (110110)2=0(110 − 110)^2 = 0

Sum = 200

Variance=200366.67\text{Variance} = \frac{200}{3} \approx 66.67

Interpretation:
The average squared distance from the mean is about 66.67.

Standard Deviation

Definition:
Standard deviation is the square root of the variance. It brings the spread back to the original units of the data.

Formula:

σ=σ2\sigma = \sqrt{\sigma^2}

Example:
If variance is 66.67:

σ=66.678.16\sigma = \sqrt{66.67} \approx 8.16

Interpretation:
On average, each day's visitor count is about 8.16 away from the mean.

Real-World Problem: Website Traffic Analysis

Problem:
A data scientist records the number of website visitors over 5 days:

120, 150, 130, 170, 140

Step 1 — Mean:

120+150+130+170+1405=142\frac{120 + 150 + 130 + 170 + 140}{5} = 142

Step 2 — Variance:

  • (120142)2=484(120 - 142)^2 = 484
  • (150142)2=64(150 - 142)^2 = 64
  • (130142)2=144(130 - 142)^2 = 144
  • (170142)2=784(170 - 142)^2 = 784
  • (140142)2=4(140 - 142)^2 = 4
Variance=484+64+144+784+45=14805=296\text{Variance} = \frac{484+64+144+784+4}{5} = \frac{1480}{5} = 296

Step 3 — Standard Deviation:

σ=29617.2\sigma = \sqrt{296} \approx 17.2

Conclusion:

  • Mean = 142 visitors per day
  • Variance = 296
  • Standard Deviation = 17.2

The website traffic varies by about 17.2 visitors from the average day.

Quiz: Test Your Knowledge

**1.

**2.


**3.

4. Which unit does variance use?
A) Same as data
B) No unit
C) Squared units of data ✅
D) Logarithmic scale


**5.


6. In the dataset [4, 8, 12], what is the mean?
A) 6
B) 8 ✅
C) 12
D) 10


7. Which formula represents variance?
A) —
B) σ2=(xiμ)2n\sigma^2 = \frac{\sum (x_i - \mu)^2}{n}
C) —
D) —


**8.


9. If the variance is 25, what is the standard deviation?
A) 5 ✅
B) 25
C) 2.5
D) 125


10. Why is standard deviation often preferred over variance in interpretation?
A) It’s easier to compute
B) It’s in the original units ✅
C) It gives smaller numbers
D) It avoids using the mean

1. What does the mean represent in a dataset?

2. Which formula correctly represents the mean?

3. What does a high variance indicate?

4. What is the relationship between variance and standard deviation?

5. What is the purpose of squaring the differences in variance?

question mark

What does the mean represent in a dataset?

Select the correct answer

question mark

Which formula correctly represents the mean?

Select the correct answer

question mark

What does a high variance indicate?

Select the correct answer

question mark

What is the relationship between variance and standard deviation?

Select the correct answer

question mark

What is the purpose of squaring the differences in variance?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 5. Розділ 7

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Suggested prompts:

Can you explain why variance uses squared units?

How do I calculate the mean, variance, and standard deviation for a different dataset?

Can you give more real-world examples where these measures are useful?

Awesome!

Completion rate improved to 1.89

bookUnderstanding Central Tendency & Spread

Свайпніть щоб показати меню

Understanding how data behaves is a crucial part of any data analysis. Whether you’re working with marketing numbers, medical stats, or machine learning models, being able to describe the average behavior and the spread of your data is essential.

Mean (Average)

Definition:
The mean is the sum of all values divided by the number of values. It represents the “central” or “typical” value in your dataset.

Formula:

Mean=xin\text{Mean} = \frac{\sum x_i}{n}

Example:
If your website had 100, 120, and 110 visitors over three days:

100+120+1103=110\frac{100 + 120 + 110}{3} = 110

Interpretation:
On average, the site received 110 visitors per day.


Concept 2: Variance

Definition:
Variance measures how far each number in the set is from the mean. It gives a sense of how “spread out” the data is.

Formula:

σ2=(xiμ)2n\sigma^2 = \frac{\sum (x_i - \mu)^2}{n}

Example (using the previous data):

  • Mean = 110
  • (100110)2=100(100 − 110)^2 = 100
  • (120110)2=100(120 − 110)^2 = 100
  • (110110)2=0(110 − 110)^2 = 0

Sum = 200

Variance=200366.67\text{Variance} = \frac{200}{3} \approx 66.67

Interpretation:
The average squared distance from the mean is about 66.67.

Standard Deviation

Definition:
Standard deviation is the square root of the variance. It brings the spread back to the original units of the data.

Formula:

σ=σ2\sigma = \sqrt{\sigma^2}

Example:
If variance is 66.67:

σ=66.678.16\sigma = \sqrt{66.67} \approx 8.16

Interpretation:
On average, each day's visitor count is about 8.16 away from the mean.

Real-World Problem: Website Traffic Analysis

Problem:
A data scientist records the number of website visitors over 5 days:

120, 150, 130, 170, 140

Step 1 — Mean:

120+150+130+170+1405=142\frac{120 + 150 + 130 + 170 + 140}{5} = 142

Step 2 — Variance:

  • (120142)2=484(120 - 142)^2 = 484
  • (150142)2=64(150 - 142)^2 = 64
  • (130142)2=144(130 - 142)^2 = 144
  • (170142)2=784(170 - 142)^2 = 784
  • (140142)2=4(140 - 142)^2 = 4
Variance=484+64+144+784+45=14805=296\text{Variance} = \frac{484+64+144+784+4}{5} = \frac{1480}{5} = 296

Step 3 — Standard Deviation:

σ=29617.2\sigma = \sqrt{296} \approx 17.2

Conclusion:

  • Mean = 142 visitors per day
  • Variance = 296
  • Standard Deviation = 17.2

The website traffic varies by about 17.2 visitors from the average day.

Quiz: Test Your Knowledge

**1.

**2.


**3.

4. Which unit does variance use?
A) Same as data
B) No unit
C) Squared units of data ✅
D) Logarithmic scale


**5.


6. In the dataset [4, 8, 12], what is the mean?
A) 6
B) 8 ✅
C) 12
D) 10


7. Which formula represents variance?
A) —
B) σ2=(xiμ)2n\sigma^2 = \frac{\sum (x_i - \mu)^2}{n}
C) —
D) —


**8.


9. If the variance is 25, what is the standard deviation?
A) 5 ✅
B) 25
C) 2.5
D) 125


10. Why is standard deviation often preferred over variance in interpretation?
A) It’s easier to compute
B) It’s in the original units ✅
C) It gives smaller numbers
D) It avoids using the mean

1. What does the mean represent in a dataset?

2. Which formula correctly represents the mean?

3. What does a high variance indicate?

4. What is the relationship between variance and standard deviation?

5. What is the purpose of squaring the differences in variance?

question mark

What does the mean represent in a dataset?

Select the correct answer

question mark

Which formula correctly represents the mean?

Select the correct answer

question mark

What does a high variance indicate?

Select the correct answer

question mark

What is the relationship between variance and standard deviation?

Select the correct answer

question mark

What is the purpose of squaring the differences in variance?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 5. Розділ 7
some-alt