Learn Useful Properties of the Gaussian Distribution | Additional Statements From The Probability Theory

The Gaussian distribution (also called normal distribution) is one of the most important distributions in probability theory and statistics. Now we will look at some useful properties of this distribution and understand why it is so important and how it is applied in real life.

Physical meaning of the Gaussian distribution

The Gaussian distribution can describe a random variable that results from many different factors adding up.

For example, when weighing something, various factors like temperature, pressure, and measurement errors affect the result. Individually, these factors don't matter much, but together they have a significant impact. This is explained further in the chapter on the Central Limit Theorem.

Let's see how we will denote the Gaussian quantities in the future:

Linear transformations of Gaussian vectors

Gaussian distribution is preserved under linear transformations of random variables: if we apply a linear transformation to a Gaussian value, we will also get a Gaussian value at the output, but with different characteristics.

Uncorrelated Gaussian variables are independent

We know that correlation shows only the presence of linear dependencies between variables: as a result variables can be dependent but not correlated. But in the case of Gaussian variables, zero correlation means that the variables are independent, which is also a very useful property of Gaussian distribution.

3-sigma rule

The 3-sigma rule, also known as the empirical rule or the 68-95-99.7 rule, is a statistical guideline that states that for a normal distribution:

Approximately 68% of the data falls within one standard deviation (σ) of the mean (μ);
Approximately 95% of the data falls within two standard deviations (2σ) of the mean (μ);
Approximately 99.7% of the data falls within three standard deviations (3σ) of the mean (μ). This rule can be very useful for detecting outliers for the data that has Gaussian distribution.


              1234567891011121314151617181920212223242526
            
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

# Generate some data from a normal distribution
mu = 0   # mean
sigma = 1   # standard deviation
x = np.linspace(mu - 4*sigma, mu + 4*sigma, 100)
y = norm.pdf(x, mu, sigma)

# Plot the PDF of the normal distribution
plt.plot(x, y, label='PDF')

# Shade the area within 1, 2, and 3 standard deviations of the mean
plt.fill_between(x, 0, y, where=(x >= mu-sigma) & (x <= mu+sigma), alpha=0.3, label='68%')
plt.fill_between(x, 0, y, where=(x >= mu-2*sigma) & (x <= mu+2*sigma), alpha=0.3, label='95%')
plt.fill_between(x, 0, y, where=(x >= mu-3*sigma) & (x <= mu+3*sigma), alpha=0.3, label='99.7%')

# Add a legend and labels
plt.legend()
plt.xlabel('X')
plt.ylabel('PDF')
plt.title('3-Sigma Rule for a Gaussian Distribution')

# Show the plot
plt.show()

Everything was clear?

Thanks for your feedback!

Section 1. Chapter 6

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Ask me questions about this topic

Summarize this chapter

Show real-world examples

Swipe to show menu