Вивчайте Confidence Intervals for Population Parameters

In the previous chapters we considered how it is possible to estimate the parameters of the population and check the quality of the data of the estimates. But those estimates were point: we simply determined the possible value of the parameter based on the data we have. But there is another approach: we can construct a certain interval that, with some probability, covers the real value of the desired parameter. This interval is called the confidence interval. Let's look at the definition:

The principle of constructing confidence intervals is somewhat similar to the principle of constructing point estimates. We also use a certain function with our samples as arguments for this function. That we use the distribution law of this function and build an interval. But a rigorous mathematical explanation of this process can be quite complicated, so we will not stop on it in more detail.

Note

It's worth noting that there's another type of interval estimation for population parameters called the credible interval, which is constructed using the Bayesian theorem. These intervals have different interpretations:

The confidence interval is essentially an interval with random endpoints that, with a certain probability, covers the true constant value of the parameter;

In contrast, the credible interval is a constant interval where the random value of the desired parameter falls with a certain probability.

Confidence interval for Gaussian distribution expectation parameter

Let's look at how to build a confidence interval for Gaussian distribution expectation parameter. We will consider 2 different situations:

In the image above, we provided a confidence interval for Gaussian expectation if we know variance. We use the PPF of Gaussian distribution and sample to build this interval.

Then we provided a confidence interval for Gaussian expectation if we don't know the variance and used adjusted sample variance instead of known variance for estimation. We use the PPF of Student distribution with an n-1 degree of freedom to build this interval.

Confidence interval with Python

Let's now look at how to build a confidence interval for the mean value of Gaussian samples in Python. We will use different confidence levels and compare intervals built due to corresponding confidence levels.


              123456789101112131415161718192021222324252627
            
import numpy as np
from scipy import stats
import pandas as pd

# Load the dataset
samples = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/Advanced+Probability+course+media/gaussian_samples.csv', names=['Value'])
data = np.array(samples)

# Calculate the degrees of freedom
n = len(data)
df = n - 1

# Build confidence intervals with different confidence levels
for conf_level in [0.9, 0.95, 0.99]:
    # Calculate the t-value for the given confidence level and degrees of freedom
    t_value = stats.t.ppf((1+ conf_level) / 2, df)

    # Calculate the sample mean and adjusted sample variance
    mean = np.mean(data)
    adjusted_var = np.var(data, ddof=1)

    # Calculate the lower and upper bounds of the confidence interval
    lower_bound = mean - t_value * np.sqrt(adjusted_var) / np.sqrt(n)
    upper_bound = mean + t_value * np.sqrt(adjusted_var) / np.sqrt(n)
    
    # Print the result
    print(f'{conf_level:.0%} confidence interval for mean value is: ({lower_bound:.2f}, {upper_bound:.2f})')

We see that the higher the confidence level, the wider the interval we get. This is quite logical, since the wider the interval, the higher the probability that this interval covers the real value of the mean.

Все було зрозуміло?

Дякуємо за ваш відгук!

Секція 3. Розділ 8

Запитати АІ

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

Suggested prompts:

Запитайте мені питання про цей предмет

Сумаризуйте цей розділ

Покажіть реальні приклади

Awesome!

Completion rate improved to 3.7

Свайпніть щоб показати меню

Note

It's worth noting that there's another type of interval estimation for population parameters called the credible interval, which is constructed using the Bayesian theorem. These intervals have different interpretations:

The confidence interval is essentially an interval with random endpoints that, with a certain probability, covers the true constant value of the parameter;

In contrast, the credible interval is a constant interval where the random value of the desired parameter falls with a certain probability.