Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Momentum estimation. Maximum Likelihood Estimation | Estimation of Population Parameters
Advanced Probability Theory
course content

Course Content

Advanced Probability Theory

Advanced Probability Theory

1. Additional Statements From The Probability Theory
2. The Limit Theorems of Probability Theory
3. Estimation of Population Parameters
4. Testing of Statistical Hypotheses

bookMomentum estimation. Maximum Likelihood Estimation

Let's consider in more detail what are general population parameters and how we can estimate them.

Momentum estimation

We use our samples and apply a specific function to them to estimate the parameter we're interested in. However, we can't just pick any function; we need to find the right one that gives us the most accurate estimate.

In mathematical statistics, there are two common methods for this. The first is called the method of moments, which we've discussed before. This method relies on the fact that certain characteristics of a random variable, like its mean or variance, are directly related to the parameters we want to estimate.
For instance, the Gaussian distribution is completely determined by its mean and variance. So, by calculating the mean and variance from our samples, we can estimate the parameters of the distribution.

Example

Assume that we have an exponential general population and want to estimate the lambda parameter of this distribution. We can do it as follows:

123456789
import pandas as pd samples = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/Advanced+Probability+course+media/expon_samples.csv', names=['Value']) # Calculate mean value over samples estim_mean = samples.mean()['Value'] # We know that samples are from exponential distribution with has parameter lambda. # We also know that mean value of exponentially distributed variable equals 1/lambda # So to estimate lambda using momentum method we can simple use 1/estim_mean print('Momentum estimation of lambda parameter is: ', 1/estim_mean)
copy

If we need to estimate more than one parameter, then, accordingly, we need to use not only the mean value but also the variance or higher-order moments. Let's consider an example of estimating the parameters of the Gaussian distribution:

12345678910
import pandas as pd samples = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/Advanced+Probability+course+media/gaussian_samples.csv', names=['Value']) # Estimate the mean and standard deviation using method of moments mu = samples.mean()['Value'] sigma = samples.std()['Value'] print('Estimated mean:', mu) print('Estimated standard deviation:', sigma)
copy

Maximum likelihood estimation

The method of moments is quite simple to interpret; however, the properties of the estimates obtained using this method may not always satisfy us (we will talk about the properties of the estimates in the following chapters). That's why we will consider another method - the maximum likelihood estimation.
The maximum likelihood method is based on maximizing the likelihood function. This function is constructed as the joint distribution function of the vector consisting of all our samples. Let's look at the image below:

Since the samples come from the same general population independently, we can combine their distributions into one joint distribution by multiplying the distributions of each individual sample. This gives us the maximum likelihood function, which we then aim to maximize to find the best parameters.

In simpler terms, we're trying to find the parameters that make our observed samples most likely to occur.

Working directly with the likelihood function can be complex, so it's often easier to use the negative log-likelihood. Taking the logarithm turns the product of probabilities into a sum, simplifying the calculations. Plus, maximizing the likelihood is the same as minimizing the negative log-likelihood.

Example

Let's use the maximum likelihood to estimate the parameters of Gaussian distribution:

123456789101112
import pandas as pd from scipy.stats import norm # Generate some random data samples = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/Advanced+Probability+course+media/gaussian_samples.csv', names=['Value']) # Estimate the parameters using maximum likelihood mu_ml, sigma_ml = norm.fit(samples) print('Maximum likelihood estimates:') print('mu = ', mu_ml) print('sigma = ', sigma_ml)
copy

In the code above, we use .fit() method of norm class to get the maximum likelihood estimation of parameters. You can apply this method to any continuous distribution represented in scipy library.

Note

In some cases, the estimate using the method of moments and the maximum likelihood estimate may coincide.

We can't use .fit() method for some distributions. That is why we have to construct the likelihood function manually and provide optimization. Let's look at the example:

1234567891011121314151617181920
import numpy as np import pandas as pd from scipy.stats import poisson from scipy.optimize import minimize samples = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/Advanced+Probability+course+media/pois_samples.csv', names=['Value']) # Define the log-likelihood function for a Poisson distribution def poisson_log_likelihood(params, data): lam = params[0] # Compute log-likelihood as sum of logarithms of Poisson PMF log_likelihood = -np.sum(poisson.logpmf(data, lam)) return log_likelihood # Use maximum likelihood estimation to fit a Poisson distribution to the data initial_guess = [5] # starting value for lambda parameter result = minimize(poisson_log_likelihood, initial_guess, args=samples) estimate_lambda = result.x[0] # Print the estimated value of lambda print('Estimated value of lambda:', estimate_lambda)
copy

In the code above, we manually created the log-likelihood function using .logpmf() method that calculates the logarithm of PMF at each point we need.

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 3. Chapter 2
some-alt