Apprendre Distinguishing Probability and Likelihood | Fundamentals of Probability in ML

Glissez pour afficher le menu

Understanding the difference between probability and likelihood is essential for anyone working in machine learning. Although these terms are often used interchangeably in everyday conversation, they play distinct roles in modeling and inference. In machine learning, you frequently use probability to describe how likely an outcome is, given a fixed model or parameter. For example, you might ask: Given this coin is fair, what is the probability it lands heads up? Here, the model (the fairness of the coin) is fixed, and you are considering the chance of possible outcomes.

In contrast, likelihood reverses this perspective. Likelihood measures how plausible a particular set of parameters is, given the observed data. In other words, you ask: Given that I observed three heads in five coin tosses, how likely is it that the coin has a certain probability of landing heads? Here, the data is fixed, and you are interested in varying the parameters of your model to see which best explain the observed outcomes.

This distinction is crucial in machine learning because you often need to estimate parameters of a model based on observed data. Probability helps you make predictions, while likelihood helps you fit your model to data.

Definition

Probability is the function $P(data | parameters)$ , describing the chance of observing data given known parameters;
Likelihood is the function $L(parameters | data)$ , expressing how plausible different parameter values are, given observed data;
Mathematically, both use the same formula, but the roles of data and parameters are reversed.


              12345678910111213141516171819202122232425
            
import numpy as np
import matplotlib.pyplot as plt

# Observed data: 7 successes (heads) out of 10 trials
n_trials = 10
n_successes = 7

# Range of possible probability values for the Bernoulli parameter (theta)
theta = np.linspace(0, 1, 100)

# Likelihood function for the Bernoulli/binomial process
# L(theta) = theta^k * (1-theta)^(n-k)
likelihood = theta**n_successes * (1 - theta)**(n_trials - n_successes)

# Normalize for plotting (optional, for visual clarity)
likelihood /= np.max(likelihood)

plt.figure(figsize=(8, 4))
plt.plot(theta, likelihood, label="Likelihood of θ")
plt.xlabel("Bernoulli parameter θ (probability of heads)")
plt.ylabel("Likelihood (normalized)")
plt.title("Likelihood Function for 7 Heads in 10 Tosses")
plt.legend()
plt.grid(True)
plt.show()

The likelihood function you just visualized shows how the plausibility of different values for the parameter $θ$ (the probability of heads) changes, given the observed data. In machine learning, this function is central to parameter estimation: you typically choose the parameter value that maximizes the likelihood, a process called maximum likelihood estimation (MLE). By examining the likelihood curve, you can see which parameter values best explain your data, guiding you to fit your models more effectively and make better predictions on new data.

Tout était clair ?

Merci pour vos commentaires !

Section 1. Chapitre 1

Demandez à l'IA

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Section 1. Chapitre 1