Lernen Multinomial Likelihood and the Softmax Loss Function | Bernoulli and Multinomial Distributions

Swipe um das Menü anzuzeigen

The Multinomial likelihood is central to modeling problems where each observation belongs to one of several discrete classes. In multi-class classification, you often encounter a scenario where, for a single observation, you observe one outcome among many possible categories. The Multinomial distribution generalizes the Bernoulli distribution to more than two classes, assigning a probability to each possible class label. The likelihood function for a single observation, given predicted class probabilities, is simply the probability assigned to the actual observed class. When you have a dataset of independent observations, the total likelihood is the product of the predicted probabilities for the observed classes across all samples. For computational and numerical reasons, you typically work with the log-likelihood, which sums the log-probabilities for the observed classes. This log-likelihood forms the basis for training many multi-class classifiers, including those using neural networks and logistic regression.


              1234567891011121314151617181920212223242526
            
import numpy as np
import matplotlib.pyplot as plt

# Suppose there are 3 classes: 0, 1, 2
num_classes = 3

# Simulate predicted probabilities for class 0, varying from 0.01 to 0.98
p0 = np.linspace(0.01, 0.98, 100)
# The rest of the probability mass is split equally between class 1 and 2
p_rest = (1 - p0) / 2
probs = np.vstack([p0, p_rest, p_rest]).T

# Assume the true class is 0 for this example
true_class = 0

# Compute log-likelihood for each set of predicted probabilities
log_likelihood = np.log(probs[:, true_class])

plt.figure(figsize=(7, 4))
plt.plot(p0, log_likelihood, label="Log-Likelihood (True class 0)")
plt.xlabel("Predicted Probability for True Class (class 0)")
plt.ylabel("Log-Likelihood")
plt.title("Log-Likelihood vs. Predicted Probability for True Class")
plt.grid(True)
plt.legend()
plt.show()

The plot above shows how the log-likelihood changes as you vary the predicted probability for the true class, holding the probabilities for the other classes equal. The log-likelihood is highest when the model assigns a probability near 1 to the true class and decreases rapidly as the probability drops. This relationship underpins the softmax loss, also known as the categorical cross-entropy loss. In multi-class classification, the softmax function is used to convert raw model outputs into normalized class probabilities. The loss function then compares these probabilities to the actual observed class using the negative log-likelihood. Minimizing this loss is equivalent to maximizing the Multinomial likelihood of the observed data. This connection is fundamental to modern machine learning algorithms for multi-class problems, ensuring that models are directly optimized to assign high probability to the correct class labels.

War alles klar?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 4

Fragen Sie AI

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Abschnitt 3. Kapitel 4