Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen The Multinomial Distribution | Bernoulli and Multinomial Distributions
Probability Distributions for Machine Learning

bookThe Multinomial Distribution

When you face a machine learning problem with more than two possible outcomes—such as classifying emails as "spam", "work", or "personal" — you need a probability model that can handle multiple categories. The Multinomial distribution is designed for just this scenario: it describes the probabilities of counts for each category in a fixed number of independent trials, where each trial results in exactly one of several possible outcomes. In a classification context, the Multinomial distribution models the likelihood of observing a particular combination of class assignments across multiple samples, making it fundamental to multi-class classification problems.

Note
Definition

The Multinomial probability mass function (PMF) gives the probability of observing a specific count for each category in a sequence of nn independent trials, where each trial can result in one of kk categories with probabilities p1,p2,...,pkp_1, p_2, ..., p_k (where each pi0p_i \geq 0 and the sum of all pip_i equals 1):

P(X1=x1,...,Xk=xk)=n!x1!x2!xk!p1x1p2x2pkxkP(X_1 = x_1, ..., X_k = x_k) = \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} p_2^{x_2} \cdots p_k^{x_k}

where:

  • nn: total number of trials;
  • kk: number of possible outcome categories;
  • xix_i: number of times category ii occurs (xi=n\sum x_i = n);
  • pip_i: probability of category ii in a single trial (pi=1\sum p_i = 1).
1234567891011
import numpy as np import matplotlib.pyplot as plt # Suppose you have a 3-class classification problem (e.g., "cat", "dog", "rabbit") categories = ["cat", "dog", "rabbit"] probabilities = [0.5, 0.3, 0.2] # Probability for each class n_samples = 100 # Number of trials (e.g., 100 data points) # Simulate one experiment: how many times does each class appear in 100 samples? counts = np.random.multinomial(n_samples, probabilities) print("Sampled counts:", dict(zip(categories, counts)))
copy
12345678910111213
# Repeat simulation to see variability n_experiments = 1000 all_counts = np.random.multinomial(n_samples, probabilities, size=n_experiments) # Plot histogram for each class fig, ax = plt.subplots() for idx, label in enumerate(categories): ax.hist(all_counts[:, idx], bins=30, alpha=0.6, label=label) ax.set_xlabel("Count in 100 samples") ax.set_ylabel("Frequency") ax.legend() plt.title("Distribution of class counts over 1000 experiments") plt.show()
copy

When you run this simulation, you see that the number of times each class appears in 100 samples varies from experiment to experiment, but the average matches the underlying probabilities. This mirrors what happens in a multi-class classifier, such as a softmax classifier, where the predicted probabilities for each class sum to one and represent the expected frequencies of each category. The Multinomial distribution provides the mathematical foundation for modeling these outcomes and for evaluating how well your classifier's predictions align with observed data.

question mark

Which of the following statements about the Multinomial distribution and its application in multi-class classification are correct?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 3

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

bookThe Multinomial Distribution

Swipe um das Menü anzuzeigen

When you face a machine learning problem with more than two possible outcomes—such as classifying emails as "spam", "work", or "personal" — you need a probability model that can handle multiple categories. The Multinomial distribution is designed for just this scenario: it describes the probabilities of counts for each category in a fixed number of independent trials, where each trial results in exactly one of several possible outcomes. In a classification context, the Multinomial distribution models the likelihood of observing a particular combination of class assignments across multiple samples, making it fundamental to multi-class classification problems.

Note
Definition

The Multinomial probability mass function (PMF) gives the probability of observing a specific count for each category in a sequence of nn independent trials, where each trial can result in one of kk categories with probabilities p1,p2,...,pkp_1, p_2, ..., p_k (where each pi0p_i \geq 0 and the sum of all pip_i equals 1):

P(X1=x1,...,Xk=xk)=n!x1!x2!xk!p1x1p2x2pkxkP(X_1 = x_1, ..., X_k = x_k) = \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} p_2^{x_2} \cdots p_k^{x_k}

where:

  • nn: total number of trials;
  • kk: number of possible outcome categories;
  • xix_i: number of times category ii occurs (xi=n\sum x_i = n);
  • pip_i: probability of category ii in a single trial (pi=1\sum p_i = 1).
1234567891011
import numpy as np import matplotlib.pyplot as plt # Suppose you have a 3-class classification problem (e.g., "cat", "dog", "rabbit") categories = ["cat", "dog", "rabbit"] probabilities = [0.5, 0.3, 0.2] # Probability for each class n_samples = 100 # Number of trials (e.g., 100 data points) # Simulate one experiment: how many times does each class appear in 100 samples? counts = np.random.multinomial(n_samples, probabilities) print("Sampled counts:", dict(zip(categories, counts)))
copy
12345678910111213
# Repeat simulation to see variability n_experiments = 1000 all_counts = np.random.multinomial(n_samples, probabilities, size=n_experiments) # Plot histogram for each class fig, ax = plt.subplots() for idx, label in enumerate(categories): ax.hist(all_counts[:, idx], bins=30, alpha=0.6, label=label) ax.set_xlabel("Count in 100 samples") ax.set_ylabel("Frequency") ax.legend() plt.title("Distribution of class counts over 1000 experiments") plt.show()
copy

When you run this simulation, you see that the number of times each class appears in 100 samples varies from experiment to experiment, but the average matches the underlying probabilities. This mirrors what happens in a multi-class classifier, such as a softmax classifier, where the predicted probabilities for each class sum to one and represent the expected frequencies of each category. The Multinomial distribution provides the mathematical foundation for modeling these outcomes and for evaluating how well your classifier's predictions align with observed data.

question mark

Which of the following statements about the Multinomial distribution and its application in multi-class classification are correct?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 3. Kapitel 3
some-alt