The Bernoulli Distribution
The Bernoulli distribution is one of the simplest yet most fundamental probability distributions in machine learning. It models binary events—situations where there are only two possible outcomes, such as success/failure, yes/no, or 1/0. In the context of machine learning, the Bernoulli distribution is crucial for binary classification tasks, where you need to predict whether an instance belongs to one of two classes. For example, determining whether an email is spam or not spam, or whether a tumor is malignant or benign, can be naturally represented by a Bernoulli random variable.
A Bernoulli random variable takes the value 1 with probability p (the probability of "success") and 0 with probability 1−p (the probability of "failure"). The probability mass function is given by:
P(X=x)=px(1−p)1−x, where x∈{0,1}This means that if you know the probability of success, you can describe the entire distribution. The Bernoulli distribution forms the building block for more complex models, such as the binomial and multinomial distributions, and is at the heart of models like logistic regression.
123456789101112131415161718import numpy as np import matplotlib.pyplot as plt # Define different probabilities for the Bernoulli distribution probabilities = [0.2, 0.5, 0.8] num_samples = 1000 fig, axes = plt.subplots(1, 3, figsize=(12, 4)) for ax, p in zip(axes, probabilities): samples = np.random.binomial(n=1, p=p, size=num_samples) counts = np.bincount(samples, minlength=2) ax.bar([0, 1], counts, tick_label=['0', '1'], color=['#1f77b4', '#ff7f0e']) ax.set_title(f'p = {p}') ax.set_xlabel('Outcome') ax.set_ylabel('Count') ax.set_ylim(0, num_samples) plt.tight_layout() plt.show()
When you simulate samples from a Bernoulli distribution with different probability parameters, you observe that the proportion of 1s and 0s shifts according to the value of p. For example, with p=0.2, you expect around 20% of the outcomes to be 1 (success) and 80% to be 0 (failure). With p=0.5, the outcomes are roughly balanced, and with p=0.8, most outcomes are 1. This directly illustrates how the Bernoulli distribution models the likelihood of a binary event, and how adjusting the probability parameter allows you to fit the distribution to the characteristics of your data. In machine learning, this property enables you to model the probability that a given input belongs to the positive class, which is essential for tasks like logistic regression.
Takk for tilbakemeldingene dine!
Spør AI
Spør AI
Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår
Fantastisk!
Completion rate forbedret til 6.67
The Bernoulli Distribution
Sveip for å vise menyen
The Bernoulli distribution is one of the simplest yet most fundamental probability distributions in machine learning. It models binary events—situations where there are only two possible outcomes, such as success/failure, yes/no, or 1/0. In the context of machine learning, the Bernoulli distribution is crucial for binary classification tasks, where you need to predict whether an instance belongs to one of two classes. For example, determining whether an email is spam or not spam, or whether a tumor is malignant or benign, can be naturally represented by a Bernoulli random variable.
A Bernoulli random variable takes the value 1 with probability p (the probability of "success") and 0 with probability 1−p (the probability of "failure"). The probability mass function is given by:
P(X=x)=px(1−p)1−x, where x∈{0,1}This means that if you know the probability of success, you can describe the entire distribution. The Bernoulli distribution forms the building block for more complex models, such as the binomial and multinomial distributions, and is at the heart of models like logistic regression.
123456789101112131415161718import numpy as np import matplotlib.pyplot as plt # Define different probabilities for the Bernoulli distribution probabilities = [0.2, 0.5, 0.8] num_samples = 1000 fig, axes = plt.subplots(1, 3, figsize=(12, 4)) for ax, p in zip(axes, probabilities): samples = np.random.binomial(n=1, p=p, size=num_samples) counts = np.bincount(samples, minlength=2) ax.bar([0, 1], counts, tick_label=['0', '1'], color=['#1f77b4', '#ff7f0e']) ax.set_title(f'p = {p}') ax.set_xlabel('Outcome') ax.set_ylabel('Count') ax.set_ylim(0, num_samples) plt.tight_layout() plt.show()
When you simulate samples from a Bernoulli distribution with different probability parameters, you observe that the proportion of 1s and 0s shifts according to the value of p. For example, with p=0.2, you expect around 20% of the outcomes to be 1 (success) and 80% to be 0 (failure). With p=0.5, the outcomes are roughly balanced, and with p=0.8, most outcomes are 1. This directly illustrates how the Bernoulli distribution models the likelihood of a binary event, and how adjusting the probability parameter allows you to fit the distribution to the characteristics of your data. In machine learning, this property enables you to model the probability that a given input belongs to the positive class, which is essential for tasks like logistic regression.
Takk for tilbakemeldingene dine!