Bernoulli Likelihood and the Cross-Entropy Loss Function
The Bernoulli distribution models binary outcomes — such as success/failure or yes/no—using a single parameter, the probability of success, often denoted as p. In the context of machine learning, binary classification tasks often assume that each label y (where y is 0 or 1) is drawn from a Bernoulli distribution, with a predicted probability p assigned by a model for the positive class. The likelihood of observing a true label y given a predicted probability p is written as:
L(p;y)=py(1−p)1−yThis likelihood reflects how probable the observed outcome is under the model's prediction. However, when training models, you typically maximize the log-likelihood, which for a single observation becomes:
logL(p;y)=ylog(p)+(1−y)log(1−p)This log-likelihood function is fundamental to binary classifiers such as logisticregression. The negative log-likelihood is widely known as the cross-entropy loss in machine learning literature. For a batch of data, the average negative log-likelihood (cross-entropy loss) is:
Cross-entropy=−N1i=1∑N[yilog(pi)+(1−yi)log(1−pi)]This loss function penalizes confident but wrong predictions much more heavily than less confident ones, making it a natural fit for probabilistic binary classification.
12345678910111213141516171819202122232425import numpy as np import matplotlib.pyplot as plt # Range of predicted probabilities p = np.linspace(0.001, 0.999, 200) # True label y = 1 log_likelihood_y1 = np.log(p) cross_entropy_y1 = -np.log(p) # True label y = 0 log_likelihood_y0 = np.log(1 - p) cross_entropy_y0 = -np.log(1 - p) plt.figure(figsize=(10, 6)) plt.plot(p, log_likelihood_y1, label='Log-Likelihood (y=1)', color='blue') plt.plot(p, -cross_entropy_y1, '--', label='-Cross-Entropy (y=1)', color='blue', alpha=0.5) plt.plot(p, log_likelihood_y0, label='Log-Likelihood (y=0)', color='red') plt.plot(p, -cross_entropy_y0, '--', label='-Cross-Entropy (y=0)', color='red', alpha=0.5) plt.xlabel('Predicted Probability $p$') plt.ylabel('Value') plt.title('Bernoulli Log-Likelihood and Cross-Entropy Loss') plt.legend() plt.grid(True) plt.show()
The plot above illustrates how the log-likelihood and cross-entropy loss behave as the predicted probability p varies, for both possible true labels. Notice that the log-likelihood reaches its maximum when the predicted probability matches the true label (either 0 or 1), and drops off rapidly as the prediction becomes less accurate. The cross-entropy loss, being the negative log-likelihood, is minimized when predictions are accurate and grows quickly for wrong, confident predictions. This property makes cross-entropy a natural loss function for Bernoulli models: it directly reflects the probability assigned to the true outcome and strongly discourages overconfident errors. As a result, optimizing cross-entropy encourages models to produce well-calibrated probabilities, which is essential for robust binary classification.
Tack för dina kommentarer!
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal
Fantastiskt!
Completion betyg förbättrat till 6.67
Bernoulli Likelihood and the Cross-Entropy Loss Function
Svep för att visa menyn
The Bernoulli distribution models binary outcomes — such as success/failure or yes/no—using a single parameter, the probability of success, often denoted as p. In the context of machine learning, binary classification tasks often assume that each label y (where y is 0 or 1) is drawn from a Bernoulli distribution, with a predicted probability p assigned by a model for the positive class. The likelihood of observing a true label y given a predicted probability p is written as:
L(p;y)=py(1−p)1−yThis likelihood reflects how probable the observed outcome is under the model's prediction. However, when training models, you typically maximize the log-likelihood, which for a single observation becomes:
logL(p;y)=ylog(p)+(1−y)log(1−p)This log-likelihood function is fundamental to binary classifiers such as logisticregression. The negative log-likelihood is widely known as the cross-entropy loss in machine learning literature. For a batch of data, the average negative log-likelihood (cross-entropy loss) is:
Cross-entropy=−N1i=1∑N[yilog(pi)+(1−yi)log(1−pi)]This loss function penalizes confident but wrong predictions much more heavily than less confident ones, making it a natural fit for probabilistic binary classification.
12345678910111213141516171819202122232425import numpy as np import matplotlib.pyplot as plt # Range of predicted probabilities p = np.linspace(0.001, 0.999, 200) # True label y = 1 log_likelihood_y1 = np.log(p) cross_entropy_y1 = -np.log(p) # True label y = 0 log_likelihood_y0 = np.log(1 - p) cross_entropy_y0 = -np.log(1 - p) plt.figure(figsize=(10, 6)) plt.plot(p, log_likelihood_y1, label='Log-Likelihood (y=1)', color='blue') plt.plot(p, -cross_entropy_y1, '--', label='-Cross-Entropy (y=1)', color='blue', alpha=0.5) plt.plot(p, log_likelihood_y0, label='Log-Likelihood (y=0)', color='red') plt.plot(p, -cross_entropy_y0, '--', label='-Cross-Entropy (y=0)', color='red', alpha=0.5) plt.xlabel('Predicted Probability $p$') plt.ylabel('Value') plt.title('Bernoulli Log-Likelihood and Cross-Entropy Loss') plt.legend() plt.grid(True) plt.show()
The plot above illustrates how the log-likelihood and cross-entropy loss behave as the predicted probability p varies, for both possible true labels. Notice that the log-likelihood reaches its maximum when the predicted probability matches the true label (either 0 or 1), and drops off rapidly as the prediction becomes less accurate. The cross-entropy loss, being the negative log-likelihood, is minimized when predictions are accurate and grows quickly for wrong, confident predictions. This property makes cross-entropy a natural loss function for Bernoulli models: it directly reflects the probability assigned to the true outcome and strongly discourages overconfident errors. As a result, optimizing cross-entropy encourages models to produce well-calibrated probabilities, which is essential for robust binary classification.
Tack för dina kommentarer!