Aprende Cumulative Distribution Functions and Probability Density Functions | Additional Statements From The Probability Theory

Desliza para mostrar el menú

Cumulative Distribution Function (CDF)

The Cumulative Distribution Function (CDF) is a function that describes the cumulative probability of a random variable taking on a value less than or equal to a given value.

Mathematically, the CDF of a random variable X, denoted as F(x), is defined as:

F(x) = Probability that variable X is less or equal to value x.

Using this function, it is easy to describe continuous random variables.
Look at the example below: we will use a normally distributed random variable and look at its CDF using the .cdf() method.


              1234567891011121314151617181920
            
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Generate a random variable following a normal distribution
mu = 0  # mean
sigma = 1  # standard deviation
x = np.linspace(-5, 5, 100)  # x values
rv = norm(loc=mu, scale=sigma)  # create a normal distribution with given mean and standard deviation

# Compute the CDF for the random variable
cdf = rv.cdf(x)

# Plot the CDF
plt.plot(x, cdf, label='CDF')
plt.xlabel('X')
plt.ylabel('CDF')
plt.title('CDF of a Standard Normal Distribution')
plt.legend()
plt.show()

Using CDF, we can determine the probability that our random variable belongs to any of the intervals of interest. Assume that X is a random variable, and F(x) is its CDF.
To determine the probability that the variable X belongs to the interval [a, b], we can use the following formula:

P{X є [a,b]} = F(b) - F(a).


              12345678910111213
            
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Generate a random variable following a normal distribution
mu = 0  # mean
sigma = 1  # standard deviation
rv = norm(loc=mu, scale=sigma) 

# Calculate probabilities for different ranges
print('Normally distributed variable belongs to [-1, 1] with probability:', round(rv.cdf(1) - rv.cdf(-1), 3))
print('Normally distributed variable belongs to [-2, 2] with probability:', round(rv.cdf(2) - rv.cdf(-2), 3))
print('Normally distributed variable belongs to [-3, 3] with probability:', round(rv.cdf(3) - rv.cdf(-3), 3))

Percent Point Function (PPF)

Percent Point Function (PPF), also known as the inverse of the cumulative distribution function (CDF). It is used to find the value of a random variable that corresponds to a given probability. In Python it is implemented using .ppf() method:


              12345678910111213
            
from scipy.stats import norm

# Define probabilities
probabilities = [0.1, 0.5, 0.85]

# Iterate over each probability and print the corresponding value of the variable
for i in probabilities:
    # Calculate the value of the variable using the percent point function (inverse of the cumulative distribution function)
    value = norm.ppf(i)
    # Round the value to 3 decimal places for clarity
    value = round(value, 3)
    # Print the result
    print('Normally distributed variable is less than', value, 'with probability', i)

Probability Density Function (PDF)

Probability Density Function (PDF) is a function that provides information about the likelihood of a random variable taking on a particular value at a specific point in the continuous range. Its interpretation is similar to that of the PMF but is specifically used for describing continuous random variables.

The PDF defines the shape of the probability distribution of a continuous random variable.

Let's consider the following example of PDF calculated using the .pdf() method.


              1234567891011121314151617
            
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Generate x values for plotting
x = np.linspace(-3, 3, 100)

# Calculate the probability density function (PDF) values for the standard normal distribution
pdf_values = norm.pdf(x, loc=0, scale=1)

# Plot the PDF
plt.plot(x, pdf_values, label='PDF')  # Plot PDF values against x values
plt.xlabel('X')  # Label for x-axis
plt.ylabel('PDF')  # Label for y-axis
plt.title('PDF of a Standard Normal Distribution')  # Title of the plot
plt.legend()  # Show legend
plt.show()  # Display the plot

The PDF provides insight into the likelihood or probability density of a random variable assuming a specific value. Higher PDF values suggest a greater likelihood, while lower values suggest a lesser likelihood.

To determine the probability of a continuous variable falling within a specific range, similar to using the PMF, we calculate the sum of the PDF for all values within that range. However, since continuous variables can have an infinite number of values within any range, we calculate the area under the PDF curve within the specified range instead of a simple sum.

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 1. Capítulo 3

Pregunte a AI

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Sección 1. Capítulo 3