Course Content
Probability Theory
Probability Theory
Normal Distribution
Hi there! It is the right time to move to more complex distributions! Continuous one!
What is it?
Continuous distribution is a distribution that has an infinite number of possible outcomes. Therefore, we can not calculate the interval value or create a table because we do not know their amount. Such distributions can be expressed only with a graph.
Let's start with the most widely used and gripping one, normal distribution!
To work with this distribution we should import the norm
object from scipy.stats
and then we can apply numerous functions to this distribution like sf
, cdf
, but not pmf
. Here is the function with the same meaning titled as pdf
.
Examples:
- Animals size.
- People's heights.
- Birth weights.
To understand the key characteristics, it is better to first look at the graph.
Distribution of imperial penguin's heights in meters.
Key characteristics:
The graph is bell-shaped due to the reason that it looks like a bell. The graph is symmetric. It has thin tails.
Graph explanation:
I guess you remember something about mean and standard deviation, so look to the mean, which equals 1.2 meters here, and the standard deviation with the value of 0.3. You can see the most bright yellow rectangle with the value mean + std (standard deviation) as the right border and mean - std (standard deviation) as the left border. The important thing is that all values between the amount mentioned above to 68.3% of all values. The number 68.3% can be called a confidence interval.
The values between mean + 2 * std and mean - 2 * std amount to 95.4% of all values.
The values between mean + 3 * std and mean - 3 * std amount to 99.7% of all values.
Confidence interval:
In our case with a mean of 1.2 and a standard deviation of 0.3 we can say that: 68.3% confidence we can say that the average imperial penguin's heigh is between 1.2 - 0.3 meters and 1.2 + 0.3 meters -> 0.9 and 1.5 meters. 95.4% confidence we can say that the average imperial penguin's heigh is between 1.2 - 2 * 0.3 meters and 1.2 + 2 * 0.3 meters -> 0.6 and 1.8 meters. 99.7% confidence we can say that the average imperial penguin's heigh is between 1.2 - 3 * 0.3 meters and 1.2 + 3 * 0.3 meters -> 0.3 and 2.1 meters.
Let's recall some functions, bit for normal distribution (they are a little bit different):
For outputting random sample: norm.rvs(loc, scale, size)
.
For calculating the probability of receiving exactly x
events: norm.pdf(x, loc, scale)
.
For calculating the probability of receiving x
or more events: norm.sf(x, loc, scale)
.
For calculating the probability of receiving x
or less events: norm.cdf(x, loc, scale)
.
loc
is the mean value of the distribution.scale
is the standard deviation value of the distribution.size
is the number of samples of the distribution.x
is the number of expected results.
Task
Here build the random distribution of the cat's weights! Follow the algorithm:
- Import
norm
object fromscipy.stats
. - Import
matplotlib.pyplot
withplt
alias. - Import
seaborn
withsns
alias. - Generate random normal distribution with the attributes:
- Mean equals
4.2
. - Standard deviation equals
1
.
- Mean equals
- Create a histplot with such parameters:
dist
variable to thedata
attribute.True
variable to thekde
attribute.
- Output the graph.
Thanks for your feedback!
Normal Distribution
Hi there! It is the right time to move to more complex distributions! Continuous one!
What is it?
Continuous distribution is a distribution that has an infinite number of possible outcomes. Therefore, we can not calculate the interval value or create a table because we do not know their amount. Such distributions can be expressed only with a graph.
Let's start with the most widely used and gripping one, normal distribution!
To work with this distribution we should import the norm
object from scipy.stats
and then we can apply numerous functions to this distribution like sf
, cdf
, but not pmf
. Here is the function with the same meaning titled as pdf
.
Examples:
- Animals size.
- People's heights.
- Birth weights.
To understand the key characteristics, it is better to first look at the graph.
Distribution of imperial penguin's heights in meters.
Key characteristics:
The graph is bell-shaped due to the reason that it looks like a bell. The graph is symmetric. It has thin tails.
Graph explanation:
I guess you remember something about mean and standard deviation, so look to the mean, which equals 1.2 meters here, and the standard deviation with the value of 0.3. You can see the most bright yellow rectangle with the value mean + std (standard deviation) as the right border and mean - std (standard deviation) as the left border. The important thing is that all values between the amount mentioned above to 68.3% of all values. The number 68.3% can be called a confidence interval.
The values between mean + 2 * std and mean - 2 * std amount to 95.4% of all values.
The values between mean + 3 * std and mean - 3 * std amount to 99.7% of all values.
Confidence interval:
In our case with a mean of 1.2 and a standard deviation of 0.3 we can say that: 68.3% confidence we can say that the average imperial penguin's heigh is between 1.2 - 0.3 meters and 1.2 + 0.3 meters -> 0.9 and 1.5 meters. 95.4% confidence we can say that the average imperial penguin's heigh is between 1.2 - 2 * 0.3 meters and 1.2 + 2 * 0.3 meters -> 0.6 and 1.8 meters. 99.7% confidence we can say that the average imperial penguin's heigh is between 1.2 - 3 * 0.3 meters and 1.2 + 3 * 0.3 meters -> 0.3 and 2.1 meters.
Let's recall some functions, bit for normal distribution (they are a little bit different):
For outputting random sample: norm.rvs(loc, scale, size)
.
For calculating the probability of receiving exactly x
events: norm.pdf(x, loc, scale)
.
For calculating the probability of receiving x
or more events: norm.sf(x, loc, scale)
.
For calculating the probability of receiving x
or less events: norm.cdf(x, loc, scale)
.
loc
is the mean value of the distribution.scale
is the standard deviation value of the distribution.size
is the number of samples of the distribution.x
is the number of expected results.
Task
Here build the random distribution of the cat's weights! Follow the algorithm:
- Import
norm
object fromscipy.stats
. - Import
matplotlib.pyplot
withplt
alias. - Import
seaborn
withsns
alias. - Generate random normal distribution with the attributes:
- Mean equals
4.2
. - Standard deviation equals
1
.
- Mean equals
- Create a histplot with such parameters:
dist
variable to thedata
attribute.True
variable to thekde
attribute.
- Output the graph.
Thanks for your feedback!
Normal Distribution
Hi there! It is the right time to move to more complex distributions! Continuous one!
What is it?
Continuous distribution is a distribution that has an infinite number of possible outcomes. Therefore, we can not calculate the interval value or create a table because we do not know their amount. Such distributions can be expressed only with a graph.
Let's start with the most widely used and gripping one, normal distribution!
To work with this distribution we should import the norm
object from scipy.stats
and then we can apply numerous functions to this distribution like sf
, cdf
, but not pmf
. Here is the function with the same meaning titled as pdf
.
Examples:
- Animals size.
- People's heights.
- Birth weights.
To understand the key characteristics, it is better to first look at the graph.
Distribution of imperial penguin's heights in meters.
Key characteristics:
The graph is bell-shaped due to the reason that it looks like a bell. The graph is symmetric. It has thin tails.
Graph explanation:
I guess you remember something about mean and standard deviation, so look to the mean, which equals 1.2 meters here, and the standard deviation with the value of 0.3. You can see the most bright yellow rectangle with the value mean + std (standard deviation) as the right border and mean - std (standard deviation) as the left border. The important thing is that all values between the amount mentioned above to 68.3% of all values. The number 68.3% can be called a confidence interval.
The values between mean + 2 * std and mean - 2 * std amount to 95.4% of all values.
The values between mean + 3 * std and mean - 3 * std amount to 99.7% of all values.
Confidence interval:
In our case with a mean of 1.2 and a standard deviation of 0.3 we can say that: 68.3% confidence we can say that the average imperial penguin's heigh is between 1.2 - 0.3 meters and 1.2 + 0.3 meters -> 0.9 and 1.5 meters. 95.4% confidence we can say that the average imperial penguin's heigh is between 1.2 - 2 * 0.3 meters and 1.2 + 2 * 0.3 meters -> 0.6 and 1.8 meters. 99.7% confidence we can say that the average imperial penguin's heigh is between 1.2 - 3 * 0.3 meters and 1.2 + 3 * 0.3 meters -> 0.3 and 2.1 meters.
Let's recall some functions, bit for normal distribution (they are a little bit different):
For outputting random sample: norm.rvs(loc, scale, size)
.
For calculating the probability of receiving exactly x
events: norm.pdf(x, loc, scale)
.
For calculating the probability of receiving x
or more events: norm.sf(x, loc, scale)
.
For calculating the probability of receiving x
or less events: norm.cdf(x, loc, scale)
.
loc
is the mean value of the distribution.scale
is the standard deviation value of the distribution.size
is the number of samples of the distribution.x
is the number of expected results.
Task
Here build the random distribution of the cat's weights! Follow the algorithm:
- Import
norm
object fromscipy.stats
. - Import
matplotlib.pyplot
withplt
alias. - Import
seaborn
withsns
alias. - Generate random normal distribution with the attributes:
- Mean equals
4.2
. - Standard deviation equals
1
.
- Mean equals
- Create a histplot with such parameters:
dist
variable to thedata
attribute.True
variable to thekde
attribute.
- Output the graph.
Thanks for your feedback!
Hi there! It is the right time to move to more complex distributions! Continuous one!
What is it?
Continuous distribution is a distribution that has an infinite number of possible outcomes. Therefore, we can not calculate the interval value or create a table because we do not know their amount. Such distributions can be expressed only with a graph.
Let's start with the most widely used and gripping one, normal distribution!
To work with this distribution we should import the norm
object from scipy.stats
and then we can apply numerous functions to this distribution like sf
, cdf
, but not pmf
. Here is the function with the same meaning titled as pdf
.
Examples:
- Animals size.
- People's heights.
- Birth weights.
To understand the key characteristics, it is better to first look at the graph.
Distribution of imperial penguin's heights in meters.
Key characteristics:
The graph is bell-shaped due to the reason that it looks like a bell. The graph is symmetric. It has thin tails.
Graph explanation:
I guess you remember something about mean and standard deviation, so look to the mean, which equals 1.2 meters here, and the standard deviation with the value of 0.3. You can see the most bright yellow rectangle with the value mean + std (standard deviation) as the right border and mean - std (standard deviation) as the left border. The important thing is that all values between the amount mentioned above to 68.3% of all values. The number 68.3% can be called a confidence interval.
The values between mean + 2 * std and mean - 2 * std amount to 95.4% of all values.
The values between mean + 3 * std and mean - 3 * std amount to 99.7% of all values.
Confidence interval:
In our case with a mean of 1.2 and a standard deviation of 0.3 we can say that: 68.3% confidence we can say that the average imperial penguin's heigh is between 1.2 - 0.3 meters and 1.2 + 0.3 meters -> 0.9 and 1.5 meters. 95.4% confidence we can say that the average imperial penguin's heigh is between 1.2 - 2 * 0.3 meters and 1.2 + 2 * 0.3 meters -> 0.6 and 1.8 meters. 99.7% confidence we can say that the average imperial penguin's heigh is between 1.2 - 3 * 0.3 meters and 1.2 + 3 * 0.3 meters -> 0.3 and 2.1 meters.
Let's recall some functions, bit for normal distribution (they are a little bit different):
For outputting random sample: norm.rvs(loc, scale, size)
.
For calculating the probability of receiving exactly x
events: norm.pdf(x, loc, scale)
.
For calculating the probability of receiving x
or more events: norm.sf(x, loc, scale)
.
For calculating the probability of receiving x
or less events: norm.cdf(x, loc, scale)
.
loc
is the mean value of the distribution.scale
is the standard deviation value of the distribution.size
is the number of samples of the distribution.x
is the number of expected results.
Task
Here build the random distribution of the cat's weights! Follow the algorithm:
- Import
norm
object fromscipy.stats
. - Import
matplotlib.pyplot
withplt
alias. - Import
seaborn
withsns
alias. - Generate random normal distribution with the attributes:
- Mean equals
4.2
. - Standard deviation equals
1
.
- Mean equals
- Create a histplot with such parameters:
dist
variable to thedata
attribute.True
variable to thekde
attribute.
- Output the graph.