Contenido del Curso
Ultimate Visualization with Python
Ultimate Visualization with Python
Pie Chart
Pie chart is a graph which uses a circle divided into slices (segments) to represent the numerical proportion (percentage distribution) of nominal data. Here is an example of a pie chart:
This chart represents the percentage distribution of the population by region. Looks pretty neat, doesn’t it?
Note
Despite being neat, pie charts should mostly be avoided, since they distort the view of the data. A category with a lot of instances will seem even bigger, a category with few instances will seem even smaller.
Now we’ll discuss how to create such a chart in matplotlib
.
Pie Chart with Labels
The function we will use is pie()
from the pyplot
module, and its first and the only required parameter is our data (called x
).
Another important parameter is labels
which specifies the labels of the segments (it should be a sequence of strings).
Let’s first have a look at the data for our example:
import pandas as pd url = 'https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv' population_df = pd.read_csv(url) print(population_df)
This DataFrame
contains the population of each region. Now to our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Creating a pie chart and setting the labels for each region plt.pie(population_df['Population'], labels=population_df['Region']) plt.show()
We called the pie()
function passing the Series
with population data as x
parameter and the Series
with region names as the labels for the segments.
Adding Percents
The chart looks alright, but there is still something missing. We don’t know the exact percentage of each individual region. Fortunately, there is an autopct
parameter which specifies the format of the labeling of the wedges (the labels are placed inside).
You can either pass a format string or a function, however, we’ll focus here on the format string. Let’s now modify our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Setting the label for each region and its percentage plt.pie(population_df['Population'], labels=population_df['Region'], autopct='%1.1f%%') plt.show()
Format String
Here we passed the following string: %1.1f%%
.
f
indicates that the value should be treated as a floating-point number (d
indicates an integer), and .1
that there should be exactly one digit after the decimal point.
The percent signs specify that it is a format string and that the number should be followed by the percent sign (%
).
If you want to explore more parameters of the .pie()
method, here is its documentation for you.
Tarea
- Use the correct function to create a pie chart.
- Use
incomes
as the data for the pie chart (the first argument). - Set the labels to
names
via the second argument. - Set the format of the percentage to a floating number with one digit after the decimal point via the third argument.
¡Gracias por tus comentarios!
Pie Chart
Pie chart is a graph which uses a circle divided into slices (segments) to represent the numerical proportion (percentage distribution) of nominal data. Here is an example of a pie chart:
This chart represents the percentage distribution of the population by region. Looks pretty neat, doesn’t it?
Note
Despite being neat, pie charts should mostly be avoided, since they distort the view of the data. A category with a lot of instances will seem even bigger, a category with few instances will seem even smaller.
Now we’ll discuss how to create such a chart in matplotlib
.
Pie Chart with Labels
The function we will use is pie()
from the pyplot
module, and its first and the only required parameter is our data (called x
).
Another important parameter is labels
which specifies the labels of the segments (it should be a sequence of strings).
Let’s first have a look at the data for our example:
import pandas as pd url = 'https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv' population_df = pd.read_csv(url) print(population_df)
This DataFrame
contains the population of each region. Now to our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Creating a pie chart and setting the labels for each region plt.pie(population_df['Population'], labels=population_df['Region']) plt.show()
We called the pie()
function passing the Series
with population data as x
parameter and the Series
with region names as the labels for the segments.
Adding Percents
The chart looks alright, but there is still something missing. We don’t know the exact percentage of each individual region. Fortunately, there is an autopct
parameter which specifies the format of the labeling of the wedges (the labels are placed inside).
You can either pass a format string or a function, however, we’ll focus here on the format string. Let’s now modify our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Setting the label for each region and its percentage plt.pie(population_df['Population'], labels=population_df['Region'], autopct='%1.1f%%') plt.show()
Format String
Here we passed the following string: %1.1f%%
.
f
indicates that the value should be treated as a floating-point number (d
indicates an integer), and .1
that there should be exactly one digit after the decimal point.
The percent signs specify that it is a format string and that the number should be followed by the percent sign (%
).
If you want to explore more parameters of the .pie()
method, here is its documentation for you.
Tarea
- Use the correct function to create a pie chart.
- Use
incomes
as the data for the pie chart (the first argument). - Set the labels to
names
via the second argument. - Set the format of the percentage to a floating number with one digit after the decimal point via the third argument.
¡Gracias por tus comentarios!
Pie Chart
Pie chart is a graph which uses a circle divided into slices (segments) to represent the numerical proportion (percentage distribution) of nominal data. Here is an example of a pie chart:
This chart represents the percentage distribution of the population by region. Looks pretty neat, doesn’t it?
Note
Despite being neat, pie charts should mostly be avoided, since they distort the view of the data. A category with a lot of instances will seem even bigger, a category with few instances will seem even smaller.
Now we’ll discuss how to create such a chart in matplotlib
.
Pie Chart with Labels
The function we will use is pie()
from the pyplot
module, and its first and the only required parameter is our data (called x
).
Another important parameter is labels
which specifies the labels of the segments (it should be a sequence of strings).
Let’s first have a look at the data for our example:
import pandas as pd url = 'https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv' population_df = pd.read_csv(url) print(population_df)
This DataFrame
contains the population of each region. Now to our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Creating a pie chart and setting the labels for each region plt.pie(population_df['Population'], labels=population_df['Region']) plt.show()
We called the pie()
function passing the Series
with population data as x
parameter and the Series
with region names as the labels for the segments.
Adding Percents
The chart looks alright, but there is still something missing. We don’t know the exact percentage of each individual region. Fortunately, there is an autopct
parameter which specifies the format of the labeling of the wedges (the labels are placed inside).
You can either pass a format string or a function, however, we’ll focus here on the format string. Let’s now modify our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Setting the label for each region and its percentage plt.pie(population_df['Population'], labels=population_df['Region'], autopct='%1.1f%%') plt.show()
Format String
Here we passed the following string: %1.1f%%
.
f
indicates that the value should be treated as a floating-point number (d
indicates an integer), and .1
that there should be exactly one digit after the decimal point.
The percent signs specify that it is a format string and that the number should be followed by the percent sign (%
).
If you want to explore more parameters of the .pie()
method, here is its documentation for you.
Tarea
- Use the correct function to create a pie chart.
- Use
incomes
as the data for the pie chart (the first argument). - Set the labels to
names
via the second argument. - Set the format of the percentage to a floating number with one digit after the decimal point via the third argument.
¡Gracias por tus comentarios!
Pie chart is a graph which uses a circle divided into slices (segments) to represent the numerical proportion (percentage distribution) of nominal data. Here is an example of a pie chart:
This chart represents the percentage distribution of the population by region. Looks pretty neat, doesn’t it?
Note
Despite being neat, pie charts should mostly be avoided, since they distort the view of the data. A category with a lot of instances will seem even bigger, a category with few instances will seem even smaller.
Now we’ll discuss how to create such a chart in matplotlib
.
Pie Chart with Labels
The function we will use is pie()
from the pyplot
module, and its first and the only required parameter is our data (called x
).
Another important parameter is labels
which specifies the labels of the segments (it should be a sequence of strings).
Let’s first have a look at the data for our example:
import pandas as pd url = 'https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv' population_df = pd.read_csv(url) print(population_df)
This DataFrame
contains the population of each region. Now to our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Creating a pie chart and setting the labels for each region plt.pie(population_df['Population'], labels=population_df['Region']) plt.show()
We called the pie()
function passing the Series
with population data as x
parameter and the Series
with region names as the labels for the segments.
Adding Percents
The chart looks alright, but there is still something missing. We don’t know the exact percentage of each individual region. Fortunately, there is an autopct
parameter which specifies the format of the labeling of the wedges (the labels are placed inside).
You can either pass a format string or a function, however, we’ll focus here on the format string. Let’s now modify our example:
import matplotlib.pyplot as plt import pandas as pd population_df = pd.read_csv('https://codefinity-content-media-v2.s3.eu-west-1.amazonaws.com/courses/47339f29-4722-4e72-a0d4-6112c70ff738/population.csv') # Setting the label for each region and its percentage plt.pie(population_df['Population'], labels=population_df['Region'], autopct='%1.1f%%') plt.show()
Format String
Here we passed the following string: %1.1f%%
.
f
indicates that the value should be treated as a floating-point number (d
indicates an integer), and .1
that there should be exactly one digit after the decimal point.
The percent signs specify that it is a format string and that the number should be followed by the percent sign (%
).
If you want to explore more parameters of the .pie()
method, here is its documentation for you.
Tarea
- Use the correct function to create a pie chart.
- Use
incomes
as the data for the pie chart (the first argument). - Set the labels to
names
via the second argument. - Set the format of the percentage to a floating number with one digit after the decimal point via the third argument.