Course Content
Data Manipulation using pandas
Data Manipulation using pandas
Histograms
Let's move on to the first visualization steps. By now you already know how to clean, prepare, and aggregate data for further analysis. We'll start with histograms.
What is a histogram? Histogram is a graph that represents frequencies of numerical data (usually numerical intervals). To build histogram in pandas
, apply the .hist()
method to selected data. For instance, let's build a histogram for the 'totinch'
column.
Note that you don't need to use the
print()
function to output the plot.
# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data4.csv') # Histogram for the totinch column values df.totinch.hist()
As parameters, you can set color
(color for rectangles, like 'r'
, 'g'
, 'b'
, etc.) or bins
(number of intervals to divide data). Let's make rectangles red and set the number of intervals to 50.
# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data4.csv') # Histogram for the totinch column values df.totinch.hist(color = 'r', bins = 50)
Thanks for your feedback!