Course Content
Advanced Techniques in pandas
Advanced Techniques in pandas
Dealing With Pivot Tables
Python has an analog of the .groupby()
method that can lead to the same result. It is up to you as to which function to use. Let's learn it by using an example. Using the following function, called .pivot_table()
, we will calculate the mean values of the column 'Length'
that have the same value in the column 'Flight'
:
import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/plane', index_col = 0) # The code using .groupby() data_flights_1 = data[['Length', 'Flight']].groupby('Flight').mean() # The same code using .groupby() data_flights_2 = data[['Length', 'Flight']].groupby('Flight').agg('mean') # The same code using .pivot_table() data_flights_3 = pd.pivot_table(data, values = 'Length', index = 'Flight', aggfunc = 'mean') print(data_flights_1.head())
Explanation:
pd.pivot_table()
- function that creates pivot tables;data
- data frame that we use;values = 'Length'
- to the argumentvalues
, we assign columns having the same group, for which we will apply the calculation of the average, maximum, etc. If you want to group by several columns, put them in the list; the order isn't crucial;index = 'Flight'
-index
is an argument to which you assign the name of a column or columns that you want to group. If you want to group by several columns, put them in the list; the order is crucial, like in the.groupby()
function;aggfunc = 'mean'
- the same asagg
in the.groupby()
method,aggfunc
has exactly the same syntax asagg
. Thus, you can put several functions here by putting them in the list to specify functions for different columns using curly brackets.
Thanks for your feedback!