Challenge: Group by Period?
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas
can handle even with that. There is .resample
function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule
- the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetime
column ofdf
dataframe as an index ofdf
. - Calculate the number of trips each month available in dataset.
Oplossing
Alias | Meaning |
---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
Bedankt voor je feedback!
single
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Awesome!
Completion rate improved to 3.23
Challenge: Group by Period?
Veeg om het menu te tonen
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas
can handle even with that. There is .resample
function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule
- the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetime
column ofdf
dataframe as an index ofdf
. - Calculate the number of trips each month available in dataset.
Oplossing
Alias | Meaning |
---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
Bedankt voor je feedback!
Awesome!
Completion rate improved to 3.23single