Challenge: Group by Period?
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas
can handle even with that. There is .resample
function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule
- the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetime
column ofdf
dataframe as an index ofdf
. - Calculate the number of trips each month available in dataset.
Løsning
Alias | Meaning |
---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
Tak for dine kommentarer!
single
Spørg AI
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat
Opsummér dette kapitel
Explain code
Explain why doesn't solve task
Awesome!
Completion rate improved to 3.23
Challenge: Group by Period?
Stryg for at vise menuen
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas
can handle even with that. There is .resample
function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule
- the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetime
column ofdf
dataframe as an index ofdf
. - Calculate the number of trips each month available in dataset.
Løsning
Alias | Meaning |
---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
Tak for dine kommentarer!
Awesome!
Completion rate improved to 3.23single