Challenge: Group by Period?
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas
can handle even with that. There is .resample
function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule
- the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetime
column ofdf
dataframe as an index ofdf
. - Calculate the number of trips each month available in dataset.
Lösning
Alias | Meaning |
---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
Tack för dina kommentarer!
single
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal
Sammanfatta detta kapitel
Explain code
Explain why doesn't solve task
Awesome!
Completion rate improved to 3.23
Challenge: Group by Period?
Svep för att visa menyn
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas
can handle even with that. There is .resample
function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule
- the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetime
column ofdf
dataframe as an index ofdf
. - Calculate the number of trips each month available in dataset.
Lösning
Alias | Meaning |
---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
Tack för dina kommentarer!
Awesome!
Completion rate improved to 3.23single