Challenge: Group by Period?
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas can handle even with that. There is .resample function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule - the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetimecolumn ofdfdataframe as an index ofdf. - Calculate the number of trips each month available in dataset.
Solución
| Alias | Meaning |
|---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
¡Gracias por tus comentarios!
single
Pregunte a AI
Pregunte a AI
Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla
Awesome!
Completion rate improved to 3.23
Challenge: Group by Period?
Desliza para mostrar el menú
Previously, across other courses and chapters, you used to group observations by some columns. But can we do it with some time-series data? For example, can we summarize data by each week presented in dataset? Sounds like a complicated task.
Actually, pandas can handle even with that. There is .resample function available to group by different periods. Let's consider the structure of this function.
1df.resample(rule, axis = 0, closed = None, label = None, convention = 'start', kind = None, loffset = None, base = None, on = None, level = None, origin = 'start_day', offset = None)
The most important and the only one required argument is rule - the offset string or object representing target conversion. Easier, it's the period we want to divide our data by. There is a list of offset aliases used for resampling. You can find them in the table below the task.
Swipe to start coding
- Set
pickup_datetimecolumn ofdfdataframe as an index ofdf. - Calculate the number of trips each month available in dataset.
Solución
| Alias | Meaning |
|---|---|
B | Business day frequency |
C | Custom business day frequency |
D | Calendar day frequency |
W | Weekly frequency |
M | Month end frequency |
Q | Quarter end frequency |
There are many more aliases available. You can read about it in documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases (Offset aliases)
¡Gracias por tus comentarios!
single