Challenge: Fixing the Issues
Well, in the last chapter you saw, that there were only two rides with negative durations where minutes in both columns were different. But if paid your attention to seconds, you might notice, that that were the minute ending and starting (59 seconds, and 00 respectively). It means that all the inconsistencies can be interpreted as misuages of 12 and 24-hour formats.
Since we have investigated the real reason for the issue, we can now fix it! Let me remind you of one of the ways to replace values in dataframe based on some condition - .where
function.
1df['col_name'].where(~(condition), inplace = True, other = values_to_replace)
Using the following approach all the values in col_name
will be replaced with values_to_replace
if (condition)
is True.
Swipe to start coding
- For all the trips with negative
duration
add 12 hours todropoff_datetime
column. - Calculate column
duration
again. - Print first 5 rows of updated
df
.
Lösung
Danke für Ihr Feedback!
single
Fragen Sie AI
Fragen Sie AI
Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen
Awesome!
Completion rate improved to 3.23
Challenge: Fixing the Issues
Swipe um das Menü anzuzeigen
Well, in the last chapter you saw, that there were only two rides with negative durations where minutes in both columns were different. But if paid your attention to seconds, you might notice, that that were the minute ending and starting (59 seconds, and 00 respectively). It means that all the inconsistencies can be interpreted as misuages of 12 and 24-hour formats.
Since we have investigated the real reason for the issue, we can now fix it! Let me remind you of one of the ways to replace values in dataframe based on some condition - .where
function.
1df['col_name'].where(~(condition), inplace = True, other = values_to_replace)
Using the following approach all the values in col_name
will be replaced with values_to_replace
if (condition)
is True.
Swipe to start coding
- For all the trips with negative
duration
add 12 hours todropoff_datetime
column. - Calculate column
duration
again. - Print first 5 rows of updated
df
.
Lösung
Danke für Ihr Feedback!
Awesome!
Completion rate improved to 3.23single