Avsnitt 1. Kapitel 5
single
Challenge: Filtering Large Datasets
Svep för att visa menyn
Imagine you are tasked with analyzing a massive CSV file containing millions of records—too large to load into memory all at once. Your goal is to extract only those rows where a specific column's value exceeds a given threshold, saving the filtered results to a new file. This scenario is common in large-scale data analysis, where efficient, memory-friendly processing is essential.
Uppgift
Svep för att börja koda
Implement a function that processes a large CSV file in chunks and writes only the rows where the specified column's value is greater than the given threshold to a new file.
- Read the input CSV file in chunks of size
chunk_size. - For each chunk, filter rows where the column specified by
columnis greater thanthreshold. - Write all filtered rows to the output CSV file, including the header row.
- If no rows match the condition, write only the header to the output file.
Lösning
Var allt tydligt?
Tack för dina kommentarer!
Avsnitt 1. Kapitel 5
single
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal