Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Challenge: Filtering Large Datasets | Working with Large Datasets
Large Data Handling
Section 1. Chapitre 5
single

single

Challenge: Filtering Large Datasets

Glissez pour afficher le menu

Imagine you are tasked with analyzing a massive CSV file containing millions of records—too large to load into memory all at once. Your goal is to extract only those rows where a specific column's value exceeds a given threshold, saving the filtered results to a new file. This scenario is common in large-scale data analysis, where efficient, memory-friendly processing is essential.

Tâche

Glissez pour commencer à coder

Implement a function that processes a large CSV file in chunks and writes only the rows where the specified column's value is greater than the given threshold to a new file.

  • Read the input CSV file in chunks of size chunk_size.
  • For each chunk, filter rows where the column specified by column is greater than threshold.
  • Write all filtered rows to the output CSV file, including the header row.
  • If no rows match the condition, write only the header to the output file.

Solution

Switch to desktopPassez à un bureau pour une pratique réelleContinuez d'où vous êtes en utilisant l'une des options ci-dessous
Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 5
single

single

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

some-alt