Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Challenge: Deduplicate a Customer List | Deduplication Strategies
Data Cleaning Techniques in Python

bookChallenge: Deduplicate a Customer List

Oppgave

Swipe to start coding

You are given a list of customer records that contains duplicate entries. Each customer is represented as a dictionary with two fields:

  • name — the customer's full name;
  • email — the email address provided by the customer.

Your goal is to remove duplicate records using a simple matching rule.

Follow these steps:

  1. Two records are considered duplicates if their email fields match exactly.
  2. Create an empty dictionary named unique_customers, where keys are email addresses and values are customer dictionaries.
  3. Loop through the input list customers and add only the first occurrence of each email to unique_customers.
  4. Store the deduplicated list in a new variable named deduplicated_list, which should contain only the unique customer dictionaries (values of unique_customers).

Make sure both unique_customers and deduplicated_list are declared and contain the correct deduplicated data.

Løsning

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 2. Kapittel 3
single

single

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

close

bookChallenge: Deduplicate a Customer List

Sveip for å vise menyen

Oppgave

Swipe to start coding

You are given a list of customer records that contains duplicate entries. Each customer is represented as a dictionary with two fields:

  • name — the customer's full name;
  • email — the email address provided by the customer.

Your goal is to remove duplicate records using a simple matching rule.

Follow these steps:

  1. Two records are considered duplicates if their email fields match exactly.
  2. Create an empty dictionary named unique_customers, where keys are email addresses and values are customer dictionaries.
  3. Loop through the input list customers and add only the first occurrence of each email to unique_customers.
  4. Store the deduplicated list in a new variable named deduplicated_list, which should contain only the unique customer dictionaries (values of unique_customers).

Make sure both unique_customers and deduplicated_list are declared and contain the correct deduplicated data.

Løsning

Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 2. Kapittel 3
single

single

some-alt