Working with Duplicates
12345678import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") # Check which rows are duplicates print(df.duplicated()) # Count duplicate rows print(df.duplicated().sum())
12345import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") print(df.duplicated(subset=["Ticket"]).sum())
12345678import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") # Remove duplicate rows print(df.drop_duplicates()) # Remove duplicates based only on values in a subset print(df.drop_duplicates(subset=["Ticket"]))
12345678import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") # Count unique values for each column print(df.nunique()) # Count unique values for a single column print(df["Embarked"].nunique())
Var allt tydligt?
Tack för dina kommentarer!
Avsnitt 2. Kapitel 2
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal
Awesome!
Completion rate improved to 10
Working with Duplicates
Svep för att visa menyn
12345678import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") # Check which rows are duplicates print(df.duplicated()) # Count duplicate rows print(df.duplicated().sum())
12345import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") print(df.duplicated(subset=["Ticket"]).sum())
12345678import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") # Remove duplicate rows print(df.drop_duplicates()) # Remove duplicates based only on values in a subset print(df.drop_duplicates(subset=["Ticket"]))
12345678import pandas as pd df = pd.read_csv("https://staging-content-media-cdn.codefinity.com/courses/64641555-cae4-4cd0-8d29-807aeb6bc0c4/datasets/passengers.csv") # Count unique values for each column print(df.nunique()) # Count unique values for a single column print(df["Embarked"].nunique())
Var allt tydligt?
Tack för dina kommentarer!
Avsnitt 2. Kapitel 2