Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Challenge: Mapping the Lifecycle of a SpaceStream Explorer | s1
Cohort Analysis with Python
Seksjon 1. Kapittel 6
single

single

Challenge: Mapping the Lifecycle of a SpaceStream Explorer

Sveip for å vise menyen

You are now tasked with calculating advanced retention metrics for SpaceStream, an intergalactic holovision service. As the Lead Data Analyst, you will analyze a cohort of 5 users over three months, tracking who stays loyal and who drifts away. Your goal is to compute three critical metrics for each month: Retention Rate, Churn Rate, and Survival Rate.

Begin by examining the provided dataset, where each user is marked as active (1) or inactive (0) for each month. The columns month_0, month_1, and month_2 represent activity across three consecutive months. Your solution will require you to use pandas to process this dataset and extract the necessary metrics for each month.

1234567891011121314151617181920212223242526
import pandas as pd data = { "user_id": [1, 2, 3, 4, 5], "month_0": [1, 1, 1, 1, 1], # Everyone starts active "month_1": [1, 0, 1, 0, 1], # 3 users active "month_2": [1, 0, 0, 0, 0], # 1 user active } df = pd.DataFrame(data) # Calculating retention rate: fraction of original cohort active in each month cohort_size = len(df) retention_rate = [df[f"month_{i}"].sum() / cohort_size for i in range(3)] # Calculating churn rate: 1 - retention rate churn_rate = [1 - r for r in retention_rate] # Calculating survival rate: fraction of users still active in ALL months up to i survival_rate = [] for i in range(3): still_active = df[[f"month_{j}" for j in range(i + 1)]].all(axis=1).sum() survival_rate.append(still_active / cohort_size) print("retention_rate:", retention_rate) print("churn_rate:", churn_rate) print("survival_rate:", survival_rate)

This code calculates the required metrics for each month. The retention rate measures what fraction of the original cohort is active in a given month. The churn rate is simply one minus the retention rate, indicating the proportion that is no longer active. The survival rate checks for users who have remained continuously active from the beginning up to the current month - requiring a user to have a 1 in every month so far.

Oppgave

Sveip for å begynne å kode

Write a Python function called calculate_cohort_metrics(df) that takes in a DataFrame with the same structure as above and returns three lists: retention_rate, churn_rate, and survival_rate for each month. Your function should:

  • Accept a DataFrame where each row is a user and each column after user_id is a month (e.g., month_0, month_1, ...).
  • Calculate retention rate for each month as the fraction of cohort users active in that month.
  • Calculate churn rate for each month as one minus the retention rate.
  • Calculate survival rate for each month as the fraction of users who were active in all months up to and including that month.
  • Return the three lists in the order: retention_rate, churn_rate, survival_rate.

Løsning

Switch to desktopBytt til skrivebordet for virkelighetspraksisFortsett der du er med et av alternativene nedenfor
Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 6
single

single

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

some-alt