Seksjon 1. Kapittel 9
single
Challenge: Analyzing Sales Data with Spark SQL
Sveip for å vise menyen
Oppgave
Sveip for å begynne å kode
You are given a flights dataset as a list of rows. Load it into a DataFrame, register it as a temporary view, and answer the following using spark.sql(). Store results in the specified variables:
- Find the top 3 routes (unique
AirportFrom+AirportTopairs) by averageLength– store as a list of tuples[(origin, destination, avg_length), ...]intop_routes_by_length; - For each airline, find the flight with the longest
Lengthusing a window function withrow_number()– store as a DataFrame inlongest_flight_per_airlinewith columnsAirline,Flight,Length; - Count how many delayed flights (
Delay == 1) perDayOfWeek– store as a list of tuples[(day_of_week, count), ...]sorted byDayOfWeekascending indelays_by_dow.
Print all results.
Løsning
Alt var klart?
Takk for tilbakemeldingene dine!
Seksjon 1. Kapittel 9
single
Spør AI
Spør AI
Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår