Seksjon 1. Kapittel 5
single
Challenge: Predicting Flight Delays
Sveip for å vise menyen
Oppgave
Sveip for å begynne å kode
You are given a flights dataset as a list of rows. Load it into a DataFrame using createDataFrame and train a binary classification model to predict whether a flight is delayed (Delay == 1). Complete all steps and store results in the specified variables:
- Fill nulls in
DelayandLengthwith0; - Add a
LABELcolumn –1.0ifDelay == 1, otherwise0.0; - Add
IS_WEEKEND–1ifDayOfWeek >= 6, otherwise0; - Split into train (80%) and test (20%) with
seed=42; - Build a Pipeline with
StringIndexeronAirline,VectorAssembleron["Length", "Time", "IS_WEEKEND", "AIRLINE_IDX"], andRandomForestClassifierwithnumTrees=10,maxDepth=3,seed=42; - Fit the pipeline and generate predictions on the test set – store in
predictions; - Compute AUC-ROC – store in
auc_roc(rounded to 4 decimal places); - Compute accuracy – store in
accuracy(rounded to 4 decimal places).
Print both metrics.
Løsning
Alt var klart?
Takk for tilbakemeldingene dine!
Seksjon 1. Kapittel 5
single
Spør AI
Spør AI
Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår