Kursinnehåll
Classification with Python
Classification with Python
Challenge: Comparing Models
Now you'll compare the models we've covered using a single dataset — the breast cancer dataset. The target variable is the 'diagnosis'
column, where 1
represents malignant and 0
represents benign cases.
You will apply GridSearchCV
to each model to find the best parameters. In this task, you'll use recall as the scoring metric because minimizing false negatives is crucial. To have GridSearchCV
select the best parameters based on recall, set scoring='recall'
.
Swipe to start coding
You are given a breast cancer dataset stored as a DataFrame
in the df
variable.
- Create a dictionary for
GridSearchCV
to iterate through[3, 5, 7, 12]
values forn_neighbors
and store it in theknn_params
variable. - Create a dictionary for
GridSearchCV
to iterate through[0.1, 1, 10]
values forC
and store it in thelr_params
variable. - Create a dictionary for
GridSearchCV
to iterate through[2, 4, 6, 10]
values formax_depth
and[1, 2, 4, 7]
values formin_samples_leaf
, and store it in thedt_params
variable. - Create a dictionary for
GridSearchCV
to iterate through[2, 4, 6]
values formax_depth
and[20, 50, 100]
values forn_estimators
, and store it in therf_params
variable. - Initialize and train a
GridSearchCV
object for each of the model, and store the trained models in the respective variables:knn_grid
,lr_grid
,dt_grid
, andrf_grid
.
Lösning
Tack för dina kommentarer!
Challenge: Comparing Models
Now you'll compare the models we've covered using a single dataset — the breast cancer dataset. The target variable is the 'diagnosis'
column, where 1
represents malignant and 0
represents benign cases.
You will apply GridSearchCV
to each model to find the best parameters. In this task, you'll use recall as the scoring metric because minimizing false negatives is crucial. To have GridSearchCV
select the best parameters based on recall, set scoring='recall'
.
Swipe to start coding
You are given a breast cancer dataset stored as a DataFrame
in the df
variable.
- Create a dictionary for
GridSearchCV
to iterate through[3, 5, 7, 12]
values forn_neighbors
and store it in theknn_params
variable. - Create a dictionary for
GridSearchCV
to iterate through[0.1, 1, 10]
values forC
and store it in thelr_params
variable. - Create a dictionary for
GridSearchCV
to iterate through[2, 4, 6, 10]
values formax_depth
and[1, 2, 4, 7]
values formin_samples_leaf
, and store it in thedt_params
variable. - Create a dictionary for
GridSearchCV
to iterate through[2, 4, 6]
values formax_depth
and[20, 50, 100]
values forn_estimators
, and store it in therf_params
variable. - Initialize and train a
GridSearchCV
object for each of the model, and store the trained models in the respective variables:knn_grid
,lr_grid
,dt_grid
, andrf_grid
.
Lösning
Tack för dina kommentarer!