Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Challenge: Choosing the Best K Value | k-NN Classifier
Classification with Python

Stryg for at vise menuen

book
Challenge: Choosing the Best K Value

As shown in the previous chapters, the model's predictions can vary depending on the value of k (the number of neighbors). When building a k-NN model, it's important to choose the k value that gives the best performance.

A common approach is to use cross-validation to evaluate model performance. You can run a loop and calculate cross-validation scores for a range of k values, then select the one with the highest score. This is the most widely used method.

To perform this, sklearn offers a convenient tool: the GridSearchCV class.

The param_grid parameter takes a dictionary where the keys are parameter names and the values are lists of options to try. For example, to test values from 1 to 99 for n_neighbors, you can write:

python

Calling the .fit(X, y) method on the GridSearchCV object will search through the parameter grid to find the best parameters and then re-train the model on the entire dataset using those best parameters.

You can access the best score using the .best_score_ attribute and make predictions with the optimized model using the .predict() method. Similarly, you can retrieve the best model itself using the .best_estimator_ attribute.

Opgave

Swipe to start coding

You are given the Star Wars ratings dataset stored as a DataFrame in the df variable.

  • Initialize param_grid as a dictionary containing the n_neighbors parameter with the values [3, 9, 18, 27].
  • Create a GridSearchCV object using param_grid with 4-fold cross-validation, train it, and store it in the grid_search variable.
  • Retrieve the best model from grid_search and store it in the best_model variable.
  • Retrieve the score of the best model and store it in the best_score variable.

Løsning

Switch to desktopSkift til skrivebord for at øve i den virkelige verdenFortsæt der, hvor du er, med en af nedenstående muligheder
Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 7
single

single

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

close

Awesome!

Completion rate improved to 4.17

book
Challenge: Choosing the Best K Value

As shown in the previous chapters, the model's predictions can vary depending on the value of k (the number of neighbors). When building a k-NN model, it's important to choose the k value that gives the best performance.

A common approach is to use cross-validation to evaluate model performance. You can run a loop and calculate cross-validation scores for a range of k values, then select the one with the highest score. This is the most widely used method.

To perform this, sklearn offers a convenient tool: the GridSearchCV class.

The param_grid parameter takes a dictionary where the keys are parameter names and the values are lists of options to try. For example, to test values from 1 to 99 for n_neighbors, you can write:

python

Calling the .fit(X, y) method on the GridSearchCV object will search through the parameter grid to find the best parameters and then re-train the model on the entire dataset using those best parameters.

You can access the best score using the .best_score_ attribute and make predictions with the optimized model using the .predict() method. Similarly, you can retrieve the best model itself using the .best_estimator_ attribute.

Opgave

Swipe to start coding

You are given the Star Wars ratings dataset stored as a DataFrame in the df variable.

  • Initialize param_grid as a dictionary containing the n_neighbors parameter with the values [3, 9, 18, 27].
  • Create a GridSearchCV object using param_grid with 4-fold cross-validation, train it, and store it in the grid_search variable.
  • Retrieve the best model from grid_search and store it in the best_model variable.
  • Retrieve the score of the best model and store it in the best_score variable.

Løsning

Switch to desktopSkift til skrivebord for at øve i den virkelige verdenFortsæt der, hvor du er, med en af nedenstående muligheder
Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

close

Awesome!

Completion rate improved to 4.17

Stryg for at vise menuen

some-alt