What is k-NN
Let's start our classification adventure with the simplest task β binary classification. Suppose we want to classify sweets as cookies/not cookies based on a single feature: their weight.
A simple way to predict the class of a new instance is to look at its closest neighbor. In our example, we must find a sweet that weighs most similarly to the new instance.
That is the idea behind k-Nearest Neighbors(k-NN) - we just look at the neighbors. The k-NN algorithm assumes that similar things exist in close proximity. In other words, similar things are near each other. k in the k-NN stands for the number of neighbors we consider when making a prediction.
In the example above, we considered only 1 neighbor, so it was 1-Nearest Neighbor. But usually, k is set to a bigger number, as looking only at one neighbor can be unreliable:
If k (number of neighbors) is greater than one, we choose the most frequent class in the neighborhood as a prediction. Here is an example of predicting two new instances with k=3:
As you can see, changing the k may cause different predictions.
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Ask me questions about this topic
Summarize this chapter
Show real-world examples
Awesome!
Completion rate improved to 4.17
What is k-NN
Swipe to show menu
Let's start our classification adventure with the simplest task β binary classification. Suppose we want to classify sweets as cookies/not cookies based on a single feature: their weight.
A simple way to predict the class of a new instance is to look at its closest neighbor. In our example, we must find a sweet that weighs most similarly to the new instance.
That is the idea behind k-Nearest Neighbors(k-NN) - we just look at the neighbors. The k-NN algorithm assumes that similar things exist in close proximity. In other words, similar things are near each other. k in the k-NN stands for the number of neighbors we consider when making a prediction.
In the example above, we considered only 1 neighbor, so it was 1-Nearest Neighbor. But usually, k is set to a bigger number, as looking only at one neighbor can be unreliable:
If k (number of neighbors) is greater than one, we choose the most frequent class in the neighborhood as a prediction. Here is an example of predicting two new instances with k=3:
As you can see, changing the k may cause different predictions.
Thanks for your feedback!