Lesson 7 of 15
K-Nearest Neighbours
K-Nearest Neighbours (k-NN)
K-Nearest Neighbours is one of the simplest machine learning algorithms. To classify a new point, it:
- Computes the distance from the query point to every training point
- Selects the closest training points
- Returns the most common label among those neighbours
Algorithm
for each training point:
compute euclidean distance to query
sort by distance
take k smallest
return the majority label
Tie-breaking
When two labels are equally common among the neighbours, return the smaller label value.
Properties
- Non-parametric: no training phase, all computation at prediction time
- Lazy learner: stores the entire training set
- is a hyperparameter: small = low bias, high variance; large = high bias, low variance
Your Task
Implement knn_classify(X_train, y_train, x_query, k) that returns the most common label among the nearest neighbours. Use Euclidean distance. Break ties by returning the lowest label value.
Python runtime loading...
Loading...
Click "Run" to execute your code.