Why it is necessary to normalize in knn

knn

#1

I am currently studying knn algorithm and I want to know that why it is necessary to normalize all the variable in knn


#2

@sid100158,

For classification algorithms like KNN, we measure the distances between pairs of samples and these distances are influenced by the measurement units also. For example: Let’s say, we are applying KNN on a data set having 3 features.First feature ranging from 1-10, second from 1-20 and the last one ranging from 1-1000. In this case, most of the clusters will be generated based on the last feature as the difference between 1 to 10 and 1-20 are smaller as compared to 1-1000. To avoid this miss classification, we should normalize the feature variables.

Any algorithm where distance play a vital role for prediction or classification, we should normalize the variable as we do the same process in PCA also.

Hope this helps!

Regards,
Imran