In the KNN algorithm, there are various distance metrics that are used. By default or mostly used is Euclidean distance. So my question is what is the advantage of using Manhattan distance over the euclidean distance? and in which scenarios it is preferable to use Manhattan distance over Euclidean?

# When Manhattan distance is preffered over Euclidean distance

Manhattan (L1 norm) is used in case of high dimensional data over Euclidean (L2 norm). Manhattan gives robust results.

Sourav

Hi @purnima82

Ecludeian distance will look for those many elements (in KNn if k=5 it will look for 5 nearest neibhours in any dimensions.).In real life mostly data will be in any dimensions that is the reason Ecludeian distance will be used mostly.

Manhattan distance will find the elements only in vertical or horizontal dimensions.(real life example is chessboard where you will move horizontally or Vertically but not like in any dimensions).

Consider two data points((x1,y1),(x2,y2)).

Then Ecludeian distance = sqrt(sqr(x2-x1)+sqr(y2-y1))

Manhattan distance = (x2-x1)+(y2-y1)

Also, if the dataset has discrete and binary attributes Manhattan is preferred over Ecludeian distance.

Where it is read:

“Then Ecludeian distance = sqrt(sqr(x2-x1)+sqr(y2-y1))”

Consider this instead:

“Then Ecludeian distance = sqrt((x2-x1)^2+(y2-y1)^2)”