Data Scaling in kNN algorithm

Dear Aishwarya,

I am studying how to apply KNN in python as you guided in the post: A Practical Introduction to K-Nearest Neighbors Algorithm for Regression (with Python code)

In the step 5: Preprocessing - Scaling the features

5. Preprocessing – Scaling the features

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0, 1))
x_train_scaled = scaler.fit_transform(x_train)
x_train = pd.DataFrame(x_train_scaled)

x_test_scaled = scaler.fit_transform(x_test)
x_test = pd.DataFrame(x_test_scaled)

There was an answer on stackoverflow as below:

std_scale = preprocessing.StandardScaler().fit(X_train)
X_train_std = std_scale.transform(X_train)
X_test_std = std_scale.transform(X_test)

As I understand, he applied the method fit on training set for the scaler first, then use that scaler to transform data of training and test set.

Meanwhile, in your approach, you applied the fit_transform method for training and test data. Please help to clarify the difference between your approach and his.

Many thanks,

© Copyright 2013-2019 Analytics Vidhya