How to clean the test set?



If we have performed data cleaning on the training set and we want to do the same to testing set to run our trained classifier -

Do we have to clean it by using the same cleaning functions we used to clean the data set or is there some direct method using pandas/sklearn?

it is always better to join both dataset & pre-process collectively as separate pre-processing may lead to some inconsistencies. You can add additional column specifying dataset for each observation for your ease.

