I have always wondered about the following question though it might seem to be trivial.
In every machine learning model we do following:
- we either have train and test data separately or we divide the train and test data from one file; in both of these cases we perform clean>eda>preprocessing>feature engg etc on both train and test and then we build our model on train and test it on test data
my question is that we have performed all eda and feature engg on test data, but on deployment the model has to work on real time raw data which is unpreprocessed.
so how to check the efficiancy of our model and how to access our model