I have below mentioned concerns/doubts:-
- Imputation of test dataset :- some suggest adding replacing NA with -1. What is the logic behind it. Does it work for almost all the data
- Imbalanced Classes : Should categorical variables with high Imbalanced Classes be ignored completely on the basis of Entropy(NO new information) before training a dataset.