How to deal with variables with too many level?

r

#1

Hi
I want to predict dress problem by logestic Regression in R but when want to predict on test data set it error that there is a new level in particular variable
how can I solve this?
thanks


#2

@hossein_mortazavy- If there are two many levels in the test data you can combine the less frequent levels to form new level unless it does not have too much importance in predicting the result of a model.
If there is level in the test data and it is not present in the training data you can use this method for making same levels in both test and train.

Hope this helps!

Regards,
Hinduja


#3

Dear @hinduja1234
sorry for delay and thanks for response

do you think this is the best way? I think it is the simplest way to delete some levels but the problem need more sophisticate ways like unbalanced problems
you guys dont have any idea?