How to train a model when train set and test contain different levels for a variable

data_wrangling

#1

Hello,

While trying to a problem on classification I came across a curious case:
The column has some levels in the train data,but the test data has some extra levels for that column and hence the error:

How to deal with these cases?
The test data doesn’t contain the response variable which would have allowed me to combine the two datasets and then train the model on it.
Can someone please help me with this??


#2

Pagal,

You can add the response variable to the test dataset and set it to NA, as shown below:

      test$response=NA

This will allow you to combine the two datasets.