I faced this issue while creating dummies for categorical variable. Let say in train set I have 2 categorical columns (A and B).
‘A’ has 3 distinct categories A1,A2,A3.
‘B’ has 2 distinct categories B1,B2
I now dummified it and got 6 binary columns in train dataset.
Now I have similar columns in test data but they have different number of category. Let say
‘A’ has A1,A2,A3,A4 as categories
‘B’ has B1 only as the category.
Test dataframe will now have different columns sets.
So how to predict the test dataset, if the columns become different after category treatment(dummifying)