HI,

Scenario:

I have a categorical variable x1 with four levels(a, b, c and d). I run a few tests and find that one of those levels(say d) does not contribute much towards predicting the target variable.

Question:

Now is it possible for me to include just the first three categories of x1 to predict my target variable?

I read that for linear regression, reference coding is used and removing one level from a predictor variable would change the coding of the other levels. Here’s a link to the article:

And is this the same case with tree based methods too?

Thank you for the answer!!