How to deal with categorical variables with lot many levels post cod apart from techniques such as dummy coding and One hot coding
There are multiple ways to deal with high cardinality categorical variables.
- Delete rare categories from the data.
- Convert the categories to frequencies which is nothing but a count of each category.
- Convert each category to it’s mean response to target.