Should we group a large number of categories into broader groups while predicting into them?




I am working on a problem which requires me to make predictions into a large number of categories(~50). Would it be a good practice to group the categories into more broader ones so that I will have to predict first into 5 or 6 categories only OR should I directly predict into the total number of categories available?
Which approach is good?



Hi Aditya,

If you have enough datapoints in all the category than go for granular approach, if not then try to club them into super categories.

Hope this helps.

Aayush Agrawal