Should we group a large number of categories into broader groups while predicting into them?

data_wrangling
analytics

#1

Hello,

I am working on a problem which requires me to make predictions into a large number of categories(~50). Would it be a good practice to group the categories into more broader ones so that I will have to predict first into 5 or 6 categories only OR should I directly predict into the total number of categories available?
Which approach is good?

Thanks!


#2

Hi Aditya,

If you have enough datapoints in all the category than go for granular approach, if not then try to club them into super categories.

Hope this helps.

Regards,
Aayush Agrawal