Since it is a practice problem, I am interested to know, what features other people are thinking.
Here are the features I have already tried apart from existing columns and giving me rmse of about ‘0.29’
- Category Counts at user level
- Product Counts at user level
- Total categories a product may belong to(1,2,3)
- Total counts at category level
- Percent of products at user and category level over the total products in the category
- Percent of products at user category over all products at user level
- Products purchased at different occurrence
- Rank ordered the IDs within each category based on their Purchase amount
- Fluctuations of purchase amount by an ID within each category, range etc
What are some of the other ideas that you guys have tried and have resulted in decrease in rmse?