Training and testing data have the below format with no outcome variable
user_id: A hash that uniquely identifies the user.
activity_date: The date of the activity
activity_type: The type of activity like click through, purchase, email open, form submit etc.
I am trying to build a model that predicts which user_id’s will make a purchase in the future
and, to score the test data from most likely to least likely to purchase.
data looks something like this,
(1) Describe which activity types are most useful in predicting
which user will purchase in the future.
(2) Get 1000 user_id’s that are most likely to convert.
I am just confused if it’s a classification or regression or both ?
Any thoughts or inputs on how can I get started ?