Potential Customer for Targeted Marketing - Machine Learning Models



Hi community,

I have a question on a customer classification prediction problem (correct me if I am wrong in terms of the problem type). The problem is as follows:

  • Marketing team wants to optimise campaign to only target potential customers (to save cost and time)
  • Dataset contains all customers who are eligible for the product in the campaign (A), including demographics, and other customer attributes
  • A test campaign has been run on a random subset of customers from A (let’s call this B)
  • In the end, dataset records who signed up from B and C (C is customers who were not part of B - they signed up without being targeted)
  • Objective is to identify who should be targeted and who should be left alone (either because they would sign up without being targeted or they would not sign up even if they were targeted)

My understanding is as follows:

  • My training data should be all the customers who have signed up to the product offer
  • From this, I can then derive the features that are relevant
  • Finally, my model should classify which customers should be targeted and which should be left alone

My questions:

  • Is this the right approach? Or is it not a classification problem by nature?
  • When selecting a model, we should do some exploratory analysis first to understand what’s in the data, right? How does it relate with selecting which machine learning model to use?



Hi @cruisybd,

That’s an interesting problem. First of all, from what I understand. You have 3 datasets, A, B and C. (B is a subset of A). So you have the features and target variable for B and C, while some rows of A do not have the target.

The idea is to identify potential customers (target variable 1 and 0)

Your training data can have both, people who signed up, and those who did not. In this case, you model will learn who are the potential customers and also which are not.

Yes, that is correct.

This is certainly a classification problem, since your target variable is 1 or 0.

This depends on your dataset.

  • So you will have to check how huge is your training data. For a very large training data. LightGBM works well.
  • If your training data has too many categorical variables, and creating dummies is very difficult, you can go for CatBoost.
  • For imbalanced training set, you will have to choose the algorithm accordingly (GBM or XGB would be preferred).
  • Or, use the hit and trial method. Simply try to fit logistic, random forest and GBM, and see which works better.


Hi @AishwaryaSingh

Sorry I meant, it’s one dataset (the whole population is referred to as A), B and C are just groups which are subset of A. I just named them in my explanation to make it easier but apparently it’s confusing - apologies for that. So basically A = B + C. Therefore, all the rows have the target (1 or 0).

Because the idea is to optimise the campaign (esp from cost and conversion rate perspective), from the current dataset I have:

  • customers who have been tested with the promotional campaign (B, which is part of A)
    • some of them took the offer
    • some of them did not take the offer
  • remaining customers who have not been tested with the promotional campaign (C, which is the remaining part of A)
    • some of them actually bought the product, even without the promotional offer
    • some of them had not bought the product, either they hadn’t seen the product, they don’t know about it or they actually didn’t want to buy it

This means that I need to find those customers who:

  • will likely to take the offer when targeted by the campaign
  • will buy the product anyways, without being targeted (cost saving here)
  • will not buy the product even if they are targeted

Or should I be looking at the problem only as: who should be targeted and who shouldn’t be? The one who are not targeted will then include both customers who will buy and those who will not buy even if they are targeted.


Hi @cruisybd,

Basically B and C have target variables. Use these as your training dataset. The remaining part of A should be your test dataset.

Now what you can do is, train model on C and predict on test (1 - buy anyway ; 0 - dont buy ). This should give you the people who will buy the product without being targeted. Now for the remaining people (who got 0), fit the model on B and predict values for these people (1-will buy after the campaign ; 0 - will not buy even after campaign)

Although this is an optimized method, it is more complicated. Simple buy-or-not will be a rather simple case. You can make your choice here.


Hi @AishwaryaSingh ,

I like that idea, however, I need to clarify some things with you. Also, let me give an example of the dataset (the whole dataset is A and A = B + C).

| age | gender | tested_campaign | sign_up
| 50  | male   | yes             | no #this is what i refer to as B - campaign tested
| 35  | male   | yes             | yes #this is what i refer to as B - campaign tested
| 40  | male   | yes             | no #this is what i refer to as B - campaign tested
| 20  | female | no              | yes #this is what i refer to as C (or remaining of A)
| 53  | female | no              | no #this is what i refer to as C (or remaining of A)

When you say the remaining part of A, do you mean: split C into train and test set? Then after the model (model_1) is trained on train_C_dataset, I should predict on test_C_dataset to get people who will buy the product without being targeted (1’s) and the people who are unlikely to buy the product (0’s). Is this right?

Is there another model to build from using B? So, split B into train_B_dataset and test_B_dataset, then build a model based on train_B_dataset and test this model_2 on test_B_dataset. Once happy with performance, then test this on the remaining people who got 0’s from test_C_dataset_predicted above?

So, in summary, there are 2 models to build, from C (model_1) and B (model_2). And then fit model_1 to test_C_dataset, from the prediction test_C_dataset_predicted, get the customers who are predicted 0’s for their target response. Finally, fit model_2 into these customers.

Is this the right understanding? This is something new, I have never come across 2 x model, 2 x fitting steps like this. It sounds awesome!



Sorry for the delay in reply.

I mean, the people from dataset C with the target variable 0. Since they got 0, that implies they did not buy the product without the campaign.

Also, I have a question. If all rows in your dataset have the target variable, which is your test dataset?

My idea was, We have a training set (B+C) and a test set. Fit on C and predict on test. For the rows which got 0 value, fit on B and predict on these rows.


Hi @AishwaryaSingh,

My test set will have to be the split of C (e.g. 80% for training, 20% for testing).

Now that you asked that, this wouldn’t work, would it?


Hi @cruisybd,

I started a same research on June 2018. Already submitted initial phase of research at ICREST conference. I am just curious to know how far did you go with this project.