I am newbie in DataScience and doing a Churn Prediction as part of my academic work. Following is how i am thinking of how my predictor should work.
- I have 9 months of transaction starting from Nov to June
- I take 3 months of historical transactions(Nov, Dec, Jan) and predict for the 4th month(Feb)
- The problem is i have highly imbalanced data where only 1% of customers churn in a given month. So i have created a model evaluation technique where i do evaluate Classification model against 6 different imbalanced technique (Oversampling, Undersampling and Hybrid)
After one evaluation cycle i would like to repeat this procedure for the subsequent data sets like
Use Dec,Jan,Feb data and predict March and so forth.
I was thinking of reusing the trained model when ever i did subsequent analysis.
My question here is
- should i save the model after it has been trained and predicted with the imbalanced data?