I am working on a problem of loan default prediction for a financial risk assessment. I would like to know the good approach to use SMOTE function for handling the imbalanced dataset which originally has 6% default rate.
I have used the following code for Smoting
Minority Oversampling using SMOTE
training_sub <- as.data.frame(training_sub)
training_new <- SMOTE(SeriousDlqin2yrs~., training_sub, perc.over = 200, perc.under = 100)
the SMOTED data gives 50% balanced data (50% - 0, 50% -1) and also changes the number of records.
But when I used this data, I get improvement in Sensitivity, with loss of accuracy for a Logistic Regression model.
Is there a way to increase the accuracy of the model?