Using SMOTE function to handle imbalanced data set



I am working on a problem of loan default prediction for a financial risk assessment. I would like to know the good approach to use SMOTE function for handling the imbalanced dataset which originally has 6% default rate.

I have used the following code for Smoting

Minority Oversampling using SMOTE

training_sub <-
training_new <- SMOTE(SeriousDlqin2yrs~., training_sub, perc.over = 200, perc.under = 100)

the SMOTED data gives 50% balanced data (50% - 0, 50% -1) and also changes the number of records.
But when I used this data, I get improvement in Sensitivity, with loss of accuracy for a Logistic Regression model.
Is there a way to increase the accuracy of the model?


It is not a question of your code applying SMOTE.

I would not use class balance as a first step in your modeling process to improve accuracy.
There are many things you can do with your variables: binning, create new features, treat categories, etc.
And even considering interactions…

Have you already worked with your model in this way?..