Till now, i have implemented 4 models…Logistic regression, Decision Trees, Random Forest and Conditional Inference Trees
Following is my accuracy on TRAINING data, found out through ConfusionMatrices
Featureing Engineering used till now:
- Self Employed, made as factor, missing values imputed to No
- Gender, made as factor, missing values imputed to Male
- Married, made as factor, missing values imputed to Yes
- Dependents, made as factor, missing values imputed to 0
- Term, made as factor, missing values imputed to 360, 350 changed to 360, 6 changed to 60
- credit history, made as factor, missing values imputed to 1
- loan amount, fitted using rpart
- new variable, TotalIncome
- new variable, Alpha(loan amount/total income)
Decision trees and Random forest give me the best score and rank
i am stuck at the score of 0.784722222222222 and rank 492
how can i improve?
Please help out!