Ideas to improve score

loan_prediction
r
machine_learning
hackathon

#1

Till now, i have implemented 4 models…Logistic regression, Decision Trees, Random Forest and Conditional Inference Trees
Following is my accuracy on TRAINING data, found out through ConfusionMatrices
image

Featureing Engineering used till now:

  1. Self Employed, made as factor, missing values imputed to No
  2. Gender, made as factor, missing values imputed to Male
  3. Married, made as factor, missing values imputed to Yes
  4. Dependents, made as factor, missing values imputed to 0
  5. Term, made as factor, missing values imputed to 360, 350 changed to 360, 6 changed to 60
  6. credit history, made as factor, missing values imputed to 1
  7. loan amount, fitted using rpart
  8. new variable, TotalIncome
  9. new variable, Alpha(loan amount/total income)

Decision trees and Random forest give me the best score and rank
i am stuck at the score of 0.784722222222222 and rank 492
how can i improve?
Please help out!


#2
  1. Try a few ensemble methods
  2. Create new features (if required)
  3. Try some other algorithms like XGboost… etc

#3

just read a bit of what ensemble methods do…from Saurav’s post (https://www.analyticsvidhya.com/blog/2017/02/introduction-to-ensembling-along-with-implementation-in-r/)
Thanks Mathan, for the idea

I have created a couple of new features, not sure what more i can do there…