Random Forest to Choose multiple variable in consumer lending portfolio



Hey Guys,

I pretty much see lot of problems that are posted about random forests or other classification algorithms are so much to do with techniques; but i personally feel it looses the important ingredient say complexity of the product/business as such.

I am currently dealing a data set of vehicle financing loan portfolio. I pretty much have lot of variables that are categorical (qualitative), quantitative, and also dichotomous in nature. So i wanted to use selected variables that could be much relevant to predict the default probability of a customers. For which i need to choose the variable. I would be glad, some of you can help me with resources that cites the example and also the interpretation of statistics arising out of random forests.

Please let me know, if you have any relevant R codes that would be helpful for me to fill the gaps.



@Sillypatterns - You can refer to this link for selecting the variable.

Also, you can use variable selection table in Random Forest.

Hope this helps!



Thanks bro for this,
I have come across this link http://rstatistics.net/linear-regression-advanced-modelling-algorithm-example-with-r/
i believe so that this even addresses the issue of categorical, continuous variable separately.

and also it handles the correlation and chi squared test to screen the variables before passing them to random forest.

Can you please go through this and let me know if you find this is ok.