Feature Selection Chi-Square



I am trying to do the feature selection for the Loan_Prediction 2.

I am using R for Chi-Square test and for any categorical feature that has a value less than 0.05 , I am selecting it as a feature.

Please let me know do I need to change the tactics ?


@Debanjan_Banerjee - According to me you should carry with the approach and look is it improving the model accuracy on test data set.By doing this you will understand that your approach is good for improving model accuracy.

Hinduja

Well Hinduja I am stuck with the accuracy on around .79 mark.

Can you please suggest any other way to improve the accuracy please ?


There are lots of methods which you can use to increase the accuracy.
1- Bagging
2-Parameter tuning

But the best method is the feature engineering which can improve the accuracy of the mode and it can be used for better explanation of the model without using complex model.

May I know how to calculate Chi-square test in R. Is this different fro categorical and continuous variables? Please share an example code. Thanks.


Hi @himansu979

First chi-square is for categorical variables, it is based on the contingency table and its margins you could have a quick look at Wikipedia .
In R you have chisq.test as part of the base, you can use the package vcdExtra and the function CMHtest check the attributes. CMHtest will give you a better view due to the fact it builds a table.

Hope it helps.