in the av’ loan prediction problem there is a continuous variable ‘CoApplicaantIncome’. out of total 614 customers around 250 customers have coapplicant income as zero, which is around 40% of the total data. what can i do with this variable.
One thing, you can try is creating a new feature where Applicant having income zero are marked as 0 and the applicant having income greater than zero are marked as 1.
Maybe this categorical variable can be a useful feature.
I have created a simple random forest model for the loan prediction problem for which you can find below the list of fields in order of importance. Some of the possible use cases for the Co-applicant income are as follows:
total income can be calculated as (applicant income + co-applicant income)
Ratio between total income and loan amount can be used as a feature
Both total income and income - loan ratio have a high importance in affecting the model outcome.
Co-Applicant income in itself has a high level of importance in predicting the loan status.
As the total income capability of the applicant increases (applicant income + co-applicant income),
there is an increase in proportion of approved loans.
Hope this was of some help.
Hello Shaurya and all, I looked at the data, and observe that for many observations where ‘CoapplicantIncome’ is specified but the marital status of (primary) Applicant is ‘No’. How to interpret this data? Do we have to assume that the applicant is applying for the loan along with brother or another family member? I will be happy to hear your opinions and ideas. Thanks.