I am new to Analytics. I have tried using Logistic Regression for the first time for Loan Prediction problem
When i am trying to predict the Loan_Status variable i am getting the probability after using Logistic regression which is continuous.
I thought i would be getting categorical output like 0 and 1. What i am missing here.Is anything more i have to do .
mylogit<- glm(formula=Loan_Status~.,data = train,family = binomial)
test$Loan_Status<-predict(mylogit,newdata = test,type=“response”)
Any help would be appreciated.
Thanks in advance.
Logistics regression outputs the probability of a particular event happening as you see it uses exponential terms in denominator, you have to use a threshold value (say .5 ) to output the result as 0 or 1. You can use the following code to output the result as 0 or 1 :
test$Loan_Status = as.numeric(test$Loan_Status> 0.5)
On the other hand choosing the value of threshold is somewhat tricky, Over time you will get the hang of it. For now you can try these two methods :
- Hit and trial.
- Plotting ROC curve - It is a more sophisticated way for finding the threshold. You can go through this link.
Hope this helps.
Thanks it helped alot.
logistic regression model will give you probabilities .
As the evaluation metrix is Accuracy . You should ideally take the predicted and Accuracy values of each row and then consider the threshold value(predicted) for which accuracy is maximum which could be 0.5 or anything.