How to check accuracy of Logistic Regression model



After doing some feature engineering, i applied the LR model on my training set ( after removing the loan ID colum)

fit = glm(as.factor(Loan_Status) ~ ., data = train, family = binomial)

Now, i have the model, and I want to check its accuracy
So i created a duplicate training set, named it check, and applied the model on it

check$Loan_Status = 0
Prediction = predict(fit, check,type=“response”)

Now, there are some problems:

  1. the model doesnt get applied to check dataset. there are no errors, but the value of loan_status does not change
  2. how do i check accuracy? i wanted to use confusion matrix
    confusionMatrix(train$Loan_Status, check$Loan_Status)
    but i got an error: the data cannot have more levels than the reference

can anybody help me out?


solved it !! :smiley:
how do i mark it as solved?


Predict function after Logistic regression gives you probability as the output. Hence, you need to convert that probability into a binary or multinomaial variable.

Then you can compare this variable with your target variable.

Validation techniques:
1.Classification Table
= sum(diagonal)/Total sum

  1. Hosmer - Lemeshow (HL test)

  2. ROC curve / AUC value

  3. Concordance


Hey j,akhil.j,
can you suggest me any document dataset which have duplicates in it. I want to check my work on large scale.