Model Validation Techinques For R (Logistic Regression)


Hi all,
In R what all techniques can be used to validate a logistic regression model, are the below sufficient

  1. Kappa value
  2. Concordance/Discordance/Tied
  3. Confusion Matrix
  4. Lift & Gain Chart
  5. ROC & Area under the Curve

Kindly post the R command and the interpretation in case you are posting something new other than the list above

  1. Concordance/Discordance
  2. Lift chart
  3. AUC (Area under the Curve )

Are sufficient to validate the logistic regression model.


you would like to do model diagnostic for validity of model .It would mean
1- testing for linearity of model
2- No apparent trend in residual distribution w.r t pretictor
3- Testing for influential observation.

It could me more detailed but this much I know as of now .



Thanks for your reply but is this not true for linear regression as logistic regression always assumes non linearity between IV and DV.


true it is not between DV and IVs but full model (log-odd) must be linear .


I assume these are not validation for logistic model but performance parameters…correct me if wrong


Could you please shed some light around 1,2,3 you have mentioned, and probably explain how they can be used to validate a model. This will help me understanding some new things.
As I always chose a model based on the 1) & 2) (read from the topic description) and then use 3),4) &5) to validate a model once applied on test data set.


Let me give you details explanation for Model validation Technique for Logistic regression :
Why do we require model validation ?
Ans : Model validation is essential to identify whether the model which we have made is good or not . Also how well does the model fit the data .Which predictor are most important, and how accurate is our model.

To identify whether the logistic regression model is good or not : we can use likelihood ratio test or
Pseudo R^2 or Hosmer-lemeshow test etc

Now to check how accurate our model is : we can use Confusion Matrix, ROC,
(i.e. you can say its a performance check method)

To check for variable importance we can use function varImp which is present in caret package .

For validation of model : we can use K-fold cross validation technique to check whether the model perform well on different data set or not.