What to do when Hosmer lemeshow test fails during Logistic regression?



I am very new to modelling techniques. I have couple of doubts:

If i perform a Hosmer lemeshow test to find the goodness of fit of the model, and the test fails. I meant to say that i got a really low p value. In such case what will i do to make sure that the hosmer lemeshow test doesnt fail?


I am also very new to analytics, but ill try to answer your question. There are few things you could try to improve your P value

  1. change the selection of numerical variables which you are doing.Try to use relevant variables and check there significance. later you can change them according to your business problem.

  2. Bucket your continuous variable in 3-4 bins(depends on business).

  3. Create dummy variables replacing the categorical variables
    I hope this may help. All these things are taken consideration after you have done exploratory data analysis and data preparation.

Hi @vijay1987,

Adding to what Neel has said.

Hosmer lemeshow goodness of fit test has a glitch in it, for larger set of data >1000 observations it’s highly likely that it will fail, I have read many citation over it which suggests the same thing. So it’s ok if your model doesn’t pass the H-L test if your data set size is greater than 1000 observations as H-L is only used when data is really small.

Hope this helps.

Aayush Agrawal


Thanks a lot for the reply.
will transforming my variable also help??

Please correct me if i am wrong:
The Hosmer lemeshow test is failing because, my independant variable is not correctly able to predict my “y” right?