Hi @Aarshay,

Just to chime in here with a quote from Hosmer and Lemeshow Applied Logistic Regression, "when the outcome variable is dichotomous (i.e. male or female):

- The conditional mean of the regression equation must be formulated to be bounded between 0 and 1 (the ratio scale you mentioned).
- The binomial, not the normal, distribution describes the distribution of the errors.
- The principles that guide an analysis using linear regression will also guide us in logistic regression."

The way that logistic regression satisfies the first condition is pretty cool (…or at least I think so). It takes the linear regression model, Y = I + B*X, where the parameters can range between positive and negative infinity and exponentiates this formula, allowing the parameters to still range between positive/negative infinity, but causing the outcome to be positive (because, as you know, when you exponentiate something, the result is always positive). And then it reformats the outcome into a fraction (or odds ratio) that constrains it to be between 0 and 1 (which works great as a probabilistic estimator).

The second condition makes sense with a dichotomous outcome since the error can only take two values (ya either got it right, or ya didn’t). So the binomial distribution fits the task better for dichotomous classification.

For the third condition, both linear and logistic regression use a form of maximum likelihood to choose parameters, linear regression just does so with a least squares function, and logistic regression instead uses a likelihood function.

I hope this helps and that you check out Applied Logistic Regression by Hosmer and Lemeshow. I really got a lot out of the book.

Thanks!

Nathan