Is there any derivation for the log loss function (from where did
this function came)?
How does gradient descent work in logistic regression?
Like in linear regression our y was (y = b0 + b1x) and what we did
there was randomly select a value for b1 and then kept updating it
until we get a minimum MSE or the global minima.
Here, in Logistic regression, our (y = sigmoid function), just wanted
to know how gradient descent will work here? In linear regression we
used a random value for b1, what will be that random value in logistic
After reading tons of articles I also got to know that logistic
regression uses MLE(Maximum likelihood estimation). I couldn’t figure
out where exactly we use MLE in this algorithm