I am trying to implement gradient descent without regularization in titanic dataset using reference of Andrew NG’s machine learning course.
Here is the cost function given by him for logistic regression -
J(theta) = -ylog(h(x)) - (1-y)log(1-h(x)) (summed over all records)
Gradient descent algo -
thetaj = thetaj - (alpha/#records)(h(x) - y)(xj) (j=1…all features)
h(x) = 1/1+e(-np.multiple(thetaTranspose,X))
Following is my implementation in python3 -
iteration = 1000 thetanew = np.random.randint(0,10,size=(titanicTrain_gradient_X().shape,1)) theta = thetanew errorLog = np.empty(iteration) alpha = 0.0001 epsilon = 0.1 for i in range(iteration): thetaX = np.dot(theta.transpose(),titanicTrain_gradient_X()) eThetaX = expit(-thetaX) denom = np.add(1,eThetaX) hx = np.divide(1,denom) diff = np.subtract(hx,titanicTrain_gradient_Y.transpose()) derivative = np.dot(diff,titanicTrain_gradient_X().transpose()) theta = np.subtract(thetanew, np.multiply((alpha/titanicTrain_gradient_X().shape),derivative.transpose())) thetaXNew = np.dot(theta.transpose(),titanicTrain_gradient_X()) eThetaXNew = expit(-thetaXNew) denomNew = np.add(1,eThetaXNew) hxNew = np.divide(1,denomNew) #To avoid divide by zero error in log hxNewLogSafe = np.subtract(hxNew,epsilon) # print (hxNew) costerror = np.divide(np.add(np.multiply(titanicTrain_gradient_Y.transpose(),np.log(hxNewLogSafe)), np.multiply(np.subtract(1,titanicTrain_gradient_Y.transpose()),np.log(np.subtract(1,hxNewLogSafe)))), titanicTrain_gradient_Y.shape) #print (costerror) errorLog[i] = costerror.sum() thetanew = theta
However when I plot cost error and iteration, I don’t get a consistent curve when I execute this code multiple times. Sometimes, the error increases and sometimes it decreases with each iteration. Below are some plots of the same-
Can anyone suggest me what is going wrong here? Its supposed to decrease with every iteration. I tried different values of alpha (0.1,0.01,.001,.0001 etc) but no difference.