Epochs in Neural Network



What are the epochs in a neural network? As per my understanding, if I set epoch = 1, then the model reads each image once. Is that correct? If I set epoch to 10, the model will look at each image 10 times and how is it useful? It would be great if someone can simplify this.


Hi @AarushiS,

Epochs refer to how many times your input will be used to update the weights while training the model. When the epoch = 1, the model will only look at the input once. As you increase the number of epochs, the weights are updated after each epoch and hence producing better results.

Using a large number of epochs is also not preferred as your model might overfit on the training images and will perform poorly for validation or test images. So, we must find the optimum number of epochs in order to make our model stable.


So how do we decide the number of epochs? Since a higher number would overfit and a very low number would not be enough to train the model.


For a classification task, instead of random weight initializations in a neural network, we set all the weights to zero. What will happen?


Hi @AarushiS,

You can increase the number of epochs use early stopping. So, if there is no improvement in the model’s performance, the training will automatically stop and hence it will not overfit.

For more detail, refer to the Early stopping section from the below-given article:


Hi @kevinf,

Even if you set all the weights to zero in the beginning, during backpropagation the model will learn new weights which will reduce the error and improve the model’s performance.


Can we say in this case all the neurons will end up recognizing the same thing?
If it is true, how we could justify it?


Hi @kevinf,

You might not get the exact same weights but your final result will be almost similar irrespective of the initialized weights.


1 epoch means one forward and one backward propagation combined.
i.e. Assigning weights and coming up with losses after forward propagation. And then refining the weights using backward propagation.


epochs are simply nothing but cycle in neural network, so in NN there are two phase like forward stage and backward stage so initially you should set weight by random value and bias by zero. so on every epochs you will get new value of weight and bias. Importance of epoch in neural network to minimize the cost function of model so,in every epochs NN layer by layer calculation then find out activation function perform complex calculation and at the end compare with target result,after that back propogation method start in which based on some calculation update old weight and bias value so if you increased then there are chance to get good result but optimize result not only on epochs it depends on optimizer(adam,gradient descent),learning rate ,activation function all other hyper parameter but if you get situation like overfitting please use regularization (drop,L2) which help to increase test accuracy if you want to learn proper NN you should start with deeplearning course by Andrew Ng on Coursera


Thank you @PulkitS @erskumars

@jaideepgami56, thank you for the suggestion. Could you give me a rough idea about how long will it take for an individual to complete the DL course by Andrew Ng?


There are no rules to decide how many epochs will work best. Generally the more complex your network architecture, and more complex your data (example big images) the more epochs will be needed.
You will have to tune this - start with say 20 epochs & run your model and measure the training & cross-validation loss per epoch. Plot these to check if your model is overfitting yet. If not, increase epochs…and so on.


yes DL course contains five specialization course with a proper description of how to make NN and deep learning, also some interesting assignments which help to learn concepts and last course about a different sequential model which help to make prediction model.


After setting all the weights to zero, you will get a “symmetric” system, i.e. all the neurons will become identical. Therefore, all of them will learn the same thing. That will significantly degrade the performance of your classifier.


Thanks @jaideepgami56 @mjbhobe