I was studying some mathematics about linear regression and I came across this.

**If we denote the variable we are trying to predict as Y and our covariates as X, we may assume that there is a relationship relating one to the other such as Y=f(X)+ϵ where the error term ϵ is normally distributed with a mean of zero like so ϵ∼N(0,σϵ)**

Why is the error term distributed with mean zero ?

Is it a mathematical assumption ?

And what if the error is not distributed normally with a non zero mean ?

Any help would be greatly appreciated.

Thanks

Neeraj