In PCA, there is a condition that the second PCA describes as much variability in data as possible with the constraint that it should be orthogonal to the 1st PCA.

Why is the 90 degree angle used here?

Wanted to know the math behind this,so if somebody could help me please!