Assumption of same mean in Pooled Variance




While Studying Statistics I read about pooled variance and it says “Pooled variance is a method for estimating variance given several different samples taken in different circumstances where the mean may vary between samples but the true variance is assumed to remain the same”. I wanted to know why we assume the variance to remain the same.

And also what is the difference between the standard error computed for two samples with and without Pooled variance.

where Sp= Pooled variance.


Hi @adityashrm21

The assumption is for the use case where we have small sample sub sets. Lets say you have two data sets and you need to find the standard deviation of the combined sample. We have the basic formula that you mentioned

However, if the sample is very small, we might not get a accurate picture of the combined population. What we then assume is that true variance of these small datasets are the same because they are part of the larger data set

To show this mathematically, this is what we want to achieve

Since the assumption is that the SD is the same, the SE follows accordingly

Let me know if this helped