Central Limit Theorem Practical applications

data_science

#1

Hi All,

Please help me where and how do we use central limit theorm in practical applications.

Regards,
Tony


#2

We use statistics because it’s usually not practical to collect all of the data from an entire population. That’s where the central limit theorem comes in[1].

According to the central limit theorem, as a sample size n gets larger, the distribution of the sample means more closely approximates a normal distribution, regardless of the distribution of the population from which the sample was drawn. As a general rule of thumb, the assertions of the central limit theorem are valid when n >= 30. If the population itself is normally distributed, the sampling distribution of the mean is normal for any sample size.

As the sample size increases, the distribution of sample means converges toward the center of the distribution. Thus, as the sample size increases, the standard deviation of the sample means decreases.

Assume that the systolic blood pressure of 30-year-old males is normally distributed, with an average of 122 mmHg and a standard deviation of 10 mmHg.

A random sample of 16 men from this age group is selected. Calculate the probability that the average blood pressure of the sample will be greater than 125 mmHg.

The population is normally distributed, so sample means are also normally distributed for any sample size. Calculate the standard error of the mean = 10 / SQRT(16) = 2.5. Calculate the z-score for the sample mean, X Bar = 125.

So take (125-122) /2.5 = 1.2. Therefore, z=1.20 now you need to find a reference table for z.

According to this table it is 0.3849. The area below the curve on each side of the mean is 0.5, and the area between
the mean and the z-score 1.20 is 0.3849. Calculate the probability that the average blood pressure of the sample will be greater than 125 mmHg: = 0.5 - 0.3849 = 0.1151 or 11.51%.

I hope this helped as a practical application.

Source:

  1. https://tinyurl.com/mknjfdf

#3

Hi ItsMeNotU ,

Thank a ton for your valuable inputs.

Do we use this theory for applying on algorithms for regression and classification problems?

Regards,
Tony


#4

Real quick: If you are asking these questions for a class paper, don’t call it a theory; it’s a Theorem. My professor would take points off the paper for that error.

Yes, you can use this on any data that is Gaussian. You can run your classification algorithm and then separate a ranking and run Central Limit Theorem (CLT). As for regression; regression predicts yhat. But you can segregate your data based on the different dimension and then use CLT.