Central Limit Theorem
Central Limit Theorem
Central Limit Theorem
The central limit theorem relies on the concept of a sampling distribution, which is
the probability distribution of a statistic for a large number of samples taken from a
population.
Suppose that you draw a random sample from a population and calculate
a statistic for the sample, such as the mean.
Now you draw another random sample of the same size, and again calculate
the mean.
You repeat this process many times, and end up with a large number of means,
one for each sample.
The central limit theorem says that the sampling distribution of the mean will always
be normally distributed, as long as the sample size is large enough. Regardless of
whether the population has a normal, Poisson, binomial, or any other distribution, the
sampling distribution of the mean will be normal.
We can describe the sampling distribution of the mean using this notation:
Where:
The sample size affects the sampling distribution of the mean in two ways.
When the sample size is small, the sampling distribution of the mean is sometimes
non-normal. That’s because the central limit theorem only holds true when the sample
size is “sufficiently large.”
When n < 30, the central limit theorem doesn’t apply. The sampling
distribution will follow a similar distribution to the population. Therefore, the
sampling distribution will only be normal if the population is normal.
When n ≥ 30, the central limit theorem applies. The sampling distribution will
approximately follow a normal distribution.
When n is low, the standard deviation is high. There’s a lot of spread in the
samples’ means because they aren’t precise estimates of the population’s
mean.
When n is high, the standard deviation is low. There’s not much spread in the
samples’ means because they’re precise estimates of the population’s mean.
1. The sample size is sufficiently large. This condition is usually met if the
sample size is n ≥ 30.
1. The samples are independent and identically distributed (i.i.d.) random
variables. This condition is usually met if the sampling is random.