Central Limit Theorem

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Central Limit Theorem

Point Estimate
Interval Estimate
Confidence Level
Confidence Interval
stats.norm.interval()

Significant Significant
Level Level
95%
2.5% 2.5%
Confidence
Level

Confidence Interval

Smitesh Tamboli
Population
The population refers to the entire group of items, individuals, or events that are
the subject of observation or study.
Example: Examining the average income of all households in a city, the population
would consist of every household in the city.
Population Mean
The average value of the entire population. It is denoted as

where Xi: individual item in the population.


N: population size

Population Standard Deviation


Measures how much data or values are spread in the entire population.

Sample
A sample is a subset of the population that collect and analyze to make inferences
about the entire population.

Sample Mean
The average value of the sample.
where xi: individual item in the sample.
n: sample size

Sample Standard Deviation S


Measures how much data or values are spread in the sample.

Inference St at ist ic s
Inferential statistics is a branch of statistics that involves making inferences or
predictions about a population based on sample data.

It helps to draw conclusions about a population parameter using information


obtained from a sample.

Smitesh Tamboli
Central Limi t Theorem
Let S1, S2,...,Sk be samples of size n drawn from an independent and identically
distributed population with mean and standard deviation . Let be the
sample means of the samples S1, S2,....,Sk. According to the CLT, the distribution of
follows a normal distribution with mean and standard deviation
for large value of n.
The central limit theorem states that the distribution of sample means approximates
a normal distribution as the sample size gets larger, regardless of the population’s
distribution.

Population
Mean and Standard Deviation

Sample S1 Sample S2 Sample S3 Sample Sk

Sample Mean Sample Mean Sample Mean Sample Mean

The distribution of the sample mean follows the


normal distribution of the population mean
and standard deviation

Example: Consider an eComm platform collect customer reviews of a particular


product each day. Let's say, they collect n reviews each day and repeat this process
for multiple days. As per the CLT, as the number of days increases, the distribution
of these daily sample means will tend toward a normal distribution.

With this, we can infer the probability that the true average rating falls within a
certain range based on the properties of the normal distribution.

Smitesh Tamboli
Estimation of Populat i on Par amet e rs
Estimation is a process used for making inferences about population parameters
based on the sample.

Point Estimate:
A point estimate is a single value or a specific value calculated from a sample. The
sample mean and standard deviation are estimates of population mean and variance.

A point estimate provides a specific numerical value intended to represent the true
value of the parameter.

Example: Suppose we are interested in estimating the average height of all students
in a school. We collect data from a random sample of 50 students and calculate the
sample mean height that is 165 centimeters. The sample mean 165cm serves as a
point estimate of the population (or school) mean height.

Interval Estimate:
The accuracy of the point estimate of population parameters is very difficult to
establish, hence we prefer interval estimate over point estimate.

An interval estimate of a population parameter such as mean and standard deviation


is an interval or range of values within which the true parameter value is likely to lie
with a certain probability.

For example, we can say that the average height of all students in a school lies
between 150 cm to 175cm.

The interval estimate may or may not contain the true parameter values. Thus, we
associate a confidence or probability with the interval estimate that predicts the
probability of finding the true parameter value in the interval.

For example, we may say that there is a 95% confidence that the interval contains the
average height of all the students lies between 150cm to 175cm. However, 95%
confidence implies that there is a 5% chance that the interval may not contain the
actual population mean.

Smitesh Tamboli
Confidence Level:
The confidence level is the proportion of times that the interval estimate, constructed
from repeated random samples, will contain the true population parameter.

It is the probability that the interval estimate will contain the true population
parameter. It is usually written as (1 - )100% on the interval estimate of the
population parameter.

When = 0.05, 95% is the confidence level and 0.95 is the probability that the
interval estimate will have the population parameter.

is called significance which represents the probability that the interval estimate
does not contain the true population parameter.

Confidence Interval:
Confidence interval is the interval estimate of the population parameter estimated
from a sample using a specified confidence level.

Significant 95%
Significant
Level Confidence
Level
2.5% Level 2.5%

Confidence Interval

Example: we want to estimate the average time customers spend on the e-commerce
platform per visit.
Sample data: [12,15,14,10,20,22,18,25,15,19]
We want to calculate a 95% confidence interval for the average time spent on the
platform.

Smitesh Tamboli
Steps to Calculate Confidence Interval:
Calculate the sample mean and standard deviation
Determine the Standard Error
Standard Error:
Confidence Interval Calculation

stats.norm.interval(0.95, loc=mean,
scale=standard_error)

Smitesh Tamboli

You might also like