Central Limit Theorem
Central Limit Theorem
Central Limit Theorem
Point Estimate
Interval Estimate
Confidence Level
Confidence Interval
stats.norm.interval()
Significant Significant
Level Level
95%
2.5% 2.5%
Confidence
Level
Confidence Interval
Smitesh Tamboli
Population
The population refers to the entire group of items, individuals, or events that are
the subject of observation or study.
Example: Examining the average income of all households in a city, the population
would consist of every household in the city.
Population Mean
The average value of the entire population. It is denoted as
Sample
A sample is a subset of the population that collect and analyze to make inferences
about the entire population.
Sample Mean
The average value of the sample.
where xi: individual item in the sample.
n: sample size
Inference St at ist ic s
Inferential statistics is a branch of statistics that involves making inferences or
predictions about a population based on sample data.
Smitesh Tamboli
Central Limi t Theorem
Let S1, S2,...,Sk be samples of size n drawn from an independent and identically
distributed population with mean and standard deviation . Let be the
sample means of the samples S1, S2,....,Sk. According to the CLT, the distribution of
follows a normal distribution with mean and standard deviation
for large value of n.
The central limit theorem states that the distribution of sample means approximates
a normal distribution as the sample size gets larger, regardless of the population’s
distribution.
Population
Mean and Standard Deviation
With this, we can infer the probability that the true average rating falls within a
certain range based on the properties of the normal distribution.
Smitesh Tamboli
Estimation of Populat i on Par amet e rs
Estimation is a process used for making inferences about population parameters
based on the sample.
Point Estimate:
A point estimate is a single value or a specific value calculated from a sample. The
sample mean and standard deviation are estimates of population mean and variance.
A point estimate provides a specific numerical value intended to represent the true
value of the parameter.
Example: Suppose we are interested in estimating the average height of all students
in a school. We collect data from a random sample of 50 students and calculate the
sample mean height that is 165 centimeters. The sample mean 165cm serves as a
point estimate of the population (or school) mean height.
Interval Estimate:
The accuracy of the point estimate of population parameters is very difficult to
establish, hence we prefer interval estimate over point estimate.
For example, we can say that the average height of all students in a school lies
between 150 cm to 175cm.
The interval estimate may or may not contain the true parameter values. Thus, we
associate a confidence or probability with the interval estimate that predicts the
probability of finding the true parameter value in the interval.
For example, we may say that there is a 95% confidence that the interval contains the
average height of all the students lies between 150cm to 175cm. However, 95%
confidence implies that there is a 5% chance that the interval may not contain the
actual population mean.
Smitesh Tamboli
Confidence Level:
The confidence level is the proportion of times that the interval estimate, constructed
from repeated random samples, will contain the true population parameter.
It is the probability that the interval estimate will contain the true population
parameter. It is usually written as (1 - )100% on the interval estimate of the
population parameter.
When = 0.05, 95% is the confidence level and 0.95 is the probability that the
interval estimate will have the population parameter.
is called significance which represents the probability that the interval estimate
does not contain the true population parameter.
Confidence Interval:
Confidence interval is the interval estimate of the population parameter estimated
from a sample using a specified confidence level.
Significant 95%
Significant
Level Confidence
Level
2.5% Level 2.5%
Confidence Interval
Example: we want to estimate the average time customers spend on the e-commerce
platform per visit.
Sample data: [12,15,14,10,20,22,18,25,15,19]
We want to calculate a 95% confidence interval for the average time spent on the
platform.
Smitesh Tamboli
Steps to Calculate Confidence Interval:
Calculate the sample mean and standard deviation
Determine the Standard Error
Standard Error:
Confidence Interval Calculation
stats.norm.interval(0.95, loc=mean,
scale=standard_error)
Smitesh Tamboli