Skip to content

Commit ad5f7a3

Browse files
committed
clarification with z-test and t-test
1 parent 7f898c1 commit ad5f7a3

File tree

1 file changed

+19
-0
lines changed

1 file changed

+19
-0
lines changed

Notes.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@
5858
Center Limit Theorem(CLT) : random variables tends to follow normal distribution.
5959
Test and hypothesis: statistic take n random samples of population, as n samples approximately normal dist, from n samples sigma, SE, etc, we can estimate population.
6060

61+
https://www.khanacademy.org/math/probability/statistics-inferential/hypothesis-testing/v/z-statistics-vs--t-statistics
62+
6163
If the observed value is normally distributed, then under the null hypothesis, the z-statistic has a standard normal distribution
6264
So the further the z-statistic is from zero, such extreme z-statitics should unlikely to happen under normal distribution. this serves as the stronger the evidence that the null hypothesis is false.
6365
The P-value of the test is the chance of getting a test statistic as extreme, or more extreme, than the one we observed(z-value), if the null hypothesis is true
@@ -67,6 +69,19 @@
6769
95% population within mean+- 2 sd, confidence level 95%.
6870
SE = 2*sigma.
6971

72+
CLT say we probably can infer population from sample. Probability is Confidence level.
73+
74+
so we draw the sample, got x-bar, sample mean, s, sample standard deviation.
75+
Population u and sigma can be inferred thru z-test(sample size > 30) or t-test.
76+
77+
Z = (x-bar - u) / sample-stdv / sqrt(n)
78+
79+
z is in (−1, 1) with probability approximately 0.68
80+
z is in (−2, 2) with probability approximately 0.95
81+
82+
so, if you relax z to 2, you higly probably(95%) guessed u value by assert it must be within a range. however the range could be coarse.
83+
84+
7085
Ex: U.S ppl life mean 68, sd(sigma) is 5. so 95% ppl live from 58-78.
7186

7287
z= ( observed − expected ) / standard_error
@@ -122,6 +137,9 @@
122137
A P-value of less than 5% is statistically significant: the difference can't be explained by chance, so reject the null hypothesis
123138
A P-value of more than 5% is not statistically significant: the difference can be explained by chance. We don't have enough evidence to reject the null hypothesis.
124139

140+
Comparing p-values from t and z
141+
One may be tempted to think that the confidence interval based on the t statistic would always be larger than that based on the z statistic as always t∗ > z∗ . However, the standard error SE for the t also depends on s which is variable and can sometimes be small enough to offset the difference.
142+
125143
## Confidence Level : how close sample to population. z=diff(sample,population) is to 0 by confidence. Prob(-1<(p-xp)/SE<1)=0.68
126144
Claim: 42 likes out of 100 surveyed with error 9%. This is 95% CI.
127145
1. sample mean approximate to true mean, with sample size n, population std sigma, and CI.
@@ -136,6 +154,7 @@
136154
no-belt = c(65963,4000,2642,303)
137155
chisq.test(data.frame(yes-belt,no-belt))
138156

157+
139158
## ANOVA
140159
F-dist: distribution between two variances, the ratio of two chi-square variables. ANOVA.
141160
variance analysis. hypothesis test two samples with t test to get p-value to reject null hypo.

0 commit comments

Comments
 (0)