Statistics in Details

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 283

https://www.statlect.

com/fundamentals-of-statistics/

MarinStatsLectures-R Programming & Statistics Youtube


Channel
https://www.youtube.com/user/marinstatlectures/playlists

https://www.youtube.com/playlist?list=PLqzoL9-eJTNBZDG8jaNuhap1C9q6VHyVa

StatQuest Youtube Channels

jamal505 Youtube Channles Arabic


https://www.youtube.com/user/jamal191719/playlists

Brandon Foltz
https://www.youtube.com/user/BCFoltz/playlists

et theory: Union and Intersection -


Mathematics - Probability and Statistics -
TU Delft
https://www.youtube.com/watch?v=envnkifm9lU&list=PL3sV3oXFujMguuuKpSyr2nK4lx37FiDMD

Hypothesis Testing

Power And Effect Size

Central Tendency

Z-Scores and Z-tests

Variability & Z-test

Central Limit Theorem and sampling Distribtuion

Estimation population parameters

One Sample t-test


Independent Samples T-test

Paired t-tests

One-way ANOVA

Post Hoc Tests

Confidence interval

Maxied ANOVA

Coorelation

Regression

Chi-squre

Wilcoxon One-Sample Signed Ranks Test

Wilcoxon Test
01 (Sec. 6.1 - 6.3.1) Distributions derived
from Normal Distribution
02 (Sec. 6.3.2) Sampling from Normal
Distribution
03 (Sec. 8.1 - 8.4) Estimation of
Parameters and Fitting of Probability
Distributions
Matlab Code :
04 (Sec. 8.3) The Method of Maximum
Likelihood
Not known Theta but I know distribution PDF but I do not know its parameters
49:00 Min
Descriptive STS Udacity

Lesson 1:

Intro to Research Methods

How did BBC measure memory?


25 Causal inference ‫االستدالل السببي‬
Double blind experiments
Lesson2: intro to Research Methods

Problem Set Numbering


You may notice as you go through the problem set "PS 1a: Intro to
Research Methods" and other problem sets that the problem numbers
in the progress bar are not always consecutive.

For lessons with two problem sets, if one problem set appears to be
"missing" problems, those problems will appear in the other problem
set for the lesson. For example, PS 1a skips from Problem 3 to 6
because Problems 4 and 5 are in PS 1b.

Problems with similar numbers usually cover similar topics. For


example, Problems 1-3 from PS 1a and 4-5 from PS 1b all cover
terminology for discussing populations and samples.

Remember, a population is the entire group of everyone (or everything)


you're interested in, while a sample is a smaller group selected from
the population. See this page for more information.
A construct is a variable that is not directly observable or
measurable. How might you measure someone's personality?
Operational definition

Operational Definition The descriptions for constructs that


we settle on and that helps you measure constructs in real world
are called Operational Definition.

Below are some Construct — Operational Definitions that


can help you to understand the concepts with examples.
A variable is a value that may change or differ between individuals
in an experiment. The moon's circumference will always have the same
value, so it is called a constant.
A variable is a value that may change or differ between individuals
in an experiment. The number of seconds in a minute does not change,
so it is called a constant.

Keep trying! At least one of your answers is more like a scientific


fact than a hypothesis as described in this lesson.

Extraneous lurking variables


Sample statistic
Population parameter
Sampling errors
Remember, the sampling error is the difference between the
population parameter and sample statistic. What is the sampling
error in this example?
We know there is a correlation, or association, between playing
violent video games and aggression in adolescent males. However, we
don't know if this is because naturally aggressive males are more
likely to play violent video games, because playing violent video
games causes adolescent males to become more aggressive, or something
other explanation.
Remember, a construct is a variable that is not directly observable
or measurable. Do you think aggression is directly measurable on its
own, or would we need an operational definition for it?
Non-response bias is a problem in many surveys, and you should be
especially concerned about it if your survey has a particularly low
response rate.
In general when conducting a survey, it's always possible the people
who completed your survey did not respond accurately.

In an experimental study, unlike an observational study, it is


valid to make causal conclusions since randomization minimizes
the effect of lurking variables.

What conclusions can you draw from the results of this


experimental study?

Good job!

Note that while the operational definition of quality of sleep


is the 10-point scale, the way we actually measure success
comes from the difference in quality between groups.
Lesson 3 PS 1b: Additional Practice (Optional)

A variable is a characteristic that describes individual data points.


In this example, the change in weight per student is a variable.
Taking the average gives one number that describes the group of
students. What would you call such a number?
A constant is a something in your experiment that does not change.
For example, if you test people's memory, and you make sure that
everyone takes the test at the same time of day, then time of day is
a constant in your experiment. The average change in weight isn't
constant here because it could have had a different value if the
results of the experiment had been different.
Good answer.

If you consider Dr. Friedman's value in the context of all


freshmen at all universities, her value is a sample statistic.
However, if she is concerned only about the information from
her university, then it can instead be considered a population
parameter since she has information from every freshman.
A construct is a variable that is not directly observable or
measurable. Would there be any difficulty in measuring someone's age
in years?
You got it right!

The researcher found a correlation between listening to


classical music and performance in school, then mistakenly
concluded that listening to classical music caused better
performance in school. Correlation does not imply causation!
This is because lurking variables can result in a correlation
between two variables that are not causally related.
This wasn't a controlled in experiment, so you can't conclude that
any trend you see is the result of one variable causing another.
You got it right! With such a small dataset and a weak trend, it's
hard to draw conclusions from this data.
Predictor variable is also acceptable, though that term tends to be
used when we are talking about a factor that we do not manipulate,
but may help our predictions.
33 participants were exposed to bright light therapy, so if there
were 100 total participants, the proportion would be 0.33. However,
there were only 95 total participants.
Remember, the independent variable is the variable that is different
between the control and experiment group, and the dependent variable
is the variable that you measure to determine the success of the
experiment.
Lesson 4 : visualizing Data
Good job! We could get a rough answer, but to get an exact answer,
we'd need to know how many of the students in the tallest bin are
older than 20, and how many are younger than 20.
Lesson 5 :

Good job! 105 - 15 = 90. This is 10 bins of size 9 between 15 and


105.
Good job! This one is tricky. Remember that each bin should be the
same size for every bar in the histogram.
Good job! As we make the bin size bigger, more values will fall
inside that bin.

Histograms should have a numerical x-axis. If the x-axis is


categorical, the graph is called a bar graph instead, and there is
usually some space between each bar to indicate that the x-axis is
not numerical.
Lesson6 :
Remember, "n", or the sample size, should be the total number of
people listed. In this case, that means you'll need to sum up all the
frequencies.
Note that since 1975 falls in the middle of the 1971-1980 bin, any
answer between 9.26% and 26% could be the possible true percentage.
Lesson: 7 Google Spreadsheet:
Google Drive create spreadsheet

Lesson 8: Central Tendency


Mode median average
Important :
Although there is one value for which there is a maximum frequency,
in terms of describing the 'modality' of a distribution, we want to
consider the number of local peaks. Are there any other local peaks
in this distribution?
Lesson 9:
https://github.com/Anwarvic/Intro-to-Descriptive-Statistics--Udacity/tree/master/Lesson%2009
Lesson 10:
Very important
If you add up the frequencies, you'll see that the first three bars
have less than half of the values, but the first four bars have more
than half. That means the median should be somewhere in the middle of
the fourth bar, greater than 25 and less than 30.
Lesson 11: Variability
The answer for the mode is incorrect. (Calculate the mode using the
bin size shown here - that is, the mode should be the salary range
for the tallest bar.)
Range is the difference between the maximum value and
the minimum value observed.
Q1:
The first quartile is the point where 25% of the distribution is below that
point, and 75% of the data is above that point.
IQR is not affected by Outliers
The median is always between Q1 and Q3, but remember that the mean is
sensitive to outliers. Watch the solution video for an example.
3 options to measure variability
Lesson 12: Variability
Full question on Google Spreadsheet to
calculate everything

That's the right proportion!


Thanks for thinking about whether this is what you would
expect. We'll be going into more detail on this question later
in the course.

That doesn't look right; double check your work.

Bessel's Correction means calculating the sample variance as SS


/ (n-1) instead of SS / n. Then the sample standard deviation
is the square root of the sample variance. (SS here stands for
"sum of squares", or the total squared deviation.)
Lesson 13: Standardizing
To proportions
First try finding how many fewer Facebook friends
Katie has than the mean. Then, divide that number by
the standard deviation to figure out how many standard
deviations away from the mean Katie is.
Z-Score
Lesson 14 : Standardizing
The smaller the standard deviation is, the more standard deviations
above the mean your score will be.
lesson 15:
Lesson 16 : Normal Distribution
0.16
Round your z-score to two decimal points and use the exact value(s) shown
in the z-table.
Lesson 17 : Normal Distributions
Very important Questions
Lesson 18:
The "population" means all possible outcomes when rolling the tetrahedral
die.
Central limit theory :
Very important
Thanks for thinking about what you expect! The sampling distribution
will actually be skinnier for n=5. Keep watching to find out why.
http://onlinestatbook.com/stat_sim/sampling
_dist/index.html
Revision video 25 <> mean about CLT
High level question : analysis
Lesson 19: Sampling distribution Questions
Good job! We'll describe the location of the sample mean by
calculating how many standard errors it is away from the center of
the sampling distribution. That will give us a z-score for our sample
mean.
For two samples of the same size, will the z-score be closer to 0 if
the mean is closer to or farther from \muμ?
Final Project >> Python
https://github.com/karimkmafifi/Intro-to-Descriptive-Statistics-Udacity-Final-Project

Inferential STS Udacity


Analyticsvidhya

https://www.analyticsvidhya.com/blog/tag/descriptive-statistics/

Central limit theory

You might also like