RMDS 2023 Q1 Q&Alecture Unit 19 20 PM_canvas

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 36

Q&A LECTURE - UNIT 19, 20

Resit Test 1 – Tomorrow


• Remindo test: including a scientific calculator
• Chrome books of the UT
• 40 MC questions
• Therm 2
• 13:45-15:45 hrs.

• Test 1: Unit 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 23


Test 2 & R test – Friday 3 Nov 2023 – 13:35-
16:45 hrs
• Remindo tests
• Chrome books of the UT
• Test 2:
• 40 MC questions
• Unit 13, 24, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22
• including a scientific calculator
• R-test: similar to R practice test:

• NH205: 65 persons: surnames start with A - G


• NH207: 60 persons: surnames starting with H - M
• NH209: 75 persons: surnames starting with N - Z
What else today..
• Unit 19 - Sampling:
Probability vs. non-probability sampling
“Could you explain the last question of asg unit 19? (The researcher
adapts his plans and decides to focus on the wide variety of first year
psychology students at the University of Twente...).

Unit 20 – Certainty about the mean:


Population, sample, sampling distribution (of the mean)
 assignment Q7 (start with Q5)
Theoretical
Research variable(s)
Question
I. Conceptualization
II. Operationalization
III. Measurement

v1 v2
Sampling unit1
All Units unit2
Data
unit3
unit4
Sampling 6

• Before sampling, you need a clear definition on


the population of interest (target population)
 Units of Analysis in your RQ

• A population refers to all members of a certain


group of people/companies/countries
Probability sampling 7

Selected randomly from the population  Individuals have an equal or known


chance to participate in the study.
• Goal: your sample represents well the (target) population (external validity)
• a sampling frame is necessary

Sampling frame
Types of probability sampling 8

1. Simple random sampling (SRS)


2. Stratified random sampling
3. Cluster random sampling
4. Multi-stage random sampling
Non-probability sampling 9

Selected non-randomly: individuals have an unequal or unknown


chance to participate in the study.

• No sampling frame available


• Sample is not representative for the population
• Goal: Inductive reasoning (explorative empirical research)
Types of non-probability sampling 10

Selected non-randomly: individuals have an unequal or unknown chance to


participate in the study.

1. Convenience
2. Purposive
3. Systematic

4. Quota (purposive, with fixed final size)


5. Snowball (social network of resp.)
Today..
Could you explain the last question of unit 19?
The researcher adapts his plans and decides to focus on the wide
variety of first year psychology students at the University Twente.
Because of the diverse educational careers sofar, the quality of the
psychology study might be improved by offering a more personalized
study program in BA-1.
The researcher wants to explore whether the present program fits to
the diverse personal needs of the BA-1 students. He wants to cover a
wide variety of BA-1 students by defining subgroups.
• Which type of sampling method would you advise to researcher?
• Describe the sampling method in more detail.
Today..
• Unit 19 - Sampling:
Probability vs. non-probability sampling

• Unit 20 – Certainty about the mean:


Population, sample, sampling distribution (of the mean)
 assignment Q5 & Q7
Sample distribution vs. sampling
distribution
Which statement is/are correct?
A. A Sample distribution displays which values of a variable you have
obtained after drawing a sample of a give size from a population.
B. A Sampling distribution displays the values of a statistic (e.g. mean, SD,
var) from repeatedly drawing samples of a give size from a population.

1. A and B are correct


2. Only A is correct
3. Only B is correct
4. A and B are not correct
5. I don’t know 
Q5: Average height in the
population?
• website
http://onlinestatbook.com/stat_sim/sampling_dist/

• Population: Height is normally


distributed, ranges 0-32

14
Population distribution

Sample distribution =
distribution of height
(after SRS): n=5
Sampling distribution of
the mean height of 5
persons

Sampling distribution of
“none”
→ .. the sample “mean”
→ .. 25 persons
Q5: Average height in the
population?
• website http://onlinestatbook.com/stat_sim/sampling_dist/

• Population: Height is normally distributed,


ranges 0-32

1. Session: “animated” (one by one)


•  3rd graph: draw 1 sample of n=2 persons
and determine their mean heights
•  4th graph: draw 1 sample of n=25 persons
and determine their mean heights
16
website http://onlinestatbook.com/stat_sim/sampling_dist/
Q5: Height of a population
• website http://onlinestatbook.com/stat_sim/sampling_dist/

• Height is normally distributed, range 0-32

Q5a. Session: “10.000”


•  3rd graph: drawing 10.000 samples of n=2 persons and determine
their mean heights
•  4th graph: drawing 10.000 sample of n=25 persons and determine
their mean heights

18
Q5b:
Compare mean and SD of both “sampling distribution of the
(sample) mean”. Are they different?
Q5b:
Compare mean and SD of the sampling distribution of the
mean

• Means are almost similar


• SD for n=25 < SD for n=2

• “With a sample size of 25 you


are more confident about the
value of the population mean
compared to a sample size of
2“(the range of graph is larger)
Q7/Q5c: What is the formula for the SD of the
sampling distribution of the (sample) mean?

Standard Deviation
of the population!

= SEM = Standard Error of the Mean


Q5d: Compute SD of the sampling distribution of
the (sample) mean? What is the SEM?
Why should you memorize this SEM
formula?

inference
Why should you memorize this SEM formula?
The common situation in research …..
• RQ: What is the average height of all UT students?
• The distribution of height in the population is unknown
→  and  are unknown
• You just have one sample, randomly drawn from the
population (SRS)
• and s can be calculated easily.

Standard
Deviation
of a samp
le
VARIANCE / STANDARD DEVIATION
 “Variance is the average of the squared differences from the mean”
Why should you memorize this SEM formula?
Information from one random sample (n > 30) can be used to say what the average height
of UT students probably is (inference).
------------------------------------------------------------------------------------------------------------
Central Limit Theorem:
“If you draw many random samples from the population that are large enough (n > 30) and
calculate the sample mean each time…..

many

…. the sampling distribution of the sample means is approximately normal, irrespective the
shape of the population distribution.”


Why should you memorize this SEM formula?
Information from one random sample (n > 30) can be used to say what the average
height of UT students probably is (inference).
------------------------------------------------------------------------------------------------------------
Central Limit Theorem:
“If you draw many random samples from the population which are sufficient large (n > 30)
that are large enough (n > 30) and calculate the sample mean each time…..

many

….the sampling distribution of the sample means is approximately normal, irrespective the
shape of the population distribution.”

 That’s . The shape of a normal distribution is well known and very useful.
NORMAL DISTRIBUTION

Q7c: Empirical rule (68-95-99.7 rule)

0.954 (or 95.4%)

http://onlinestatbook.com/2/calculators/normal.html
Why should you memorize this SEM
formula?
Central Limit Theorem:
1. “The shape of sampling distribution is normal”.
• - 1 SD to +1 SD: 68% of the values
• - 2 SD to +2 SD: 95.4% of the values
• -1.96 SD to +1.96 SD: 95.0 % of the values

• We can use these characteristics to express our uncertainty about


the mean of the population with only one sample!
 “95% Confidence interval of the mean”
Why should you memorize this SEM
formula?
Information from one random sample (n > 30) can be used to say what the average height
of UT students probably is (inference).
-----------------------------------------------------------------------------------------------------------
Central Limit Theorem:
1. Shape of sampling distribution is normal;
And
2. Mean of the sampling distribution almost equals to population mean.

→ the goal of answering our RQ is close, despite we don’t known distribution of


height in the UT population
Why should you memorize this SEM
formula?
Information from one random sample (n > 30) can be used to say what the average height
of UT students probably is (inference).
-----------------------------------------------------------------------------------------------------------
Central Limit Theorem:
1. Shape of sampling distribution is normal;
2. Mean of the sampling distribution almost equals to population mean.
And
3. SD of the sampling distribution (of the sample means) = SEM =

We don’t know , but our best estimate is s.


Why should your memorize this SEM
formula?
You can use information from one random sample (n > 30, ) to say what the
average height of UT students probably is (inference).
----------------------------------------------------------------------------------------------------

• 95% Confidence Interval of the mean = sample mean ± 1.96 * SEM

S
Why should you memorize this SEM
formula?
You can use information from one random sample (n > 30) to say what the average height
of UT students probably is (inference).
------------------------------------------------------------------------------------------------------------

“The average height of UT students is 1.71 cm (95 CI: [1.65, 1.77])”

Interpret the 95% CI of the mean:


• “95% of all sample means closest to the actual population mean are between
1.65 and 1.77 cm”.
• “With repeated (random) sampling, the interval 1.65 to 1.77 cm holds the actual
population mean 95% of the times”.

• “I am 95% confident that the interval 1.65 to 1.77 cm holds the actual population
mean.”
QUESTIONS?
GOOD LUCK AT THE RESIT TEST 1! 

You might also like