0% found this document useful (0 votes)
53 views6 pages

4.1.STS.Handout.Key

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 6

*All solutions and teacher notes in blue*

AP Statistics Handout Key: Lesson 4.1


Topics: sample vs. population, biased sampling, simple random samples (SRS)

Lesson 4.1 Guided Notes


Sample vs. Population
1) In your own words, define the following terms: population,
census, and sample.
Population
Population: The entire group of individuals you want to study.
Sample
Census: When you measure all individuals in a population

Sample: When you measure a subset of individuals in a


population. Due to limited resources, we often use samples to
make inferences about populations.
Graphic from the National Center for Education Statistics

Biased Sampling

SW Tennessee Community College: The homepage of their website


boasts: “Our overall graduation placement rate is 98.5%...”1 This
means that 98.5% of graduates find jobs after leaving college.

2) Identify the sample they used to find that 98.5% number. Do you
believe that this sample is representative of the population of all At SW Tennessee Community College:
students who attended the school? Why or why not?
Sample: Students who graduated. only 27%
No - Students who graduated may have more resources and of incoming students transfer or
motivation than their peers who failed to graduate. So, graduates
graduate within 8 years of entering.
are not representative of the population all students who
attended this school. Among full-time, first-time degree or certificate-seeking
students who entered in 2010/2011, Source: IPEDS (2020)

3) The 98.5% statistic is probably misleading. Why?


The students who graduated have the resources they need to get a job. However, most students who
attended this school didn’t graduate. These students likely had a harder time finding work. So, the
percent of all students (who attended the school) that found work is probably lower than 98.5%.

Bias: a study flaw that leads to unrepresentative and/or inaccurate estimates.


Undercoverage: When part of the population has a reduced chance of being included in a sample.
• Leads to bias.
• Example: excluding the students who didn’t graduate.

1
Accessed 6/9/2020

Material adapted from Skew The Script (skewthescript.org)


2

Rogers State University (Oklahoma): In a recent report,2 the


University found that about 75% of graduates were pursuing another
degree or had found full-time employment by their final semester.
The same report shows that the response rate to the University’s
questions/surveys was only 20%.

Nonresponse: When individuals chosen for a sample don’t respond.


- Leads to bias if these individuals differ from respondents. Rogers State University

4) How could bias in the sampling method have affected the graduate study/employment rate estimate
from Rogers State University?
When writing about sampling bias…
Graduates who didn’t find post-grad employment may
be ashamed, making them less likely to respond to the 1. Identify the population and the sample
survey. Therefore, this sampling method may include a 2. Explain how the sampled individuals might
lower proportion of unemployed graduates than in the differ from the general population
full population of graduates. This produces an 3. Explain how this leads to an over or
overestimate of the true percentage of all graduates underestimate.
who are actually starting full-time work.

Types of Sampling Bias Important: These categories of bias


can overlap. On an FRQ, if you’re
• Undercoverage bias unsure, don’t try to use one of these
vocab terms. Instead, just describe the
• Nonresponse bias bias, how it arises, and whether it leads
to an under or overestimate.
• Voluntary response bias: Occurs when a sample is composed
of volunteers, who may differ from individuals who don’t choose to volunteer.
Ex: You want to study heart rate during exercise. You recruit volunteers to run a mile and then
measure their pulse. The few insane people in our society who actually like to run are the ones
who volunteer, so they’re healthier on average than the population → bias

• Question wording bias: When survey questions are confusing or leading.


Ex: “Which show do you prefer: Diners, Drive-In, and Dives, hosted by the incredibly talented,
funny, and interminable mayor of Flavortown / chef Guy Fieri or Iron Chef hosted by the boring
Alton Brown?

• Self-reported response bias: When individuals inaccurately report their own traits.
Ex: I report being able to bench-press 350 lbs.

2
Employment and Continuing Education for Graduating Students 2017-2019 AY 3-Year Aggregation (downloaded 6/9/2020 from
https://www.rsu.edu/about/accountability-academics/student-outcomes/)

Material adapted from Skew The Script (skewthescript.org)


3

Simple Random Samples (SRS)

In order to avoid bias, you must randomly sample.

Simple Random Sample (SRS): a sampling method in which every possible group of individuals in the
population has an equal chance of being selected.

Example: COVID-19 and Sampling


When COVID-19 spread to NYC, the city only provided tests to people
who showed symptoms. Some infected people don’t show symptoms.
So, the sampling method led to an underestimate of the number of
people infected. Instead, they could have randomly sampled the NYC
population, tested those who were sampled, and gotten an unbiased
estimate of the number of people infected.

5) Describe how you would implement a simple random sample (SRS) of 1,000 NYC residents to test for
COVID.

When describing how to perform an SRS:


Assign every individual in NYC an integer 1 – N (where
N is the population size of NYC). Use a random number 1. Assign each individual in the population a
generator to obtain 1,000 integers between 1 – N, number 1 – N (population size).
skipping repeats. Administer the COVID test to the 2. Use a random number generator to obtain
1,000 individuals whose numbers were selected. n (sample size) numbers, skipping repeats.
3. Sample the individuals whose numbers
were generated

Recommended discussion norms:


Lesson 4.1 Discussion skewthescript.org/discussion-norms

Discussion Question: During World War II, a statistician by the name of Abraham Wald was asked to
help the British air force decide where to put extra armor on their planes. They gave him charts of the
bullet holes in planes that were wounded in fighting but made it back safely to England. An example is
shown below, with each dot representing places hit by bullets.

Using the chart, on what part of a new plane would you


recommend they put extra armor? Choose from the
options below and give a statistical reason for your choice.
Options: A) Nose B) Wings C) Body D) Engine E) Tail

Best answer: It’s important to note the sampling method


here – the planes pictured here survived the shots fired at
them. So, it looks like they can take damage on the wings,
body, tail, etc. and still make it back safely. The planes we
don’t see are the ones that crashed: the crashed planes
may have been hit on the nose, since none of the planes
Image courtesy of Professor Joseph Blitzstein (i.e. the best stats prof in the country). that made it back safely (pictured here) have hits on the
See his “Harvard Thinks Big” talk on this problem: https://youtu.be/dzFf3r1yph8 nose. This was Abraham Wald’s key insight during the war.

Material adapted from Skew The Script (skewthescript.org)


4

Lesson 4.1 Practice

1) Identify the population and the sample in each study.


a) A teacher posts a poll to a website that only the students in his class have access to. Twenty
students respond to the poll.
The population is all students in this teacher’s class. The sample is the 20 students who
responded to the poll.

b) In the produce section of a grocery store there are 18 bags of red grapes. A shopper selects 1
grape from three of the bags to taste test.
The population would be all red grapes at this grocery store. The sample would be the 3 grapes
tasted.

2) A teacher posts a poll to his class website. He asks, “Would you prefer to have the quiz on Friday or
Monday?” Out of his 32 students, 8 responded to the poll. 62.5% of the respondents indicated they
would prefer to have the quiz on Friday. How could bias have impacted the estimate of 62.5%?
The sample of 8 students who answered the survey may be more prepared to take the quiz on Friday
perhaps because they check the website every day. Students who don’t check the website every day
may be less likely to answer the survey and may also prefer to take the quiz on Monday. The value of
62.5% could be an overestimate of the proportion of all 32 students who would prefer to have the quiz
on Friday.

3) A vegetable gardener is trying to determine the average number of pea pods produced by all 24 of
their pea pod plants. The plants are growing all around the perimeter of a rectangular garden. The
gardener selects 5 plants along one side of the garden, counts the number of pea pods on each plant,
and found the mean of these values.

a) How could sampling bias have impacted the sample mean number of pea pods from the 5 plants
chosen?
If the 5 plants chosen were on the sunniest side of the garden, for example, then their mean
number of pods could be higher than the true mean number of pods for all 24 plants.

b) Describe how the gardener could select a simple random sample of 5 plants.
Assign every plant a number from 1 to 24. Use a random number generator to generate 5
numbers between 1 and 24 (skipping repeats). Include the corresponding plant for each number
in the sample and measure each plant’s number of pea pods.

Material adapted from Skew The Script (skewthescript.org)


5

4) Suppose we want to estimate the proportion of cars in a parking lot that have a manual transmission
rather than an automatic transmission. If the lot has 80 cars, describe how you would implement a
simple random sample of 18 cars.
Assign every car a number from 1 to 80. Use a random number generator to generate 18 distinct
numbers between 1 and 80 (skipping repeats). Find the corresponding car for each random number, and
note whether it has a manual or automatic transmission.

Questions 5-9: Select the type of sampling bias present in each scenario from the list below. Justify your
choice.
A) Undercoverage Bias D) Question Wording Bias
B) Nonresponse Bias E) Self-Reported Response Bias
C) Voluntary Response Bias

5) A teacher is curious about her students’ opinion on a recent project they completed. During class, she
asks for volunteers to join a focus group to share feedback after class. The teacher uses this feedback to
infer how all her students feel about the project.
C. This is Voluntary Response Bias because the students who end up giving their opinion on the project
have volunteered to stay after class, and these volunteers may feel more strongly about their opinions
than those who chose not to stay after class.

6) A local fire department wants to survey the residents of the town they serve about whether taxes
should be raised to pay for a new fire truck. The question they posed reads, “As it stands, if a fire breaks
out in your home, we may not be able to reach you in time to save your home. A new fire truck would
give us a much better chance. Are you in favor of a new truck for the fire department?”
D. This is Question Wording Bias because the question is leading residents to say yes to a new fire truck
by mentioning the scary possibility of the current truck not getting to their house in time to fight a fire.

7) The principal is looking to get a representative sample of all students at the high school to gauge their
opinion on a new mascot. She takes a simple random sample of students sitting in the cafeteria at lunch.
Note: At this school, seniors are allowed to leave campus at lunch.
A. The principal could have Undercoverage Bias because seniors who leave campus at lunch are not able
to be part of the sample. So, seniors have a lower probability of being included from the outset.

8) A middle school is considering a “no-homework” policy, but first administrators want to know if
students are spending an exorbitant amount of time on homework each night. A random sample of
middle school students is asked how much time they spend on homework each night on average.
E. This is Self-Reported Response Bias. Middle school students may not be able to accurately report the
average amount of time they spend on homework each night, or they might purposefully
overemphasize it.

9) The owner of a coffee shop is hoping to survey the employees on their opinions of the coffee made at
the shop. A simple random sample of 10 employees is selected, and an anonymous survey is emailed to
each of them. The owner receives 4 responses.
B. This could be Nonresponse Bias because only 4 of the 10 sampled employees responded to the
survey. The opinions of respondents might differ from the opinions of non-respondents.

Material adapted from Skew The Script (skewthescript.org)


6

Further Practice

Teachers: We recommend providing additional practice exercises from your AP Stats textbook or from
prior AP Stats exams. The following textbook sections and AP exam questions are aligned to the content
covered in this lesson.

• The Practice of Statistics (AP Edition), 4th-6th editions: section 4.1


• Stats: Modeling the World (AP Edition), 4th/5th editions: ch 11, 3rd edition: ch 12
• Statistics: Learning from Data (AP Edition), 2nd edition: section 1.3
• Advanced High School Statistics, sections 1.3-1.4
• AP Exam Free Response Questions (FRQs): 2014 Q4, 2013 Q2 (parts a & b)

Handout Key by statistics student Greyson Zuniga

Material adapted from Skew The Script (skewthescript.org)

You might also like