Homework 3 - Fall 2024

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

BMED2400: Introduction to Bioengineering Statistics Fall 2024

HOMEWORK #3
List Collaborators1:

Basic information
Topics covered: Sampling, Random Variables, Conditional Probability, and Bayes Theorem
General instructions:
• Please begin each problem on a new page – it makes life easier on graders.
(The keyboard shortcut in word for a new page is CTRL+Enter)
• You may, and are encouraged to, use information from:
o Static internet resources
o Lecture slides and videos
o The textbook
o Calculator
o Excel / spreadsheet software
o Classmates, Peers, & Family members
o Your instructors
• You may not use:
o People you don’t know (e.g., Chegg)
o Students who have previously taken BMED2400 (except TAs)

Submission:
Please submit via Gradescope.
Please make sure you submit to Homework #3

Grading:
You will be graded using the following scheme for each problem:
• 0% for little effort or failure to demonstrate meaningful comprehension of the relevant
content
• 50% for incomplete work that demonstrates meaningful comprehension of the relevant
content
• 100% for complete work (possibly with errors) that demonstrates meaningful
comprehension of the relevant content

1
Please list anyone you worked with on the homework. This uses the honor system. Remember that I encourage you
to work together but that the work your turn in should be your own work.
BMED2400: Introduction to Bioengineering Statistics Fall 2024

Conceptual Questions (0.4 points total)


C1) In your own words, what is a false positive in a medical test? Give one example of a
situation where a false positive test result could happen.
a) A false positive in a medical test would be when due to the inaccuracy of the test, a
problem that doesn’t exist within a patient is reported to exist. A good example would
be a Covid Test which showed up positive even though a person was not just
asymptomatic but doesn’t have Covid at all.
C2) Put yourself in the shoes of an Oncologist. Explain to a male patient how conditional
probability affects your interpretation of the results of his estrogen test relative to one run
on a female patient. You must mention the terms mean and variance.
a) When viewing the results of an estrogen test, the oncologist has to view the answers
in perspective. They are viewing the results based on the condition whether or not the
subject is male or female. Depending on the person’s gender, they would have
different means of estrogen levels and variance as well. Males have lower level of
estrogen than females and variance as well.
C3) Write out Bayes formula/theorem and explain what each of the terms represents.
a) P(H|E) = P(E|H)(P(H)/P(E)). P(H|E) is the posterior probability, P(E|H) is the
likelihood, P(H) is prior probability and P(E) is marginal probability. Basically, it
tells you what your chances of being positive are after testing positive.
C4) Describe the difference between sensitivity and specificity. Identify and explain one (1)
situation where the design of a medical test should prioritize sensitivity. Identify and
explain one (1) situation where the design of a medical test should prioritize specificity.
a) Sensitivity is how well the test can ascertain positive results (would minimize false
negatives). Specificity is how well the test can ascertain negative results (would
minimize false positives). High specificity is used for tests and screenings where
further care or screening would require lots of money.
C5) What are two assumptions that are implicitly made when someone calculates and reports
probability values in a research paper from data they collected?
a) You assume the conditions and the circumstances of the experiments are controlled.
You assume that the data is also processed correctly and filtered.
BMED2400: Introduction to Bioengineering Statistics Fall 2024

Data Processing and Analysis (0.4 points each)


D1) Your task is to validate that the Bayes’ Theorem WORKS. Therefore, you will use the
theorem to check the conditional probabilities of five surgeries in our data set. For this
problem, we are going to focus our efforts on two surgical specialties – Board Game
Orthopedics (Spare Ribs, Funny Bone, and Wish Bone) and Board Game GI Surgeries
(Butterflies in Stomach and Bread Basket).
a) Setup: Describe how you can use the data that we collected to show that Bayes
Theorem is mathematically correct. List the information (i.e., the probabilities) that
you will need to add to the three blank columns in the table below.
i) We can calculate the true conditional probability adjusted to account for error by
using the Bayes Theorem. The additional probabilities I would need would the probability of
each Procedure, given that it is a success, the probability of each procedure, as well as the
probability of overall success.
b) Calculation: Fill out the table below. For the Measured column, you should list
information directly from the data set. For the Calculated column, you should list the
result from a calculation using Bayes Theorem. For the blank columns, fill in any
other information you need to determine the calculated problem.

𝑷(𝐒𝐮𝐜𝐜𝐞𝐬𝐬|𝐏𝐫𝐨𝐜𝐞𝐝𝐮𝐫𝐞) Calculation Inputs


Procedure
Measured Calculated P(Procedure|Success) P(Procedure) P(Success)

Butterflies
in 77% 78% 9.3% 6.0% 50.3%
Stomach

Bread
57% 59% 11% 9.4% 50.3%
Basket

Spare Ribs 41% 43% 9.3% 11% 50.3%

Funny
50% 50% 8.0% 8.1% 50.3%
Bone

Wish
21% 21% 4.0% 9.4% 50.3%
Bone

c) Interpretation: Explain what your results mean (both the conditional probabilities
and the comparison of the measured and calculated columns) as if to another
engineer. Put yourself in the shoes of a director of surgery in a hospital: pick one
procedure to focus on improving in your hospital and describe WHY you chose it.
i) The measured probabilities represent the results of the experiment with false data
within and the calculated probabilities account for those errors related to false data and give us
a more accurate probability. I would improve Spare Ribs as the calculated probability was a
BMED2400: Introduction to Bioengineering Statistics Fall 2024

whole 2% off the measured probability just like Bread Basket, however, more Spare Ribs
surgeries were performed and it has a lower success rate.
D2) Do textbook problem 3.26, but change the conformance probabilities to 0.95, 0.93, and
0.98 for machines 1, 2, and 3, respectively.
a) Setup: Read problem 3.26. Identify two concepts/rules that are useful for this
problem and list any equations you will use.
i) It tells you two integral pieces of information: 1) The conformation probabilities
of each machine and 2) the assignments of each machine to production.
b) Calculation: Calculate results to parts (a) and (b) from the textbook.
i) a) 94.6% b) 30.1%
c) Interpretation: Your boss wants you to improve the average quality of the factory
these machines are in. You have to redistribute the production among the three
machines to improve quality, but you have limited options for distribution because of
throughput. No machine may be below 20% or above 50%. How would you distribute
the production among the three types of machines to maximize the output of the
factory? Defend your reasoning using math and show the amount of improvement in
producing conforming parts that your solution would create.
i) I would reassign it so that 3 would be 50%, 1 would be 30%, and 2 would be
20%. This would give us a new conformation probability of 96.1%. This would optimize the
probability the most due to the fact that the machine with the highest efficiency makes the most
and the one with the lowest efficiency makes the least.
D3) You have four (4) six-sided dice. They are each weighted differently.
Pick a software coding environment of your choice (e.g., R or MATLAB), or utilize Excel.
Develop code or an appropriate spreadsheet setup to do the following:
• Randomly weight four dice.
• Randomly select one of the four dice.
• Roll that die 50 times. After each roll, update the probability of having selected each
die according to Bayes’ Theorem.
a) Setup: Provide the code used or include a table of the excel formulas as they appear
in your spreadsheet (use CTRL+` to switch between viewing formulas and results).
i) To decide which number to weight: Randbetween(1,6))
ii) To decide which dice to roll: F9(Randbetween(1,4))
iii) RandomNumberValue: Randbetween(1,10)
iv) Weighted Die Roll: IF(RandomNumberValue > 6, Weighted Number,
RandomNumber Value)1
b) Calculation: Fill in the following tables:
BMED2400: Introduction to Bioengineering Statistics Fall 2024

Side Weights

Die # 1 2 3 4 5 6

1 10% 10% 10% 10% 50% 10%

2 10% 10% 50% 10% 10% 10%

3 50% 10% 10% 10% 10% 10%

4 10% 10% 10% 10% 10% 50%

Die Selection Probabilities

Roll # 1 2 3 4

0 0.25 0.25 0.25 0.25

5 0 0.60 0.40 0

10 0 0.50 0.20 0

15 0.20 0.40 0.13 0

20 0.20 0.45 0.15 0

25 0.16 0.36 0.20 0.04

30 0.17 0.37 0.20 0.03

35 0.17 0.34 0.17 0.06

40 0.18 0.38 0.15 0.05

45 0.18 0.38 0.16 0.07

50 0.16 0.38 0.14 0.06

c) Interpretation: Discuss which die you believe you selected and the likelihood you
are incorrect, if that is even possible. Discuss why the probability changed as you
rolled the die. Qualitatively, how would expect your values to change if you instead
had six (6) dice? What if you had only been told the result of every other roll?
i) I believe die 2 was selected as the weighted 3 showed up the most. If I am
incorrect, it would truly be due to random error and I believe if we rolled it more times, the true
choice would be apparent. The values would change if I had 6 dice depending on the
BMED2400: Introduction to Bioengineering Statistics Fall 2024

weightages of the dice. If I was only told every other result, I believe I would still arrive at the
same conclusion, but it might’ve taken me more rolls.
BMED2400: Introduction to Bioengineering Statistics Fall 2024

Scientific Literacy and Reflection (0.4 points total)


LR1) Please read the research article linked below and answer the following questions. These
questions can be answered using the article; however, if you decide to use outside
resources, please make sure you cite the source.
Article:
Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases:
Biases in judgments reveal some heuristics of thinking under uncertainty. Science,
185(4157), 1124-1131.
Brief background:
This paper is generally referred to as a seminal work in the study of human thinking,
especially related to biases, uncertainty, and economics. It reports on three studies
conducted by Amos Tversky and Daniel Kahneman (who you may know from such books
as Thinking fast and thinking slow). The experiments look at how people of various levels
of expertise answer statistical questions in ways that rely on ‘biases’ about statistics. The
results describe the biases, which the authors call heuristics.
a) Name and describe the three main heuristics the article reports.
i) Representativeness - when people are asked to judge the probability that an object
or event A belongs to class or process.
ii) Availability of instances or scenarios - employed when people are asked to assess
the frequency of a class or the plausibility of a particular development.
iii) Adjustment from an anchor - employed in numerical prediction when a relevant
value is available.
b) On page 1125 (page 3 of the PDF), the authors use an example question about
hospitals. What is the right answer to their question and why?
i) I believe the answer is both hospitals as the deviation from the 50% boys or girls
should be the same, the smaller hospital would have more drastic percentages but the overall
number of days should pan out to be around the same. If it isn’t, the smaller hospital would
have more days with over 60 as it has less babies born per day.
c) Why do the authors think people use heuristics?
i) People use heuristics when using their own intuitions to account for some aspect
of the experiment and instead of striving for internal consistency, they strive for compatibility
between their prior knowledge, experience, and the data from the experiment.
d) What are the implications the authors see their results having for researchers?
i) These heuristics lead to systematic and predictable errors. Accounting for these
could improve judgements and decisions.
e) Think of one time you have personally relied on one of these heuristics. What were
the pros and cons of using it?
i) Pro – I gained data that was seemingly correct, it was aligned with the knowledge
I had, and it seemingly made sense with the logic that my brain set forth.
BMED2400: Introduction to Bioengineering Statistics Fall 2024

ii) Con- It wasn’t actual data but what I thought the data would be.

You might also like