The Analysis of Biological Data Practice Problem Answers
The Analysis of Biological Data Practice Problem Answers
The Analysis of Biological Data Practice Problem Answers
d
< 58.2. (c) If we had
data from 100 different samples and we calculated the mean difference for no smoking day and a
normal day each time, we expect that 99 of these samples would include the true mean
difference. (d) We can test whether the difference is different from 0 by using the paired t-test: t
= (25 - 0) / 10.22 = 2.45, for 9 df. t > 2.26, the critical value for '(2) = 0.05, so we conclude that
the accident rate does differ between no smoking day and a normal day (and is higher on no
smoking day). P < 0.05.
3. The change in mean length is -5.81 mm, the standard deviation is 19.5 mm, and the sample size is
64. The SE is 19.5 /
64 = 2.44. The confidence interval for the difference is the mean
difference + 2.44 * t, where t
0.05(2),63
= 2.0, so the confidence interval is: -9.69 mm <
d
< -1.935
mm. This assumes that the distribution of change in length is a normal distribution and that the
iguanas are a random sample from the population. (b) The 95% confidence interval for the SD of
the change in length is derived using the formula for the confidence interval of the variance in
chapter 11. The variance of the difference is 380.25. We need the df (63) and the appropriate &
2
for 0.025 and 0.9725, which are 86.83 and 42.95. Then, the lower confidence interval is (380.25
* 63) / 86.83 = 275.9 mm
2
, and the high is (380.25 * 63) / 42.95 = 557.8 mm
2
. The standard
deviation 95% confidence interval is found by taking the square-root of the variance confidence
interval, or 16.6 mm < ( < 23.6 mm. (c) No change in length implies a difference of zero, which
is excluded from the confidence interval in part (a). This implies that a hypothesis test using
these data would reject the null hypothesis. of no difference. (d) t = -2.38, for 63 df. We reject
the null hypothesis of no mean difference with P < 0.05, since t > 2.0, the cutoff for '(2) = 0.05
for 63 df.
4. (a) To calculate the 95% confidence intervals for the mean number of dung beetles captured, we
must assume that the number captured is normally distributed. So, for the dung-addition
treatment, the confidence interval is 4.8 + (2.26 # 1.03): 2.47 < < 7.13. For the control, the
confidence interval is 0.51 + (2.26 # 0.28), or -0.13 < < 1.15. (b) The two-sample t-test cannot
be used test for differences in the means, since the standard deviations are more than three-fold
different between the two groups. Instead, a Welch's approximate t-test is appropriate.
5. Because these are not paired samples, we will analyze the difference of the means, not the mean
of the differences. The monogamous flies had a mean testes size of 0.8475 mm
2
, the polyandrous
0.95 mm
2
, for a difference of 0.1025 mm
2
. The 95% confidence interval for this difference
requires finding the SE
Y1bar
-
Y2bar
,= )(s
p
2
(1/n
1
+ 1/n
2
), where n
1
and n
2
= 4, and s
p
2
= (df
1
s
1
2
+
df
2
s
2
2
)/ (df
1
+ df
2
), where df
1
= df
2
= 3, and s
1
2
and s
2
2
= 0.0010 and 0.0011 respectively. Then,
SE
Y1bar
-
Y2bar
= 0.023. The confidence interval for the difference is the standard error times
t
0.05(2),6
= 2.45, so 0.1025+ 2.45 # 0.23, or 0.046 to 0.159. (b) The null hypothesis is that there is
no difference between the monogamous and polyandrous treatments in testes size, so (
1
-
2
)
0
=
0. t = the difference in means, 0.1025, over the SE of the difference, 0.23. t = 4.48, for 6 df. t
crit
=
3.71 for P = 0.01, so P < 0.01. The mean testes sizes are significantly different.
6. (a) On average, 33% more of the male bodies were covered if they emitted pheromones. s
p
=
26.5%;
SE
Y
1
!Y
2
= 6.02%. df
1
= 48 and df
2
= 31, so df = 79. t
0.05(2), 79 df
= 1.99, so the confidence
interval is: 21% <
1
!
2
< 45%. (b) Using a two-sample t-test, we will assume that the percent
coverage is normally distributed, that each snake is independent, and that the standard deviations
are not different (they are not more than threefold different). The null hypothesis is that there is
no difference between the males emitting pheromones and those not, so (
1
-
2
)
0
= 0. t = 0.33 /
0.0602 = 5.47 > 3.9, the critical value for '(2) = 0.0002 for 79 df, so we can reject the null
hypothesis, with P < 0.0002.
7. (a) The PLFA levels between the control and addition plots before cicada death were not
significantly different (two sample t-test: t = -0.19, df = 42, P > 0.10). (b) After the cicada
addition, the PLFA levels differed significantly: t = 2.89, df = 42, P < 0.01. (c) No: we are
interested in the effect of the cicada addition, so we need to look at whether the change in PLFA
levels differed between the treatments. The correct comparison is the mean change in the control
plots compared to the mean change in the addition plots. It is possible to have a situation in
which test b was significant and test a was not, but the effect of cicada addition was not
significant. For instance, the control plots may have been non-significantly lower in PLFA prior
to cicada addition, and significantly lower after cicada addition, but the difference might not be
significant.
8. As described, this test assumes that the eight "open water" samples were independent of the eight
"near shore" samples, as it uses a two-sample t-test (and so would have 14 df). Differences in
growth rate could be due to differences between lakes, so the two samples within each lake are
not independent. The paired t-test would better reflect this (and would have 7 df).
9. (a) Since we assume that the distributions are normal, we can use the F-test to compare the
variances. The ratio of the variances (the larger variance is always in the numerator) is F =
(0.1582)
2
/(0.0642)
2
= 0.025 / 0.004 = 6.07. The degrees of freedom are 32 and 19. F
0.05(1),32,19
=
2.05 (between 2.07 for 30 df in the numerator and 2.03 for 40 df), so we reject the null
hypotheses that the variances are equal. (b) The variances are not equal, but the difference in the
standard deviations is not greater than threefold. Therefore a two-sample t-test could be used: t =
0.28, df= 51, P > 0.05. You could use Welch's approximate t test rather than the two-sample t-
test. t = 0.13, with 46 df, which is not significant.
10. (a) Remember to multiply each value of flower length by the number of flowers in that category,
when calculating the mean and variance (the zeros are dropped in the equation below):
4 55 ( ) +10 58 ( ) + 41 61 ( ) + 75 64 ( ) + 40 67 ( ) + 3 70 ( )
4 +10 + 41+ 75 + 40 + 3
= 63.5
4 55 ! 63.5 ( )
2
+10 58 ! 63.5 ( )
2
+ 41 61! 63.5 ( )
2
+ 75 64 ! 63.5 ( )
2
+ 40 67 ! 63.5 ( )
2
+ 3 70 ! 63.5 ( )
2
(4 +10 + 41+ 75 + 40 + 3) !1
= 8.6
(b) For the simplest test, we will assume that distributions are normal. Then we can use the F-
test to compare the variances. 42.4 / 8.6 = 4.93; 443, 172 df. Looking this up (roughly) we find
that the critical value is 1.21 (for 1000,200) and 1.31 (for 200,100) for '(1) = 0.05, so we
conclude that the variance for the F
2
is significantly greater than the variance for the F
1
.
11. Paired t-test: mean difference is 22.3, which is significantly different from 0 (P <0.01, t = 4.5,
df=7).
12. (a) We would use Welch's approximate t-test for this comparison since the standard deviations
differ between Pteronotus and the vampires, by more than threefold. (b) The average strength
was higher for Pteronotus, which is in the opposite direction as predicted by the model.
13. (a) We can use a paired t-test in this case, as the body temperature measurements are taken on
the same individuals as the brain temperature measurements. To do this, we calculate the
difference in temperature between brains and bodies for each ostrich, find the mean difference
(0.648 degrees C) and the standard error of the difference (0.116 degrees C).
0
= 0: the null
hypothesis is that the brain temperature does not differ from the body temperature. t = 0.648 /
0.116 = 5.6, with 5 df, which is greater than the critical value for '(2) = 0.01, 4.03, so P < 0.01.
We reject the null hypothesis of no difference between brain and body temperature. (b) While
our test is significant, the deviation is the opposite predicted from observations of mammals in
similar environments: brains are hotter than bodies in ostriches, not cooler.
14. (a) B. (b) B. (c) From B, we can still be fairly confident that the groups are different, but we
need to mentally double the size of the error bars to make this determination. (d) With sample
sizes of 100, the standard errors will be one tenth as great as the standard deviations. Graphs A,
B, and C will be significantly different.
Chapter 13
1. (A) The point fall on a curve, not a line, so the distribution is unlikely to be normal. (B) There is
some curvature to this distribution, more than would be expected for a normal distribution. (C)
This is very close to the straight line distribution with the points more densely clustered at the
center that you would expect from a normal distribution. (D) This quantile plot has two distinct
curves. The middle of the plot has relatively few points, not the highest density that you would
expect with a normal distribution.
2. (a) I. No, this is a uniform distribution, not a normal distribution.
II. No, this plot is bimodal, not normal.
III. No, this plot is right skewed. It looks log-normal rather than normal.
IV. Yes, this is a normal distribution.
(b) I. The sign test would be best for these data, as they are unlikely to transformed into anything
resembling normal.
II. The sign test would probably be best for this distribution as well. Bimodal distributions are
tough to transform into anything else.
III. These data could probably be tested by a one-sample t-test after transformation (probably a
log transformation, as it is right-skewed).
IV. These data could be tested by a one-sample t-test as they are.
(c) I: B (this explains why there is such a constant density of points in the quantile plots).
II: D. (The two peaks correspond to the two curves in the quantile plots; these are also the areas
of the highest density in the quantile plots).
III. A. (The density is highest on the left, with a few points on the right scattered over a large
range on the x-axis).
IV. C. (Sparse points at the extremes of the range; points falling along a straight line as expected
for a normal distribution).
3. (a) mean 2.75; 95% CI: -3.28 <
log[x]
< 8.78. (b) mean 1.86; 95% CI: -0.02 <
log[x]
< 3.74. (c)
Not possible: cannot use ln transformation on negative values. (d) mean 4.23; 95% CI: -2.04 <
log[x]
< 10.5. (e) Here, we must use Y' = ln(Y + 1). Mean: 0.98; 95% CI: -0.23 <
log[x]
< 2.2.
4. After applying the arcsine to the square-root of the raw data, the mean is 1.39 and the SD is 0.09.
5. The simplest analysis is a sign test. We will score each lobster as "+" if it is pointing more
towards home than away (so -90 to 90) and "-" if pointing more away than towards home (more
than 90 or less than -90). Lobsters exactly at 90 or -90 will not be scored. Of the 14 scored
lobsters, 11 were "+" and 3 were "-".
P = 2
14
x
!
"
#
$
%
&
0.5
14'x
0.5
x
x=11
14
(
!
"
#
$
%
&
= 0.057. Using the usual
standard of ' = 0.05, we cannot reject the null hypothesis with these data.
6. Mann-Whitney U test : U
1
= 39. U
2
=6, so U = 39. The critical value for n
1
= 5 and n
2
= 9 is 38.
U is larger, so we reject the null hypothesis that there is no difference in the time until
reproduction due to accidental death or infanticide.
7. A non-parametric approach to this problem is the Mann-Whitney U test, noting that there will be
many ties, so our test will be less powerful than it might be. We assign a rank of "3" to the zeros
(the average of 1-5), a rank of 11.5 to the 1's (average of 6-17), a rank of 19.5 to the 2's (average
of 18-21), and a rank of 23 to the 3's (average of 22-24). Then we sum the ranks for the blind
group: R1 = 140.5. We then calculated U
1
= n
1
n
2
+ n
1
(n
1
+1)/2 - R
1
= 81.5. U
2
= n
1
n
2
- U
1
= 144 -
81.5 = 62.5. U = 81.5. The critical value for n
1
= 12 and n
2
= 12 is 107. U is smaller, so we do
not reject the null hypothesis that there is no difference in the number of gestures used by the
blind vs. sighted humans.
8. (a) Benton: mean 0.67, variance 0.075; Warrenton: mean: 0.24, variance 0.007. The standard
deviations are more than 3 fold different, so a two-sample t-test is inappropriate. (b) (1) Try a log
transformation, followed by a two-sample t-test. (2) Welch's t-test on the non-transformed data,
which does not require equal variances. (3) Mann-Whitney U test. (c) A log transformation
might be appropriate because the population with the larger mean has the larger variance. The
two-sample t-test on the transformed data rejects the null hypothesis: t = 0.961 / 0.249 = 3.87,
with 10 df. P < 0.01. (d) For Benton county, -0.94 <
log[x]
< -0.04. For Warrenton county, -1.84
<
log[x]
< -1.06. (e) The transformed confidence interval for Benton county is 0.39 to 0.96; for
Warrenton county the interval is 0.16 to 0.35. If we found the means from many samples, 95% of
the confidence intervals from these samples would contain the true mean.
9. Two-sample t-tests and confidence intervals are robust to violations of equal standard deviations
so long as the sample sizes of the two groups are roughly equal and the standard deviations are
within 3 times of one another.
10. (A) two-sample t-test (B) These distributions are skewed right, so a log-transformation could be
tried. (C) This one is tough: neither distribution is normal, but they deviate in different ways, so
that a transformation that would improve the fit for one would not help with the other. Also, the
standard deviations will be very different. The differences in shape make the rank-sum test
inappropriate. A randomization test (chapter 19) is best. (D) two-sample t-test. (E) two-sample t-
test.
11. (a) The null hypothesis was that the spermatids of males and hermaphrodites did not differ in
their mean size, while the alternative hypothesis is that they do differ in size. (b) U = 35910, n
1
= 211, n
2
= 700. Use the normal approximation:
Z =
2U ! n
1
n
2
n
1
n
2
n
1
+ n
2
+1
( )
/ 3
= -11.32. This is highly
significant (P < 0.00002). (c) It is not clear whether the variables are distributed in the same
way for the hermaphrodites and the males: if not, the Mann-Whitney U test may be inaccurate.
12. Mann-Whitney U tests are insensitive to outliers and may be used appropriately in this case.
13. The differences are not normally distributed, but skewed right. We need a sign test. There are 6
"-" and 22 "+", where a "+" means more species in the dioecious group.
P = 2
28
x
!
"
#
$
%
&
0.5
28'x
0.5
x
x=22
28
(
!
"
#
$
%
&
= 0.0037, so we reject the null hypothesis.
14. First, we transform the data by taking the natural log of each weight. The mean ln-transformed
weights are -1.37 mg for females and -1.76 mg for males. Since the transformed weights are
normally distributed, we can use the two-sample t-test. t = 3.51, with df = 18. The critical value
for '(2) = 0.01 for 18 df is 2.88, so P < 0.01, and we reject the null hypothesis. Female
mosquitoes weight more that males.
15. No, the differences are probably not normally distributed: two values less than 500, three values
above 24,000 would not be expected from a normal distribution. (b) We can use a sign test: we
have five pairs where there are more species feeding on angiosperms than on gymnosperms. The
probability of five out five in the binomial distribution, assuming equal probabilities of either
outcome, is 0.5^5, or 0.031. However, for a two-tailed test, we must double this, so P = 0.062.
We are unable to reject the null hypothesis with the usual significance level. (c) The number of
species that feed on angiosperms were often two orders of magnitude greater, and all pairs had
higher number of species in the angiosperm group. It is impossible to get a P-value under 5%
with only 5 data points in a sign test, even with all the data in a consistent direction.
Chapter 14
1. (a) Limit sampling error. (b) Reduce bias. (c) Reduce sampling error. (d) Reduces bias. (e)
Reduce bias.
2. (a) [answers will vary] T, H, H, H, H, T, T, H (b) [answers will vary] no (c) 1 " Pr[exactly 4
heads and 4 tails in 8 tosses] = 1 "
8
5 . 0
4
8
!
!
"
#
$
$
%
&
= 0.727. (d) [answers will vary] Assign a random
number between 0 and 1 to each unit. Assign the first treatment to the units with the 4 smallest
random numbers and the second treatment to the remaining units.
3. Use a randomized block design, where each block is a position along the moisture gradient. Place
three plots in each block, one for each of the three fertilizer treatments (call the fertilizers A, B,
and C). Within each block, randomly assign the three fertilizers plots. The figure below
illustrates the design for 6 blocks.
4. The researchers planned their sample size assuming a significance level of 0.05 and an 80%
probability of rejecting a false null hypothesis (for a specified difference between treatment
means).
5. Observational study. The treatments, presence and absence of brook trout, were not randomly
assigned to the units (streams)! the trout were already present in the streams prior to the study.
Potential confounding factors (water temperature, stream depth, food supply) might differ
between streams with and without brook trout, and randomization was not used to break their
association with the treatment variable.
6. (a) Decrease bias (reduces effects of confounding variables); decrease sampling error (by
grouping similar units into pairs) (b) No effect on bias; reduce sampling error. (c) Decrease bias
(corrects for effects of age, a possible confounding variable); no effect on sampling error. (d) No
effect on bias; reduce sampling error.
7. (a) No. In an experimental study the experimenter assigns two or more treatments to subjects.
Here, there was only one treatment. (b) Have two treatments: the salt infusion and a placebo
control (e.g., distilled water infusions). Assign treatments randomly to a sample of severe
pneumonia patients. Ensure equal numbers of patients in each treatment. Keep patients unaware
of which treatment they are receiving. A clinician unaware of which patient received which
treatment should record their subsequent condition.
8. (a) Observational study: cancer treatments were not assigned randomly to subjects. (b) Yes: it
compares marijuana use of cancer patients and non-cancer (control) patients. (c) Reducing bias.
Age and sex might be confounding variables, affecting both marijuana use and cancer incidence.
Using only subjects similar in age and sex reduces the effect of these confounding variables on
the association between marijuana use and cancer. (d) First, it is not wise to accept the null
hypothesis because the study might not have had sufficient power to detect an effect. The 95%
confidence interval for the odds ratio ranged from 0.6 to 1.3, which still includes the possibility
of a moderate effect. Second, observational studies such as this one cannot decide causation
because of confounding variables. Perhaps marijuana use does increase cancer risk, but
marijuana users have lifestyle differences that diminish cancer risk, offsetting any marijuana
effect.
9. (a) The stings were applied to two volunteers, which means that the reactions to the 40 stings
were probably not independent. Treating the sample size as 40 is a case of pseudo-replication.
(b) More conservatively and appropriately, this study should be treated as a paired design with
two samples. The mean difference in swelling would be found for each subject, and then the
average of the two subjects would be tested to see if it was significantly different from zero. (c)
Since each subject adds just one data point, there is no need to inflict 20 stings on each subject.
Fewer stings per subject would be less cruel, and more subjects would allow more replicates and
so more power.
10. Extreme doses increases power and so enhanced the probability of detecting an effect. If an
effect is detected, then studies of the effects of more realistic doses would be the next step.
11. (a) 13 plots per treatment. The "uncertainty" is 0.4/2 = 0.2, so n = 8 (0.25 / 0.2)
2
= 12.5, round up
to 13. (b) 50 plots per treatment. The "uncertainty" is 0.2/2 = 0.1, so n = 8 (0.25 / 0.1)
2
= 50. (c)
A greater total sample size would be needed if the design were not balanced. For a given total
sample size, the expected width of a confidence interval for the difference between two means
increases as the design is more imbalanced (because the precision of the treatment mean having
the lower sample size is greatly reduced). Achieving the same confidence interval width as a
balanced design will therefore require a greater total sample size. (d) Because environmental
differences between the normal-corn plot and the Bt-corn plot would be confounding variables.
In effect, such a design would lack replication because plants in the same plot are not
independent.
12. 16 plots per treatment. n = 16 (0.25 / 0.25)
2
= 16.
Chapter 15
1. (a) Population isolation. (b) Experimental study: treatments were assigned to plants by the
experimenters. (c) We assume that the variable has a normal distribution with equal variances in
every population. (We also assume random samples.)
Treatment Mean 95% CI
Isolated 9.3 5.3 < < 13.2
Medium 14.5 11.5 < < 17.5
Long 10.8 8.0 < < 13.5
Continuous 12.8 8.2 < < 17.3
(d) (answers may vary). In the figure below, open circles are the data (offset where needed to
minimize overlap). Means are filled circles. Vertical lines indicate 95% confidence intervals..
2. (a) H
0
: the mean persistence times of the four isolation treatments are equal (
1
=
2
=
3
=
4
)
H
A
: at least one of the means
i
is different
(b)
Source of variation Sum of squares df Mean squares F-ratio P
Groups (treatments) 63.188 3 21.063 3.996 0.035
Error 63.250 12 5.271
Total 126.438 15
(c) F-distribution with 3 and 12 degrees of freedom. (d) The probability of obtaining an F ratio
statistic as large or larger than the value observed when the null hypothesis is true. (e) The total
sum of squares is the sum of the deviations squared between each observation and the grand
mean. The error sum of squares is the sum of the squared deviations between each observation
and its group mean. The group sum of squares is the sum of the squared deviations between the
group mean for each individual and the grand mean. (f) Use R
2
, the ratio of the group sum of
squares and the total sum of squares. (g) R
2
= 63.188 / 126.438 = 0.50.
3. (a) H
0
: The mean of group i equals the mean of group j, H
A
: The mean of group i does not equal
the mean of group j, for all pairs of means i and j. (b) These are unplanned comparisons, because
they are intended to search for differences among all pairs of means. Planned comparisons must
be fewer in number and identified as crucial in advance of gathering and analyzing the data. (c)
Failure to reject a null hypothesis that the difference between a given pair of means is zero does
not imply that the means are equal, because power is not necessarily high, especially when the
differences are small. If the means of the Medium and Isolated treatments differ from one
another, then one or both of them must differ from the means from the other two groups, but we
dont know which.
(d)
(e) The critical value for a t-test has a Type 1 error rate of 0.05 when comparing two means. The
critical value for the Tukey comparison of all pairs of means is larger so that the probability of
making at least one Type 1 error in all the comparisons is only 0.05. (Note that degrees of
freedom are 12 in either case because we are using the error mean square to calculate the
standard error of the difference between means).
4. (a) Transformations or a nonparametric Kruskal-Wallis test. (b) The transformation should be
attempted first, because this would yield the more powerful test.
5. (a) H
0
: The mean number of shoots is equal in the three treatments (
1
=
2
=
3
). H
A
: At least
one of the treatment means
i
is different.
(b)
Source of variation Sum of squares df Mean squares F-ratio P
Groups (treatments) 2952.808 2 1476.40 5.32 0.011
Error 8049.067 29 277.55
Total 11001.875 31
The critical value is F
0.05(1),2,29
= 3.33. Since F > 3.33, P < 0.05, reject H
0
. Conclude that there
are differences between treatments in mean shoot number. (c) The variable is normally
distributed with equal variance in the three treatment populations. (d) Fixed-effects. The levels
chosen were set by the researcher and are repeatable, they were not randomly sampled from a
population of treatments.
6. (a)
1
Y "
2
Y = "0.004 " "0.195 = 0.191. MS
error
= 0.0345, df = 42, SE = 0.0679, t
0.05(2),42
= 2.01,
0.054 <
1
"
2
< 0.328. (b) Yes, because the study was designed mainly to compare PLP1 gene
expression in persons with schizophrenia to that of control individuals. It was a single focused
comparison not a broad search for differences between groups. (c) The expression measurements
are normally distributed in the populations with equal variances. (We also assume random
samples.)
7. (a) H
0
: Mean PLP1 gene expression is equal in the three groups (
1
=
2
=
3
).
H
A
: At least one of the group means
i
is different.
Source of variation Sum of squares df Mean squares F-ratio P
Groups 0.5403 2 0.2701 7.82 0.0013
Error 1.4502 42 0.0345
Total 1.9905 44
The critical value is F
0.05(1),2,42
= 3.22. Since F > 3.22, P < 0.05, reject H
0
. Conclude that mean
PLP1 expression differs among groups. (b) The expression measurements are normally
distributed in the populations with equal standard deviations. (c) Fixed-effects ANOVA: we are
comparing predetermined and repeatable treatment groups, not a random selection of groups in a
population. (d) Use R
2
to describe the fraction of the variance explained by group differences: R
2
= 0.27. (e) Use the Tukey-Kramer method.
8. (a) Random effect. The sites were randomly chosen; the goal was to determine whether sites
varied in general in Spain, not just whether these four sites were different. (b) Because this is a
random effect ANOVA, the hypotheses should be stated in terms of the population of sites,
from which we have a sample.
H
0
: Mean carotenoid plasma concentration in vultures is the same among sites in Spain.
H
A
: Mean carotenoid plasma concentration in vultures differs among sites in Spain.
Source of variation Sum of squares df Mean squares F-ratio P
Groups (Sites) 712.25 3 237.42 30.44 < 0.0001
Error 1388.26 178 7.80
Total 2100.52 181
The critical value is F
0.05(1),3,178
= 2.66. Since F > 2.66, P < 0.05, reject H
0
. Conclude that sites
vary in the mean carotenoid plasma concentration. (c) We assume that carotenoid plasma
concentration has a normal distribution in every population with equal variances. We also
assume that sites were randomly chosen and that site means have a normal distribution. (We also
assume random samples.)
9. (a) H
0
: Mean tested weight is the same in the four treatments (
1
=
2
=
3
=
4
). H
A
: At least one
of the treatment means
i
is different.
Source of variation Sum of squares df Mean squares F-ratio P
Groups (treatment) 19057.5 3 6352.5 3.32 0.025
Error 130050.0 68 1912.5
Total 149107.5 71
The critical value is F
0.05(1),3,68
= 2.74. Since F > 2.74, P < 0.05, reject H
0
. Conclude that mean
testes weight varies among treatments. (b) Testes weight is normally distributed in the four
populations with equal variance. (We also assume random samples). (c) Yes, this was an
experimental study because the treatments were assigned (randomly) to the subjects (mice) by
the researchers. (d) R
2
= 19057.5/149107.5 = 0.13.
10. Need to use a &
2
contingency test (Chapter 9). Observed and expected frequencies (in
parentheses) are shown in the contingency table below.
H
0
: The proportion of impregnated females is the same in the four treatments
H
A
: The proportion of impregnated females is not the same in the four treatments
Number of females
Treatment impregnated not impregnated Row sum
oil 14 (12.75) 4 (5.75) 18
THC 13 (12.75) 5 (5.75) 18
cannabinol 11 (12.75) 7 (5.75) 18
cannabidiol 13 (12.75) 5 (5.75) 18
Column sum 51 21 72
&
2
= 1.28, df = 3. The critical value is
2
3 ), 1 ( 05 . 0
! = 7.81. Since &
2
not greater than 7.81, do not reject
H
0
. (P = 0.73). We cannot reject the null hypothesis that the proportion of females impregnated
is equal between treatments.
11. Random-effects: the males were chosen at random from the population. A given male is not a
specific, repeatable treatment. The goal is to generalize to the population of males (the
hypothesis statements should reflect this). (b) H
0
: Mean condition of offspring is the same for all
males in the population. H
A
: Mean condition of offspring differs between males in the
population.
Source of variation Sum of squares df Mean squares F-ratio P
Groups (males) 9.9401 11 0.9036 4.63 0.0008
Error 4.682 24 0.1951
Total 14.622 35
The critical value F
0.05(1),11,24
= 2.21 (between 2.18 and 2.25). Since F > 2.21, P < 0.05, reject H
0
.
Conclude that mean offspring conditions differs among males in the beetle population.
(c) s
2
A
= (0.904 " 0.195) / 3 = 0.236. Repeatability = 0.236 / (0.236 + 0.195) = 0.548.
Chapter 16
1. r = "0.5; r = "0.8; r = 0.5; r = 0.
2. (a) 95%: 0.22 <* < 0.63. 99%: 0.14 <* < 0.68. (b) 95%: "0.95 <* < "0.76. 99%: "0.96 <* <
"0.70. (c) 95%: "0.17 <* < 0.25. 99%: "0.24 <* < 0.31
3. (a) Scatter plot:
(b) The relationship is linear, positive, and strong. (c) r = 0.93. SE = 0.13. (d) The standard error
is the standard deviation of the sampling distribution of r. (e) 0.72 < * < 0.98
4 (a) No change to the correlation coefficient (r = 0.93). Adding a constant to one of the variables
does not alter the correlation coefficient. (b) No change to the correlation coefficient (r = 0.93).
Dividing one of the variables by a constant does not alter the correlation coefficient.
5. We can use a paired t-test (Chapter 12).
H
0
: Mean arrival date of male and female partners is the same (
d
= 0)
H
A
: Mean arrival date of male and female partners is different (
d
$ 0)
d = 0.3 days (males are slightly earlier on average), SE = 1.667, t = 0.18, df = 9, P = 0.86.
t
0.05(2),9
= 2.26; since observed t is less than t
0.05(2),9
, P > 0.05, do not reject H
0
. Conclude that we
cannot reject null hypothesis of equal mean arrival times of males and females. Assume a normal
distribution of differences between arrival rates of males and females.
6. When there is measurement error in one or both of the variables X and Y.
7. A narrower range of values for inbreeding coefficient should lower the correlation with the
number of surviving pups compared with a wider range of inbreeding coefficients.
8. Use the Spearman rank correlation:
H
0
: The population rank correlation is zero (*
S
= 0)
H
A
: The population rank correlation is not zero (*
S
$ 0)
r
S
= 0.66. P = 0.17.
r
S (0.05(2),7)
= 0.821. Since r
S
is not greater than or equal to r
S (0.05(2),7)
, P > 0.05, do not reject H
0
.
Conclude that we cannot reject the null hypothesis of zero correlation.
(b) Assume random sample and a linear relationship between the ranks of the two variables.
9. Sampling error in the estimates of earwig density and proportion males with forceps means that
true density and proportion on an island are measured with error. Measurement error will tend to
decrease the estimated correlation. Therefore, the actual correlation is expected to be higher on
average than the estimated correlation.
10. (a) There is a negative linear relationship between telomere length and chronicity, but it is not
strong. (b) "0.43. (c) "0.66 < * < "0.13. (d) It is the range of most plausible values for the
parameter *. If you were to repeatedly and randomly sample individuals from the same
population and compute the 95% confidence interval each time, 19 out of 20 of the intervals are
expected to include the population correlation *. (e) Assume random sampling, and that the two
variables have a bivariate normal distribution in the population. (f) (Answers may vary) The
scatter plot suggests that the relationship between telomere length and chronicity might be mildly
non-linear, which would violate the assumption of bivariate normality.
11. (a)
(b) r = 0.82 (c) H
0
: There is no correlation between second language proficiency and grey matter
density (* = 0). H
A
: There is a correlation between second language proficiency and grey matter
density (* $ 0).
r = 0.82, SE = 0.13, t = 6.37, df = 20, P = 0.000003
t
0.05(2),20
= 2.09. Since t is greater than t
0.05(2),20
, P > 0.05. Reject H
0
.
Conclude that second language proficiency and grey matter density are correlated.
(d) Random sampling and a bivariate distribution of gray matter density and language
proficiency in the population. (e) No, because there appears to be two outlying observations,
which violates the assumption of bivariate normality. (f) No, correlation alone does not imply
causation. Perhaps individuals with high grey matter densities are able to achieve a high
proficiency in a second language. An experiment would be necessary to test whether proficiency
affects grey matter.
Chapter 17
1. (a) y = 0.5x + 1 (b) y = x 1 (c) y = "0.5 x + 2 (d) y = x 5
2. (a)
The percent infant mortality increases approximately linearly with the log of the home range
size. (b) Mortality = 16.37 + 10.26(log home range) (c) H
0
: Home range size does not predict
infant mortality (+ = 0). H
A
: Home range size predicts infant mortality (+ $ 0).
b = 10.26, SE = 2.69, t = 3.81, df = 18, P = 0.0013, t
0.05(2),18
= 2.10. Since t > 2.10, P < 0.05.
Reject H
0
. Conclude that home range size predicts infant mortality. (d) Mortality = 17.51 +
6.60(log home range). The slope is much lower with the polar bear removed (6.60 instead of
10.26, a reduction of more than a third).
3. (a) The least squares regression line is the one the minimizes the sum of squared differences
between the predicted Y values on the regression line for each X and the observed Y values. (b)
Residuals are the differences between the predicted Y value on the estimated regression line and
the observed Y values. (c) The most conspicuous problem is that the variance of the residuals
increases with increasing progesterone concentration, violating the assumption that the variance
of Y is the same at all values of X. There are no conspicuous departures from the other two
assumptions, linearity and a normal distribution of Y values at every X.
4. (a)
(b) H
0
: Rate if heat loss does not change with body leanness (+ = 0). H
A
: Rate if heat loss
changes with body leanness (+ $ 0).
b = 0.0190, SE = 0.0023, t = 8.29, df = 10, P = 0.000009, t
0.05(2),18
= 2.29. Since t > 2.29, P <
0.05. Reject H
0
. Conclude that rate of heat loss increases with body leanness. (c) 0.0139 < + <
0.0241 (d) For every X there is a normal distribution of Y values, of which we have a random
sample; the relationship between leanness and heat loss is linear; the variance of Y is the same
for all values of X.
5. Caution is warranted because the prediction is based on extrapolation, which is risky because the
relationship between winning time and year might not be linear beyond the range of the existing
data.
6. (d)
7. (a) The arcsine square root transformation is a good bet for data that are proportions.
(b) Y = 0.416 + 7.10 X , where X is genetic distance and Y is arcsine square root transformed
proportion sterile.
(c) 5.67 < + < 8.52
8. (a) Such a complicated curve is unwarranted by the data. It would do a poor job of predicting
new observations because it does not describe a general trend. A curve fit should be as simple as
possible. (b) First try to transform the data to make the relationship linear. If that fails, consider
non-linear regression.
9. (a) b = 11.68, SE = 4.85. (b) "4.08 < + < 27.43 (c) The range of most plausible values for the
parameter. In 99% of random samples, the confidence interval will bracket the population value
for the slope. (d) Measurement error in the X variable (bite force) will tend to lead to an
underestimation of the population slope. (e) Measurement error in the Y variable (territory area)
will not bias the estimate of slope (though it will increase its uncertainty).
10. (a)
Source of variation Sum of squares df Mean squares F-ratio
Regression 194.3739 1 194.3739 5.8012
Residual 301.5516 9 33.5057
Total 495.9255 10
(b) H
0
: The slope of the regression of territory area on bite force is zero (+ = 0). H
A
: The slope of
the regression of territory area on bite force is not zero (+ $ 0).
F = 5.8012, df = 1,9, P = 0.039. F
0.05(1),1,9
= 5.12. Since F > 5.12, P < 0.05. Reject H
0
. Conclude
that the slope is not zero (it is positive).
(c) That the relationship between X and Y is linear; for each X there is a normal distribution of Y
values in the population, of which we have a random sample; the variance of Y is the same at all
values of X. (d) The variance of the residuals. (e) R
2
= 0.392. It measures the fraction of the
variation in Y that is explained by X.
11. (a) Perform the analysis with and without the outlier included in the data set to determine
whether it has an influence on the outcome. If it has a big influence, then it is probably wise to
leave it out and limit predictions to the range of X values between 0 and about 200 (and urge
them to obtain more data at the higher X values.) (b) Confidence bands give the confidence
interval for the predicted mean time since death for a given hypoxanthine concentration. (c)
Confidence bands. (d) The prediction interval, because it measures uncertainty when predicting
the time of death of a single individual.
12. (a)
(b) The assumptions of equal variance of residuals, and of a normal distribution of Y values at
each X, are not met because of the presence of an outlier. (c) A transformation of the data might
improve matters, but success is doubtful. Alternatively, use nonparametric correlation to measure
association between the two variables.
Chapter 18
1. (a) HOURS = CONSTANT + MUTANT.
HOURS is hours of resting. CONSTANT is the grand mean. MUTANT is the effect of each lines
of mutant flies. (b) HOURS = CONSTANT (c) Long horizontal line indicates the grand mean,
which is the predicted value for the mean of each group under the null hypothesis. The short
horizontal lines give the group means, which are the predicted values under the full general
linear model including the MUTANT term.
(d) F.
2. (a) Plasma corticosterone concentration (b) Chick age group and disturbance regime (c) That the
two explanatory variables interact to affect corticosterone concentrations. (d) Observational. The
penguins were not assigned by the researcher to age groups or disturbance regimes. (e) Yes.
Every combination of the two factors (age group and disturbance regime) is included in the
design.
3. (a) CORTICOSTERONE = CONSTANT + AGE + DISTURBANCE + AGE*DISTURBANCE.
CORTICOSTERONE is the corticosterone concentration, CONSTANT is the grand mean
corticosterone concentration, AGE is the age group of the penguin, DISTURBANCE indicates
whether the penguin lived in the undisturbed or tourist-disturbed area, and
AGE*DISTURBANCE is the interaction between age and disturbance. (b) H
0
: Penguin age
group has no effect on mean corticosterone concentration. H
0
: Disturbance regime has no effect
on mean corticosterone concentration. H
0
: There is no interaction between penguin age group
and disturbance regime.
(c) Penguins were randomly samples from each area and each age class. Corticosterone
concentrations are normally distributed within age class and disturbance regime. The variance of
the corticosterone concentration is the same for each combination of age and disturbance.
4. Three of the following: a) To investigate the effect of more than one variable with the same
experimental design; b) To include the effects of blocking in an experimental design; c) To
investigate interactions between the effects of different factors; d) To control for the effects of a
confounding variable by including it as a covariate.
5. In these plots, different symbols refer to different groups of factor B.
6. (a) MEMORY = CONSTANT + LESION. MEMORY is the spatial memory score of a rat,
CONSTANT is a constant indicating the value of MEMORY when LESION is zero (Y-
intercept), and LESION is the extent of the lesion. (b) MEMORY = CONSTANT (c) The line
having negative slope in the following graph represents the predicted values from the full
general linear model. The flat (horizontal) line represents the predicted values under the null
model.
7. (a)
(b) EXPRESSION = CONSTANT + WORKERTYPE + BLOCK.
EXPRESSION is the for gene expression, CONSTANT is the grand mean of for gene
expression, WORKERTYPE is the worker type, and BLOCK identifies which colony the bee
comes from. (c) EXPRESSION = CONSTANT + BLOCK (d) Fixed effect, because the types
are repeatable and of direct interest. The worker types in the analysis are not randomly sampled
from a population of worker types. (e) Blocking variables allow extraneous variation caused by
the block variable to be accounted for in the analysis and eliminated. When a block variable is
included, the error variance is smaller, making it easier to detect real effects of the factor.
Chapter 19
1. (a) For example: Singleton: 3.5, 3.5, 2.6, 4.4 Twin: 3.4, 4.2, 3.4 (b) Singleton: 3.5, 4.2, 4.4, 3.4
Twin: 2.7, 2.6, 1.7 (c) In a bootstrap replicate, the data for each group is a sample of the
data from that same group. Sampling is with replacement (some points might not occur and
others might occur more than once). In a randomization replicate, the data for each group is
sampled from all the data without regard to original group. Sampling is without replacement
(every data point occurs exactly once). (d) No.
2. (a) Yes " by chance the randomization might reassign data to correct groups. (b) Yes. (c) No "
Sample sizes of groups must be the same in each randomization as in the data. (d) No " Each
data point can occur only once in a randomization. (e) No " 3.8 is not in the data. (f) No " Each
data point can occur only once in a randomization.
3. (a) Yes " by chance bootstrap samples might contain the same observations as the data. (b) No "
The bootstrap sample for a group can contain only data from that group. (c) No " Sample sizes
of groups must be the same in each randomization as in the data. (d) Yes (e) No " 3.8 is not in
the data (f) Yes.
4. 10.20 < < 10.40
5. B is the real data. In A, there is little or no relationship between the two variables, whereas B
shows a strong relationship. Since randomization tends to break up associations, it is more likely
that B is the data and A is a randomization.
6. (a) A comes from randomization and B comes from the bootstrap. (b) Approximately 10 units.
The range from X=10 to X=50 should span about 4 standard deviations of the distribution, so 1
standard deviation should be about 10 units.
7. (a) The null distribution for the ratio. (b) H
0
: Mean ratio of range size equals that expected of a
randomly broken stick. H
A
: Mean ratio of range size differs from that expected of a randomly
broken stick. P is approximately 2 # (42/10,000) = 0.0084. Since P < 0.05, reject H
0
. Conclude
that mean ratio in birds exceeds that expected from a randomly broken stick.
8. (a) 98. (b) 100. (c) Giant pandas are most closely related to bears
9. (a) For example:
Group A Group B
7.8 4.5 2.1 12.4 8.9 8.9
7.8 7.8 4.5 12.4 12.4 12.4
4.5 2.1 7.8 8.9 8.9 12.4
7.8 2.1 4.5 10.8 12.4 10.8
7.8 2.1 4.5 12.4 8.9 10.8
7.8 4.5 7.8 8.9 10.8 10.8
4.5 4.5 7.8 10.8 12.4 8.9
2.1 7.8 4.5 8.9 8.9 10.8
4.5 7.8 4.5 10.8 10.8 12.4
2.1 7.8 7.8 8.9 12.4 12.4
(b) Difference in median (B minus A) for these 10 bootstrap replicates:
3.0, 4.4, 4.4, 4.4, 4.6, 4.6, 6.3, 6.3, 6.3, 6.3
4.4 < difference between population medians < 6.3
10. (a) For example:
Group A Group B
8.9 2.1 4.5 10.8 7.8 12.4
4.5 8.9 2.1 7.8 10.8 12.4
4.5 12.4 2.1 7.8 8.9 10.8
4.5 8.9 2.1 10.8 12.4 7.8
10.8 4.5 2.1 8.9 12.4 7.8
4.5 10.8 2.1 12.4 8.9 7.8
8.9 12.4 4.5 2.1 7.8 10.8
12.4 10.8 2.1 7.8 4.5 8.9
2.1 10.8 7.8 8.9 12.4 4.5
7.8 4.5 10.8 12.4 2.1 8.9
(b) H
0
: The two groups have the same median. H
A
: The two groups do not have the same median
Test statistic: observed difference between medians (B minus A) = 6.3
Null distribution: "3.0, "1.1, 1.1, 1.1, 4.4, 4.4, 4.4, 6.3, 6.3, 6.3
3 out of 10 outcomes is greater than or equal to the observed value, 6.3
P is approximately 2 # 3/10 = 0.6. Since P > 0.20, do not reject H
0
.
Conclude that the null hypothesis of no difference between medians is not rejected.
11. Yes.
12. Yes.
13. 10.4 seconds.
14. Welchs approximate t-test.
15. (a) A randomization test. The variable shell volume has been kept fixed and values of the
second variable are all included but in random order. (b) A bootstrap estimate. Each row of the
data set is a different individual sampled from the original sample of individuals. Sampling is
with replacement because the same individual sometimes occurs more than once. (c) The linear
correlation coefficient, r.
Chapter 20
1. (a) Binomial distribution with n = 47. (b) [ ] ( )
35 12
1
12
47
tes heterozygo 12 | p p p L !
"
"
#
$
%
%
&
'
= . It measures
the probability of getting 12 heterozygotes (the data) given that the true value of the proportion
of heterozygotes is p. (c) [ ] ] 1 ln[ 35 ] ln[ 12
12
47
ln ln p p p L ! + +
"
#
$
%
&
'
(
(
)
*
+
+
,
-
= . (d) 7.90
2. (a) 0.26 (b) 0.14 < p < 0.40
3. (a) The log of the probability of the DNA data given different possible times since the split
between Neanderthals and modern humans. (b) 710 thousand years (c) 470 < time since split <
1000 (in thousands of years) (d) This interval is like a 95% confidence interval. It describes the
most plausible values of the time since the split between human and Neanderthals.
4. (a) 4400 (b) 4396.5 (c) G = 2(4396.5 (4400)) = 7. (d) &
2
with df = 1. (e) 84 . 3
2
1 , 05 . 0
= ! .
Since G is greater than 3.84, P < 0.05 (exact P = 0.008). Reject H
0
. Conclude that heterogeneity
is not zero.
5. (a) Pr[Yes] = 1/2 + s/2 = (1 + s)/2
(b) [ ]
72 113
2
1
2
1
113
185
yeses 113 |
!
"
#
$
%
& '
!
"
#
$
%
& +
!
!
"
#
$
$
%
&
=
s s
s L
(c) [ ] ] 2 / ) 1 ln[( 72 ] 2 / ) 1 ln[( 113
113
185
ln yeses 113 | ln s s s L ! + + +
"
#
$
%
&
'
(
(
)
*
+
+
,
-
=
(d) "7.39
6. (a) The maximum likelihood estimate is s = 0.22. (b) 0.07 < s < 0.36. (c) H
0
: The fraction of
thieves s is zero. H
A
: The fraction of thieves s is not zero.
( ) ( ) 16 . 9 7.39 - - 2.81 - 2 = = G . 84 . 3
2
1 , 05 . 0
= ! . Since G > 3.84, P < 0.05 (exact P = 0.002). Reject
H
0
. Conclude that there are thieves among us.
7. (a)
(b) 0.60 (c) 121.653 (d) The data seem to fit a geometric distribution very well.
(e) &
2
Goodness-of-fit test
8. (a)
(b) 0.58. (c) 0.36 < , < 0.86
Chapter 21
1. Results are seen as more interesting or exciting; results are more likely to be accepted for
publication; results are more likely to be accepted in a better journal.
2. (a) Yes: average heritability declines with increasing sample size. (b) Small-sample studies
yielding low heritability estimates are unlikely to be submitted for publication (researchers) or
accepted for publication (editors).
3. The value of including this type of study is low because we need to assume that success in tennis
is a measure of aggression. Including it might be an act of desperation resulting from a shortage
of human studies that directly measure aggression.
4.
5. (Answers will vary).
6. The smaller effect size with larger studies suggests that there is a publication bias affecting the
estimates: small studies yielding low effect sizes are less likely to be appear in the published
literature.
7. Even if H
0
is true, some studies might reject it by chance. If these are the only studies available
for review (because of publication bias), then meta-analysis would conclude that H
0
is false.