Homework 6 (Answer)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

Homework 6

Part I
1. Assume that you have a sample of n1= 8, with the sample mean X́ 1 =42 and a sample
standard deviation S1=4, and you have an independent sample of n2= 15 from another
population with a sample mean of X́ 2 =34 and a sample standard deviation S1=5 .
a. What is the value of the pooled-variance tSTAT test statistic for testing H 0 :μ 1=μ2?
b. In finding the critical value, how many degrees of freedom are there?
c. Using the level of significance α =0.01, what is the critical value for a one-tail test of the
hypothesis H 0 :μ 1 ≤ μ2 against the alternative, H 1 : μ 1> μ 2?
d. What is your statistical decision?
Answer
2 2
2 ( n  1)  S1  (n2  1)  S2 (7)  4 2  (14)  52
Sp  1 
(a)
( n1  1)  ( n2  1) 7  14 = 22
X̄ − X̄ − μ −μ
( 1 2 ) ( 1 2 ) ( 42−34 )−0
t STAT = = =3 . 8959
1 1 1 1

(b)
√ (
S 2p +
n1 n2
d.f. = (n1 – 1) + (n2 – 1) = 7 + 14 = 21
) √ (
22 +
8 15 )
(c) Decision rule: d.f. = 21. If tSTAT > 2.5177, reject H0.
(d)Decision: Since t = 3.8959 is greater than the critical bound of 2.5177, reject H0. There is
enough evidence to conclude that the first population mean is larger than the second
population mean.

2. A bank with a branch located in a commercial district of a city has the business objective
of developing an improved process for serving customers during the noon-to-1 p.m lunch
period. Management decides to first study the waiting time is defined as the number of
minutes that elapses from when the customer enters the line until he or she reaches the
teller window. Data are collected from a random sample of 15 customers and stored in
Bank1.xls.
Suppose that another branch, located in residential area is also concerned with improving
the process of serving customers in the noon-to-1 p.m. lunch period. Data are collected
from a random sample of 15 customers and stored in Bank2.xls.
a. Assuming that the population variances from both banks are equal, is there evidence of a
difference in the mean waiting time between the two branches? (Use α =0.05).
b. Determine the p-value in (a) and interpret its meaning.
c. In addition to equal variances, what other assumption is necessary in (a)?
d. Construct and interpret a 95% confidence interval estimate of the difference between the
population means in the two branches.
Answer
H 0 : 1   2
(a) Mean waiting times of Bank 1 and Bank 2 are the same.
H1 : 1  2 Mean waiting times of Bank 1 and Bank 2 are different.

PHStat output:
t Test for Differences in Two Means
Data
Hypothesized Difference 0
Level of Significance 0.05
Population 1 Sample
Sample Size 15
Sample Mean 4.286667
Sample Standard Deviation 1.637985
Population 2 Sample
Sample Size 15
Sample Mean 7.114667
Sample Standard Deviation 2.082189
Intermediate Calculations
Population 1 Sample Degrees of Freedom 14
Population 2 Sample Degrees of Freedom 14
Total Degrees of Freedom 28
Pooled Variance 3.509254
Difference in Sample Means -2.828
t-Test Statistic -4.13431
Two-Tailed Test
Lower Critical Value -2.04841
Upper Critical Value 2.048409
p-Value 0.000293
Reject the null hypothesis
Since the p-value of 0.000293 is less than the 5% level of significance, reject the null
hypothesis. There is enough evidence to conclude that the mean waiting time is
different in the two banks.
(b) p-value = 0.000293. The probability of obtaining a sample that will yield a t test
statistic more extreme than –4.13431 is 0.000293 if, in fact, the mean waiting times
of Bank 1 and Bank 2 are the same.
(c) We need to assume that the two populations are normally distributed.
1 1 1 1
X 1  X 2   t S p2      4.2867  7.1147   2.0484 3.5093   
(d)  n1 n2   15 15 
4.2292  1  2  1.4268
You are 95% confident that the difference in mean waiting time between Bank 1 and
Bank 2 is between 4.2292 and 1.4268 minutes.

3. Nine experts rated two brands of Colombian coffee in a taste-testing experiment. A rating
on a 7-point scale (1 = extremely unpleasing, 7 = extremely pleasing) is given for each of four
characteristics: taste, aroma, richness, and acidity. The following data stored in Coffee.xls.
a. At the 0.05 level of significance, is there evidence of a difference in the mean ratings
between the two brands?
b. What assumption is necessary about the population distribution in order to perform this
test?
c. Determine the p-value in (a) and interpret its meaning.
d. Construct and interpret a 95% confidence interval estimate of the difference in the mean
ratings between the two brands.
Answer
(a) Define the difference in summated rating as the rating on brand A minus the rating on brand B.
H0 : D  0 vs. H1 :  D  0
D  D D́−μ D D́−μ D
t STAT  t= t=
SD S S
D D

Test statistic: n √n √n = -3.2772, p-value = 0.0112


H
Decision: Since the p-value = 0.0112 < 0.05, reject 0 . There is enough evidence of a difference in
the mean summated ratings between the two brands.

(b) You must assume that the distribution of the differences between the two ratings is
approximately normal.

(c) p-value is 0.0112. The probability of obtaining a mean difference in ratings that gives rise to
a test statistic that deviates from 0 by 3.2772 or more in either direction is 0.0112 if
there is no difference in the mean summated ratings between the two brands.

SD 1.4240
(d) D́ ±t = −1.5556 ± 2.3060 -2.6501 ≤ μ D ≤ -0.4610
√n √9
You are 95% confident that the mean difference in summated ratings between brand A and brand B
is somewhere between -2.6501 and -0.4610.

4. What motivates employees? The Great Place to Work Institute evaluated nonfinancial
factors both globally and in the United States. The results, which indicate the importance
rating of each factor, are stored in Motivation.xls.
a. At the 0.05 level of significance, is there evidence of a difference in the mean rating
between global and U.S. employees?
b. What assumptions is necessary about the population distribution in order to perform this
test?
c. Use a graphical method to evaluate the validity of the assumption in (b).
Answer
(a) Define the difference to be the global rating minus the U.S. rating.
H 0 : D  0 vs. H1 :  D  0
Excel output:
Paired t Test

Data
Hypothesized Mean Diff. 0
Level of significance 0.05

Intermediate Calculations
Sample Size 13
DBar -0.1538
degrees of freedom 12
SD 8.0503
Standard Error 2.2328
t Test Statistic -0.0689

Two-Tail Test
Lower Critical Value -2.1788
Upper Critical Value 2.1788
p-Value 0.9462
Do not reject the null hypothesis
D  D
t STAT 
SD
Test statistic: n = -0.0689 p-value = 0.9462
H
Decision: Since p-value > 0.05, do not reject 0 . There is not enough evidence of a
difference in the mean rating between global and U.S. employees.
(b) The differences are assumed to be normally distributed.
(c)
Boxplot

Diff

-20 -10 0 10

Normal Probability Plot


15
10
5
0
Diff

-5
-10
-15
-20
-2 -1 0 1 2
Z Value

Both the boxplot and the normal probability plot do not indicate severe departure
from normality.

5. Do males and females differ in the amount of time they talk on the phone and the
number of text messages they send? A study reported that women spent a mean of 818
minutes per month talking as compared to 716 minutes per month for men. Suppose that
the sample size were 100 each for women and men and that the standard deviation for
women was 125 minutes per month as compared to 100 minutes per month for men.
a. Using a 0.01 level of significance, is there evidence of a difference in the variance of the
amount of time spent talking between women and men?
b. To test for a difference in the mean talking time of women and men, is it most
appropriate to use the pooled-variance t test or the separate-variance t test? Use the most
appropriate test to determine if there is a difference in the amount of time spent talking
between women and men.

The article also reported that women sent a mean of 716 text messages per month
compared to 555 per month for men. Suppose that the standard deviation for women was
150 text messages per month compared to 125 text messages per month for men.
c. Using a 0.01 level of significance, is there evidence of a difference in the variance of the
number of text messages spent per month by women and men?
d. Based on the results of (c), use the most appropriate test to determine, at the 0.01 level
of significance, whether there is evidence of a difference in the mean number of text
messages sent per month by women and men.
Answer
2 2
(a) H0: 1 = 2 The population variances are the same.
2 2
H1: 1 ¹ 2 The population variances are different.

Decision rule: If FSTAT > 1.6854, reject H0.


S12
FSTAT 
Test statistic:
S 2 2 = 1.5625
Decision: Since FSTAT = 1.5625 is less than the upper critical bound of 1.6584, do not reject H0. There is
not enough evidence of a difference in the variances of the amount of time spent
talking between women and men.
b. It is more appropriate to use a pooled-variance t test.
cont.
Pooled-Variance t Test for the Difference Between Two Means
(assumes equal population variances)
Data
Hypothesized Difference 0
Level of Significance 0.01
Population 1 Sample
Sample Size 100
Sample Mean 818
Sample Standard Deviation 125
Population 2 Sample
Sample Size 100
Sample Mean 716
Sample Standard Deviation 100

Intermediate Calculations
Population 1 Sample Degrees of Freedom 99
Population 2 Sample Degrees of Freedom 99
Total Degrees of Freedom 198
Pooled Variance 12812.5
Standard Error 16.0078
Difference in Sample Means 102
t Test Statistic 6.3719

Two-Tail Test
Lower Critical Value -2.6009
Upper Critical Value 2.6009
p-Value 0.0000
Reject the null hypothesis
H0: 1   2 H1: 1 ¹ 2 Population 1 = women, Population 2 = men
Decision rule: If |tSTAT | >2.6009, reject H0.
Test statistic:
( X̄ 1 − X̄ 2 )−( μ1 −μ 2 )
t STAT =
1 1
√ S
p2 ( +
n1 n2 )
= 6.3719
Decision: Since tSTAT = 6.3719 is larger than the upper critical bound of 2.6009, reject
H0. There is enough evidence of a difference in the mean amount of time spent
talking between women and men.
Part II

1. Let n1 = 100, X1 = 45, n2 = 50, and X2 = 25.


a. At the 0.01 level of significance, is there evidence of a significant difference between the
two population proportions?
b. Construct a 99% confidence interval estimate for the difference between the two
population proportions.
Answer
X 1 45 X 25
p1    0.45, p2  2   0.50,
(a)
n1 100 n2 50
X1 + X 2 45 + 25
and p = = =0.467
n1 + n2 100 + 50
H 0:  1 =  2 H 1:  1 ¹  2
Decision rule: If Z < – 2.58 or Z > 2.58, reject H0.
( p 1− p2 )−( π 1−π 2 ) ( 0 . 45-0 . 50 )−0
Z STAT = =
1 1
p̄ ( 1− p̄ )
( +
n1 n2 ) √ 0 . 467 ( 1-0 . 467 ) (1001 +501 )
= – 0.58
Decision: Since ZSTAT = – 0.58 is between the critical bound of ±2.58, do not reject H0.
There is insufficient evidence to conclude that the population proportion differs for
group 1 and group 2
10.28 (b)
 p1  1  p1  p2  1  p2    .45  .55  .5  .5  
 p1  p2   Z     0.05  2.5758  + 
 n1 n2   100 50 
cont.
0.2727   1   2  0.1727

2. Do social recommendations increase ad effectiveness? A study of online video viewers


compared viewers who arrived at an advertising video for a particular brand by following a
social media recommendation link to viewers who arrived at the same video by web
browsing. Data were collected on whether the viewer could correctly recall the brand being
advertised after seeing the video. The results were:
Correctly Recalled the Brand
Arrival Method Yes No
Recommendation 407 150
Browsing 193 91
a. Set up the null and alternative hypotheses to try to determine whether brand recall is
higher following a social media recommendation than with only web browsing.
b. Conduct the hypothesis test defined in (a), using the 0.05 level of significance.
c. Does the result of your test in (b) make it appropriate to claim that brand recall is higher
following a social media recommendation than by web browsing?
Answer
(a) H0:  1   2 H1:  1   2
Population 1 = social media recommendation, 2 = only web browsing

(b) PHStat output:


Z Test for Differences in Two Proportions

Data
Hypothesized Difference 0
Level of Significance 0.05
Group 1
Number of Items of Interest 407
Sample Size 557
Group 2
Number of Items of Interest 193
Sample Size 284

Intermediate Calculations
Group 1 Proportion 0.73070018
Group 2 Proportion 0.679577465
Difference in Two Proportions 0.051122715
Average Proportion 0.713436385
Z Test Statistic 1.550652654

Upper-Tail Test
Upper Critical Value 1.6449
p-Value 0.0605
Do not reject the null hypothesis

(b) Decision rule: If ZSTAT > 1.6449, reject H0.


Test statistic:
 p1  p2    1   2 
Z STAT 
1 1
p  1 p   
 n1 n2  = 1.5507 p-value = 0.0605
Decision: Since p-value > 0.05, do not reject H0. There is not sufficient evidence to
conclude that brand recall is higher following a social media recommendation than
with only web browsing.
(c) No, the result in (b) does not make it appropriate to claim that brand recall is higher
following a social media recommendation than by web browsing.

3. Determine the upper-tail critical value of F in each of the following one-tail tests.
a. α =0.05, n1=16, n2 = 21
b. α =0.01, n1=16, n2 = 21
Answer
(a)  =0.05, n1 =16 (numerator), n2 =21, F0.05 = 2.20
(b)  =0.01,
n1 n2
=16,
F0.01=21, = 3.09

4. A bank with a branch located in a commercial district of a city has the business objective
of developing an improved process for serving customers during the noon-to-1 p.m lunch
period. Management decides to first study the waiting time is defined as the number of
minutes that elapses from when the customer enters the line until he or she reaches the
teller window. Data are collected from a random sample of 15 customers and stored in
Bank1.xls.
Suppose that another branch, located in residential area is also concerned with improving
the process of serving customers in the noon-to-1 p.m. lunch period. Data are collected
from a random sample of 15 customers and stored in Bank2.xls.
a. Is there evidence of a difference in the variability of the waiting time between the two
branches? (Use α =0.05).
b. Determine the p-value in (a) and interpret its meaning.
c. What assumption about the population distribution of each bank is necessary in (a)? Is the
assumption valid for these data?
d. Based on the results of (a), is it appropriate to use the pooled-variance t test to compare
the means of the two branches?
Answer
2 2
(a) H0: 1 = 2 The population variances are the same.
2 2
H 1: ¹ 2 The population variances are different.
1
Decision rule: If FSTAT > 2.9786, reject H0.
2
S1 2.0822 2
FSTAT  2

S2 1.6380 2
Test statistic: = 1.6159
F
Decision: Since FSTAT = 1.6159 is below the upper critical bound of  / 2 = 2.9786, do
not reject H0. There is not enough evidence to conclude that the two population
variances are different.
(b) p-value = 0.715. The probability of obtaining a sample that yields a test statistic more
extreme than 1.6159 is 0.715 if the null hypothesis that there is no difference in the
two population variances is true.
(c) The test assumes that the two populations are both normally distributed.
(c)
cont.
Box-and-whisker Plot of Waiting Time

Waiting Time (Bank2)

Waiting Time (Bank 1)

0 2 4 6 8 10 12
Normal Probability Plot of Waiting Time (Bank 2)

12

10

8
Waiting Time (Bank2)

0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Z Value

Waiting Time (Bank 1) Waiting Time (Bank2)

Mean 4.286666667 7.114666667


Standard Error 0.422925938 0.537618972
Median 4.5 6.68
Mode #N/A #N/A
Standard Deviation 1.637985115 2.082189324
Sample Variance 2.682995238 4.335512381
Kurtosis 0.832925217 -1.056273871
Skewness -0.832946775 0.072493057
Range 6.08 6.67
Minimum 0.38 3.82
Maximum 6.46 10.49
Sum 64.3 106.72
Count 15 15
Interquartile range 2.35 3.09
1.33 * std dev 2.178520203 2.769311801
Range 6.08 6.67
6 * std dev 9.827910692 12.49313594

Both the normal probability plots and the boxplots suggest that the
waiting times for both branches do not appear to be normally distributed. Waiting
times for Bank 1 appear to be skewed to the left while the waiting times for Bank 2
are slightly skewed to the right. Hence, the F test for the difference in variances,
which is sensitive to departure from the normality assumption, should not be used to
test the equality of two variances. From the boxplots and the summary statistics, the
two samples appear to have about the same amount of dispersion. Since the pooled-
variance t test is robust to departure from the normality assumption, it can be used to
test for the difference in means.
(d) Based on the results of (a), it is appropriate to use the pooled-variance t-test to compare the
means of the two branches.
5. Do men and women differ in the number of online friends that they have? A study of
3,011 people reported that men had a mean of 180 friends and women had a mean of 140
friends. Suppose that the study consisted of 1,511 men and 1,500 women and that the
standard deviation of the number of friends was 130 for men and 120 for women. Assume a
level of significance of 0.05.
a. Is there evidence of a difference in the variances of the number of online friends between
men and women?
b. Is there evidence of a difference in the mean number of online friends between men and
women?
c. Construct and interpret a 95% confidence interval estimate for the difference in the mean
number of online friends between men and women.

Answer
Population 1 = men, 2 = women
2 2
(a) H0: 1 = 2 The population variances are the same.
2 2
H1: 1 ¹2 The population variances are different.
F Test for Differences in Two Variances

Data
Level of Significance 0.05
Larger-Variance Sample
Sample Size 1511
Sample Variance 16900
Smaller-Variance Sample
Sample Size 1500
Sample Variance 14400

Intermediate Calculations
F Test Statistic 1.1736
Population 1 Sample Degrees of Freedom 1510
Population 2 Sample Degrees of Freedom 1499

Two-Tail Test
Upper Critical Value 1.1064
p-Value 0.0019
Reject the null hypothesis
H
Since the p-value = 0.0019 is lower than the 5% level of significance, reject 0 .
There is enough evidence of a difference in the variances of the number of online
friends between men and women. Hence, a separate-variance t test is appropriate.
10. 63 (b) H0: 1   2 H1: 1 ¹ 2
cont.
Separate-Variances t Test for the Difference Between Two Means
(assumes unequal population variances)
Data
Hypothesized Difference 0
Level of Significance 0.05
Population 1 Sample
Sample Size 1511
Sample Mean 180
Sample Standard Deviation 130.0000
Population 2 Sample
Sample Size 1500
Sample Mean 140
Sample Standard Deviation 120.0000

Intermediate Calculations
Numerator of Degrees of Freedom 432.0015
Denominator of Degrees of Freedom 0.1443
Total Degrees of Freedom 2993.2295
Degrees of Freedom 2993
Standard Error 4.5590
Difference in Sample Means 40
Separate-Variance t Test Statistic 8.7738

Two-Tail Test
Lower Critical Value -1.9608
Upper Critical Value 1.9608
p-Value 0.0000
Reject the null hypothesis

(b)
H 0 . There is enough evidence of a
Since the p-value is virtually zero, reject
difference in the mean number of online friends between men and women .
1 1  1 1 
X 1  X 2   t /2 S p2      180-140   1.9608 15654.57   
(c)  n1 n2   1511 1500 

31.0583  1   2  48.9417

You might also like