S1 Bivariate Data
S1 Bivariate Data
Bivariate Data
PhysicsAndMathsTutor.com
physicsandmathstutor.com PhysicsAndMathsTutor.com
2
1 The scatter diagrams below illustrate three sets of bivariate data, A, B and C.
State, with an explanation in each case, which of the three sets of data has
(i) the largest,
(ii) the smallest,
value of the product moment correlation coefficient. [4]
(Q1, Jan 2005)
2 The back-to-back stem-and-leaf diagram below shows the number of hours of television watched per
physicsandmathstutor.com
week by each of 15 boys and 15 girls.
3
Boys Girls
23 Two commentators gave 8 ratings
7 7 6 out
6 4of4 100
3 for
0 0 5 5sports
seven 6 6 7 7 8 8 9 The ratings are shown in
personalities.
the table below. 2 2 0 1 0 0 4
6 5 4 2 2 7
Personality A 5 B3 C D E F G
Key: 4 2 I 2 means
Commentator 73 a 76boy who
78watched
65 24 hours
86 and 82a 91
girl who
Commentator II watched
77 22 hours
78 79of television
80 per week.
86 89 95
(i) Find the median and the quartiles of the results for the boys. [3]
(i) Calculate Spearman’s rank correlation coefficient for these ratings. [5]
(ii) Give a reason why the median might be preferred to the mean in using an average to compare the
(ii) State what your answer tells you about the ratings given by the two commentators. [1]
two data sets. [1]
(Q3, Jan 2005)
(iii) State one advantage, and one disadvantage, of using stem-and-leaf diagrams rather than box-and-
4 The table below shows the probability distribution of the random variable X .
whisker plots to represent the data. [2]
x −2 −1 0 1 2
P(X = x) 1
4
1
5
k 2
5
1
10
5 On average 1 in 20 members of the population of this country has a particular DNA feature. Members
of the population are selected at random until one is found who has this feature.
39 Five observations of bivariate data produce the following results, denoted as (xi , yi ) for i = 1, 2, 3, 4, 5.
(13, 2.7) (13, 4.0) (18, 2.8) (23, 3.3) (23, 2.2)
(i) Show that the regression line of y on x has gradient −0.06, and find its equation in the form
y = a + bx. [4]
(ii) The regression line is used to estimate the value of y corresponding to x = 20, but the value x = 20
is accurate only to the nearest whole number. Calculate the difference between the largest and
the smallest values that the estimated value of y could take. [3]
The numbers e1 , e2 , e3 , e4 , e5 are defined by
ei = a + bxi − yi for i = 1, 2, 3, 4, 5.
(iii) The values of e1 , e2 and e3 are 0.6, −0.7 and 0.2 respectively. Calculate the values of e4 and e5 .
[2]
(iv) Calculate the value of e21 + e22 + e23 + e24 + e25 and explain the relevance of this quantity to the
regression line found in part (i). [2]
(v) Find the mean and the variance of e1 , e2 , e3 , e4 , e5 . [4]
(Q9, Jan 2005)
4732/Jan05
41 (i) Calculate the value of Spearman’s rank correlation coefficient between the two sets of rankings,
A and B, shown in Table 1. [4]
A 1 2 3 4 5
B 4 1 3 2 5
Table 1
(ii) The value of Spearman’s rank correlation coefficient between the set of rankings B and a third
set of rankings, C, is known to be −1. Copy and complete Table 2 showing the set of rankings C.
[2]
B 4 1 3 2 5
C
Table 2
(Q1, June 2005)
2 The probability that a certain sample of radioactive material emits an alpha-particle in one unit of time
is 0.14. In one unit of time no more than one alpha-particle can be emitted. The number of units of
time up to and including the first in which an alpha-particle is emitted is denoted by T .
3 In a supermarket the proportion of shoppers who buy washing powder is denoted by p. 16 shoppers
PhysicsAndMathsTutor.com
3
54 The table shows the latitude, x (in degrees correct to 3 significant figures), and the average rainfall y
(in cm correct to 3 significant figures) of five European cities.
City x y
Berlin 52.5 58.2
Bucharest 44.4 58.7
Moscow 55.8 53.3
St Petersburg 60.0 47.8
Warsaw 52.3 56.6
(ii) The values of y in the table were in fact obtained from measurements in inches and converted into
centimetres by multiplying by 2.54. State what effect it would have had on the value of the product
moment correlation coefficient if it had been calculated using inches instead of centimetres. [1]
(iii) It is required to estimate the annual rainfall at Bergen, where x = 60.4. Calculate the equation
of an appropriate line of regression, giving your answer in simplified form, and use it to find the
required estimate. [5]
(Q4, June 2005)
physicsandmathstutor.com
61 Some observations of bivariate data were made and the equations of the two regression lines were
found to be as follows.
y on x : y = −0.6x + 13.0
x on y : x = −1.6y + 21.0
(i) State, with a reason, whether the correlation between x and y is negative or positive. [1]
(ii) Neither variable is controlled. Calculate an estimate of the value of x when y = 7.0. [2]
3 Each of the 7 letters in the word DIVIDED is printed on a separate card. The cards are arranged in a
row.
(i) How many different arrangements of the letters are possible? [3]
4732/S05 [Turn over
(ii) In how many of these arrangements are all three Ds together? [2]
The 7 cards are now shuffled and 2 cards are selected at random, without replacement.
physicsandmathstutor.com PhysicsAndMathsTutor.com
3
76 The table shows the total distance travelled, in thousands of miles, and the amount of commission
earned, in thousands of pounds, by each of seven sales agents in 2005.
Agent A B C D E F G
Distance travelled 18 15 12 14 16 24 13
Commission earned 18 45 19 24 27 22 23
(i) (a) Calculate Spearman’s rank correlation coefficient, rs , for these data. [5]
(b) Comment briefly on your value of rs with reference to this context. [1]
(c) After these data were collected, agent A found that he had made a mistake. He had actually
travelled 19 000 miles in 2005. State, with a reason, but without further calculation, whether
the value of Spearman’s rank correlation coefficient will increase, decrease or stay the same.
[2]
The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented
no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, y, together with
the data for distance travelled, x, are illustrated in the scatter diagram below.
(ii) For this scatter diagram, what can you say about the value of
(a) Spearman’s rank correlation coefficient, [1]
(b) the product moment correlation coefficient? [1]
(Q6, June 2006)
82 The table contains data concerning five households selected at random from a certain town.
Give the
(ii) (b) mean why
a reason greater than the
it would notmedian? [1]
be sensible to use your answer to draw a conclusion about all the
households in the town. [1]
(ii) Give a reason why none of these diagrams could represent a geometric distribution. [1]
(Q2, Jan 2007)
3 (iii) Which1,one
The digits 2, 3,of4these
and 5diagrams could
are arranged innot represent
random order,a to
binomial distribution?
form a five-digit Explain your answer
number.
briefly. [2]
(i) How many different five-digit numbers can be formed? [1]
95 A(ii)chemical
Find thesolution wasthat
probability gradually heated.number
the five-digit At five-minute
is intervals the time, x minutes, and the
temperature, y ◦ C, were noted.
(a) odd, [2]
x than023 000. 5
(b) less 10 15 20 25 30 35 [3]
y 0.8 3.0 6.8 10.9 15.6 19.6 23.4 26.7
(ii) Use your equation to estimate the temperature after 12 minutes. [2]
(iii) It is given that the value of the product moment correlation coefficient is close to +1. Comment
physicsandmathstutor.com
on the reliability of using your equation to estimate y when
(a) x = 17,
2
10
2 Two judges each placed skaters from five countries in rank order.
© OCR 2007 4732/01 Jan07
Position 1st 2nd 3rd 4th 5th
Judge 1 UK France Russia Poland Canada
© OCR 2007 4732/01 Jan07 [Turn over
Judge 2 Russia Canada France UK Poland
Calculate Spearman’s rank correlation coefficient, rs , for the two judges’ rankings. [5]
(Q2, June 2007)
3 (i) How many different teams of 7 people can be chosen, without regard to order, from a squad
of 15? [2]
(ii) The squad consists of 6 forwards and 9 defenders. How many different teams containing
3 forwards and 4 defenders can be chosen? [2]
(ii) P(T > 4), [2]
(iii) E(T ).
PhysicsAndMathsTutor.com
[1]
physicsandmathstutor.com
11
3 A sample of bivariate data was taken and the results
2 were summarised as follows.
(ii) The ranks of the data were found. One 0 student calculated
0.7Spearman’s rank correlation coefficient
rs , and found that rs = 0.7. Another 0.2the product moment coefficient, R, of
1 student calculated
these ranks. State which one of the following statements is true, and explain your answer briefly.
2 0.1
(A) R = 0.855
The spinner
(B) Ris = 0.7 twice. The total of the two numbers on which it lands is denoted by X .
spun
P(X = 2) = 0.18.
(C) It is impossible to give the value of R without carrying out a calculation using the original
(i) Show that
data. [3]
[2]
The probability distribution of X is given in the table.
(iii) All the values of x are now multiplied by a scaling factor of 2. State the new values of r and rs .
x 0 1 2 3 4 [2]
P(X = x) 0.49 0.28 0.18 0.04 0.01 (Q3, Jan 2008)
4 A supermarket has a large stock of eggs. 40% of the stock are from a firm called Eggzact. 12% of the
(ii) Calculate E(X ) and Var(X ).
stock are brown eggs from Eggzact.
[5]
An egg is chosen at random from the stock. Calculate the probability that
2
12 The
(i) table shows
this egg is brown, x years,
the age,given thatand
it isthe mean
from diameter, y cm, of the trunk of each of seven randomly
Eggzact, [2]
selected trees of a certain species.
(ii) this egg is from Eggzact and is not brown. [2]
Age (x years) 11 12 20 28 35 45 51
Mean trunk diameter (y cm) 12.2 16.0 26.4 39.2 39.6 51.3 60.6
(i) (a) Use an appropriate formula to show that the gradient of the regression line of y on x is 1.13,
correct to 2 decimal places. [2]
(b) Find the equation of the regression line of y on x. [2]
It is given that the value of the product moment correlation coefficient for the data in the table is 0.988,
correct to 3 decimal places.
13
9 It is thought that the pH value of sand (a measure of the sand’s acidity) may affect the extent to which
a particular species of plant will grow in that sand. A botanist wished to determine whether there was
any correlation between the pH value of the sand on certain sand dunes, and the amount of each of two
plant species growing there. She chose random sections of equal area on each of eight sand dunes and
measured the pH values. She then measured the area within each section that was covered by each of
the two species. The results were as follows.
Dune A B C D E F G H
pH value, x 8.5 8.5 9.5 8.5 6.5 7.5 8.5 9.0
(i) Give a reason why it might be appropriate to calculate the equation of the regression line of y on
x rather than x on y in this situation. [1]
(ii) Calculate the equation of the regression line of y on x for species P, in the form y = a + bx, giving
the values of a and b correct to 3 significant figures. [4]
(iii) Estimate the value of y for species P on sand where the pH value is 7.0. [2]
The values of the product moment correlation coefficient between x and y for species P and Q are
rP = 0.828 and rQ = 0.0302.
(iv) Describe the relationship between the area covered by species Q and the pH value. [1]
(v) State, with a reason, whether the regression line of y on x for species P will provide a reliable
estimate of the value of y when the pH value is
(a) 8, [1]
(b) 4. [1]
(vi) Assume that the equation of the regression line of y on x for species Q is also known. State, with
a reason, whether this line will provide a reliable estimate of the value of y when the pH value
is 8. [1]
(Q9, Jan 2008)
14
4 Three tutors each marked the coursework of five students. The marks are given in the table.
Student A B C D E
Tutor 1 73 67 60 48 39
Tutor 2 62 50 61 76 65
Tutor 3 42 50 63 54 71
(i) Calculate Spearman’s rank correlation coefficient, rs , between the marks for tutors 1 and 2. [5]
physicsandmathstutor.com
(ii) The values of rs for the other pairs of tutors,
2 are as follows.
Tutors 1 and 3: rs = −0.9
1 20% of packets of a certain kind of cereal contain a free gift. Jane buys one packet a week for 8 weeks.
Tutors 2 and 3:
The number of free gifts that Jane receives is denoted r = 0.3
s by X . Assuming that Jane’s 8 packets can be
regarded as a random sample, find
State which two tutors differ most widely in their judgements. Give your reason. [2]
(i) P(X = 3), [3]
(Q4, Jan 2009)
(ii) P(X ≥ 3), [2]
(iii) E(X ). [2]
physicsandmathstutor.com
15
2 Two judges placed 7 dancers in rank order. Both4judges placed dancers A and B in the first two places,
but in opposite orders. The judges agreed about the ranks for all the other 5 dancers. Calculate the
5 value
A of Spearman’s
washing-up rank correlation
bowl contains 6 spoons,coefficient.
5 forks and 3 knives. Three of these 14 items are removed[4]at
random, without replacement. Find the probability that (Q2, June 2009)
3 In(i)anall three items
agricultural are of different
experiment, kinds,
the relationship [3]
between the amount of water supplied, x units, and the
yield, y units,
(ii) all was investigated.
three items values of x were chosen and for each value of x the corresponding
Six kind.
are of the same [3]
value of y was measured. The results are shown in the table.
16
6 (a) A student calculatedx the values
1 of2the product
3 moment 4 correlation
5 6
coefficient, r, and Spearman’s
rank correlation coefficient,
y 3 rs , for
6 two sets
8 of bivariate
8 data,
11 A and10 B . His results are given
below.
A : r =line
These results, together with the regression y onrs x=, are
0.9 ofand 1 plotted on the graph.
B: r=1 and rs = 0.9
y where appropriate, explain why the student’s results for A could both
With the aid of a diagram
be correct but his results for B cannot both be correct. [3]
12
(b) An old research paper has been partially4732
© OCR 2009 destroyed.
Jan09 Turn over
The surviving part of the paper contains the
following incomplete10information about some bivariate data from an experiment.
The mean
6 of x is 4.5. The
The equation of the regression line of y on x is y = 2.4x + 3.7.
The equation
4 of the regression line of x on y is x = 0.40y –
0 x
0 2 end3of the4equation
1 at the
Calculate the missing constant 5 of6the second
7 regression line. [4]
(Q6, Jan 2010)
(i) Give a reason why the regression line of x on y is not suitable in this context. [1]
7 The table shows the numbers of male and female members of a vintage car club who own either a
Jaguar or a Bentley.
(ii) Explain No member
the significance, forowns both makes
the regression lineofofcar.
y on x, of the distances shown by the vertical
2 Two judges placed 7 dancers in rank order. Both judges placed dancers A and B in the first two places,
but in opposite orders. The judges agreed about the ranks for all the other 5 dancers. Calculate the
PhysicsAndMathsTutor.com
value of Spearman’s rank correlation coefficient. [4]
17
3 In an agricultural experiment, the relationship between the amount of water supplied, x units, and the
yield, y units, was investigated. Six values of x were chosen and for each value of x the corresponding
value of y was measured. The results are shown in the table.
x 1 2 3 4 5 6
y 3 6 8 8 11 10
These results, together with the regression line of y on x, are plotted on the graph.
y
12
10
0 x
0 1 2 3 4 5 6 7
(i) Give a reason why the regression line of x on y is not suitable in this context. [1]
(ii) Explain the significance, for the regression line of y on x, of the distances shown by the vertical
dotted lines in the diagram. [2]
(iii) Calculate the value of the product moment correlation coefficient, r . [3]
3
18 The heights, h m, and weights, m kg, of five men were measured. The results are plotted on the
diagram.
80
78
76
74
72
70
0 h
0 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2
(i) Use the summarised data to calculate the value of the product moment correlation coefficient, r.
[3]
(iii) It was decided to re-calculate the value of r after converting the heights to feet and the masses to
pounds. State what effect, if any, this will have on the value of r. [1]
(iv) One of the men had height 1.63 m and mass 78.4 kg. The data for this man were removed and the
value of r was re-calculated using the original data for the remaining four men. State in general
terms what effect, if any, this will have on the value of r. [1]
(Q3, Jan 2010)
4 A certain four-sided die is biased. The score, X , on each throw is a random variable with probability
distribution as shown in the table. Throws of the die are independent.
x 0 1 2 3
P(X = x) 1
2
1
4
1
8
1
8
(ii) Find the probability that there are not more than 4 throws on which the score is 1. [2]
(iii) Find the probability that there are exactly 4 throws on which the score is 2. [3]
19
2 Three skaters, A, B and C, are placed in rank order by four judges. Judge P ranks skater A in 1st
place, skater B in 2nd place and skater C in 3rd place.
(i) Without carrying out any calculation, state the value of Spearman’s rank correlation coefficient
for the following ranks. Give a reason for your answer. [1]
Skater A B C
Judge P 1 2 3
Judge Q 3 2 1
(ii) Calculate the value of Spearman’s rank correlation coefficient for the following ranks. [3]
Skater A B C
Judge P 1 2 3
Judge R 3 1 2
(iii) Judge S ranks the skaters at random. Find the probability that the value of Spearman’s rank
correlation coefficient between the ranks of judge P and judge S is 1. [3]
(Q2, June 2010)
© OCR 2010 4732 Jun10
physicsandmathstutor.com
3
20 (i) Some values, (x, y), of a bivariate distribution are plotted on a scatter diagram and a regression
line is to be drawn. Explain how to decide whether the regression line of y on x or the regression
line of x on y is appropriate. [2]
(ii) In an experiment the temperature, x ◦ C, of a rod was gradually increased from 0 ◦ C, and the
extension, y mm, was measured nine times at 50 ◦ C intervals. The results are summarised below.
(a) Show that the gradient of the regression line of y on x is 0.008 and find the equation of this
line. [4]
(b) Use your equation to estimate the temperature when the extension is 2.5 mm. [1]
(c) Use your equation to estimate the extension for a temperature of −50 ◦ C. [1]
(d) Comment on the meaning and the reliability of your estimate in part (c). [2]
(Q3, June 2010)
4 (i) The random variable W has the distribution B!10, 13 ". Find
(a) P(W ≤ 2), [1]
(b) P(W = 2). [2]
1 2 3 3
Two of the cards are chosen at random, without replacement. The random variable X denotes the sum
A physicsandmathstutor.com
C E J T PhysicsAndMathsTutor.com
3
21
3 A B G
A firm wishes to assess whether there is a linear relationship between the annual amount spent on L M
advertising, £x thousand, and the annual profit, £y thousand. A summary of the figures for 12 years
is as follows.
N P Q R Z
n = 12 Σ x = 86.6 Σ y = 943.8 Σ x2 = 658.76 Σ y2 = 83 663.00 Σ xy = 7351.12
From these cards, 3 white cards and 4 grey cards are selected at random without regard to order.
(i) Calculate the product moment correlation coefficient, showing that it is greater than 0.9. [3]
(a) How many selections of seven cards are possible? [3]
(ii) (b) Find the
Comment probability
briefly that the
on this value in seven cards include exactly one card showing the letter A. [1]
this context. [4]
(iii) A manager claims that this result shows that spending more money on advertising in the future
7 The probability distribution
will result in of a discrete
greater profits. Make tworandom variable,
criticisms X , is
of this shown below.
claim. [2]
x 0 2
(iv) Calculate the equation of the regression line of y on x. [4]
P(X = x) a 1−a
(v) Estimate the annual profit during a year when £7400 was spent on advertising. [2]
(X from≤To2avoid
), athecertain
OCR is committed to seeking permission to reproduce all third-party content that it uses in its assessment materials. OCR has attempted to identify and contact all copyright holders
1 work
23
whose Five salesmen
(a)in thisPpaper.
is used firm were
issue of disclosure selected
of answer-related at random
information forall acopyright
to candidates, survey. For eacharesalesman,
acknowledgements reproduced in the the
OCR annual[1]
Copyright
income,
Acknowledgements x thousand
Booklet. This is producedpounds,
for each seriesand the distance
of examinations and is freelydriven
available tolast year,
download y public
from our thousand miles, were
website (www.ocr.org.uk) recorded.
after the The
live examination series.
results (b)were
If OCR has unwittingly Pto(X
failed = 2acknowledge
correctly
summarised). [2]
or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
as follows.
For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
= 5probability
Σ x = 251 Σ xthe=next Σy =
OCR is part of the Cambridge Assessment Group; Cambridge Assessment
(iii) Findnthe
of the University of Cambridge. that, in 14 323
7 weeks, there y = 8553 weeks
65 areΣexactly = 3247
Σ xy in
2 is the brand name of University of Cambridge Local
2 Examinations Syndicate (UCLES), which is itself a department
which Jan receives
exactly 2 free gifts. [3]
(i) (a) Show that the product moment correlation coefficient, r, between x and y is −0.122, correct
© OCR 2011 4732 Jan11
(ii) Another salesman from the firm is selected at random. His annual income is known to be
£52 000, but the distance that he drove last year is unknown. In order to estimate this distance, a
regression line based on the above data is used. Comment on the reliability of such an estimate.
© OCR 2011 4732 Jan11 [2]
Turn over
(Q1, June 2011)
2 The orders in which 4 contestants, P, Q, R and S, were placed in two competitions are shown in the
table.
£52 000, but the distance that he drove last year is unknown. In order to estimate this distance, a
regression line based on the above data is used. Comment on the reliability of such an estimate.
PhysicsAndMathsTutor.com
[2]
24
2 The orders in which 4 contestants, P, Q, R and S, were placed in two competitions are shown in the
table.
Position 1st 2nd 3rd 4th
Competition 1 Q R S P
Competition 2 Q P R S
Calculate Spearman’s rank correlation coefficient between these two orders. [5]
(Q2, June 2011)
3 (i) A random variable, X , has the distribution B(12, 0.85). Find
physicsandmathstutor.com
(a) P(X > 10), [2]
5
(b) P(X = 10), [2]
(X ).
7
25 The diagram shows the results of an experiment involving some bivariate data. The least squares
(c) Var
regression line of y on x for these results is also shown. [2]
(ii) A random variable, Y , has the distribution B(2, 41 ). Two independent values of Y are found. Find
the probability that ythe sum of these two values is 1. [4]
5
+
+
+
+
x
O 5
(ii) Use the diagram to explain what is meant by ‘least squares’. [2]
(iii) State, with a reason, the value of Spearman’s rank correlation coefficient for these data. [2]
(iv) What can be said about the value of the product moment correlation coefficient for these data?
[1]
(Q7, June 2011)
8 Ann, Bill, Chris and Dipak play a game with a fair cubical die. Starting with Ann they take turns,
in alphabetical order, to throw the die. This process is repeated as many times as necessary until a
player throws a 6. When this happens, the game stops and this player is the winner.
(b) This regression line is used to estimate the percentage sand content at depths of 25 cm and
3 A random variable X has the distribution B(13, 0.12).
100 cm. Comment on the reliability of each of these estimates. You are not asked to find the
estimates. [3]
(i) Find P(X < 2). [3]
(Q2, Jan 2012)
Two independent values of X are found.
3 A random variable X has the distribution B(13, 0.12).
(ii) Find the probability that exactly one of these values is equal to 2. [3]
(i) Find P(X < 2). [3]
4 Height
(a) The table gives the (m)and masses
heights 1.72 of 51.63
people. 1.77 1.68 1.74
Mass (kg) 75 62 64 60 70
Person A B C D E
Calculate Spearman’s rank correlation coefficient. [5]
Height (m) 1.72 1.63 1.77 1.68 1.74
(b) In an art competition
Mass the
(kg)value of75Spearman’s
62 rank64correlation
60 coefficient,
70 rs, calculated from two
judges’ rankings was 0.75. A late entry for the competition was received and both judges ranked this
entry lower than all the others. By considering the formula for rs, explain whether the new value of rs
Calculate Spearman’s rank correlation coefficient. [5]
will be less than 0.75, equal to 0.75, or greater than 0.75. [3]
(Q4, Jan
(b) In an art competition the value of Spearman’s rank correlation coefficient, rs, calculated from2012)
two
judges’ rankings was 0.75. A late entry for the competition was received and both judges ranked this
entry lower than all the others. By considering the formula for rs, explain whether the new value of rs
will be less than 0.75, equal to 0.75, or greater than 0.75. [3]
y 5 4.2 3.7
Red 3.2 3.5 3.0
6
(i) Calculate the product moment correlation coefficient r between x and y. [4]
(iii)
(ii) Over a longer
Find the periodthat
probability theexactly
value of is −the
oner of 0.8. The
two mayor
discs says, “This shows that having more tourists
is red. [3]
causes sales at Sackton Stores to decrease.” Give a reason why this statement is not correct. [1]
(Q1,replacement.
All the discs are replaced in the bag. Tony now takes three discs from the bag at random without June 2012)
2 The
(iii) masses, x kg,the
Given that offirst
50 bags
disc of flour
Tony wereis measured
takes andprobability
red, find the the results that
werethe
summarised as follows.
third disc Tony takes is also red.
n = 50 Σ(x − 1.5) = 1.4 2
Σ(x − 1.5) = 0.05 [2]
Calculate the mean and standard deviation of the masses of these bags of flour. [6]
5
29 (i) Write down the value of Spearman’s rank correlation coefficent, rs, for the following sets of ranks.
3 The test
(a) marks of 14 students are displayed in a stem-and-leaf diagram, as shown below.
0! Judge A ranks 1 2 3 4
1!2 6
2!1 3 5 Judge B ranks 1 2 3 4
3!w x 4 8 y z [1]
4!6 7 7 Key: 1 ! 6 means 16 marks
(b)
(i) Find the lower quartile. [1]
Judge A ranks 1 2 3 4
(ii) Given that the median is 32, find the values of w and x. [2]
Judge C ranks 4 3 2 1
(iii) Find the possible values of the upper quartile. [1]
[2]
(ii) State
(iv) Calculate the value of
one advantage ofras for the following
stem-and-leaf ranks.over a box-and-whisker plot.
diagram [1]
Judge D ranks 2 4 1 3
[3]
(iii) For each of parts (i)(a), (i)(b) and (ii), describe in everyday terms the relationship between the two
judges’ opinions. [3]
(Q5, June 2012)
6 A six-sided die is biased so that the probability of scoring 6 is 0.1 and the probabilities of scoring 1, 2, 3, 4,
and 5 are all equal. In a game at a fête, contestants pay £3 to roll this die. If the score is 6 they receive £10
back. If the score is 5 they receive £5 back. Otherwise they receive no money back. Find the organiser’s
expected profit for 100 rolls of the die. [5]
3
30 The Gross Domestic Product per Capita (GDP), x dollars, and the Infant Mortality Rate per thousand (IMR),
y, of 6 African countries were recorded and summarised as follows.
(i) Calculate the equation of the regression line of y on x for these 6 countries. [4]
The original data were plotted on a scatter diagram and the regression line of y on x was drawn, as shown
below.
y
100
80
60
x
800 1000 1200 1400 1600
(ii) The GDP for another country, Tanzania, is 1300 dollars. Use the regression line in the diagram to
estimate the IMR of Tanzania. [1]
(iii) The GDP for Nigeria is 2400 dollars. Give two reasons why the regression line is unlikely to give a
reliable estimate for the IMR for Nigeria. [2]
(iv) The actual value of the IMR for Tanzania is 96. The data for Tanzania (x = 1300, y = 96) is now included
with the original 6 countries. Calculate the value of the product moment correlation coefficient, r, for
all 7 countries. [4]
(v) The IMR is now redefined as the infant mortality rate per hundred instead of per thousand, and the
value of r is recalculated for all 7 countries. Without calculation state what effect, if any, this would
have on the value of r found in part (iv). [1]
(Q3, Jan 2013)
4 (i) How many different 3-digit numbers can be formed using the digits 1, 2 and 3 when
(ii) How many different 4-digit numbers can be formed using the digits 1, 2 and 3 when each digit may be
included at most twice? [5]
7
31 (i) Two judges rank n competitors, where n is an even number. Judge 2 reverses each consecutive pair of
ranks given by Judge 1, as shown.
63
Given that the value of Spearman’s coefficient of rank correlation is 65 , find n. [4]
(ii) An experiment produced some data from a bivariate distribution. The product moment correlation
coefficient is denoted by r, and Spearman’s rank correlation coefficient is denoted by rs.
r = 1 & rs = 1
r ! 1 & rs ! 1
(a) the first time she succeeds is on her 5th attempt, [2]
(b) the first time she succeeds is after her 5th attempt, [2]
(c) the second time she succeeds is before her 4th attempt. [4]
Jill also makes repeated attempts to hit the target. Each attempt of either Jill or Sandra is independent. Each
time that Jill attempts to hit the target, the probability that she succeeds is 0.2. Sandra and Jill take turns
attempting to hit the target, with Sandra going first.
(ii) Find the probability that the first person to hit the target is Sandra, on her