0% found this document useful (0 votes)
32 views18 pages

S1 Bivariate Data

Uploaded by

www.aleckmpanda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views18 pages

S1 Bivariate Data

Uploaded by

www.aleckmpanda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

OCR Maths S1

Topic Questions from Papers

Bivariate Data

PhysicsAndMathsTutor.com
physicsandmathstutor.com PhysicsAndMathsTutor.com
2

1 The scatter diagrams below illustrate three sets of bivariate data, A, B and C.

State, with an explanation in each case, which of the three sets of data has
(i) the largest,
(ii) the smallest,
value of the product moment correlation coefficient. [4]
(Q1, Jan 2005)
2 The back-to-back stem-and-leaf diagram below shows the number of hours of television watched per
physicsandmathstutor.com
week by each of 15 boys and 15 girls.
3
Boys Girls
23 Two commentators gave 8 ratings
7 7 6 out
6 4of4 100
3 for
0 0 5 5sports
seven 6 6 7 7 8 8 9 The ratings are shown in
personalities.
the table below. 2 2 0 1 0 0 4
6 5 4 2 2 7
Personality A 5 B3 C D E F G
Key: 4 2 I 2 means
Commentator 73 a 76boy who
78watched
65 24 hours
86 and 82a 91
girl who
Commentator II watched
77 22 hours
78 79of television
80 per week.
86 89 95

(i) Find the median and the quartiles of the results for the boys. [3]
(i) Calculate Spearman’s rank correlation coefficient for these ratings. [5]
(ii) Give a reason why the median might be preferred to the mean in using an average to compare the
(ii) State what your answer tells you about the ratings given by the two commentators. [1]
two data sets. [1]
(Q3, Jan 2005)
(iii) State one advantage, and one disadvantage, of using stem-and-leaf diagrams rather than box-and-
4 The table below shows the probability distribution of the random variable X .
whisker plots to represent the data. [2]
x −2 −1 0 1 2
P(X = x) 1
4
1
5
k 2
5
1
10

(i) Find the value of the constant k. [2]

(ii) Calculate the values of E(X ) and Var(X ). [5]

5 On average 1 in 20 members of the population of this country has a particular DNA feature. Members
of the population are selected at random until one is found who has this feature.

(i) Find the probability that the first person4732/Jan05


to have this feature is
(a) the sixth person selected, [3]
(b) not among the first 10 people selected. [3]
[3]
(iii) the questions on geometric distributions and on binomial distributions arePhysicsAndMathsTutor.com
separated by at least 2
other questions. [4]

39 Five observations of bivariate data produce the following results, denoted as (xi , yi ) for i = 1, 2, 3, 4, 5.

(13, 2.7) (13, 4.0) (18, 2.8) (23, 3.3) (23, 2.2)

[Σ x = 90, Σ y = 15.0, Σ x2 = 1720, Σ y2 = 46.86, Σ xy = 264.0.]

(i) Show that the regression line of y on x has gradient −0.06, and find its equation in the form
y = a + bx. [4]
(ii) The regression line is used to estimate the value of y corresponding to x = 20, but the value x = 20
is accurate only to the nearest whole number. Calculate the difference between the largest and
the smallest values that the estimated value of y could take. [3]
The numbers e1 , e2 , e3 , e4 , e5 are defined by
ei = a + bxi − yi for i = 1, 2, 3, 4, 5.

(iii) The values of e1 , e2 and e3 are 0.6, −0.7 and 0.2 respectively. Calculate the values of e4 and e5 .
[2]
(iv) Calculate the value of e21 + e22 + e23 + e24 + e25 and explain the relevance of this quantity to the
regression line found in part (i). [2]
(v) Find the mean and the variance of e1 , e2 , e3 , e4 , e5 . [4]
(Q9, Jan 2005)
4732/Jan05

41 (i) Calculate the value of Spearman’s rank correlation coefficient between the two sets of rankings,
A and B, shown in Table 1. [4]

A 1 2 3 4 5
B 4 1 3 2 5
Table 1

(ii) The value of Spearman’s rank correlation coefficient between the set of rankings B and a third
set of rankings, C, is known to be −1. Copy and complete Table 2 showing the set of rankings C.
[2]

B 4 1 3 2 5
C
Table 2
(Q1, June 2005)

2 The probability that a certain sample of radioactive material emits an alpha-particle in one unit of time
is 0.14. In one unit of time no more than one alpha-particle can be emitted. The number of units of
time up to and including the first in which an alpha-particle is emitted is denoted by T .

(i) Find the value of


(a) P(T = 5), [3]
(b) P(T < 8). [3]

(ii) State the value of E(T ). [2]

3 In a supermarket the proportion of shoppers who buy washing powder is denoted by p. 16 shoppers
PhysicsAndMathsTutor.com
3

54 The table shows the latitude, x (in degrees correct to 3 significant figures), and the average rainfall y
(in cm correct to 3 significant figures) of five European cities.

City x y
Berlin 52.5 58.2
Bucharest 44.4 58.7
Moscow 55.8 53.3
St Petersburg 60.0 47.8
Warsaw 52.3 56.6

[n = 5, Σ x = 265.0, Σ y = 274.6, Σ x2 = 14 176.54, Σ y2 = 15 162.22, Σ xy = 14 464.10.]

(i) Calculate the product moment correlation coefficient. [3]

(ii) The values of y in the table were in fact obtained from measurements in inches and converted into
centimetres by multiplying by 2.54. State what effect it would have had on the value of the product
moment correlation coefficient if it had been calculated using inches instead of centimetres. [1]

(iii) It is required to estimate the annual rainfall at Bergen, where x = 60.4. Calculate the equation
of an appropriate line of regression, giving your answer in simplified form, and use it to find the
required estimate. [5]
(Q4, June 2005)

physicsandmathstutor.com

61 Some observations of bivariate data were made and the equations of the two regression lines were
found to be as follows.
y on x : y = −0.6x + 13.0
x on y : x = −1.6y + 21.0

(i) State, with a reason, whether the correlation between x and y is negative or positive. [1]

(ii) Neither variable is controlled. Calculate an estimate of the value of x when y = 7.0. [2]

(iii) Find the values of x and y. [3]


(Q1, June 2006)
2 A bag contains 5 black discs and 3 red discs. A disc is selected at random from the bag. If it is red it
is replaced in the bag. If it is black, it is not replaced. A second disc is now selected at random from
the bag.

Find the probability that


(i) the second disc is black, given that the first disc was black, [1]
(ii) the second disc is black, [3]
(iii) the two discs are of different colours. [3]

3 Each of the 7 letters in the word DIVIDED is printed on a separate card. The cards are arranged in a
row.

(i) How many different arrangements of the letters are possible? [3]
4732/S05 [Turn over
(ii) In how many of these arrangements are all three Ds together? [2]

The 7 cards are now shuffled and 2 cards are selected at random, without replacement.
physicsandmathstutor.com PhysicsAndMathsTutor.com
3

76 The table shows the total distance travelled, in thousands of miles, and the amount of commission
earned, in thousands of pounds, by each of seven sales agents in 2005.

Agent A B C D E F G
Distance travelled 18 15 12 14 16 24 13
Commission earned 18 45 19 24 27 22 23

(i) (a) Calculate Spearman’s rank correlation coefficient, rs , for these data. [5]
(b) Comment briefly on your value of rs with reference to this context. [1]
(c) After these data were collected, agent A found that he had made a mistake. He had actually
travelled 19 000 miles in 2005. State, with a reason, but without further calculation, whether
the value of Spearman’s rank correlation coefficient will increase, decrease or stay the same.
[2]

The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented
no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, y, together with
the data for distance travelled, x, are illustrated in the scatter diagram below.

(ii) For this scatter diagram, what can you say about the value of
(a) Spearman’s rank correlation coefficient, [1]
(b) the product moment correlation coefficient? [1]
(Q6, June 2006)

[Questions 7 and 8 are printed overleaf.]

4732/S06 [Turn over


(ii) Find E(X ).
PhysicsAndMathsTutor.com
[2]

82 The table contains data concerning five households selected at random from a certain town.

Number of people in the household 2 3 3 5 7


Number of cars belonging to people in the household 1 1 3 2 4

(i) For which one or more of these variables is


Calculate
(i) (a) the product
the mean equal tomoment correlation coefficient, r , for the data in the table.
the median, [5]
[1]

Give the
(ii) (b) mean why
a reason greater than the
it would notmedian? [1]
be sensible to use your answer to draw a conclusion about all the
households in the town. [1]
(ii) Give a reason why none of these diagrams could represent a geometric distribution. [1]
(Q2, Jan 2007)
3 (iii) Which1,one
The digits 2, 3,of4these
and 5diagrams could
are arranged innot represent
random order,a to
binomial distribution?
form a five-digit Explain your answer
number.
briefly. [2]
(i) How many different five-digit numbers can be formed? [1]

95 A(ii)chemical
Find thesolution wasthat
probability gradually heated.number
the five-digit At five-minute
is intervals the time, x minutes, and the
temperature, y ◦ C, were noted.
(a) odd, [2]
x than023 000. 5
(b) less 10 15 20 25 30 35 [3]
y 0.8 3.0 6.8 10.9 15.6 19.6 23.4 26.7

[n = 8, Σ x = 140, Σ y = 106.8, Σ x2 = 3500, Σ y2 = 2062.66, Σ xy = 2685.0.]

(i) Calculate the equation of the regression line of y on x. [4]

(ii) Use your equation to estimate the temperature after 12 minutes. [2]

(iii) It is given that the value of the product moment correlation coefficient is close to +1. Comment
physicsandmathstutor.com
on the reliability of using your equation to estimate y when
(a) x = 17,
2

1 The table = 57.the probability distribution for a random variable X .


(b) xshows
[2]
x 0 1 2 3
(Q5, Jan 2007)
PX  x 0.1 0.2 0.3 0.4

Calculate EX  and VarX . [5]

10
2 Two judges each placed skaters from five countries in rank order.
© OCR 2007 4732/01 Jan07
Position 1st 2nd 3rd 4th 5th
Judge 1 UK France Russia Poland Canada
© OCR 2007 4732/01 Jan07 [Turn over
Judge 2 Russia Canada France UK Poland

Calculate Spearman’s rank correlation coefficient, rs , for the two judges’ rankings. [5]
(Q2, June 2007)
3 (i) How many different teams of 7 people can be chosen, without regard to order, from a squad
of 15? [2]

(ii) The squad consists of 6 forwards and 9 defenders. How many different teams containing
3 forwards and 4 defenders can be chosen? [2]
(ii) P(T > 4), [2]
(iii) E(T ).
PhysicsAndMathsTutor.com
[1]
physicsandmathstutor.com
11
3 A sample of bivariate data was taken and the results
2 were summarised as follows.

n=5 Σ x = 24 Σ x2 = 130 Σ y = 39 Σ y2 = 361 Σ xy = 212


1 Each time a certain triangular spinner is spun, it lands on one of the numbers 0, 1 and 2 with
probabilities as shown in the table.
(i) Show that the value of the product moment correlation coefficient r is 0.855, correct to 3 significant
figures. Number Probability [2]

(ii) The ranks of the data were found. One 0 student calculated
0.7Spearman’s rank correlation coefficient
rs , and found that rs = 0.7. Another 0.2the product moment coefficient, R, of
1 student calculated
these ranks. State which one of the following statements is true, and explain your answer briefly.
2 0.1
(A) R = 0.855
The spinner
(B) Ris = 0.7 twice. The total of the two numbers on which it lands is denoted by X .
spun

P(X = 2) = 0.18.
(C) It is impossible to give the value of R without carrying out a calculation using the original
(i) Show that
data. [3]
[2]
The probability distribution of X is given in the table.
(iii) All the values of x are now multiplied by a scaling factor of 2. State the new values of r and rs .
x 0 1 2 3 4 [2]
P(X = x) 0.49 0.28 0.18 0.04 0.01 (Q3, Jan 2008)
4 A supermarket has a large stock of eggs. 40% of the stock are from a firm called Eggzact. 12% of the
(ii) Calculate E(X ) and Var(X ).
stock are brown eggs from Eggzact.
[5]
An egg is chosen at random from the stock. Calculate the probability that
2
12 The
(i) table shows
this egg is brown, x years,
the age,given thatand
it isthe mean
from diameter, y cm, of the trunk of each of seven randomly
Eggzact, [2]
selected trees of a certain species.
(ii) this egg is from Eggzact and is not brown. [2]
Age (x years) 11 12 20 28 35 45 51
Mean trunk diameter (y cm) 12.2 16.0 26.4 39.2 39.6 51.3 60.6

[n = 7, Σ x = 202, Σ y = 245.3, Σ x2 = 7300, Σ y2 = 10 510.65, Σ xy = 8736.9.]

(i) (a) Use an appropriate formula to show that the gradient of the regression line of y on x is 1.13,
correct to 2 decimal places. [2]
(b) Find the equation of the regression line of y on x. [2]

(ii) Use your equation to estimate the mean4732/01


© OCR 2008 trunk diameter of a tree of this species with age
Jan08

(a) 30 years, [1]


(b) 100 years. [1]

It is given that the value of the product moment correlation coefficient for the data in the table is 0.988,
correct to 3 decimal places.

(iii) Comment on the reliability of each of your two estimates. [2]


(Q2, Jan 2009)

© OCR 2009 4732 Jan09


physicsandmathstutor.com PhysicsAndMathsTutor.com
5

13
9 It is thought that the pH value of sand (a measure of the sand’s acidity) may affect the extent to which
a particular species of plant will grow in that sand. A botanist wished to determine whether there was
any correlation between the pH value of the sand on certain sand dunes, and the amount of each of two
plant species growing there. She chose random sections of equal area on each of eight sand dunes and
measured the pH values. She then measured the area within each section that was covered by each of
the two species. The results were as follows.

Dune A B C D E F G H
pH value, x 8.5 8.5 9.5 8.5 6.5 7.5 8.5 9.0

Area, y cm2 , Species P 150 150 575 330 45 15 340 330


covered Species Q 170 15 80 230 75 25 0 0

The results for species P can be summarised by

n = 8, Σ x = 66.5, Σ x2 = 558.75, Σ y = 1935, Σ y2 = 711 275, Σ xy = 17 082.5.

(i) Give a reason why it might be appropriate to calculate the equation of the regression line of y on
x rather than x on y in this situation. [1]

(ii) Calculate the equation of the regression line of y on x for species P, in the form y = a + bx, giving
the values of a and b correct to 3 significant figures. [4]

(iii) Estimate the value of y for species P on sand where the pH value is 7.0. [2]

The values of the product moment correlation coefficient between x and y for species P and Q are
rP = 0.828 and rQ = 0.0302.

(iv) Describe the relationship between the area covered by species Q and the pH value. [1]

(v) State, with a reason, whether the regression line of y on x for species P will provide a reliable
estimate of the value of y when the pH value is
(a) 8, [1]
(b) 4. [1]

(vi) Assume that the equation of the regression line of y on x for species Q is also known. State, with
a reason, whether this line will provide a reliable estimate of the value of y when the pH value
is 8. [1]
(Q9, Jan 2008)

© OCR 2008 4732/01 Jan08


woodpecker. [1]
PhysicsAndMathsTutor.com
(iii) Calculate the probability that she sees a woodpecker on exactly 2 days in the first 15 days. [3]

14
4 Three tutors each marked the coursework of five students. The marks are given in the table.

Student A B C D E
Tutor 1 73 67 60 48 39
Tutor 2 62 50 61 76 65
Tutor 3 42 50 63 54 71

(i) Calculate Spearman’s rank correlation coefficient, rs , between the marks for tutors 1 and 2. [5]
physicsandmathstutor.com
(ii) The values of rs for the other pairs of tutors,
2 are as follows.
Tutors 1 and 3: rs = −0.9
1 20% of packets of a certain kind of cereal contain a free gift. Jane buys one packet a week for 8 weeks.
Tutors 2 and 3:
The number of free gifts that Jane receives is denoted r = 0.3
s by X . Assuming that Jane’s 8 packets can be
regarded as a random sample, find
State which two tutors differ most widely in their judgements. Give your reason. [2]
(i) P(X = 3), [3]
(Q4, Jan 2009)
(ii) P(X ≥ 3), [2]
(iii) E(X ). [2]
physicsandmathstutor.com
15
2 Two judges placed 7 dancers in rank order. Both4judges placed dancers A and B in the first two places,
but in opposite orders. The judges agreed about the ranks for all the other 5 dancers. Calculate the
5 value
A of Spearman’s
washing-up rank correlation
bowl contains 6 spoons,coefficient.
5 forks and 3 knives. Three of these 14 items are removed[4]at
random, without replacement. Find the probability that (Q2, June 2009)
3 In(i)anall three items
agricultural are of different
experiment, kinds,
the relationship [3]
between the amount of water supplied, x units, and the
yield, y units,
(ii) all was investigated.
three items values of x were chosen and for each value of x the corresponding
Six kind.
are of the same [3]
value of y was measured. The results are shown in the table.

16
6 (a) A student calculatedx the values
1 of2the product
3 moment 4 correlation
5 6
coefficient, r, and Spearman’s
rank correlation coefficient,
y 3 rs , for
6 two sets
8 of bivariate
8 data,
11 A and10 B . His results are given
below.
A : r =line
These results, together with the regression y onrs x=, are
0.9 ofand 1 plotted on the graph.
B: r=1 and rs = 0.9

y where appropriate, explain why the student’s results for A could both
With the aid of a diagram
be correct but his results for B cannot both be correct. [3]
12
(b) An old research paper has been partially4732
© OCR 2009 destroyed.
Jan09 Turn over
The surviving part of the paper contains the
following incomplete10information about some bivariate data from an experiment.

The mean
6 of x is 4.5. The
The equation of the regression line of y on x is y = 2.4x + 3.7.
The equation
4 of the regression line of x on y is x = 0.40y –

0 x
0 2 end3of the4equation
1 at the
Calculate the missing constant 5 of6the second
7 regression line. [4]
(Q6, Jan 2010)
(i) Give a reason why the regression line of x on y is not suitable in this context. [1]
7 The table shows the numbers of male and female members of a vintage car club who own either a
Jaguar or a Bentley.
(ii) Explain No member
the significance, forowns both makes
the regression lineofofcar.
y on x, of the distances shown by the vertical
2 Two judges placed 7 dancers in rank order. Both judges placed dancers A and B in the first two places,
but in opposite orders. The judges agreed about the ranks for all the other 5 dancers. Calculate the
PhysicsAndMathsTutor.com
value of Spearman’s rank correlation coefficient. [4]

17
3 In an agricultural experiment, the relationship between the amount of water supplied, x units, and the
yield, y units, was investigated. Six values of x were chosen and for each value of x the corresponding
value of y was measured. The results are shown in the table.

x 1 2 3 4 5 6
y 3 6 8 8 11 10

These results, together with the regression line of y on x, are plotted on the graph.

y
12

10

0 x
0 1 2 3 4 5 6 7

(i) Give a reason why the regression line of x on y is not suitable in this context. [1]

(ii) Explain the significance, for the regression line of y on x, of the distances shown by the vertical
dotted lines in the diagram. [2]

(iii) Calculate the value of the product moment correlation coefficient, r . [3]

(iv) Comment on your value of r in relation to the diagram. [2]


(Q3, June 2009)

© OCR 2009 4732 Jun09


physicsandmathstutor.com PhysicsAndMathsTutor.com
3

3
18 The heights, h m, and weights, m kg, of five men were measured. The results are plotted on the
diagram.

80

78

76

74

72

70

0 h
0 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2

The results are summarised as follows.


n=5 Σh = 9.02 Σm = 377.7 Σh2 = 16.382 Σm2 = 28 558.67 Σhm = 681.612

(i) Use the summarised data to calculate the value of the product moment correlation coefficient, r.
[3]

(ii) Comment on your value of r in relation to the diagram. [2]

(iii) It was decided to re-calculate the value of r after converting the heights to feet and the masses to
pounds. State what effect, if any, this will have on the value of r. [1]

(iv) One of the men had height 1.63 m and mass 78.4 kg. The data for this man were removed and the
value of r was re-calculated using the original data for the remaining four men. State in general
terms what effect, if any, this will have on the value of r. [1]
(Q3, Jan 2010)

4 A certain four-sided die is biased. The score, X , on each throw is a random variable with probability
distribution as shown in the table. Throws of the die are independent.

x 0 1 2 3
P(X = x) 1
2
1
4
1
8
1
8

(i) Calculate E(X ) and Var(X ). [5]

The die is thrown 10 times.

(ii) Find the probability that there are not more than 4 throws on which the score is 1. [2]

(iii) Find the probability that there are exactly 4 throws on which the score is 2. [3]

© OCR 2010 4732 Jan10 Turn over


(v) In a German examination the marks of the same students had an interquartile range of 16 marks.
What does this result indicate about the performance of the students in the German examination
PhysicsAndMathsTutor.com
as compared with the French examination? [3]

19
2 Three skaters, A, B and C, are placed in rank order by four judges. Judge P ranks skater A in 1st
place, skater B in 2nd place and skater C in 3rd place.

(i) Without carrying out any calculation, state the value of Spearman’s rank correlation coefficient
for the following ranks. Give a reason for your answer. [1]

Skater A B C
Judge P 1 2 3
Judge Q 3 2 1

(ii) Calculate the value of Spearman’s rank correlation coefficient for the following ranks. [3]

Skater A B C
Judge P 1 2 3
Judge R 3 1 2

(iii) Judge S ranks the skaters at random. Find the probability that the value of Spearman’s rank
correlation coefficient between the ranks of judge P and judge S is 1. [3]
(Q2, June 2010)
© OCR 2010 4732 Jun10

physicsandmathstutor.com

3
20 (i) Some values, (x, y), of a bivariate distribution are plotted on a scatter diagram and a regression
line is to be drawn. Explain how to decide whether the regression line of y on x or the regression
line of x on y is appropriate. [2]

(ii) In an experiment the temperature, x ◦ C, of a rod was gradually increased from 0 ◦ C, and the
extension, y mm, was measured nine times at 50 ◦ C intervals. The results are summarised below.

n=9 Σ x = 1800 Σ y = 14.4 Σ x2 = 510 000 Σ y2 = 32.6416 Σ xy = 4080

(a) Show that the gradient of the regression line of y on x is 0.008 and find the equation of this
line. [4]
(b) Use your equation to estimate the temperature when the extension is 2.5 mm. [1]
(c) Use your equation to estimate the extension for a temperature of −50 ◦ C. [1]
(d) Comment on the meaning and the reliability of your estimate in part (c). [2]
(Q3, June 2010)
4 (i) The random variable W has the distribution B!10, 13 ". Find
(a) P(W ≤ 2), [1]
(b) P(W = 2). [2]

(ii) The random variable X has the distribution B(15, 0.22).


(a) Find P(X = 4). [2]
(b) Find E(X ) and Var(X ). [3]

5 Each of four cards has a number printed on it as shown.

1 2 3 3

Two of the cards are chosen at random, without replacement. The random variable X denotes the sum
A physicsandmathstutor.com
C E J T PhysicsAndMathsTutor.com
3

21
3 A B G
A firm wishes to assess whether there is a linear relationship between the annual amount spent on L M
advertising, £x thousand, and the annual profit, £y thousand. A summary of the figures for 12 years
is as follows.
N P Q R Z
n = 12 Σ x = 86.6 Σ y = 943.8 Σ x2 = 658.76 Σ y2 = 83 663.00 Σ xy = 7351.12
From these cards, 3 white cards and 4 grey cards are selected at random without regard to order.
(i) Calculate the product moment correlation coefficient, showing that it is greater than 0.9. [3]
(a) How many selections of seven cards are possible? [3]
(ii) (b) Find the
Comment probability
briefly that the
on this value in seven cards include exactly one card showing the letter A. [1]
this context. [4]

(iii) A manager claims that this result shows that spending more money on advertising in the future
7 The probability distribution
will result in of a discrete
greater profits. Make tworandom variable,
criticisms X , is
of this shown below.
claim. [2]
x 0 2
(iv) Calculate the equation of the regression line of y on x. [4]
P(X = x) a 1−a
(v) Estimate the annual profit during a year when £7400 was spent on advertising. [2]

(i) Find E(X ) in terms of a.


(Q3, Jan 2011)
[2]

(ii) Show that Var(X ) = 4a(1 − a).


4 Jenny and Omar are each allowed two attempts at a high jump.
[3]
(i) The probability that Jenny will succeed on her first attempt is 0.6. If she fails on her first attempt,
22
8 the probability
Five dogs, that
A, B, C , D andsheE,will
tooksucceed
part in on herraces.
three secondThe
attempt
orderisin0.7. Calculate
which the probability
they finished that
the first race
Jenny
was ABCDE . will succeed. [3]

(ii) The probability


(i) Spearman’s rankthat Omar will
correlation succeed on
coefficient his firstthe
between attempt
orders p. Ifthe
is for he5fails
dogsoninhis first
the attempt,
first the
two races
probability
was that
found to behe−1.will succeed
Write downonthe
hisorder
second attemptthe
in which also pfinished
is dogs . The probability
the secondthat he succeeds
race. [1]
is 0.51. Find p. [4]
(ii) Spearman’s rank correlation coefficient between the orders for the 5 dogs in the first race and the
third race was found to be 0.9.
(a) Show that, in the usual notation (as in the List of Formulae), Σd 2 = 2.
5 30% of packets of Natural Crunch Crisps contain a free gift. Jan buys 5 packets each week.
[2]
The Hence
(i) (b) numberorofotherwise
free giftsfind
thata possible orderinin awhich
Jan receives week the dogs could
is denoted X . finished
by have Name a the third
suitable
race. distribution with which to model X , giving the value(s) of any parameter(s). State
probability [2]
any assumption(s) necessary for the distribution to be a valid model. [4]
(Q8, Jan 2011)
physicsandmathstutor.com
Assume now that your model is valid.
2
(ii) Find
Copyright Information

(X from≤To2avoid
), athecertain
OCR is committed to seeking permission to reproduce all third-party content that it uses in its assessment materials. OCR has attempted to identify and contact all copyright holders
1 work
23
whose Five salesmen
(a)in thisPpaper.
is used firm were
issue of disclosure selected
of answer-related at random
information forall acopyright
to candidates, survey. For eacharesalesman,
acknowledgements reproduced in the the
OCR annual[1]
Copyright
income,
Acknowledgements x thousand
Booklet. This is producedpounds,
for each seriesand the distance
of examinations and is freelydriven
available tolast year,
download y public
from our thousand miles, were
website (www.ocr.org.uk) recorded.
after the The
live examination series.

results (b)were
If OCR has unwittingly Pto(X
failed = 2acknowledge
correctly
summarised). [2]
or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
as follows.
For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.

= 5probability
Σ x = 251 Σ xthe=next Σy =
OCR is part of the Cambridge Assessment Group; Cambridge Assessment
(iii) Findnthe
of the University of Cambridge. that, in 14 323
7 weeks, there y = 8553 weeks
65 areΣexactly = 3247
Σ xy in
2 is the brand name of University of Cambridge Local
2 Examinations Syndicate (UCLES), which is itself a department
which Jan receives
exactly 2 free gifts. [3]
(i) (a) Show that the product moment correlation coefficient, r, between x and y is −0.122, correct
© OCR 2011 4732 Jan11

to 3 significant figures. [3]


(b) State what this value of r shows about the relationship between annual income and distance
driven last year for these five salesmen. [1]
[Questions 6, 7 and 8 are printed overleaf.]
(c) It was decided to recalculate r with the distances measured in kilometres instead of miles.
State what effect, if any, this would have on the value of r. [1]

(ii) Another salesman from the firm is selected at random. His annual income is known to be
£52 000, but the distance that he drove last year is unknown. In order to estimate this distance, a
regression line based on the above data is used. Comment on the reliability of such an estimate.
© OCR 2011 4732 Jan11 [2]
Turn over
(Q1, June 2011)

2 The orders in which 4 contestants, P, Q, R and S, were placed in two competitions are shown in the
table.
£52 000, but the distance that he drove last year is unknown. In order to estimate this distance, a
regression line based on the above data is used. Comment on the reliability of such an estimate.
PhysicsAndMathsTutor.com
[2]

24
2 The orders in which 4 contestants, P, Q, R and S, were placed in two competitions are shown in the
table.
Position 1st 2nd 3rd 4th
Competition 1 Q R S P
Competition 2 Q P R S

Calculate Spearman’s rank correlation coefficient between these two orders. [5]
(Q2, June 2011)
3 (i) A random variable, X , has the distribution B(12, 0.85). Find
physicsandmathstutor.com
(a) P(X > 10), [2]
5
(b) P(X = 10), [2]
(X ).
7
25 The diagram shows the results of an experiment involving some bivariate data. The least squares
(c) Var
regression line of y on x for these results is also shown. [2]

(ii) A random variable, Y , has the distribution B(2, 41 ). Two independent values of Y are found. Find
the probability that ythe sum of these two values is 1. [4]

5
+
+

+
+

x
O 5

© OCR 2011 4732 Jun11


(i) Given that the least squares regression line of y on x is used for an estimation, state which of x
or y is treated as the independent variable. [1]

(ii) Use the diagram to explain what is meant by ‘least squares’. [2]

(iii) State, with a reason, the value of Spearman’s rank correlation coefficient for these data. [2]

(iv) What can be said about the value of the product moment correlation coefficient for these data?
[1]
(Q7, June 2011)
8 Ann, Bill, Chris and Dipak play a game with a fair cubical die. Starting with Ann they take turns,
in alphabetical order, to throw the die. This process is repeated as many times as necessary until a
player throws a 6. When this happens, the game stops and this player is the winner.

Find the probability that


(i) Chris wins on his first throw, [1]
(ii) Dipak wins on his second throw, [3]
(iii) Ann gets a third throw, [2]
(iv) Bill throws the die exactly three times. [4]
2 In an experiment, the percentage sand content, y, of soil in a given region was measured at nine different
PhysicsAndMathsTutor.com
depths, x cm, taken at intervals of 6 cm from 0 cm to 48 cm. The results are summarised below.
(ii) Find E(X). [2]
n = 9 Σx = 216 Σx2 = 7344 Σy = 512.4 Σy2 = 30 595 Σxy = 10 674
26
2 In an experiment, the percentage sand content, y, of soil in a given region was measured at nine different
(i) State, with a reason, which variable is the independent variable. [1]
depths, x cm, taken at intervals of 6 cm from 0 cm to 48 cm. The results are summarised below.
(ii) Calculate the product moment correlation coefficient between x and y. [3]
n = 9 Σx = 216 Σx2 = 7344 Σy = 512.4 Σy2 = 30 595 Σxy = 10 674
(iii) (a) Calculate the equation of the appropriate regression line. [3]
(i) State, with a reason, which variable is the independent variable. [1]
(b) This regression line is used to estimate the percentage sand content at depths of 25 cm and
(ii) Calculate the product moment correlation coefficient between x and y. [3]
100 cm. Comment on the reliability of each of these estimates. You are not asked to find the
estimates. [3]
(iii) (a) Calculate the equation of the appropriate regression line. [3]

(b) This regression line is used to estimate the percentage sand content at depths of 25 cm and
3 A random variable X has the distribution B(13, 0.12).
100 cm. Comment on the reliability of each of these estimates. You are not asked to find the
estimates. [3]
(i) Find P(X < 2). [3]
(Q2, Jan 2012)
Two independent values of X are found.
3 A random variable X has the distribution B(13, 0.12).
(ii) Find the probability that exactly one of these values is equal to 2. [3]
(i) Find P(X < 2). [3]

Two independent values of X are found.


4
27 (a) The table gives the heights and masses of 5 people.
(ii) Find the probability that exactly one of these values is equal to 2. [3]
Person A B C D E

4 Height
(a) The table gives the (m)and masses
heights 1.72 of 51.63
people. 1.77 1.68 1.74

Mass (kg) 75 62 64 60 70
Person A B C D E
Calculate Spearman’s rank correlation coefficient. [5]
Height (m) 1.72 1.63 1.77 1.68 1.74
(b) In an art competition
Mass the
(kg)value of75Spearman’s
62 rank64correlation
60 coefficient,
70 rs, calculated from two
judges’ rankings was 0.75. A late entry for the competition was received and both judges ranked this
entry lower than all the others. By considering the formula for rs, explain whether the new value of rs
Calculate Spearman’s rank correlation coefficient. [5]
will be less than 0.75, equal to 0.75, or greater than 0.75. [3]
(Q4, Jan
(b) In an art competition the value of Spearman’s rank correlation coefficient, rs, calculated from2012)
two
judges’ rankings was 0.75. A late entry for the competition was received and both judges ranked this
entry lower than all the others. By considering the formula for rs, explain whether the new value of rs
will be less than 0.75, equal to 0.75, or greater than 0.75. [3]

© OCR 2012 4732 Jan12

© OCR 2012 4732 Jan12


physicsandmathstutor.com PhysicsAndMathsTutor.com
physicsandmathstutor.com
2
3
28
1 For each of the last five years the number of tourists, x thousands, visiting Sackton, and the average weekly
4 sales,
A bag £contains
y thousands, in Sackton
5 red discs Storesdisc.
and 1 black wereTina
noted. The
takes twotable shows
discs fromthe
theresults.
bag at random without replacement.

(i) The diagram shows part Year


of a tree2007
diagram2008
to illustrate
2009 this situation.
2010 2011
First disc Second disc
x 250 270 264 290 292

y 5 4.2 3.7
Red 3.2 3.5 3.0
6

(i) Calculate the product moment correlation coefficient r between x and y. [4]

(ii) It is required to estimate the1 average weekly


Black sales at Sackton Stores in a year when the number of
tourists is 280 000. Calculate6 the equation of an appropriate regression line, and use it to find this
estimate.
Complete the tree diagram in your Answer Book showing all the probabilities. [4]
[2]

(iii)
(ii) Over a longer
Find the periodthat
probability theexactly
value of is −the
oner of 0.8. The
two mayor
discs says, “This shows that having more tourists
is red. [3]
causes sales at Sackton Stores to decrease.” Give a reason why this statement is not correct. [1]
(Q1,replacement.
All the discs are replaced in the bag. Tony now takes three discs from the bag at random without June 2012)

2 The
(iii) masses, x kg,the
Given that offirst
50 bags
disc of flour
Tony wereis measured
takes andprobability
red, find the the results that
werethe
summarised as follows.
third disc Tony takes is also red.
n = 50 Σ(x − 1.5) = 1.4 2
Σ(x − 1.5) = 0.05 [2]

Calculate the mean and standard deviation of the masses of these bags of flour. [6]
5
29 (i) Write down the value of Spearman’s rank correlation coefficent, rs, for the following sets of ranks.

3 The test
(a) marks of 14 students are displayed in a stem-and-leaf diagram, as shown below.
0! Judge A ranks 1 2 3 4
1!2 6
2!1 3 5 Judge B ranks 1 2 3 4
3!w x 4 8 y z [1]
4!6 7 7 Key: 1 ! 6 means 16 marks
(b)
(i) Find the lower quartile. [1]
Judge A ranks 1 2 3 4
(ii) Given that the median is 32, find the values of w and x. [2]
Judge C ranks 4 3 2 1
(iii) Find the possible values of the upper quartile. [1]
[2]

(ii) State
(iv) Calculate the value of
one advantage ofras for the following
stem-and-leaf ranks.over a box-and-whisker plot.
diagram [1]

(v) State one advantage of a Judge A ranks plot1 over 2a stem-and-leaf


box-and-whisker 3 4 diagram. [1]

Judge D ranks 2 4 1 3
[3]

(iii) For each of parts (i)(a), (i)(b) and (ii), describe in everyday terms the relationship between the two
judges’ opinions. [3]
(Q5, June 2012)
6 A six-sided die is biased so that the probability of scoring 6 is 0.1 and the probabilities of scoring 1, 2, 3, 4,
and 5 are all equal. In a game at a fête, contestants pay £3 to roll this die. If the score is 6 they receive £10
back. If the score is 5 they receive £5 back. Otherwise they receive no money back. Find the organiser’s
expected profit for 100 rolls of the die. [5]

© OCR 2012 4732 Jun12

© OCR 2012 4732Jun12 Turn over


physicsandmathstutor.com PhysicsAndMathsTutor.com
3

3
30 The Gross Domestic Product per Capita (GDP), x dollars, and the Infant Mortality Rate per thousand (IMR),
y, of 6 African countries were recorded and summarised as follows.

n= 6 / x = 7000 / x 2 = 8 700 000 / y = 456 / y 2 = 36 262 / xy = 509 900

(i) Calculate the equation of the regression line of y on x for these 6 countries. [4]

The original data were plotted on a scatter diagram and the regression line of y on x was drawn, as shown
below.

y
100

80

60

x
800 1000 1200 1400 1600

(ii) The GDP for another country, Tanzania, is 1300 dollars. Use the regression line in the diagram to
estimate the IMR of Tanzania. [1]

(iii) The GDP for Nigeria is 2400 dollars. Give two reasons why the regression line is unlikely to give a
reliable estimate for the IMR for Nigeria. [2]

(iv) The actual value of the IMR for Tanzania is 96. The data for Tanzania (x = 1300, y = 96) is now included
with the original 6 countries. Calculate the value of the product moment correlation coefficient, r, for
all 7 countries. [4]

(v) The IMR is now redefined as the infant mortality rate per hundred instead of per thousand, and the
value of r is recalculated for all 7 countries. Without calculation state what effect, if any, this would
have on the value of r found in part (iv). [1]
(Q3, Jan 2013)
4 (i) How many different 3-digit numbers can be formed using the digits 1, 2 and 3 when

(a) no repetitions are allowed, [1]

(b) any repetitions are allowed, [2]

(c) each digit may be included at most twice? [2]

(ii) How many different 4-digit numbers can be formed using the digits 1, 2 and 3 when each digit may be
included at most twice? [5]

© OCR 2013 4732/01 Jan13 Turn over


physicsandmathstutor.com PhysicsAndMathsTutor.com
5

7
31 (i) Two judges rank n competitors, where n is an even number. Judge 2 reverses each consecutive pair of
ranks given by Judge 1, as shown.

Competitor C1 C2 C3 C4 C5 C6 ....... Cn−1 Cn

Judge 1 rank 1 2 3 4 5 6 ....... n−1 n

Judge 2 rank 2 1 4 3 6 5 ....... n n−1

63
Given that the value of Spearman’s coefficient of rank correlation is 65 , find n. [4]

(ii) An experiment produced some data from a bivariate distribution. The product moment correlation
coefficient is denoted by r, and Spearman’s rank correlation coefficient is denoted by rs.

(a) Explain whether the statement

r = 1 & rs = 1

is true or false. [1]

(b) Use a diagram to explain whether the statement

r ! 1 & rs ! 1

is true or false. [2]


(Q7, Jan 2013)
8 Sandra makes repeated, independent attempts to hit a target. On each attempt, the probability that she
succeeds is 0.1.

(i) Find the probability that

(a) the first time she succeeds is on her 5th attempt, [2]

(b) the first time she succeeds is after her 5th attempt, [2]

(c) the second time she succeeds is before her 4th attempt. [4]

Jill also makes repeated attempts to hit the target. Each attempt of either Jill or Sandra is independent. Each
time that Jill attempts to hit the target, the probability that she succeeds is 0.2. Sandra and Jill take turns
attempting to hit the target, with Sandra going first.

(ii) Find the probability that the first person to hit the target is Sandra, on her

(a) 2nd attempt, [2]

(b) 10th attempt. [3]

© OCR 2013 4732/01 Jan13

You might also like