Example Stats Questions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

6PAM1052/6PAM1054 Statistics Example Questions

Dan Smith, 24/10/2022

This document contains some example questions of the types that you are likely to encounter in the
forthcoming statistics quiz on this module. They are provided in order to help you prepare for the quiz. Note
that in the quiz there are lots of different versions of questions that may appear similar – the quiz that you
view is unique.

1) Read in the data from example.csv, which is under the Statistics Training unit on the Canvas module
a. What is the range of the data in Column 2?
b. What is the interquartile range of the data in Column2?
c. What is the standard deviation of the data contained in Column1?

2) Using the data in example.csv, what are the mean and standard error on the mean of the data in each

3) Now let’s assume that column1 in example.csv contains some measurements, and column2 contains
the uncertainties associated with those measurements. We can use the uncertainties in column two
to derive weights for each data point (𝑤! = 1/𝜎 " ), and calculate:
a. The weighted mean value for this sample
b. The standard error on the weighted mean

4) The table below shows the number of meals sold in a restaurant on each day of the week:

Day Mon Tue Wed Thu Fri Sat Sun Total

N(meals) 35 27 27 38 78 95 87 387

a. Calculate the number of meals that would be expected per day under the null hypothesis that
there is no dependence of the number of meals sold on the day of the week
b. Use the scipy.stats.chisquare package to determine the 𝜒 " statistic for this situation, and the p-
c. What does the p-value that you obtain tell you about the null hypothesis?

5) Jonathan measures the height of a tree, 𝑥, three times using different techniques and the results he
gets (in metres) are as follows: 𝑥# = 17.5 +/− 0.3, 𝑥" = 18.1 +/− 0.4 and 𝑥$ = 17 +/− 2.
Calculate the mean and weighted mean estimates for the height of the tree. Also calculate the error
on the mean and on the weighted mean using appropriate techniques.

6) The volume of a square-faced rectangular box with sides of length 𝑥 and 𝑦 can be shown to be 𝑉 =
𝑥 " 𝑦. I measure the two faces and find that x = 12.0 +/- 0.2 cm, and y = 31.0 +/- 0.3 cm. Use the
method of error propagation to determine the Volume, the associated uncertainty, and the fractional
uncertainty on the estimate of the volume of the box resulting from these measurements.

7) The manufacturer of chocolate eggs claims that the amount of chocolate used when making an egg is
drawn from a normal distribution with a mean of 220g and standard deviation of 5g. Jeff buys a
chocolate egg from this manufacturer and finds that his egg weights 197g. On the basis of this
purchase, what are you able to say about the manufacturer’s claims?

8) Jeremy is comparing the lengths of bees observed near two of his beehives. He uses a Kolmogorov-
Smirnov test to compare the distributions and finds a p-value of 0.93. What does this tell him about
the bees near the two hives?

9) Brian – a statistics nerd – is very worried about the traffic outside his house as he owns two pet cats.
His partner wishes for them (two people and two cats) to move to a new house where Brian is
concerned that the traffic levels (and therefore safety of his cats) might be different. He installs traffic
monitoring equipment in the road outside both houses – the equipment records the length of time
between each passing vehicle in each location. To compare the two sets of data he uses a
Kolmogorov-Smirnov test and finds a p-value of 0.00016. What does this tell him about the traffic
outside the two houses?

10) In the practical exercises, you looked at fitting a linear model to some data linking body weight and
brain weight. Repeat that regression calculation using the least-squares method, but ignoring the
highest body-weight data point (which is the one with the largest error bar).
a. How does the result that you get using least-squares in this scenario compare to the result
you obtained using the 𝜒 " fitting method?
b. Why is doing this kind of data-manipulation a bad idea?

11) Winston is comparing two data sets and finds a Pearson correlation coefficient of -0.98. What does
this tell me about the correlation between the two sets of data?

12) An experiment is conducted and the results are found to be significant at the 𝑁𝜎 level. Where N is
equal to:

a. 1
b. 2
c. 3
d. 4
e. 5
f. 6
g. 7
h. 8
i. 9
j. 10

How would you describe the outcome of the experiment in each of these cases?

You might also like