ErrorAnalysis
ErrorAnalysis
ErrorAnalysis
nb 1
ERROR ANALYSIS
Lab Report
Names:
Section:
Date:
Purpose
To understand terminology and concepts used in measurements.
To apply these concepts in a simple experimental situation.
To become familiar with the use of Mathematica.
Introduction
For a scientist to arrive at a valid conclusion in testing a theory or hypothesis it is necessary to understand the
underlying concepts of measurement errors. In engineering and manufacturing these same concepts are
important in designing and making quality products.
In statistics, an error (or residual) is not a “mistake” but rather a difference between a computed, estimated,
or measured value and the accepted true, specified, or theoretically correct value. In science and engineering
in general an error is defined as a difference between the desired and actual performance or behavior of a
system or object.
Readings
Books:
P. R. Bevington, “Data Reduction and Error Analysis for the Physical Sciences”, McGrawHill. Chapter 1 and 2.3
John R. Taylor, “An Introduction to Data analysis”, University Science Books. Chapters 1 to 5.
The above chapters are available through ‘Files/Books’ folder on your Canvas.
∑ni=1 xi
μ =
N
This formula says: Add the N measurements x1 , x2 , …, xN . Now divide by N to get the average value of x,
namely μ.
An estimate of the uncertainty of a single measurement, xi , is given by the standard deviation σ defined as
2
∑in=1 xi -μ
σ=
N-1
.
If we write xi ± σ for a single measurement, then this implies that, with a high probability, true or expected
value of xi lies between xi +σ and xi -σ.
If we have a theory to know that X would have a Gaussian or Normal distribution, this probability is about
68%. If needed, one can consider xi ± 2σ for about 95% probability.
An estimate of the uncertainty in the measurement of the mean μ is given by the standard error of the mean
σ,
2
σ ∑in=1 xi -μ
σ = = N (N-1)
.
N
Thus, we can write the mean as μ ± σ , which means that there is a 68% probability that the mean value of x,
namely μ, lies between μ+σ and μ-σ .
Again, if one wants to be safer, they can consider μ ± 2 σ , which means μ lies in the range μ - 2 σ and μ + 2 σ
with 95% probability.
The hat we used above means that we are estimating the mean μ with the average of a sample. We did not use
hat notation for σ above, because in this lab we will not be concerned about the error with estimating the
standard deviation.
Usually it is clear from the context if we are talking about the mean μ itself or the estimation. So one might
not use the hat sign at all.
Histogram of measurements distribution displays how many times a measurement lies in a given interval. The
various intervals are called “bins”. The appearance and usefulness of a histogram is strongly influenced by the
size of a bin. It is desirable to have a large bin size in order to make the number per bin (n) large. Then the
fluctuations in “n” from bin to bin will be relatively small. On the other hand it is desirable to have the bin size
small enough so as not to obscure the shape of the variation of “n”. It is an art to pick the bin size that best
displays your experimental results. Note that as the number of measurements increases, the histogram
becomes smoother.
A Gaussian or Normal distribution function is usually used to approximate a histogram which only contains
random errors. Not all random variables would show a Gaussian curve as a histogram, but usually Gaussian
distribution is a good approximation.
Example 1.
As an example of how you would use these concepts, suppose the students in a class are randomly assigned
to one of three groups, each with N students. First and second group are taught some new concept using one
teaching technique, while the third group is taught the same concept using a different technique. The three
groups are then given the same exam on the material. We’ll call the average exam scores for the three groups
μ1 , μ2 , and μ3 , and the standard errors of the means σ 1 , σ 2 and σ 3 .
We would expect the difference μ1 -μ2 to be “small” since the two groups were taught the same way and we
ErrorAnalysis.nb 3
would like to attribute the difference to just random measuring errors. On the other hand, in order to say that
the two different teaching methods produce different learning results we need to be able to say that μ1 -μ3
is “large”.
But small or large compared to what? The answer is small or large compared to the error in determining μ1 -
μ2 (or μ1 -μ3 ), which we’ll call Δ12 (or Δ13 ).
You calculate these errors from the standard deviations of the mean as follows: Δ12 = σ1 2 + σ2 2 and Δ13
= σ1 2 + σ3 2 , see propagation of errors. If groups 1 and 2 are not different then there is about 68% probability
that μ1 -μ2 will be less than or equal to Δ12 , or about 95% that it will be less than/equal to 2Δ12 , and a 99.7%
that it will be less than/equal to 3Δ12 . The scientific convention is to say two measurements are significantly
(or statistically) different if they differ by three standard deviations or more. Thus we say that groups 1
and 2 are not significantly different provided μ1 -μ2 < 3Δ12 . Likewise, groups 1 and 3 are significantly different
if μ1 -μ3 ⩾3 Δ13 .
Data
The Data section includes a clear and organized documentation of the observations and measurement which
were made during the lab. The Data section may include a table of measurements organized in rows and
columns with the column headings indicating the quantities being measured. Data sections may include
diagrams of an experimental set-up with observations recorded on the diagram. The use of sentences and
lengthy paragraphs is not necessary. Elaborate discussions are discouraged. But clearly labeled and docu-
mented findings are essential. These findings become the evidence which allow you to draw a conclusion
related to the question described by the Purpose.
The Data section will often include calculated data. Work should be shown for each type of calculation which
is performed. If the same type of calculation is repeatedly performed, the work only needs to be shown once.
This work should be clear and labeled.
The Data section often includes a graph. The axes of the graph should be clearly labeled. When the graph is a
representation of collected data and we are expecting a linear relation, you will often be asked to determine
the slope, y-intercept, regression coefficient, and equation. The equation is often written in slope-intercept
form for a linear graph. The equation should include symbols for the variables being plotted – not the tradi-
tional y and x often used in math class.
Significant figures
Be sure to read section on significant figures and quote all results accordingly. In general, the uncertainty is
quoted to only one significant figure unless that is a 1, in which case it is sometimes quoted to two (i.e., 0.3,
not 0.33, but 0.1 or 0.13 would be acceptable). The value should be quoted to the same precision as the error,
or at most one more. For example, give 2.5 ± 0.4 not 2.543 ± 0.4 and not 2.5 ± 0.007, although 2.54 ± 0.1 or 2.54
± 0.14 would be acceptable.
ber that the equal sign in programming is usually used for assignment.
You can also use the documentation whenever you need. For example, this is a list of Mathematica commands
we will use in this lab,
ReadList, Histogram, TableForm, Plot, Join, Mean, StandardDeviation
Procedure
In this experiment both you and your partner shall take two sets of data on the reaction time and compare the
two sets to investigate if there is a significant difference between them. You and your partner shall calculate
the average reaction time and the standard deviation for each set, make the histogram of the data distribu-
tion and overlap with normal (or Gauss) distribution. Then you shall compare one set of your data with that of
your partner. You shall answer the questions below in Questions sections.
1. Make two sets of 40 reaction time measurements. To measure your reaction time, you will use the
program stopwatch.exe. Select “Reaction Time”, and click the mouse or hit the return key once you see the
little guy sticking his tongue out at you, saying “Hit Me!”. Repeat this 40 times for each set. Copy-paste the
data into notepad and save in files “YourName_RT_1” and “YourName_RT_2”
2. You shall import these data into Mathematica using command ReadList[“Insert/filepath/filename”]
from a file or {InputString[]} from a dialog box, as shown in the example below.
3. Calculate your average reaction times, t1 , t2 , and the standard deviations, σ1 , σ2 . Present the data in a
table as it is shown in the example below.
4. Calculate the standard deviation of the mean.
5. Compare your reaction times for the two runs.
6. Present ti with corresponding errors. Note that only the significant figures should be presented.
7. The lab partner should repeat all the above to obtain another two sets of 40 measurements. Calculate
t3 , t4 , and the standard deviations, σ3 , σ4 of the new data for the partner.
8. Compare reaction times for you and your partner.
9. Combine the two sets of data 1 and 2 to give a data set of 80 measurements. Make a histogram of the
data and overlap with normal distribution as shown in example below.
10. Answer the questions below.
Dialog:
Step 1.
Import the data created by stopwatch into Mathematica List variable ti (i = 1, ... 4) using the command Read-
List[“Insert/filepath/filename”] from a file or {InputString[]} from a dialog box.
To run a code cell, press Shift+Enter, i.e. hold down the Shift key and press the Return key (aka Enter key).
Also copy the code cells in new cells. If you copy a code or input cell inside a text cell it will not compute.
t1 = ReadList["/Users/Girsh/Documents/275_f14/stopwatch1.txt"]
t2 = ReadList["/Users/Girsh/Documents/275_f14/stopwatch2.txt"]
(*t1={InputString[]} *)
Step 2.
n1 = Length[t1]
∑n1
i=1 t1〚i〛
mu1 =
n1
Step 3.
Calculate the deviation from average dev1=(t1 -μ1 ) and its square dev1p2=(t1 -μ1 ) 2 and present results in a
table.
dev1 = t1 - mu1;
dev1p2 = dev1 ^ 2;
TableForm{{t1, dev1, dev1p2}}, TableAlignments Center,
TableHeadings None, "t1i ", "t1i - μ1 ", "(t1i -μ1 ) 2 "
Step 4.
Calculate standard deviation, “sigma1”, and standard deviation of the mean, “sigma1bar”.
∑n1
i=1 dev1p2〚i〛
sigma1 =
n1 - 1
∑n1
i=1 dev1p2〚i〛
sigma1bar =
n1 * (n1 - 1)
Question 1:
Show all calculation below. Do this question for only one set of data but specify which data set you are using.
Answer 1:
ErrorAnalysis.nb 6
Step 5.
Repeat steps 1-4 for your second set of measurements “t2”. Calculate their average “mu2”, deviation from
average “dev2”, and square of the deviation “dev2p2”, standard deviation, “sigma2”, and standard deviation
of the mean, “sigma2bar”. You can copy the code and only tweak if needed. And no need to answer the
Question 1 again.
Step 6.
Now create a table of six columns (t1,dev1,dev1p2,t2,dev2,dev2p2) for both sets of your measurements:
Step 7.
Create a table of mean times and standard deviations for your two set of measurements:
μ1 ± σ1 =
μ2 ± σ2 =
Compare your reaction times for the two runs. [Here you can find a example for meaningful comparison of
two sets of data.]
μ1 - μ2 =
Δ12 = σ1 2 + σ2 2 =
μ1 - μ2
R12 = =
2 2
σ1 + σ2
ErrorAnalysis.nb 7
Step 8.
Combine the two sets of data for t1 and t2 to give a data set of 80 measurements
Using the function Histogram[t12,nbar] make a histogram of data. nbar specifies the number of bins. Experi-
ment with different values of nbar to produce a meaningful histogram for this data.
Question 2:
Using Drawing Tools [CTRL-d] indicate μ value and the ranges μ±σ, μ±2σ, etc., on the histogram. Determine
the percent of the measurements falling within these ranges.
Answer 2:
Step 9.
* 2*(σ) .
1
PG [x] = 2
2*π *σ
n12 = Length[t12]
σ12 = StandardDeviation[t12]
μ12 = Mean[t12]
1 -(x-μ) 2
2*π *σ
Plot[P[x1, μ12, σ12], {x1, μ12 - 3.5 σ12, μ12 + 3.5 σ12}, PlotRange All,
Frame True, FrameLabel {"Reaction Time, Sec." , "Probability Density"}]
Step 10.
Using the command the Mathematica function Show plot the histogram and normal distribution of the data
together.
Notice that the argument “ProbabilityDensity” in Histogram[t12,nbar,“ProbabilityDensity”] plots the normal-
ized histogram for t12.
ErrorAnalysis.nb 8
Step 11.
Repeat steps 1-4 for set of measurements “t3” of your lab partner. Calculate their average “mu3”, standard
deviation, “sigma3”, and standard deviation of the mean, “sigma3bar”.
Write your results below and compare the reaction times for you and your partner:
μ3 =
σ3 =
σ3 =
μ1 - μ3
R13 = =
2 2
σ1 + σ3
Question 3:
Write a short paragraph on the conclusions you can draw from R12 and R13 . [Here you can find a example
for meaningful comparison of two sets of data.]
Answer 3.
Question 4:
Compare the variability of the individual reaction time measurements for you and your partner. Who has
the “steadier nerves” (i.e., smallest deviation)?
Answer 4.