MTH-262 Statistics and Probability Theory
MTH-262 Statistics and Probability Theory
MTH-262 Statistics and Probability Theory
Lecture 03
Outlines of Today’s Lecture
• In this lecture we will discuss following topics:
• Data Presentation: Frequency distribution.
• Measures of Central Tendency: Arithmetic Mean.
• Getting Started with R
How to Construct a Frequency Distribution
1: Sort the data in ascending order.
2: Calculate the range of data.
Data Presentation 3: Decide on the number of intervals in the
frequency distribution.
4: Determine the intervals.
5: Tally and count the observations under each
Frequency Distribution interval.
Mean
For a variable x, the mean of the observations for a
sample is called a sample
mean and is denoted x¯ . Symbolically,
σ𝑛𝑖=1 𝑋𝑖
𝑋ത =
𝑛
Properties of Arithmetic Mean
ii) Sum of squared deviations of values from their mean is always minimum
(comparative to any other average, A).
σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2
< σ𝑛𝑖=1 𝑋𝑖 − 𝐴 2
iv) The combined mean for more than one mean can be presented as:
ത ത
𝑛 𝑋 +𝑛 𝑋 +⋯+𝑛𝑘 𝑋𝑘 ത
𝑋ത𝑐 = 1 1 2 2
𝑛1 +𝑛2 +⋯+𝑛𝑘
Properties of Arithmetic Mean
i) Sum of the deviations of observations from their mean is always zero.
σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത = 0
ii) Sum of squared deviations of values from their mean is always minimum
(comparative to any other average, A).
σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2
< σ𝑛𝑖=1 𝑋𝑖 − 𝐴 2
Mean=15.444 Sum of Errors = 0.004 Sum of Squared Errors = 80.222 Mean=15.444 Sum of Errors=13 Sum of Squared Errors = 99
Properties of Arithmetic Mean
iii) Change of origin and scale affect the mean.
𝑌 = 𝑎 + 𝑏𝑋 → 𝑌ത = 𝑎 + 𝑏𝑋ത
iv) The combined mean for more than one mean can be presented as:
ത ത
𝑛 𝑋 +𝑛 𝑋 +⋯+𝑛𝑘 𝑋𝑘 ത
𝑋ത𝑐 = 1 1 2 2
𝑛1 +𝑛2 +⋯+𝑛𝑘
X Y=2+3X X1 X2 X3
16 50 16 15 12
17 53 17 16 16
10 32 10 14 20
13 41 13 13 18
20 12 12
20 62
18 13 12
18 56
13 18 15
13 41 14 18 16
14 44 18 20 20
18 56
Mean(X1) = 15.444 Mean(X2) = 15.444 Mean(X3) = 15.667
Mean=15.444 Mean(Y) = 2+3(15.444)
Mean(Y) = 48.333 Combined Mean = 15.518
Mathematical Averages
Mean (Arithmetic Mean for Grouped Data)
When the data is presented in terms of classes and their
frequencies, slightly different formula is applicable to find out the
average.
Sum(f) = 40 Sum(f*X)=132.5
Mathematical Averages
Trimmed Mean
A trimmed mean is computed by “trimming away” a certain percent
of both the largest and the smallest set of values. For example, the 10% trimmed
mean is found by eliminating the largest 10% and smallest 10% and computing the
average of the remaining values.
Case is Important
• The case of the letters in a variable name is important. There is a distinction
between x and X, or mydata and myData. This is the case with everyday language,
so shouldn’t be surprising, but isn’t always true when using computers.
Getting Started
Functions
• The R language is comprised of numerous built-in functions, providing a rich set
of actions. Several of these functions are for the familiar mathematical
operations:
Getting Started
The Workspace
• After interacting with R one
typically has created several objects
and perhaps functions. Without
doing anything special, R will
maintain these objects in a global
Workspace. When R searches for an
object at the command line, this is
the first place on its path that it will
look.
Getting Started
Data Sets
• Many packages include accompanying data sets. The UsingR package has several
that we will see utilized in the text. This package also calls in, among others, the
HistData package that provides data sets from the history of statistics and data
visualization.
Any Question?