Topic 4
Topic 4
Topic 4
Topic 4
Introduction to
Istatistics
Sayfa 1
Content
PART I
BASIC OCTAVE COMMANDS
PART II
STATISTICS
PART III
CONFIDENCE INTERVALS
Sayfa 2
PART I
BASIC
OCTAVE
COMMANDS
Sayfa 3
OneDim Arrays (Vectors)
x = [0 0.25 0.5 0.75 1]
x = 0 0.2500 0.5000 0.7500 1.0000
x = 0:0.25:1
x = 0 0.2500 0.5000 0.7500 1.0000
dizi = 1:7
dizi = 1 2 3 4 5 6 7
dizi = -5:2:5
dizi = -5 -3 -1 1 3 5
Sayfa 4
v = [1 2 3] % row vector
v = 1 2 3
x = [1 2 3];
y = [5 6 7];
x .* y
ans = 5 12 21
Sayfa 5
TwoDim Arrays (Matrices)
A = [1 1 1; 2 2 2]
A = 1 1 1
2 2 2
A = [1 1 1
2 2 2]
A = 1 1 1
2 2 2
B = A'
B = 1 2
1 2
1 2
Sayfa 6
Visualizing and Analysing Data
To visualize data
plot(x,y) X-Y graph
hist(x) Histogram
pie(x) Pie chart
...
To analyze data
mean(x) Average value
std(x) Standard deviation
max(x) Maximum value
min(x) Minimum value
...
Sayfa 7
Example 1
Exam Scores of 20 students:
55 42 65 68 64 72 75 58 87 89
77 66 91 39 44 57 69 75 68 81
x=[55 42 65 68 64 72 75 58 87 89 ...
77 66 91 39 44 57 69 75 68 81];
Sayfa 8
Example 1: Basic Analysis
x=[55 42 65 68 64 72 75 58 87 89 ...
77 66 91 39 44 57 69 75 68 81];
mean(x)
ans = 67.100
std(x)
ans = 14.853
length(x)
ans = 20
std(x) / sqrt(length(x))
ans = 3.3213
max(x)
ans = 91
min(x)
ans = 39
Sayfa 9
Example 1: plot command
x=[55 42 65 68 64 72 75 58 87 89 ...
77 66 91 39 44 57 69 75 68 81];
plot(x)
Sayfa 10
Example 1: hist command
x=[55 42 65 68 64 72 75 58 87 89 ...
77 66 91 39 44 57 69 75 68 81];
hist(x)
Sayfa 11
Example 1: hist command
x=[55 42 65 68 64 72 75 58 87 89 ...
77 66 91 39 44 57 69 75 68 81];
hist(x,20)
Number of bins
Sayfa 12
Example 1: hist command
x=[55 42 65 68 64 72 75 58 87 89 ...
77 66 91 39 44 57 69 75 68 81];
hist(x, [5:10:95])
Bin centers
Sayfa 13
PART II
STATISTICS
Sayfa 14
Statistics
Wikipedia says:
http://en.wikipedia.org/wiki/Statistics
Sayfa 15
Data Analysis
Data analysis is a very broad subject covering many techniques and types
of data. In this lecture we will study some basic calculations that are
commonly performed on sampled data.
Sayfa 16
For bi-variate data (two variables) the correlation coefficient (ρ) is a measure
of the linear dependence between one variable and the other.
xy x. y
x y
1 n 1 n 1 n
xy xi yi x xi y yi
n i 1 n i 1 n i 1
1 n
i
1 n
x y i
2 2
( x x ) ( y y )
n 1 i 1 n 1 i 1
Sayfa 17
xy x. y
x y
-1<=p<=1
p =0 if there is no correlation
p = =+-1 if X and Y are fully correlated
Sayfa 18
Sayfa 19
Example Using the following data, calculate the correlation
coefficient of X and Y.
X = { 17, 23, 27, 31, 51, 61}
Y = { 9, 14, 16, 23, 45, 49}
Answer (rho = 0.9)
Sayfa 20
To solve the problem we can use also Octave.
X = [ 17 23 27 31 51 61 ];
Y = [ 9 14 16 23 45 49 ];
plot(X,Y,'*')
rho=(mean(X.*Y)-mean(X)*mean(Y)) / (std(X)*std(Y))
rho = 0.82658
Sayfa 21
Median and Mod
The median is the number in the middle and
the mode is the most frequent number in a data set.
Sayfa 22
ı
PART 3
CONFIDENCE
INTERVAL
Sayfa 23
Population & Sample
In statistics it is very important to distinguish
between population and sample.
Sayfa 24
If the mean is measured using the whole population then this
would be the population mean and is represented by μ.
Sayfa 26
This variation is assumed to be normally distributed
around the desired values: μ = 250 g and σ = 2.5 g.
Sayfa 27
The sample mean and sample standard deviation:
x = [247.1, 250.0, 250.1, 249.8, 246.7, ...
254.4, 249.2, 249.4, 247.0, 247.0, ...
245.0, 253.3, 251.2, 250.7, 250.6, ...
247.3, 248.5, 248.0, 243.6, 250.2];
mean(x)
ans = 248.9550
std(x)
ans = 2.6015
Is the result
consistent with
μ = 250 g and σ = 2.5 g?
Is the machine
calibrated adequately?
Sayfa 28
Confidence Interval (=Güvenlik Aralığı)
In statistics, confidence interval (CI)
is a type of interval estimate of a population parameter
and is used to indicate the reliability of an estimate.
http://en.wikipedia.org/wiki/Confidence_level
Sayfa 29
Confidence Levels are defined as follows:
CL +- sigma
----- --------
0.800 1.28σ
0.900 1.65σ
0.950 1.96σ
0.990 2.58σ
0.999 3.29σ
Sayfa 31
CL Area under the curve
1
1 x2 / 2
%68
1 2
e dx 0.6827
2
1 x2 / 2
dx 0.9545
%95 e
2 2
3
1 x2 / 2
%99.7
3 2
e dx 0.9973
Sayfa 32
Let us return back to our example.
To get an impression of the expectation μ, it is sufficient
to give an estimate.
( xi x ) 2
2.6015 g
20 1
2.6015
E 0.5817 g
n 20
Sayfa 33
with 95% confidence level
The population mean lies between the interval:
x 2 E x 2 E
247.7916 250.1184
That is: the true mean is somewhere between
[247.7916, 250.1184] with 95% probability.
Sayfa 35
Binomial Distribution:
n k nk
P
(a) binom k p (1 p )
np 30 0.5 15
np(1 p) 30 0.5 (1 0.5) 2.74
+-3σ away from true mean -> there is a signal for something
Sayfa 37
Important Functions for
Measurement & Calibration
1. Gaussian Function
1
p( x) exp[( x ) 2 / 2 2 ]
2
Mean: µ
Std.Dev: σ
Sayfa 38
Important Functions for
Measurement & Calibration
2. Rectangular Distribution
Mean: M
a
Std.Dev:
3
Sayfa 39
Important Functions for
Measurement & Calibration
3. Triangular Distribution
Mean: M
a
Std.Dev:
6
Sayfa 40
Questions
1. Time of execution of a computer program in seconds are
given by:
T = {358,353,357,358,362,364,358,361,360,355}
Calculate mean, median and mod of the data.
Sayfa 41
Weight (kg)
2. In a hospital, the mean values of
Month Boy Girl
weights of 250 babies as a function ----- ---- ----
of month are obtained*. Determine 0 3.4 3.2
1 4.4 4.1
the correlation coefficient between: 2 5.5 5.0
(a) month and boy 3 6.4 5.7
4 7.1 6.4
(b) month and girl 5 7.7 6.9
(c) boy and girl 6 8.3 7.5
9 9.4 8.6
12 10.2 9.4
Sayfa 42
3. An industrial refrigerator is used W (m/s) T(oC)
to cool food in a processing factory. 3.8 1.69
An experiment is performed to test 8.4 1.34
the effect of wind speed of the 7.3 1.45
refrigerator on the temperature 3.9 1.75
in the refrigerator. The results are given 1.7 1.87
in the table. 9.6 1.05
(a) Plot the data 4.5 1.60
(a) Calculate the correlation coefficient 6.4 1.45
(b) Comment on the result 0.4 2.02
8.7 1.15
8.8 1.15
0.7 1.99
Sayfa 43
4. Repat the example, by using the rectangular distribution
function for a = 2.5 g.
Answer:
Mean Sigma Interval with 95% CL
-------- ------ --------------------
Gaussian 248.9550 2.6015 [247.7916, 250.1184]
Rectangular 248.9550 1.4434 [246.0682, 251.8418]
Triangular 248.9550 1.0206 [247.9344, 249.9756]
Sayfa 44