13 Chapter14
13 Chapter14
13 Chapter14
Chapter 14
Linear Regression
least-square regression
– Interpolation
( )
• Variance:
∑ (yi − y ) ∑ yi − ∑ yi / n
2 2
2
sy =
2
=
n −1 n −1
• Coefficient of variation:
sy
c.v. = ×100%
y
Statistics Review
• Example
Table
6.495 6.565 6.625 6.435 6.635 6.395 6.655 6.655
6.665 6.595 6.515 6.715 6.625 6.485 6.775 6.605
6.755 6.505 6.615 6.555 6.575 6.715 6.555 6.685
• Solution ☞
=y
∑
158.4
=
y
= 6.6
i = median
6.605 + 6.615
= 6.61
n 24 2
=sy 2 ∑ ( yi − y ) 2 0.217
= = 0.009435
n −1 23
= =
∑ yi 2 − (∑ yi ) 2 / n 1045.657 − 158.42 / 24
= 0.009435
n −1 23
=sy =
St ∑ ( yi − y ) 2
= = s y 2 0.097133
n −1 n −1
sy 0.097133
c.v. = ×100% = ×100 =1.47%
y 6.6
Normal Distribution
Normal Distribution
68% in [ y − s y , y + s y ]
95% in [ y − 2 s y , y + 2 s y ]
= [6.405734, 6.794266 ]
Descriptive Statistics in MATLAB
• MATLAB has several built-in commands to
compute and display descriptive statistics.
Assuming some column vector s:
– mean(s), median(s), mode(s)
• Calculate the mean, median, and mode of s. mode is a part of
the statistics toolbox.
– min(s), max(s)
• Calculate the minimum and maximum value in s.
– var(s), std(s)
• Calculate the variance and standard deviation of s
• Note - if a matrix is given, the statistics will be
returned for each column.
Histograms in MATLAB
• [n, x] = hist(s, x)
– Determine the number of elements in each bin of data in
s. x is a vector containing the center values of the bins.
• [n, x] = hist(s, m)
– Determine the number of elements in each bin of data in
s using m bins. x will contain the centers of the bins.
The default case is m=10
• hist(s, x) or hist(s, m) or hist(s)
– With no output arguments, hist will actually produce a
histogram.
Histogram Example
Linear Least-Squares Regression
y = a0 + a1 x + e = f ( x) + e
e = y − (a0 + a1 x) = y − f ( x)
Linear Least-Squares Regression
• Linear least-squares regression is a method to
determine the “best” coefficients in a linear model
for given data set.
• “Best” for least-squares regression means
minimizing the sum of the squares of the estimate
residuals. For a straight line model, this gives:
n n
Sr = ∑ e = ∑ (yi − a0 − a1 xi )
2 2
i
i=1 i=1
a0 = y − a1 x
Example
V F n∑ xi yi − ∑ xi ∑ yi 8(312850) − (360)(5135)
a1 = = = 19.47024
(m/s) (N)
n∑ x − (∑ x ) 8(20400) − (360)
2 2
2
i i
i xi yi (xi)2 x iy i
a0 = y − a1 x = 641.875 −19.47024 (45) = −234.2857
1 10 25 100 250
2 20 70 400 1400
St = ∑ i
( y − y )2
St
• r2 represents the percentage of the original uncertainty
explained by the model.
• For a perfect fit, Sr=0 and r2=1.
• If r2=0, there is no improvement over simply picking the
mean.
• If r2<0, the model is worse than simply picking the mean!
Example
V F
(m/s) (N)
Fest = −234.2857 +19.47024v
St = ∑ (yi − y ) = 1808297
2
i xi yi a0+a1xi (yi- ȳ)2 (yi-a0-a1xi)2
Sr = ∑ (yi − a0 − a1 xi ) = 216118
1 10 25 -39.58 380535 4171 2
power : y = α2 x β2
x
saturation - growth - rate : y = α 3
β3 + x
Linearization of Nonlinear
Relationships
• One option for finding the coefficients for a
nonlinear fit is to linearize it. For the three
common models, this may involve taking
logarithms or inversion:
Model Nonlinear Linearized
exponential : y = α1eβ1 x ln y = ln α1 + β1 x
Table
v, m/s 10 20 30 40 50 60 70 80
F, N 25 70 380 550 610 1220 830 1450
• Solution ☞
=y α 2 x β2 =
log y log α 2 + β 2 log x
12.606 20.515
=
= 1.57657
X = Y = 2.5644
8 8
n∑ xi yi − ∑ xi ∑ yi 8(33.622) − 12.606(20.515)
=a1 = = 1.9842
n∑ xi − (∑ xi ) −
2 2
2 8(20.516) (12.606)
a0 =−
y a1 x =
2.5644 − 1.9842(1.5757) =− 0.5620
log y =
−0.5620 + 1.9842 log x
−0.562 1.9842
= =
F 10 v 0.274v1.9842
Transformation Examples
Table
v, m/s 10 20 30 40 50 60 70 80
F, N 25 70 380 550 610 1220 830 1450
• Solution ☞
=y α 2 x β2 =
log y log α 2 + β 2 log x
2,000
Data
1,600 Regression line F = 0.274 v1.9842
1,200
F [N]
800
400
0
0 20 40 60 80 100
-400
v [m/s]
Transformation Examples
Table
v, m/s 10 20 30 40 50 60 70 80
F, N 25 70 380 550 610 1220 830 1450
1,600
Data ☞
• Solution 2,000
Data
1,200 Regression line 1,600 Regression line
1,200
800
F [N]
F [N]
800
400
=F 19.470 v − 234.286 400 F = 0.274 v1.9842
0 0
0 20 40 60 80 100 0 20 40 60 80 100
-400 -400
v [m/s] v [m/s]
Linear Regression Program
MATLAB Functions
• MATLAB has a built-in function polyfit that fits a
least-squares nth order polynomial to data:
– p = polyfit(x, y, n)
• x: independent data
• y: dependent data
• n: order of polynomial to fit
• p: coefficients of polynomial
f(x)=p1xn+p2xn-1+…+pnx+pn+1
• MATLAB’s polyval command can be used to
compute a value using the coefficients.
– y = polyval(p, x)