18-660: Numerical Methods For Engineering Design and Optimization

18-660: Numerical Methods for
Engineering Design and Optimization

Xin Li
Department of ECE
Carnegie Mellon University
Pittsburgh, PA 15213
Slide 1
Overview
 Linear Regression
 Ordinary least-squares regression
 Minimax optimization
 Design of experiments
Slide 2
Linear Regression
 Linear regression (also referred to as response surface
modeling) is widely used for many engineering problems
 We do not know the analytical form of f(x)
 But we can generate a set of sampling points for f(x)
 Fit an approximate function for f(x) from these sampling points
f(x) f (x ) ≈ α1 ⋅ b1 (x ) + α 2 ⋅ b2 (x ) + 
Model Basis
coefficients functions
f(x) is approximated as the linear

combination of multiple basis functions
x
Slide 3
Linear Regression
 Major steps of linear regression
 Select a model template (e.g., polynomial function)
 Generate a number of sampling points
 Compute performance values at these sampling points
 Create a set of linear equations to solve model coefficients
 A simple example
 f(x) = exp(x), x ∈ [-1, 1]
 We will use this simple example to show how we can generally
build a regression model from sampling data
Slide 4
Linear Regression Example
 Step 1: select a model template
f ( x ) ≈ bx + c
 Step 2: generate a number of sampling points

Samples 1 2 3 4 5
x -1 -0.5 0 0.5 1
 Step 3: compute performance values at these sampling points

Samples 1 2 3 4 5
f(x) 0.3679 0.6065 1.0000 1.6487 2.7183
Slide 5
 Step 4: create linear equations for model coefficients
f ( x ) ≈ bx + c
Samples 1 2 3 4 5
x -1 -0.5 0 0.5 1
f(x) 0.3679 0.6065 1.0000 1.6487 2.7183
 −1 1 0.3679
− 0.5
 1 0.6065

b 
 0 1 ⋅   = 1.0000  i-th sampling point
  c   
 0.5 1  1 . 6487 
 1 1 2.7183
x values f(x) values
Slide 6
 Step 5: solve over-determined linear equations
 # of equations is greater than # of coefficients – over-determined
 No exact solution exists to satisfy all equations, but we can find
the least-squares solution:
A ⋅α = B
min A ⋅α − B
2
Ordinary least-squares
α 2
(OLS) regression
Vector
 For a vector ε ∈ RM, ||ε||2 is defined as:
M
ε 2
= ∑ε
i =1
i
2
Slide 7
     ε1 
 i-th row   ε 
     2 Error at the i-th
A ⋅α = B  A  ⋅α − B =  ε 3  sampling point
     
      
    ε M 
M
A ⋅α − B ∑ i (α )
ε
2 2
min min
α 2 α
i =1
Slide 8
   
   
   
M samples  A  ⋅α = B (M > N)
   
   
   
N coefficients
 There are several possible ways to solve over-determined

linear equations for linear regression
 We will explain these algorithms in detail in future lectures
 For now, you can simply use “α = A\B” in MATLAB
Slide 9
 Step 5: solve over-determined linear equations
 −1 1 0.3679
3
Linear
− 0.5 1 0.6065 2.5 exp(x)
 b  
 0 1 ⋅   = 1.0000  2
  c   
exp(x)
1.5
 0.5 1  1 . 6487 
 1 1 2.7183 1
0.5
0
-1 -0.5 0 0.5 1
x
b = 1.1486 Linear model results in large error
c = 1.2683
Slide 10
Quadratic Model Example
 What if we build a quadratic model for y = exp(x)?
 Select a model template
f ( x ) ≈ ax 2 + bx + c
 Generate a number of sampling points

Samples 1 2 3 4 5
x -1 -0.5 0 0.5 1
 Compute performance values at these sampling points

Samples 1 2 3 4 5
f(x) 0.3679 0.6065 1.0000 1.6487 2.7183
Slide 11
 Create a set of linear equations to solve model coefficients
f ( x ) ≈ ax 2 + bx + c
Samples 1 2 3 4 5
x -1 -0.5 0 0.5 1
f(x) 0.3679 0.6065 1.0000 1.6487 2.7183
 1 −1 1 0.3679
0.25 − 0.5
 1 a  0.6065
 0 0 1 ⋅ b  = 1.0000 
   
0.25 0.5 1  c  1.6487 
 1 1 1 2.7183
x2 x f(x) values
Slide 12
 Build quadratic model for y = exp(x)
 1 −1 1 0.3679 3
Quadratic
0.25 − 0.5
 1 a  0.6065 2.5 exp(x)
 0 0 1 ⋅ b  = 1.0000  2
   
exp(x)
0.25 0.5 1  c  1.6487  1.5
 1 1 1 2.7183 1
0.5
0
-1 -0.5 0 0.5 1
x
a = 0.5477
Quadratic model results in much
b = 1.1486 better accuracy in this example
c = 0.9944
Slide 13
Linear Model vs. Quadratic Model
Linear RSM exp(x ) ≈ 1.1486 x + 1.2683
Quadratic RSM exp( x ) ≈ 0.5477 x + 1.1486 x + 0.9944
2
 Regression model is different from direct Taylor expansion

 E.g., different constant terms in linear and quadratic models –
they are selected to minimize the least-squares error
3
Linear
2.5 exp(x)
2
Minimize least-
exp(x)
1.5
squares error
1
0.5
Direct Taylor
expansion
0
-1 -0.5 0 0.5 1
x
Linear model for exp(x)

Slide 14
Minimax Optimization
 We can also solve over-determined linear equations to satisfy
other optimality criteria (i.e., not ordinary least-squares)
A ⋅α = B
i-th row of A
 i-th row   
    Error at the i-th
    sampling point
 A  ⋅α − B
min max A(i, :) ⋅ α − Bi    
α i
   
Minimize the maximal    
absolute error
Slide 15
 Other optimality criteria can be similarly formulated
A ⋅α = B
i-th row of A
 i-th row   
    Error at the i-th
A(i, :) ⋅ α − Bi     sampling point
min max
α
 A  ⋅α − B
i Bi    
   
Minimize the maximal    
relative error
These formulations are minimax optimization problems
Slide 16
 General minimax problems are difficult to solve
 Cost function does not have continuous derivative
ε(α)
max(ε1, ε2)
ε1(α) ε2(α)
Slide 17
 However, our minimax problem for regression modeling can
be re-formulated into a special form
 Consider the example of absolute error minimization

min max A(i, :) ⋅ α − Bi
α i
Introduce a slack variable t
min t
α ,t
Cost function
 A(1, :) ⋅ α − B1 ≤ t

 A(2, :) ⋅ α − B2 ≤ t
S.T.  Constraints
 
Subject to  A(M , :) ⋅ α − BM ≤ t

Slide 18
min t min t
α ,t α ,t
 A(1, :) ⋅ α − B1 ≤ t  − t ≤ A(1, :) ⋅ α − B1 ≤ t
  − t ≤ A(2, :) ⋅ α − B ≤ t
 A(2, :) ⋅ α − B2 ≤ t  2
S.T.  S.T. 
   
 A(M , :) ⋅ α − BM ≤ t − t ≤ A(M , :) ⋅ α − BM ≤ t

 Re-written as a linear programming (LP) problem

 Both cost function and constraints are linear
 No closed-form solution exists for LP
 Can be numerically solved by an efficient (i.e., low complexity)
and robust (i.e., global convergence) algorithm
Slide 19
Design of Experiments (DOE)
 We already know the basics for linear regression
 Open problem:
 How can we select few samples to achieve good accuracy?
 A bad linear model example: f(x1, x2) = a⋅x1 + b⋅x2 + c

x2
(x1 = −1 x2 = 0 f1 )
(x1 = 0 x2 = 0 f2 )
-1 0 1 x1 (x1 = 1 x2 = 0 f3 )
Sampling points for linear model
Slide 20
 Linear model example (continued)
(x1 = −1 x2 = 0 f1 )
f (x1 , x2 ) = a ⋅ x1 + b ⋅ x2 + c (x1 = 0 x2 = 0 f2 )
(x1 = 1 x2 = 0 f3 )
x1 x2 1
− 1 0 1 a   f1 
 0 0 1 ⋅ b  =  f 
     2
 1 0 1  c   f 3 
Singular matrix (cannot solve the coefficient b)
Slide 21
 Linear model example (continued)
x2
No variation is applied to x2
-1 0 1 x1
x2
Add additional sampling points for x2

-1 0 1 x1
Slide 22
 A bad quadratic model example: f(x1, x2) = a11⋅x12 + a12⋅x1⋅x2 +
a22⋅x22 + b1⋅x1 + b2⋅x2 + c
x2 (x1 = 0 x2 = 0 f1 )
(x1 = 0 x2 = −1 f2 )
(x1 = 0 x2 = 1 f3 )
-1 0 1 x1 (x1 = −1 x2 = 0 f4 )
(x1 = 1 x2 = 0 f5 )
Sampling points for quadratic model
Slide 23
 Quadratic model example (continued) (x = 0 x = 0 f )
1 2 1
(x1 = 0 x2 = −1 f2 )
f ( x1 , x2 ) = a11 ⋅ x + a12 ⋅ x1 ⋅ x2 + a22 ⋅ x
2 2
1 2
(x1 = 0 x2 = 1 f3 )
+ b1 ⋅ x1 + b2 ⋅ x2 + c
(x1 = −1 x2 = 0 f4 )
(x1 = 1 x2 = 0 f5 )
x12 x1x2 x22 x1 x2 1

 a11 
0 0 0 0 0 1    f1 
0   a12   
0 1 0 − 1 1 f
 a22   2 
0 0 1 0 1 1 ⋅   =  f 3 
   b1   
1 0 0 − 1 0 1  f4 
 b2 
1 0 0 1 0 1    f 5 

 c 
Singular matrix (cannot solve the coefficient a12)
Slide 24
 Quadratic model example (continued)
x2
Cross-product terms cannot be

-1 0 1 x1 captured
x2
Add additional sampling points for x1x2

-1 0 1 x1
Slide 25
 Design of experiments (DOE) is a research area that studies
how to optimally select sampling points for modeling
 Given a model template (e.g., linear or quadratic function),

optimize sampling points for certain optimal criterion
 E.g., maximize modeling accuracy
 Numerical optimization may be required to find the optimal

sampling scheme
D. Montgomery, Design and Analysis of Experiments, John Wiley & Sons, 2004
Slide 26
Summary
 Linear regression
 Ordinary least-squares regression
 Minimax optimization
 Design of experiments
Slide 27

18-660: Numerical Methods For Engineering Design and Optimization

Uploaded by

Copyright:

Available Formats

18-660: Numerical Methods For Engineering Design and Optimization

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

18-660: Numerical Methods For Engineering Design and Optimization

Uploaded by

Copyright:

Available Formats

18-660: Numerical Methods for

Engineering Design and Optimization

f(x) is approximated as the linear

 Step 2: generate a number of sampling points

 Step 3: compute performance values at these sampling points

x values f(x) values

 There are several possible ways to solve over-determined

 Generate a number of sampling points

 Compute performance values at these sampling points

 Regression model is different from direct Taylor expansion

Linear model for exp(x)

These formulations are minimax optimization problems

 Consider the example of absolute error minimization

Introduce a slack variable t

 Re-written as a linear programming (LP) problem

 A bad linear model example: f(x1, x2) = a⋅x1 + b⋅x2 + c

Sampling points for linear model

Singular matrix (cannot solve the coefficient b)

Add additional sampling points for x2

Sampling points for quadratic model

x12 x1x2 x22 x1 x2 1

Cross-product terms cannot be

Add additional sampling points for x1x2

 Given a model template (e.g., linear or quadratic function),

 Numerical optimization may be required to find the optimal

You might also like