Chapter 7 - Unconstrained Minimization Methods
Chapter 7 - Unconstrained Minimization Methods
SoICT
Hanoi University of Science and Technology
2 / 66
Introduction
3 / 66
Recall some concepts from calculus
R n = {x = (x1 , x2 , · · · , xn )T : xi ∈ R, i = 1, 2, · · · , n}
u + v = (u1 + v1 , u2 + v2 , · · · , un + vn )
Rn and the operations just defined form a linear space. Elements of Rn are sometimes points.
4 / 66
Recall some concepts from calculus
then Rn together with the scalar product becomes an n-dimensional Euclidean space.
The length of vector u ∈ Rn is the number
n
!1/2
X
1/2
||u|| =< u, u > = ui2
i=1
5 / 66
Recall some concepts from calculus
n
!1/2
X
2
ρ(u, v ) = ||u − v || = (ui − vi )
i=1
6 / 66
Recall some concepts from calculus
7 / 66
Recall some concepts from calculus
8 / 66
Recall some concepts from calculus
9 / 66
n-dimensional Euclidean space
10 / 66
Recall some concepts from calculus
11 / 66
Recall some concepts from calculus
12 / 66
Recall some concepts from calculus
13 / 66
Recall some concepts from calculus
14 / 66
Recall some concepts from calculus
15 / 66
Differential of the multivariable function
16 / 66
Recall some concepts from calculus
f (x) → min, x ∈ X
17 / 66
Recall some concepts from calculus
Note
Obviously if function f has global minimum on X then
18 / 66
Recall some concepts from calculus
Some examples
f (x) = (x − 1)2 has global minimum at x ∗ = 1 with f (x ∗ ) = 0.
f (x) = e x + e −x − 3x 2 . The optimal value of function f (x) = −7.02. The problem has
global minimum at two points x = ±2.84, has no local minimum.
f (x) = e −x has a lower bound of zero but not achieved. There is no local minimum or
global minimum.
f (x) = −x + e −x The objective function is not bounded below, has neither a local
minimum nor a global minimum.
f (x) = e x + e −x − 3x 2 + x The problem has local minimum x1 = −2.9226 and
x2 = 2.7418, in which x1 is global minimum point. The optimal value of this function is
-9.9040
19 / 66
Recall some concepts from calculus
20 / 66
Recall some concepts from calculus
21 / 66
Extrema of Multivariable Functions
22 / 66
Unconstrained nonlinear programming
Definition
Consider the Unconstrained nonlinear programming:
Theorems
Theorem 1 (Necessary condition for optimality) : The necessary condition for x 0 to be a local
minimum is
▽f (x 0 ) = 0 (2)
Condition (2) is called the stationary condition, the point x 0 satisfying (2) is called stationary
point. Therefore, solving the problem (1) can be reduced to to solving equations (2).
23 / 66
Unconstrained nonlinear programming
Some examples
f (x) = x 2 − 3x − 1: equation f ′ (x) = 2x − 3 = 0 has the only solution x 0 = 3/2 that is a
local minimum and at the same time a global minimum.
!
2x1 − 2x2 + 1
f (x1 , x2 ) = x12 + x22 − 2x1 x2 + x1 : equation ▽f (x) = = 0 has the only
−2x1 + 2x2
solution x 0 = (−1/4, 1/4). However, solution x 0 is not optimal solution of the problem
min{f (x) : x ∈ R2 } because f (−1/4, 1/4) = −1/8 > −1 = f (0, 1).
24 / 66
Unconstrained nonlinear programming
Theorem (cont.)
Theorem 2 (Sufficient condition for optimality) : Suppose f is twice continuously
differentiable. The stationary point x 0 is local minimum if matrix f ”(x 0 ) is a positive definite
matrix.
To know whether a matrix is positive definite or not, the following Sylvester’s criterion can be
used.
Sylvester’s criterion : Matrix A = (aij )n×n a negative definite matrix (positive semidefinite
matrix) if and only if all of its sub-determinants are non-negative.
ai1 ,i1 ai1 ,i2 · · · ai1 ,ik
ai ,i ai2 ,i2 · · · ai2 ,ik
△i1 ,i2 ,··· ,ik = det 2 1 ≥0
···
aik ,i1 aik ,i2 · · · aik ,ik
where ∀1 ≤ i1 < i2 < · · · < ik ≤ n, ∀k = 1, 2, · · · , n
25 / 66
Unconstrained nonlinear programming
Some examples
2 2
Consider f (x1 , x2 ) = e x1 +x2 : solve equations
2 2
!
2x1 e x1 +x2
▽f (x) = 2 2 =0
2x2 e x1 +x2
!
2 0
has the only solution x0 = (0, 0) because f ”(0, 0) = has determinant
0 2
det|f ”(x 0 )| > 0 so x 0 is a local minimum and at the same time the optimal solution of
the problem.
26 / 66
Unconstrained nonlinear programming
27 / 66
Unconstrained nonlinear programming
28 / 66
Single-variable minimization methods
Unimodal function
Unimodal function is a function with only one maximum or minimum point on the specified
interval.
Definition : If x ∗ is the single minimum point of f (x) in the range a ≤ x ≤ b then f (x) is
unimodal on the interval if and only if for any two points x1 and x2 ,
x ∗ ≤ x1 ≤ x2 implies that f (x ∗ ) ≤ f (x1 ) ≤ f (x2 )
x ∗ ≥ x1 ≥ x2 implies thatf (x ∗ ) ≤ f (x1 ) ≤ f (x2 )
Note
The search methods (to be introduced) all are worked for unimodal function.
29 / 66
Single-variable minimization methods
Example
Assuming the interval contains minimum [0,1], we calculate the value of function at two points
x1 < x2 : f (x1 ) = f1 , f (x2 ) = f2 , there are 3 possibilities:
1 f1 < f2 : The minimum point x cannot be to the right of x2 , thus the interval containing
the minimum is [0, x2 ]
2 f1 > f2 : The minimum point x cannot be to the left of x1 , thus the interval containing
the minimum is [x1 , 1]
3 f1 = f2 : Two intervales [0, x1 ) and (x2 , 1] can be removed, and the intervalcontaining the
minimum is [x1 , x2 ]
30 / 66
Single-variable minimization methods
Searching diagram
31 / 66
Single-variable minimization methods
Example :
Write the script to calculate the minimum of f (x) = x(x − 1.5) on the interval (a, b) = (0, 1)
with the precision e = 0.01 compared to the absolute solution x ∗ = 0.75
f = inline(’x.*(x-1.5)’,’x’);
eps = 0.01;
a = 0; b = 1; k = 0;
while abs(b-a)>= 2*eps
x1=a + (b-a)/2 - eps/2; x2=a + (b-a)/2 + eps/2;
f1=f(x1); f2=f(x2); k=k+1;
if f1<f2 b=x2;
elseif f1>f2 a=x1;
else a=x1;b=x2;
end
end
... 32 / 66
Single-variable minimization methods
Example (cont.) :
...
fprintf(’Number of iterations k= %d ’,k);
fprintf(’Length of segment : b-a = %d’,b-a);
fprintf(’x= %d’,x1);
.
Result
» Number of iterations k=7
Length of segment : b-a = 1.773438e-002
x=7.502344e-001
33 / 66
Single-variable minimization methods
Fibonacci method
Definition : Fibonacci sequence is defined recursively as following
F0 = 1; F1 = 1;
Fk = Fk−1 + Fk−2 , k ≥ 2;
We need to determine the number of iterations N before calculating the minimum. The
(k) (k)
selection of two points x1 and x2 at iteration k is determined by the formula :
(k) FN−1−k
x1 = (bk − ak ) + ak , k = 0, 1, · · · , N − 1
FN+1−k
(k) FN−k
x2 = (bk − ak ) + ak , k = 0, 1, · · · , N − 1
FN+1−k
34 / 66
Single-variable minimization methods
35 / 66
Single-variable minimization methods
Example
Implement the golden section method to find the minimum of function in the interval [0,2]
f (x) = x 2 − 2x + 1
f = inline(’x∧ 2-2*x+1’);
a = 0; b = 2; eps = 0.00001;
x1 = a + (b-a)*0.382;
x2 = a + (b-a)*0.618;
f1 = f(x1);
f2 = f(x2);
while abs(b-a)>2*eps
if f1 > f2
a = x1; x1 = x2; f1 = f2;
x2 = a + (b-a)*0.618;
f2 = f(x2);... 36 / 66
Single-variable minimization methods
Example (cont.)
...
else
b = x2; x2 = x1; f2 = f1;
x1 = a + (b-a)*0.382;
f1 = f(x1);
end
end
fprintf(’The interval containing the minimum : [%f,%f]’,a,b);
»
The interval containing the minimum : [0.999990,1.000004]
37 / 66
Single-variable minimization methods
38 / 66
Unconstrained Optimization Methods
Introduction
Consider the unconstrained nonlinear programming
where f (x) is continuously differentiable. To solve (3), if the solution exist, it could be found
among solutions of equations
▽f (x) = 0 (4)
However, solving equations (4) in general case is is still quite complecate. This leads us to use
efficient methods to solve (3).
39 / 66
Unconstrained Optimization Methods
Introduction (cont.)
A common approach to solving (3) is to use methods that iterate from an initial value x 0 then
move ’toward’ the optimal value x ∗ , at each iteration we calculate :
x k+1 = x k + αk p k , k = 1, 2, · · · (5)
where
p k is the displacement vector from the point x k .
αk is the length of the move in the direction p k .
Obviously the procedure (5)is deterministic when we determine the direction p k of the move
and the way to calculate the length of the move αk .
40 / 66
Unconstrained Optimization Methods
Introduction (cont.)
Depending on the different constructions of p k and αk we have iterative procedures with
different properties.We are particularly interested in the following two properties: :
The value variation of objective function f of sequence {x k }
The convergence of the sequence {x k } to solution x ∗ .
Also note that defining p k and αk differently also requires different amounts of computation.
41 / 66
Unconstrained Optimization Methods
Gradient methods
We select direction p k such that
< ▽f (x k ), p k >< 0 (6)
Because when select αk small enough, we have
that is, moving in the direction of p k with a sufficiently small length, we will get to the point
x k+1 with a smaller objective function value. So the direction p k satisfying (6) is called the
descent direction of the objective function f (x).
42 / 66
Unconstrained Optimization Methods
Iterative procedures that follow the formula (7) are called gradient methods.
43 / 66
Unconstrained Optimization Methods
44 / 66
Unconstrained Optimization Methods
45 / 66
Unconstrained Optimization Methods
Theorems with gradient methods (cont.)
Theorem 2 : Suppose function f is twice continuously differentiable and its Hessian matrix
satisfies the condition
for all x, y ∈ Rn , sequence {x k }is constructed according to the iterative procedure (7) where
αk is determined according to procedure 2. Then for all initial points x 0 we have
x k → x ∗ , f (x k ) → f (x ∗ ), when k → ∞
f (x k ) − f (x ∗ ) ≤ q k [f (x 0 ) − f (x ∗ )],
||x k − x ∗ || ≤ Cq 1/2
46 / 66
Unconstrained Optimization Methods
Example :
Consider the unconstrained minimization problem
1
f (x1 , x2 ) = (x12 + γx22 ) (γ > 0)
2
Perform the gradient method starting from the initial solution x 0 = (γ, 1), the step length is
determined according to the procedure 1, we get a sequence of approximate solutions
x k = (x1k , x2k ) where
γ−1 k k γ−1 k
k
x1 = γ , x2 = −
γ+1 γ+1
The approximation sequence converges slowly to the optimal solution x ∗ = (0, 0) when
γ >> 1 and γ << 1
47 / 66
Unconstrained Optimization Methods
48 / 66
Unconstrained Optimization Methods
Newton method
In the case that the function f is twice continuously differentiable and the calculation of f ′ (x)
and f ”(x) is not difficult, we can use the quadratic term of the Taylor-series expansion.
1
fk (x) ≈ f (x k )+ < f ′ (x k ), x − x k > + < H(x k )(x − x k ), x − x k > (9)
2
is a quadratic approximation of the function f at the neighbor of point x k .
49 / 66
Unconstrained Optimization Methods
x k+1 = x k + αk (u k − x k ) (11)
50 / 66
Unconstrained Optimization Methods
From (12) we infer that x k+1 is the breakpoint of the function fk (x), i,e,
51 / 66
Unconstrained Optimization Methods
where x ∗ is the solution to equation f ′ (x) = 0. So the convergence rate of Newton’s method is
the square of the distance after each iteration.
52 / 66
Unconstrained Optimization Methods
Example
Illustrate Newton’s method for the following objective function without constraints
53 / 66
Unconstrained Optimization Methods
Example (cont.)
fval = f(x);gval = gf(x);H = hf(x);ng = norm(gval);nf = 1;tol = 0.05;iter = 0; alpha
= 1;
while ng >= tol
iter = iter + 1;nf = 0;alpha = 1;p = -inv(H)*gval;pass = 0;
while pass == 0
ftest = f(x+alpha*p);
nf = nf+1;
if ftest <= fval + 0.01*alpha*gval’*p
pass = 1;
x = x+alpha*p;
fval = ftest;
gval = gf(x);
H = hf(x);
... 54 / 66
Unconstrained Optimization Methods
Example (cont.)
...
ng = norm(gval);
else
alpha = alpha/2;
end
end
fprintf(’%3i %3.2e %3.2e %3.2e %3.2e %i’,iter,fval,ng,norm(x-xstart),alpha,nf);
end
55 / 66
Unconstrained Optimization Methods
iter f ||g|| ||x-x*|| alpha nf
1 2.18e+000 4.64e+000 2.21e+000 1.00e+000 1
2 2.08e+000 1.10e+001 2.06e+000 6.25e-002 5
3 2.00e+000 1.51e+001 1.93e+000 2.50e-001 3
4 1.91e+000 1.63e+001 1.83e+000 5.00e-001 2
5 1.79e+000 1.68e+001 1.72e+000 1.00e+000 1
6 1.46e+000 7.79e+000 1.65e+000 1.00e+000 1
7 1.36e+000 8.04e+000 1.59e+000 5.00e-001 2
8 1.23e+000 8.23e+000 1.50e+000 1.00e+000 1
9 9.87e-001 3.70e+000 1.40e+000 1.00e+000 1
10 9.82e-001 1.06e+001 1.23e+000 1.00e+000 1
11 6.72e-001 1.17e+000 1.12e+000 1.00e+000 1
.....
16 1.42e-001 4.59e+000 2.82e-001 1.00e+000 1
17 9.04e-002 4.41e-001 1.96e-001 1.00e+000 1
18 2.24e-002 2.13e+000 4.88e-002 1.00e+000 1 56 / 66
Unconstrained Optimization Methods
Searching methods
In practice the function f is not always smooth and twice differentiable. In addition, the
calculation of these derivative requires large computations, so it is not suitable. So we need
methods that only require the calculation of the function value.
Denote
0
.
..
e i = 1
.
.
.
0
is a vector consisting of only zeros and ones in the ith row. Of course e i ∈ Rn , this is the unit
57 / 66
Searching methods
Algorithm
Suppose we have x 0 as a starting approximation. Choose α0 > 0 as a parameter of the
algorithm. In iterations k = 0, 1, 2, · · · at approximations x k and αk > 0.
Set
k ik k
p = e , ik = k − n+1 (15)
n
where [k/n] is the largest integer not exceeding k/n.
Calculate f (x k + αk p k ) if
f (x k + αk p k ) < f (x k ) (16)
is satisfied then assign
x k+1 = x k + αk p k , αk+1 = αk
then go to step k + 1.
58 / 66
Searching methods
Algorithm (cont.)
If inequality (16) is not satisfied, then calculate f (x k − αk p k ) and check if the inequality
f (x k − αk p k ) < f (x k ) (17)
x k+1 = x k ,
59 / 66
Searching methods
Algorithm (cont.)
(
λαk , if ik = n, x k = x k−n+1
αk+1 =
αk , if ik ̸= n, hay x k ̸= x k−n+1
where 0 < λ < 1 is parameter of the algorithm.
Formula (15) ensure a cyclical transition of the searching direction
p 0 = e 1 , p 1 = e 2 , · · · , p n−1 = e n , p n = e 1 , · · · , p 2n−1 = e n , p 2n = e 1 ,
The length αk+1 = λαk needs to be changed according to the above formula only if after
n consecutive cycles we have not succeeded in reducing the objective function value when
satisfying either option (16) and (17).
60 / 66
Unconstrained Optimization Methods
M(x 0 ) = {x : f (x) ≤ f (x 0 )}
is bounded. Then the sequence {x k } constructed by the method described above is the
minimize sequence of the problem (14), that is
f (x k ) → f ∗ , k → ∞
61 / 66
Unconstrained Optimization Methods
62 / 66
Exercise :
try to understand optimization functions in Matlab
FMINBND
FMINUNC
FMINSEARCH
OPTIMSET
INLINE
63 / 66
Consider unconstraint minimization problem:
64 / 66
Consider the the unconstrained nonlinear programming:
Implement gradient algorithm to solve the problem, starting from the initial solution
x 0 = (1, 2). To determine the step length use procedure 1 (which requires solving a problem of
minimize the function of one variable).
65 / 66
66 / 66