0% found this document useful (0 votes)
11 views

Chapter 7 - Unconstrained Minimization Methods

The document discusses unconstrained minimization methods. It begins with an introduction and overview of concepts from calculus relevant to optimization such as n-dimensional Euclidean space, derivatives and differentials of multivariable functions, and extrema of multivariable functions. It then covers single-variable minimization methods and gradient-based unconstrained optimization methods.

Uploaded by

Tâm Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Chapter 7 - Unconstrained Minimization Methods

The document discusses unconstrained minimization methods. It begins with an introduction and overview of concepts from calculus relevant to optimization such as n-dimensional Euclidean space, derivatives and differentials of multivariable functions, and extrema of multivariable functions. It then covers single-variable minimization methods and gradient-based unconstrained optimization methods.

Uploaded by

Tâm Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

1 / 66

Chapter 7: Unconstrained Minimization Methods

Vu Van Thieu, Dinh Viet Sang, Ban Ha Bang

SoICT
Hanoi University of Science and Technology

2 / 66
Introduction

1 Recall some concepts from calculus

2 Unconstrained nonlinear programming

3 Single-variable minimization methods


Single-extreme function
Fibonacci method
Golden section method

4 Unconstrained Optimization Methods


Gradient methods
Newton method

3 / 66
Recall some concepts from calculus

n-dimensional Euclidean space


Denote Rn - set of n-dimensional real vectors

R n = {x = (x1 , x2 , · · · , xn )T : xi ∈ R, i = 1, 2, · · · , n}

where R is the set of real numbers. There we define the operations


Addition of two vectors u = (u1 , u2 , · · · , un )T abd v = (v1 , v2 , · · · , vn )T

u + v = (u1 + v1 , u2 + v2 , · · · , un + vn )

Vector multiplication with a real number α

αu = (αu1 , αu2 , · · · , αun )T

Rn and the operations just defined form a linear space. Elements of Rn are sometimes points.
4 / 66
Recall some concepts from calculus

n-dimensional Euclidean space (cont.)


If we consider the concept of the scalar product of two vectors u, v ∈ Rn :
n
X
< u, v >= ui × vi
i=1

then Rn together with the scalar product becomes an n-dimensional Euclidean space.
The length of vector u ∈ Rn is the number

n
!1/2
X
1/2
||u|| =< u, u > = ui2
i=1

5 / 66
Recall some concepts from calculus

n-dimensional Euclidean space (cont.)


The distance between two points u, v ∈ Rn

n
!1/2
X
2
ρ(u, v ) = ||u − v || = (ui − vi )
i=1

For u, v , w ∈ Rn we have the triangle inequality

||u − v || ≤ ||u − w || + ||w − v ||

6 / 66
Recall some concepts from calculus

n-dimensional Euclidean space (cont.)


Assume {u k , k = 1, 2, · · · } set of points in Rn , that means u k ∈ Rn , k = 1, 2, · · · , the
point v is called the critical point of sequence {u k } if we could find a subsequencce
{u k(i) } converges to v .
Sequence {u k } is said to be bounded if there exists a constant M ≥ 0 such that
||u k || ≤ M, for all k = 1, 2, · · ·
Set O(x, ϵ) = {u ∈ Rn : ||u − x|| < ϵ} is the sphere centered at x and radius ϵ > 0 is
called the neighbor ϵ of x.
The point v ∈ Rn is called the critical point of set U ⊂ Rn , if all its neighbors ϵ always
contains point of U that is different from v .

7 / 66
Recall some concepts from calculus

n-dimensional Euclidean space (cont.)


Point x ∈ X is said to be an interior point of set X if there exists a neighbor ϵ of which
lies entirely in X . The set of interior points of X is denoted by int(X ).
Point x ∈ X is said to be a boundary point of set X if among all its neighbor ϵ there are
points in X and not in X . The set of boundary points of X is denoted by ∂(X ).
Set X is said to be the open set if every point x ∈ X is an interior point of X .
Set X in Rn is said to be bounded, if there exists a constant L > 0 such that ||u|| ≤ L
for all u ∈ X .

8 / 66
Recall some concepts from calculus

n-dimensional Euclidean space (cont.)


Set X in Rn is said to be closed set if it contains all boundary points.
Assume {x k } is a point in the closed set X and limk→+∞ x = x, then x ∈ X
Set X is said to be compact if it is closed and bounded.
Assume {x k } is a point in the compact set X . Then from {x k } we can always we can
always extract the convergent subsequence {x k(i) } such that limk(i)→+∞ x k(i) = x, then
x ∈X

9 / 66
n-dimensional Euclidean space

10 / 66
Recall some concepts from calculus

Differential of the multivariable function


Definition 1 : Assume function f is dertermined at neighbor O(x, ϵ) of the point x. We say
that the function f is differentiable at x if there exists a vector f ′ (x) ∈ Rn such that the
increment of the function at x : ∆f (x) = f (x + ∆x) − f (x), ||∆x|| ≤ ϵ could be stated as
following
∆f (x) =< f ′ (x), ∆x > +o(x, ∆x)
where lim||∆x||→0 o(x,∆x)
||∆x|| = 0.
The function f ′ (x) is called the gradient of the function f at x and denoted by ▽f (x).

11 / 66
Recall some concepts from calculus

Differential of the multivariable function (cont.)


Definition 2 : Assume function f is determined at neighbor O(x, ϵ) of the point x. We say the
function We say that the function f is twice differentiable at x if along with vector f ′ (x), there
exists a symmetric matrix f ”(x) ∈ Rn×n such that the increment of function at x can be
written as
< f ”(x)∆x, ∆x >
∆f (x) = f (x + ∆x) − f (x) =< f ′ (x), ∆x > + + o(x, ∆x)
2

where lim||∆x||→0 o(x,∆x)


||∆x||2
= 0.
Matrix f ”(x) is called second-order derivative matrix (a.k.a. the Hessian) of function f at x
and is sometimes denoted by ▽2 f (x)

12 / 66
Recall some concepts from calculus

Differential of the multivariable function (cont.)


Definition 3 : Assume function f is determined on open set X . We say that the function f is
continuously differentiable over the set X if f is differentiable at every point x of X and

||f ′ (x + ∆x) − f ′ (x)|| → 0 khi ||∆x|| → 0, ∀x, x + ∆x ∈ X

The set of functions satisfying this property is denoted by C 1 (X ).


Definition 4 : Assume function f is determined on open set X . We say that the function f is
twice continuously differentiable on set X if f is twice differentiable at every point x of X and

||f ”(x + ∆x) − f ”(x)|| → 0 khi ||∆x|| → 0, ∀x, x + ∆x ∈ X

The set of functions satisfying this property is denoted by C 2 (X ).

13 / 66
Recall some concepts from calculus

Differential of the multivariable function (cont.)


Taylor’s formula : Assume f (x) is twice continuously differentiable at a neighbor ϵ of x o , then

f (x) =f (x o )+ < f ′ (x o ), x − x o >


1
+ < f ”(x o )(x − x o ), x − x o > +α(x, x o )||x − x o ||2
2
where limx→x o α(x, x o ) = 0, the error can be written as o(||x − x o ||2 )

14 / 66
Recall some concepts from calculus

Differential of the multivariable function (cont.)


The finite-increments formula : Suppose function f is continuously differentiable on the
open set S and x is a vector of S. Then for any vector y satisfying x + y ∈ S, one could
always find a number α ∈ [0, 1] such that
Z 1
f (x + y ) − f (x) =< f ′ (x + αy ), y >= < f ′ (x + ty ), y > dt
0

if f is twice continuously differentiable, then we have:


1
f (x + y ) − f (x) =< f ′ (x), y > + < f ”(x + αy )y , y > .
2

15 / 66
Differential of the multivariable function

16 / 66
Recall some concepts from calculus

Extrema of Multivariable Functions


Considering optimization problem

f (x) → min, x ∈ X

where X ⊂ Rn , and f is the function determined on X .


Defintion 5 :The point x ∗ ∈ X is called global minimum point of f on X if
f (x ∗ ) ≤ f (x), ∀x ∈ X .
Value f (x ∗ ) is the minimum value of f on X and we denote it by min{f (x) : x ∈ X }
The point x ∗ ∈ X is called local minimum point of f on X if there exists neighbor
O(x, ϵ), ϵ > 0 such that f (x ∗ ) ≤ f (x), for x ∈ O(x, ϵ) ∩ X .

17 / 66
Recall some concepts from calculus

Extrema of Multivariable Functions (cont.)


Definition 6 : Suppose function f is bounded on X . Value f ∗ is called lower bound of f on X
if
1 f ∗ ≤ f (x) for all x ∈ X
2 For every value ϵ > 0, there always exists u ϵ ∈ X such that f (u ϵ ) < f ∗ + ϵ
Then we denote: inf x∈X f (x) = f ∗

Note
Obviously if function f has global minimum on X then

inf f (x) = min f (x)


x∈X x∈X

18 / 66
Recall some concepts from calculus

Some examples
f (x) = (x − 1)2 has global minimum at x ∗ = 1 with f (x ∗ ) = 0.
f (x) = e x + e −x − 3x 2 . The optimal value of function f (x) = −7.02. The problem has
global minimum at two points x = ±2.84, has no local minimum.
f (x) = e −x has a lower bound of zero but not achieved. There is no local minimum or
global minimum.
f (x) = −x + e −x The objective function is not bounded below, has neither a local
minimum nor a global minimum.
f (x) = e x + e −x − 3x 2 + x The problem has local minimum x1 = −2.9226 and
x2 = 2.7418, in which x1 is global minimum point. The optimal value of this function is
-9.9040

19 / 66
Recall some concepts from calculus

20 / 66
Recall some concepts from calculus

21 / 66
Extrema of Multivariable Functions

22 / 66
Unconstrained nonlinear programming

Definition
Consider the Unconstrained nonlinear programming:

min{f (x) : x ∈ Rn } (1)

where f (x) is continuously differentiable.

Theorems
Theorem 1 (Necessary condition for optimality) : The necessary condition for x 0 to be a local
minimum is
▽f (x 0 ) = 0 (2)
Condition (2) is called the stationary condition, the point x 0 satisfying (2) is called stationary
point. Therefore, solving the problem (1) can be reduced to to solving equations (2).

23 / 66
Unconstrained nonlinear programming

Some examples
f (x) = x 2 − 3x − 1: equation f ′ (x) = 2x − 3 = 0 has the only solution x 0 = 3/2 that is a
local minimum and at the same time a global minimum.
!
2x1 − 2x2 + 1
f (x1 , x2 ) = x12 + x22 − 2x1 x2 + x1 : equation ▽f (x) = = 0 has the only
−2x1 + 2x2
solution x 0 = (−1/4, 1/4). However, solution x 0 is not optimal solution of the problem
min{f (x) : x ∈ R2 } because f (−1/4, 1/4) = −1/8 > −1 = f (0, 1).

24 / 66
Unconstrained nonlinear programming
Theorem (cont.)
Theorem 2 (Sufficient condition for optimality) : Suppose f is twice continuously
differentiable. The stationary point x 0 is local minimum if matrix f ”(x 0 ) is a positive definite
matrix.
To know whether a matrix is positive definite or not, the following Sylvester’s criterion can be
used.
Sylvester’s criterion : Matrix A = (aij )n×n a negative definite matrix (positive semidefinite
matrix) if and only if all of its sub-determinants are non-negative.
ai1 ,i1 ai1 ,i2 · · · ai1 ,ik
ai ,i ai2 ,i2 · · · ai2 ,ik
△i1 ,i2 ,··· ,ik = det 2 1 ≥0
···
aik ,i1 aik ,i2 · · · aik ,ik
where ∀1 ≤ i1 < i2 < · · · < ik ≤ n, ∀k = 1, 2, · · · , n
25 / 66
Unconstrained nonlinear programming

Some examples
2 2
Consider f (x1 , x2 ) = e x1 +x2 : solve equations
2 2
!
2x1 e x1 +x2
▽f (x) = 2 2 =0
2x2 e x1 +x2
!
2 0
has the only solution x0 = (0, 0) because f ”(0, 0) = has determinant
0 2
det|f ”(x 0 )| > 0 so x 0 is a local minimum and at the same time the optimal solution of
the problem.

26 / 66
Unconstrained nonlinear programming

Some examples (cont.)


Consider f (x1 , x2 ) = −x12 + x22 − 2x1 x2 − x1 : solve equations
!
−2x1 − 2x2 − 1
▽f (x) = =0
−2x1 + 2x2
!
−2 −2
has the only solution x 0 = (−1/4, −1/4) because f ”(x 0 ) = has determinant
−2 2
det|f ”(x 0 )| < 0 so x 0 is not minimum of function f (x).

27 / 66
Unconstrained nonlinear programming

28 / 66
Single-variable minimization methods

Unimodal function
Unimodal function is a function with only one maximum or minimum point on the specified
interval.
Definition : If x ∗ is the single minimum point of f (x) in the range a ≤ x ≤ b then f (x) is
unimodal on the interval if and only if for any two points x1 and x2 ,
x ∗ ≤ x1 ≤ x2 implies that f (x ∗ ) ≤ f (x1 ) ≤ f (x2 )
x ∗ ≥ x1 ≥ x2 implies thatf (x ∗ ) ≤ f (x1 ) ≤ f (x2 )

Note
The search methods (to be introduced) all are worked for unimodal function.

29 / 66
Single-variable minimization methods

Example
Assuming the interval contains minimum [0,1], we calculate the value of function at two points
x1 < x2 : f (x1 ) = f1 , f (x2 ) = f2 , there are 3 possibilities:
1 f1 < f2 : The minimum point x cannot be to the right of x2 , thus the interval containing
the minimum is [0, x2 ]
2 f1 > f2 : The minimum point x cannot be to the left of x1 , thus the interval containing
the minimum is [x1 , 1]
3 f1 = f2 : Two intervales [0, x1 ) and (x2 , 1] can be removed, and the intervalcontaining the
minimum is [x1 , x2 ]

30 / 66
Single-variable minimization methods

Searching diagram

Suppose we have the initial interval [a, b] containing the minimum


1 Calculate x1 = a + (b − a)/2 − e/2 abd x2 = a + (b − a)/2 + e/2 with e is the precision.
2 Calculate f1 = f (x1 ) và f2 = f (x2 )
3 ▶ If f1 < f2 then setting b = x2 (remove segment x > x2 );
▶ If f1 > f2 then setting a = x1 (remove segment x < x1 );
▶ If f1 = f2 then setting a = x1 ,b = x2 (remove segment x < x1 và x > x2 );
4 If |b − a| < 2e then terminates; otherwise go back to step 1

31 / 66
Single-variable minimization methods
Example :
Write the script to calculate the minimum of f (x) = x(x − 1.5) on the interval (a, b) = (0, 1)
with the precision e = 0.01 compared to the absolute solution x ∗ = 0.75

f = inline(’x.*(x-1.5)’,’x’);
eps = 0.01;
a = 0; b = 1; k = 0;
while abs(b-a)>= 2*eps
x1=a + (b-a)/2 - eps/2; x2=a + (b-a)/2 + eps/2;
f1=f(x1); f2=f(x2); k=k+1;
if f1<f2 b=x2;
elseif f1>f2 a=x1;
else a=x1;b=x2;
end
end
... 32 / 66
Single-variable minimization methods

Example (cont.) :
...
fprintf(’Number of iterations k= %d ’,k);
fprintf(’Length of segment : b-a = %d’,b-a);
fprintf(’x= %d’,x1);
.
Result
» Number of iterations k=7
Length of segment : b-a = 1.773438e-002
x=7.502344e-001

33 / 66
Single-variable minimization methods

Fibonacci method
Definition : Fibonacci sequence is defined recursively as following

F0 = 1; F1 = 1;
Fk = Fk−1 + Fk−2 , k ≥ 2;

We need to determine the number of iterations N before calculating the minimum. The
(k) (k)
selection of two points x1 and x2 at iteration k is determined by the formula :

(k) FN−1−k
x1 = (bk − ak ) + ak , k = 0, 1, · · · , N − 1
FN+1−k
(k) FN−k
x2 = (bk − ak ) + ak , k = 0, 1, · · · , N − 1
FN+1−k

34 / 66
Single-variable minimization methods

Golden section method


To improve the Fibonacci method, without given number of iterations N, we always apply a
fixed ratio when dividing the interval bk − ak
FN−1 FN
lim = 0.382; lim = 0.618
N→∞ FN+1 N→∞ FN+1
(k) (k)
Thus, the golden section method propose to choose two initial points x1 và x2 according to
the updated formula at a iteration step as follows.
(k)
x1 = 0.382(bk − ak ) + ak ,
(k)
x2 = 0.618(bk − ak ) + ak , k = 0, 1, 2, · · ·

35 / 66
Single-variable minimization methods
Example
Implement the golden section method to find the minimum of function in the interval [0,2]

f (x) = x 2 − 2x + 1

f = inline(’x∧ 2-2*x+1’);
a = 0; b = 2; eps = 0.00001;
x1 = a + (b-a)*0.382;
x2 = a + (b-a)*0.618;
f1 = f(x1);
f2 = f(x2);
while abs(b-a)>2*eps
if f1 > f2
a = x1; x1 = x2; f1 = f2;
x2 = a + (b-a)*0.618;
f2 = f(x2);... 36 / 66
Single-variable minimization methods

Example (cont.)
...
else
b = x2; x2 = x1; f2 = f1;
x1 = a + (b-a)*0.382;
f1 = f(x1);
end
end
fprintf(’The interval containing the minimum : [%f,%f]’,a,b);
»
The interval containing the minimum : [0.999990,1.000004]

37 / 66
Single-variable minimization methods

38 / 66
Unconstrained Optimization Methods

Introduction
Consider the unconstrained nonlinear programming

f (x) → min, x ∈ Rn (3)

where f (x) is continuously differentiable. To solve (3), if the solution exist, it could be found
among solutions of equations
▽f (x) = 0 (4)
However, solving equations (4) in general case is is still quite complecate. This leads us to use
efficient methods to solve (3).

39 / 66
Unconstrained Optimization Methods

Introduction (cont.)
A common approach to solving (3) is to use methods that iterate from an initial value x 0 then
move ’toward’ the optimal value x ∗ , at each iteration we calculate :

x k+1 = x k + αk p k , k = 1, 2, · · · (5)
where
p k is the displacement vector from the point x k .
αk is the length of the move in the direction p k .
Obviously the procedure (5)is deterministic when we determine the direction p k of the move
and the way to calculate the length of the move αk .

40 / 66
Unconstrained Optimization Methods

Introduction (cont.)
Depending on the different constructions of p k and αk we have iterative procedures with
different properties.We are particularly interested in the following two properties: :
The value variation of objective function f of sequence {x k }
The convergence of the sequence {x k } to solution x ∗ .
Also note that defining p k and αk differently also requires different amounts of computation.

41 / 66
Unconstrained Optimization Methods

Gradient methods
We select direction p k such that
< ▽f (x k ), p k >< 0 (6)
Because when select αk small enough, we have

f (x k+1 ) = f (x k + αk ) = f (x k ) + αk < ▽f (x k ), p k > +o(αk ) < f (x k )

that is, moving in the direction of p k with a sufficiently small length, we will get to the point
x k+1 with a smaller objective function value. So the direction p k satisfying (6) is called the
descent direction of the objective function f (x).

42 / 66
Unconstrained Optimization Methods

Gradient methods (cont.)


One of the vectors satisfying the inequality (6) can be chosen as the gradient vector of the
function f at x k :
pk = −αk ▽f (x k ), αk > 0, k = 0, 1, 2, · · ·
Then we have an iterative procedure

x k+1 = x k − αk ▽f (x k ), αk > 0, k = 0, 1, 2, · · · (7)

Iterative procedures that follow the formula (7) are called gradient methods.

43 / 66
Unconstrained Optimization Methods

Gradient methods (cont.)


Since the search direction is fixed, the gradient methods differ due to the choice of αk . We list
out some of the basic choices below
Procedure 1 : Solve the problem of Minimizing a function with one variable

min{φk (λ) : λ ≥ 0}, với φk (λ) − f (x k − λ▽f (x k ))

The optimal solution of the problem is taken as the value of αk .


Procedure : To select αk we perform the following process
1 Set α = α > 0
2 Set u = x k − α▽f (x k ), calculate f (u)
3 Check
f (u) − f (x k ) ≤ −ϵα||▽f (x k )||2 , with 0 < ϵ < 1 (8)
4 If inequality (8) is satisfied then set αk = α, otherwise set α = α/2 and go back to step 2.

44 / 66
Unconstrained Optimization Methods

Theorems with gradient methods


Theorem 1 : Suppose f (x) is bounded below and its gradient f ′ (x) satisfies the Lipchitz
condition :
||f ′ (x) − f ′ (y )|| ≤ L||x − y ||
for all x, y ∈ Rn , the selection of αk can be performed according to the procedure 2. Then
following the iterative formula (7) will produce the sequence {x k } satisfying the condition

||f ′ (x)|| → 0, khi k → ∞

for all initial points x 0

45 / 66
Unconstrained Optimization Methods
Theorems with gradient methods (cont.)
Theorem 2 : Suppose function f is twice continuously differentiable and its Hessian matrix
satisfies the condition

m||y ||2 ≤< H(x)y , y >≤ M||y ||2 , M ≥ m > 0

for all x, y ∈ Rn , sequence {x k }is constructed according to the iterative procedure (7) where
αk is determined according to procedure 2. Then for all initial points x 0 we have

x k → x ∗ , f (x k ) → f (x ∗ ), when k → ∞

where x ∗ is minimum point of f (x), we have also the following:

f (x k ) − f (x ∗ ) ≤ q k [f (x 0 ) − f (x ∗ )],
||x k − x ∗ || ≤ Cq 1/2
46 / 66
Unconstrained Optimization Methods

Example :
Consider the unconstrained minimization problem
1
f (x1 , x2 ) = (x12 + γx22 ) (γ > 0)
2
Perform the gradient method starting from the initial solution x 0 = (γ, 1), the step length is
determined according to the procedure 1, we get a sequence of approximate solutions
x k = (x1k , x2k ) where
γ−1 k k γ−1 k
   
k
x1 = γ , x2 = −
γ+1 γ+1
The approximation sequence converges slowly to the optimal solution x ∗ = (0, 0) when
γ >> 1 and γ << 1

47 / 66
Unconstrained Optimization Methods

48 / 66
Unconstrained Optimization Methods

Newton method
In the case that the function f is twice continuously differentiable and the calculation of f ′ (x)
and f ”(x) is not difficult, we can use the quadratic term of the Taylor-series expansion.
1
fk (x) ≈ f (x k )+ < f ′ (x k ), x − x k > + < H(x k )(x − x k ), x − x k > (9)
2
is a quadratic approximation of the function f at the neighbor of point x k .

49 / 66
Unconstrained Optimization Methods

Newton method (cont.)


It is easy to see that when x k is very close to x ∗ then f (x k ) is very close to 0, and so the
quadratic term of the (9) will give us more precise information about the variation of the
function f in the neighbor of x k .
We determine the approximation vector u k from the condition

fk (u k ) = min fk (x) (10)

and build the next approximation solution

x k+1 = x k + αk (u k − x k ) (11)

So depending on the options of choosing αk , we have different methods. If αk = 1, for every


k, we have Newton’s method.

50 / 66
Unconstrained Optimization Methods

Newton method (cont.)


From (11) we have x k+1 = u k , when choosing αk = 1, the condition (10) becomes

fk (x k+1 ) = min fk (x), k = 1, 2, · · · (12)

From (12) we infer that x k+1 is the breakpoint of the function fk (x), i,e,

▽fk (x) = ▽f (x k ) + H(x k )(x − x k ) = 0

So if H(x k ) is not degenerate, then we have the following Newton’s formula

x k+1 = x k − [H(x k )]−1 ▽f (x k ), k = 1, 2, · · · (13)

51 / 66
Unconstrained Optimization Methods

Newton method (cont.)


Convergence theorem : Suppose H(x) is invertible, [H(x)]−1 is bounded, H(x) satisfies
Liptchitz condition
||H(x) − H(y )|| ≤ L||x − y ||, ∀x, y ∈ Rn
Then the sequence {x k } built on (13) will satisfy

||x k+1 − x ∗ || ≤ c||x k − x ∗ ||2

where x ∗ is the solution to equation f ′ (x) = 0. So the convergence rate of Newton’s method is
the square of the distance after each iteration.

52 / 66
Unconstrained Optimization Methods

Example
Illustrate Newton’s method for the following objective function without constraints

f (x) = 100(x2 − x12 )2 + (1 − x1 )2

in the algorithm there are functions f (x), gf (x) and hf (x).

f = inline(’100*(x(2)-x(1)∧ 2)∧ 2 + (1-x(1))∧ 1’);


gf = inline(’[100*2*(x(2)-x(1)∧ 2)*(-2*x(1)) + 2*x(1)-2;200*(x(2)-x(1)∧ 2)]’);
hf = inline(’[1200*x(1)∧ 2 - 400*x(2)+ 2, -400*x(1);-400*x(1),200]’);

53 / 66
Unconstrained Optimization Methods
Example (cont.)
fval = f(x);gval = gf(x);H = hf(x);ng = norm(gval);nf = 1;tol = 0.05;iter = 0; alpha
= 1;
while ng >= tol
iter = iter + 1;nf = 0;alpha = 1;p = -inv(H)*gval;pass = 0;
while pass == 0
ftest = f(x+alpha*p);
nf = nf+1;
if ftest <= fval + 0.01*alpha*gval’*p
pass = 1;
x = x+alpha*p;
fval = ftest;
gval = gf(x);
H = hf(x);
... 54 / 66
Unconstrained Optimization Methods

Example (cont.)
...
ng = norm(gval);
else
alpha = alpha/2;
end
end
fprintf(’%3i %3.2e %3.2e %3.2e %3.2e %i’,iter,fval,ng,norm(x-xstart),alpha,nf);
end

55 / 66
Unconstrained Optimization Methods
iter f ||g|| ||x-x*|| alpha nf
1 2.18e+000 4.64e+000 2.21e+000 1.00e+000 1
2 2.08e+000 1.10e+001 2.06e+000 6.25e-002 5
3 2.00e+000 1.51e+001 1.93e+000 2.50e-001 3
4 1.91e+000 1.63e+001 1.83e+000 5.00e-001 2
5 1.79e+000 1.68e+001 1.72e+000 1.00e+000 1
6 1.46e+000 7.79e+000 1.65e+000 1.00e+000 1
7 1.36e+000 8.04e+000 1.59e+000 5.00e-001 2
8 1.23e+000 8.23e+000 1.50e+000 1.00e+000 1
9 9.87e-001 3.70e+000 1.40e+000 1.00e+000 1
10 9.82e-001 1.06e+001 1.23e+000 1.00e+000 1
11 6.72e-001 1.17e+000 1.12e+000 1.00e+000 1
.....
16 1.42e-001 4.59e+000 2.82e-001 1.00e+000 1
17 9.04e-002 4.41e-001 1.96e-001 1.00e+000 1
18 2.24e-002 2.13e+000 4.88e-002 1.00e+000 1 56 / 66
Unconstrained Optimization Methods
Searching methods
In practice the function f is not always smooth and twice differentiable. In addition, the
calculation of these derivative requires large computations, so it is not suitable. So we need
methods that only require the calculation of the function value.

f (x) → min, x ∈ Rn (14)

Denote  
0
.
 .. 
 
e i = 1
 
.
.
.
0
is a vector consisting of only zeros and ones in the ith row. Of course e i ∈ Rn , this is the unit
57 / 66
Searching methods

Algorithm
Suppose we have x 0 as a starting approximation. Choose α0 > 0 as a parameter of the
algorithm. In iterations k = 0, 1, 2, · · · at approximations x k and αk > 0.
Set  
k ik k
p = e , ik = k − n+1 (15)
n
where [k/n] is the largest integer not exceeding k/n.
Calculate f (x k + αk p k ) if
f (x k + αk p k ) < f (x k ) (16)
is satisfied then assign
x k+1 = x k + αk p k , αk+1 = αk
then go to step k + 1.

58 / 66
Searching methods

Algorithm (cont.)
If inequality (16) is not satisfied, then calculate f (x k − αk p k ) and check if the inequality

f (x k − αk p k ) < f (x k ) (17)

is satisfied then assign


x k+1 = x k − αk p k , αk+1 = αk
then go to step k + 1.
An iteration succeeds if either (16) and (17) are satisfied. Otherwise, we do the following

x k+1 = x k ,

59 / 66
Searching methods

Algorithm (cont.)
(
λαk , if ik = n, x k = x k−n+1
αk+1 =
αk , if ik ̸= n, hay x k ̸= x k−n+1
where 0 < λ < 1 is parameter of the algorithm.
Formula (15) ensure a cyclical transition of the searching direction

p 0 = e 1 , p 1 = e 2 , · · · , p n−1 = e n , p n = e 1 , · · · , p 2n−1 = e n , p 2n = e 1 ,

The length αk+1 = λαk needs to be changed according to the above formula only if after
n consecutive cycles we have not succeeded in reducing the objective function value when
satisfying either option (16) and (17).

60 / 66
Unconstrained Optimization Methods

Searching methods (cont.)


Convergence Theorem: Suppose f is a continuously differentiable convex function on Rn , and
x 0 is chosen such that the set

M(x 0 ) = {x : f (x) ≤ f (x 0 )}

is bounded. Then the sequence {x k } constructed by the method described above is the
minimize sequence of the problem (14), that is

f (x k ) → f ∗ , k → ∞

where f ∗ is optimal value of the problem (14).

61 / 66
Unconstrained Optimization Methods

62 / 66
Exercise :
try to understand optimization functions in Matlab

FMINBND
FMINUNC
FMINSEARCH
OPTIMSET
INLINE

63 / 66
Consider unconstraint minimization problem:

f (x) = e x1 (4x12 + 2x22 + 4x1 x2 + 2x2 + 1)

Initial point x 0 = (−1, 1)

64 / 66
Consider the the unconstrained nonlinear programming:

f (x1 , x2 ) = (3x1 − 9)2 + (4x2 − 10)2 → min, x = (x1 , x2 ) ∈ R2 .

Implement gradient algorithm to solve the problem, starting from the initial solution
x 0 = (1, 2). To determine the step length use procedure 1 (which requires solving a problem of
minimize the function of one variable).

65 / 66
66 / 66

You might also like