Line Search and Trust-Region

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Line search and trust-region

Trust-region methods Trust-region methods


The trust-region model
Least Squares; Levenberg-Marquardt Least Squares; Levenberg-Marquardt
The trust-region subproblem
The Dogleg algorithm The Dogleg algorithm
The trust-region algorithm

Line search and trust-region

Nonlinear Optimization
Trust-region methods
◮ Line search and trust-region and two examples of global
strategies that modify a (usually) locally convergent algorithm,
Niclas Börlin e.g. Newton, to become globally convergent.
◮ At every iteration k, both global strategies enforce the descent
Department of Computing Science
condition
Umeå University f (xk +1 ) < f (xk )
niclas.borlin@cs.umu.se
by controlling the length and direction of the step.
December 3, 2007

c 2007 Niclas Börlin, CS, UmU


Nonlinear Optimization; Trust-region methods c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods

Line search and trust-region Line search and trust-region


Trust-region methods Trust-region methods
The trust-region model The trust-region model
Least Squares; Levenberg-Marquardt Least Squares; Levenberg-Marquardt
The trust-region subproblem The trust-region subproblem
The Dogleg algorithm The Dogleg algorithm
The trust-region algorithm The trust-region algorithm

The trust-region model


◮ Line search strategies choose the direction first, followed by the
distance. ◮ Trust-region methods use the quadratic model
◮ Trust-region strategies choose the maximum distance first, 1
followed by the direction. mk (p) = fk + p T gk + p T Bk p,
2
0.8 0.8 fk = f (xk ), gk = ∇f (xk ).
0.6 0.6 ◮ Newton-type trust-region methods have Bk = ∇2 f (xk ).
xk xk
0.4 0.4
◮ The model is “trusted” within a limited region around the current

point xk defined by
0.2 0.2

0 p 0

kpk ≤ ∆k .
−0.2 −0.2

−0.4 −0.4 This will limit the length of the step from xk to xk +1 .
−0.6

−2.5 −2 −1.5 −1 −0.5


−0.6

−2.5 −2 −1.5 −1 −0.5


◮ The value of ∆k will be adjusted up if the model is found to be in
“good” agreement with the objective function, and down if the
model is a “poor” approximation.
c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods
Line search and trust-region Line search and trust-region
Trust-region methods Trust-region methods
The trust-region model The trust-region model
Least Squares; Levenberg-Marquardt Least Squares; Levenberg-Marquardt
The trust-region subproblem The trust-region subproblem
The Dogleg algorithm The Dogleg algorithm
The trust-region algorithm The trust-region algorithm

The trust-region subproblem ◮ Note that if Bk = ∇2 f (xk ) is positive definite and ∆k big enough,
the solution of the trust-region subproblem is the solution of
◮ At iteration k of a trust-region method, the following subproblem
must be solved: ∇2 f (xk )p = −∇f (xk ),
1 i.e. p is a Newton-direction.
min mk (p) =fk + p T gk + p T Bk p,
p 2 ◮ Otherwise,
s.t. kpk ≤ ∆k
∆k ≥ kpk k = k(∇2 f (xk ) + λI)−1 ∇f (xk )k,
◮ It can be shown that the solution p ∗ of this constrained problem
is the solution of the linear equation system so if ∆k → 0, then λ → ∞ and
(Bk + λI)p ∗ = −gk 1
pk → − ∇f (xk ).
for some λ ≥ 0 such that the matrix (Bk + λI) is positive λ
semidefinite. ◮ When λ varies between 0 and ∞, the corresponding search
◮ Furthermore, direction pk (λ) will vary between the Newton direction and a
λ(∆k − kp ∗ k) = 0. multiple of the negative gradient.
c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods

Line search and trust-region Line search and trust-region


Trust-region methods Trust-region methods
The trust-region model The trust-region model
Least Squares; Levenberg-Marquardt Least Squares; Levenberg-Marquardt
The trust-region subproblem The trust-region subproblem
The Dogleg algorithm The Dogleg algorithm
The trust-region algorithm The trust-region algorithm

The reduction ratio

◮ To enable adaption of the trust-region size ∆k , the reduction ratio

f (xk ) − f (xk + pk ) actual reduction


ρk = =
mk (0) − mk (pk ) predicted reduction
pN is defined.
◮ If the reduction ratio is large, e.g. ρk > 43 , the trust-region size is
p(λ) increased in the next iteration.
◮ If the reduction ratio is small, e.g. ρk < 14 , the trust-region size is
decreased in the next iteration.
◮ Furthermore, a step pk will only be accepted if the reduction ratio
−g is not too small.

c 2007 Niclas Börlin, CS, UmU


Nonlinear Optimization; Trust-region methods c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods
Line search and trust-region Line search and trust-region
Trust-region methods Trust-region methods
The trust-region model The trust-region model
Least Squares; Levenberg-Marquardt Least Squares; Levenberg-Marquardt
The trust-region subproblem The trust-region subproblem
The Dogleg algorithm The Dogleg algorithm
The trust-region algorithm The trust-region algorithm

The trust-region algorithm


◮ ˆ initial
Specify starting approximation x0 , maximum step length ∆,
ˆ and acceptance constant η ∈ [0, 1 ). ◮ ◮ Update the current point
trust-region size ∆0 ∈ (0, ∆) 4

◮ For k = 0, 1, . . . until xk is optimal xk + pk if ρk > η,
xk +1 =
◮ Solve xk otherwise.
1 T ◮ Update the trust-region radius
min mk (p) =fk + pT gk + p Bk p,
p 2  1
s.t. kpk ≤ ∆k  4 ∆k if ρk < 41 ,
∆k +1 = ˆ
min(2∆k , ∆) if ρk > 34 and kpk k = ∆k ,
approximately for a trial step pk . 
∆k otherwise.
◮ Calculate the reduction ratio
f (xk ) − f (xk + pk )
ρk =
mk (0) − mk (pk )
for pk .

c 2007 Niclas Börlin, CS, UmU


Nonlinear Optimization; Trust-region methods c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods

Trust-region methods Trust-region methods


Least Squares; Levenberg-Marquardt Least Squares; Levenberg-Marquardt
The Dogleg algorithm The Dogleg algorithm

The Levenberg-Marquardt algorithm ◮ The Levenberg-Marquardt algorithm was put into the trust-region
framework (∆-parameterized) in the early 80-ies (Moré, 1981).
◮ The ∆ version of Levenberg-Marquardts has a number of
◮ The first trust-region algorithm was developed for least squares advantages over the λ version:
problems by Levenberg (1944) and Marquardt (1963). ◮ λ is nontrivially related to the problem. ∆ is related to the size of
◮ The original algorithm uses the approximation Bk = JkT Jk and x . E.g. ∆0 = kx0 k is often a reasonable choice.
solves
◮ The transition to λ = 0 is handled transparently.
◮ The λ algorithm need to re-solve
(Bk + λk I)p = −gk
for different values of λk . (Bk + λk I)p = −gk
◮ The original algorithm adapts by modifying the λ value, i.e. if the when a step is rejected and λ is reduced. The ∆ algorithm has
1
reduction produced by p is good enough, λk +1 = 10 λk , ways to avoid that.
otherwise λk +1 = 10λk and the step is rejected. ◮ However, many popular implementation of Levenberg-Marquardt
still use the original, λ-parameterized, formulation.

c 2007 Niclas Börlin, CS, UmU


Nonlinear Optimization; Trust-region methods c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods
Trust-region methods Trust-region methods
Least Squares; Levenberg-Marquardt Least Squares; Levenberg-Marquardt
The Dogleg algorithm The Dogleg algorithm

The Dogleg algorithm ◮ The dogleg algorithm solves this problem by


approximating the function pk (λ) with a piecewise
◮ The trust-region subproblem linear polygon p̃(τ ) and solving kp̃(τ )k = ∆k . pB
1 The polygon kp̃(τ )k is defined as
min mk (p) =fk + p T gk + p T Bk p, ◮
p 2
τ pU ,

s.t. kpk ≤ ∆k 0 ≤ τ ≤ 1,
p̃(τ ) = U B U .
p + (τ − 1)(p − p ), 1 ≤ τ ≤ 2.
is a hard problem.
◮ If the unconstrained solution ◮ The point p U is the Cauchy point, i.e. the
minimizer of m along the steepest descent
p B = −Bk−1 gk
direction
is too long, kp B k > ∆k , we have to find a λ such that gT g pU
pU = − T g.
g Bg
kpk (λ)k = k(Bk + λI)−1 gk k = ∆k .
◮ The dogleg algorithm works only if Bk is positive
◮ This is a non-linear equation in λ. definite, e.g. for least squares problems.
c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods c 2007 Niclas Börlin, CS, UmU
Nonlinear Optimization; Trust-region methods

You might also like