0% found this document useful (0 votes)
31 views

General Condition For Solving Optimization Problem

Uploaded by

Up Next
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

General Condition For Solving Optimization Problem

Uploaded by

Up Next
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Outline

• Preliminaries
• Optimality Conditions
• Lagrange Multiplier
• General Condition for Solving Optimization Problem

Reading:
Chapter 4 JS Arora Page 112-178
Chapter 6 Mark French
Existence of Minima

Motivation:
• Before applying any optimization method to an optimization problem, it is important to check
if there exists a feasible solution.

Representation of optimum points. (a) The


unbounded domain and function (no global optimum).
(b) The bounded domain and function (global minimum
and maximum exist).
Preliminaries: Calculus Concepts

1. Gradient Vector:

What is Gradient Vector? Consider a function f(x) of n variables x1, x2, …, xn. The partial
derivative of the function with respect to x1 at a given point x* is defined as ∂f(x*)/∂x1, with
respect to x2 as ∂f(x*)/∂x2, and so on.
Preliminaries: Calculus Concepts

What is its direction? Normal to the tangent plane at x*.

Why?
Preliminaries: Calculus Concepts
Preliminaries: Calculus Concepts

Not to get overwhelmed!


2. Hessian Matrix: Second Order Partial Derivatives Think of it as a name of an
arrangement
Differentiating the gradient vector once again, we obtain a matrix of second partial derivatives for
the function f(x) called the Hessian matrix or, simply, the Hessian.

It is important to note that each element


of the Hessian is a function in itself
that is evaluated at the given point x*.
Preliminaries: Calculus Concepts
Preliminaries: Calculus Concepts
What does Taylor series exactly do?

3. Taylor Series: Expresses a function in terms of derivatives


at a single point. It makes life easy whenever
Expansion for single variable: we are ready to work with approximating
things……….

Expansion for two variables (x1, x2):


Preliminaries: Calculus Concepts

Why do we care about matrix


form?

Are we generalizing it for more


than one variables?
Concepts for Necessary & Sufficient Condition

Necessary Condition:
• The conditions that must be satisfied at the optimum point are called necessary. Stated
differently, if a point does not satisfy the necessary conditions, it cannot be optimum.

• Points satisfying the necessary conditions are called candidate optimum points.

Sufficient Condition:
• If a candidate optimum point satisfies the sufficient condition, then it is indeed an optimum
point.

Having oxygen in the earth's atmosphere is a necessary condition for Rain pouring from the sky is a
human life. Certainly, having oxygen will not guarantee human life. There sufficient condition for the ground
are many other conditions needed for human life other than oxygen in the to be wet.
atmosphere.

Similarly, it is not necessary for rain to be pouring from the sky for the
ground to be wet. The sprinkler could be on as well.
Optimality Condition

Why do we care about optimality conditions?

The optimality conditions for unconstrained or constrained problems can be used in


two ways:
1. They can be used to check whether a given point is a local optimum for the problem.
2. They can be solved for local optimum points.

What is Optimality Condition for Single Variable objective function?


• How to think about it? Use neighborhood idea

• Consider a nature of function f(x) in the neighborhood of x*.


For minima around x*:
Optimality Condition

• First Order Necessary Condition:

1. How?

2. What is meaning of first order derivative to be zero?


Function values are stationary wrt. the change in domain variable x*

3. Is this condition also sufficient to find minima?


No, function values may be stationary for maxima or inflection points (neither max nor min).
Optimality Condition

First Order Sufficient Condition:


• The stationary points should be minima.

Do we need to worry about higher order terms?


Optimality Condition
Optimality Condition for function with several Variables

Using Taylor Series for Multiple Variables

Necessary Condition Sufficient Condition


Optimality Condition for function with several Variables
Optimality Condition for function with several Variables

Geometrically, we know for minima, f(x)>f(x*).


Optimality Condition for function with several Variables
Optimality Condition for function with several Variables

What is the significance of Hessian Matrix? What does it do?

• Hypothesis: Because this is appearing from Taylor expansion, d can be viewed as (x-x*). For two variables, d is a vector.
Hessian seems to act on a vector. Can we think of it as a transformation matrix?

1. Lets understand the act of a transformation matrix on a vector.


Consider a matrix acts on vectors (1,1), (-1,1), (-1,0)
Optimality Condition for function with several Variables

Consider a matrix acts on vectors v (1,1), (-1,1), (-1,0)

• What would be transformed vectors?


(3,3) (-1,1) (-2,1)

• How do they look like?

• It seems that for certain orientations of these vectors, the


Transformation matrix A would not change the vector direction.
That means if λ is scalar value, the resulting vector can be written as
λv.

Or Av= λv

(A-Iλ)v=0
Optimality Condition for function with several Variables

• (A-Iλ)v=0

• Finding determinant and equating it to zero would give us λ.

• What does that mean?


• Given a matrix A, you can find the directions in which any vector would not change in direction.
• Also, if λ is positive, vector remains in the same direction.
• If λ is negative, vector would be in opposite direction.

• In context of Hessian matrix H, d is vector….


• If Hd= λd, then dTHd = dT λd= λ dTd

• dTHd = λ dTd

• What is dTd?

• Thus, the curvature sign will depend on λ.


Eigen Vectors and Values:
• Eigen Vectors and Values feature prominently in the analysis of Linear Transformation.
• The word ‘Eigen’ comes from a German word meaning “proper” or “characteristics”.
• Originally, it was used for the analysis of the principal axes of the rotational motion of the
rigid bodies.
• Applying T to the eigen vector will typically scale the eigen vector by a scalar value called eigen value.

T(v)=λ v

Blue vector is eigen vector as


it does not change its
direction with shear
transformation. As it also
retains its size, eigen values is
unity.
Optimality Condition for function with several Variables
With Constraints

What would be necessary & sufficient conditions for General optimization problem?
Optimality Condition for function with several Variables
With Constraints

What would be necessary & sufficient conditions for optimization problem with equality constraints?
Optimality Condition for function with several Variables
With Constraints

Convert the problem into unconstrained optimization by substituting the variable. We already know the
necessary and sufficient condition for the objective function.
Optimality Condition for function with several Variables
With Constraints

However, in general it’s not possible to do so cleanly like constraint x1+x2-2=0.


So, one way, assume x2=φ (x1)

Now, lets substitute this into the objective function.


(1.5,1.5)

x1+x2=2
Optimality Condition for function with several Variables
With Constraints

Necessary Condition:

(1.5,1.5)
Sufficient condition:

x1+x2=2
Both the condition exists, so, there exists a minima.
Optimality Condition for function with several Variables
With Constraints

However, in general it’s not possible to do so cleanly like constraint x1+x2-2=0.


So, one way, assume x2=φ (x1)

Now, lets substitute this into the objective function.

Necessary Condition:
Chain Rule

How to eliminate φ?

Lets use equality constraint

Differentiating both sides


Substitute in in

Let define a quantity

Substituting v :
Necessary conditions
Then, highlighted equation becomes
General Optimality Condition for
function with several Variables
With Inequality Constraints
• Previous slides, we have dealt with necessary and sufficient condition for equality constraints.
• If the optimization method has inequality constraints, then how to find the necessary and sufficient
conditions.
• One way is to convert the inequality constraints into equality constraints by introducing additional
slack variables.

Some inequalities may or may


not be active
• Introduce Slack variable s2 in the inequality and convert it into equality constraint.
• Lagrange function would be:
• Necessary condition of the Lagrange Theorem by treating x1, x2, u and s as unknown:

We have 4 equations and 4 unknowns.


As the equations are non-linear multiple
solutions might exist.

Observe: substitute s=0 makes the inequality active, solving them we get
This is the stationary point of L. So, it is a candidate minimum.

What happens if u=0? It violates constraints

Note that by keeping u>=0, gradient of the cost function and constraint function points in opposite direction. This way f
can not be reduced any further by stepping in the negative gradient direction without violating the constraint as shown in
figure.
Inequality Constraints:

Lagrangian Functions:

KKT Conditions:
Thus, all the necessary KKT conditions are met.

Sufficient condition for local minima is not met. It is local maxima.

Sufficient condition for local minima is met.

KKT condition is violated. Thus, x=d is not candidate minimum.


Observe from the graph.
Summary of KKT conditions:
Replacing Slack variable in the previous formulation, we can get alternate form of KKT condition.

You might also like