24142

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

E0 230 Computational Methods of Optimization Assignment - 3

Sai Phanindra Korlepara — 24142 — M.Tech CSA Dept.

Problem 1. System of Linear Equations


   
2 −4 2 −14 10
A = −1 2 −2 11  , b = −6
−1 2 −1 7 −5

1. Prove that the above system has infinite solutions.

2. Reformulation as least norm convex problem.

3. Solution using KKT Conditions.

4. Deriving a projection operator for the constraint set.

5. Solve using projected gradient descent.

Solution.

1. To prove that the given system of linear equations Ax = b has infinitely many solutions,
we analyze the matrix A and use its rank
Compute the rank of A
Performing row-reduction, we aim to find the rank
1 1
R2 ← R2 + R1 , R3 ← R3 + R1
2 2
 
2 −4 2 −14
0 0 −1 4  .
0 0 0 0
The rank of A is the number of non-zero rows, which is Rank(A) = 2. We can clearly
see that there are 4 columns, but the rank of matrix A is only 2, meaning there are
2 free variables in this problem. Hence this system of Linear equations has infinite
solutions.
To further verify this claim, we can find the general solution. Reduce the matrix further
by applying the following transformations R1 ← R1 /2, R2 ← −R2
 
1 −2 1 −7 | 5
0 0 1 −4 | 1 .
0 0 0 0 | 0

From the row-reduced matrix, we can simplify and rewrite the system as

x1 = 2x2 + 3x4 + 4, x3 = 4x4 + 1.

1
Thus, the general solution is
   
x1 2x2 + 3x4 + 4
x2   x2 
x=
x3  =  4x4 + 1  .
  

x4 x4

Rewriting in terms of free variables x2 and x4


     
4 2 3
0 1 0
x= 1 + x2 0 + x4 4 .
    

0 0 1

Since x2 and x4 are free, the system has infinitely many solutions.

2. If the objective is to find the least norm solution (ConvProb) we can restate the problem
as follows.
1
min4 ||x||2
x∈R 2

subject to the constraint that Ax = b where, A is 3 × 4 matrix and b is a 3 × 1 vector


both as defined in the question of this problem.
We can prove that this constraint set is convex as follows, A set is called convex, if
for any two points x1 , x2 ∈ C, where C is the convex set, we have αx1 + (1 − α)x2 ∈
C | α ∈ [0, 1].
Let’s say that two points x1 , x2 that are part of the constraint set Ax = b. Then by
definition, Ax1 = b, Ax2 = b. Then if we have a point xα = αx1 + (1 − α)x2 ,

Axα = A(αx1 + (1 − α)x2 )

αAx1 + (1 − α)Ax2 = αb + (1 − α)b = b


Hence we have shown that the point xα ∈ C, this proves that the set defined above is
convex.
Similarly the objective function can be restated as follows,
1 1
min4 xT x = xT Ix
x∈R 2 2
We can see that the objective function is Quadratic and the Hessian is identity matrix,
which is a positive definite matrix, Since ∇2 f (x) is positive definite (λmin (I) = 1 > 0),
f (x) is strongly convex with a strong convexity parameter of 1.
Hence we conclude that the objective function and the constraint set of ConvProb is
Strongly Convex and Convex respectively.

2
3. Now let’s solve this convex optimization problem using KKT conditions. To begin with
the Lagrangian for this problem is
1
L(x, λ) = ||x||2 − λT (Ax − b)
2
The KKT conditions are as follows,

(a) ∇x L(x, λ) = 0.
(b) Ax = b

To solve the KKT, lets start with first condition, the gradient of 12 ||x||2 = 12 xT x is x,
and the gradient of λT (Ax − b) with respect to x is AT λ. Thus

∇x L(x, λ) = x − AT λ.

Setting ∇x L(x, λ) = 0, we get


x = AT λ.
Now, substitute x = AT λ into the second KKT condition Ax = b

(AAT )λ = b.

Since AAT is a 3 × 3 matrix and invertible we have

λ = (AAT )−1 b.

Once λ is found, substitute it back into x = AT λ

x = AT (AAT )−1 b.

This gives the least-norm solution to the problem, which concludes the solution using
the KKT conditions. A small piece of code was written and the value of x∗ was found
to be,
T
x∗ = 0.5957, −1.1915, −0.3617, −0.3404 .


4. Now we want to find projection of any point z on the constraint set of convProb
Ax = b. This can be derived by solving the following constrained optimization problem.
1
min4 ||x − z||2
x∈R 2

subjected to the constraint that Ax = b. This problem can be solved using KKT
conditions like before, So let’s state the Lagrangian and the KKT for this problem.
1
L(x, λ) = ||x − z||2 − λT (Ax − b)
2
(a) ∇x L(x, λ) = 0.
(b) Ax = b

3
To solve we start with the gradient with respect to x

∇x L(x, λ) = x − z − AT λ.

Setting ∇x L(x, λ) = 0, we get


x = z + AT λ.

Substitute x = z + AT λ into the second KKT condition Ax = b

A(z + AT λ) = b.

Az + AAT λ = b.
Solving for λ we get
λ = (AAT )−1 (b − Az).

Substitute λ back into x = z + AT λ to get the projection

x = z + AT (AAT )−1 (b − Az).

Thus, we have derived the projection of any point z onto the constraint set Ax = b.

5. A function was written to implement projected gradient descent to solve ConvProb.


As α increases, n.of iterations decreases. Convergence criteria is ||xt − x∗ || < 1e − 7

Figure 1: Projected Gradient Descent with x0 = [5, 5, 5, 5]

4
Problem 2. Support Vector Machines
1. Value of the Primal objective Function.
2. Dual conversion of the primal.
3. show that X X
λi = λi = γ
i:yi =1 i:yi =−1

4. Value of the Dual objective Function.


5. Active set of the primal problem.
6. plots...
Solution.
1. Using CVXPY, Optimisation problem was set up as follows,
1
objective = arg min ||w||22
2
subject to the constraints as
labeli (wT datai + b) ≥ 1 where i = 1, 2, ..., N.
The solution was found to be,
1
||w||22 = 2.66, w = [1.55, −2], b = 1
2
Note : When running the script make sure that both .csv files are in the same filepath
as the script.
2. The dual of the primal problem is derived as follows:
Starting from the primal optimization problem:
1
minimize ∥w∥22
2
subject to yi (w⊤ xi + b) ≥ 1 ∀i = 1, 2, . . . , N
Here, yi is the i-th label and xi is a column vector of the coordinates of each point.
Form the Lagrangian with multipliers λi ≥ 0:
N
1 X
2
λi yi (w⊤ xi + b) − 1
 
L(w, b, λ) = ∥w∥2 −
2 i=1

Taking derivatives and setting them to zero:


N N
∂L X X
=w− λ i y i xi = 0 ⇒ w= λi yi xi
∂w i=1 i=1

5
N N
∂L X X
=− λi y i = 0 ⇒ λi y i = 0
∂b i=1 i=1
Substituting w back into the Lagrangian:
!⊤ N !  !⊤ 
N N N N
1 X X X X X
L= λi yi xi λj yj xj − λi yi λj yj xj x i + b + λi
2 i=1 j=1 i=1 j=1 i=1

Simplifying the expression:


N N N N N N
1 XX XX X X
L= λi λj yi yj x⊤
i x j − λ i λj yi yj x ⊤
i x j − b λi yi + λi
2 i=1 j=1 i=1 j=1 i=1 i=1
PN
Since i=1 λi yi = 0, the term involving b can be removed:
N N N
1 XX X
L=− λi λj yi yj x⊤ x
i j + λi
2 i=1 j=1 i=1

Thus, the dual objective function is:


N N N
X 1 XX
g(Λ) = λi − λi λj yi yj x⊤
i xj
i=1
2 i=1 j=1

comparing this in matrix form:


1
g(Λ) = Λ⊤ 1 − Λ⊤ AΛ
2
where 1 is a column vector of ones and
Aij = yi yj x⊤
i xj

Therefore, the dual problem is:


1
maximize g(Λ) = Λ⊤ 1 − Λ⊤ AΛ
2
subject to Λ ≥ 0, y T Λ = 0

3. we have from the constraints of the dual, y T Λ = 0. If all the yi s are equal to 1 we have,
X X
λi = y i λi = y T Λ = 0

similarly when all yi = −1,


X X
λi = (−yi )λi = −y T Λ = 0

Hence we have shown that


X X
λi = λi = γ = 0
i:yi =1 i:yi =−1

6
4. Implemented the following objective and constraints that are equivalent to the result
from part 2
1
minimize g(Λ) = −Λ⊤ 1 + Λ⊤ AΛ
2
T
subject to Λ ≥ 0, y Λ = 0
The objective value for the dual problem came out to 2.66 which is same as the primal,
it shows that the original dual formulation and the dual implementation are equivalent
to the primal. the Lambda values is non zero for points 0,1,6,9. We will see in the
later part that this also matches the constraints that are active in the primal.
5. We can check which constraints of the primal are active by checking the i’s where
labeli (wT datai + b) = 1. A function was written to verify all the indices and it is found
that points where we are actively constrained are 0,1,6,9.
6. We can see clearly from the scatter plot above the line wT x + b separates y = 1 and
y = −1 optimally and we can clearly see the Active constraints near the margin of the
line.

Figure 2: Scatter plot for SVM Hyperplane fit

You might also like