Lec3 Convex Function Exercise

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Optimization Methods - Convex Functions

Bolei Zhang
December 7, 2022

1 Problems
Problem 1: A function f (·) is convex ⇐⇒ ∀x, v ∈ domf, g(t) = f (x + tv) is convex.

Solution: “ =⇒ ”: For any t1 , t2 ∈ domg, x, v ∈ domf , there is

g(θt1 + (1 − θ)t2 ) = f (x + θt1 v + (1 − θ)t2 v)


= f (θ(x + t1 v) + (1 − θ)(x + t2 v))
(1)
≤ θf (x + t1 v) + (1 − θ)f (x + t2 v))
= θg(t1 ) + (1 − θ)g(t2 )
“⇐=”: For any x1 , x2 ∈ domf , there is x1 + tx2 ∈ domg. We have

f (θx1 + (1 − θ)x2 ) = f (x1 + (1 − θ)x2 − (1 − θ)x1 )


= f (x1 + (1 − θ)(x2 − x1 ))
= g(0 + (1 − θ) · 1) (2)
≤ θg(0) + (1 − θ)g(1)
= θf (x1 ) + (1 − θ)f (x2 )

Theorem 1: First order condition: For a differentiable function f , if dom f is a convex set, then f is
a convex function ⇐⇒ f (y) ≥ f (x) + ∇f (x)T (y − x), ∀x, y ∈ domf .

Proof: First consider the case n = 1: We show that a differentiable function f : R → R is convex if
and only if
f (y) ≥ f (x) + f ′ (x)(y − x) (3)
for all x and y in domf .
Assume first that f is convex and x, y ∈ domf . Since domf is convex, we conclude that for all
0 < t ≤ 1, x + t(y − x) ∈ domf , and by convexity of f ,

f (x + t(y − x)) ≤ (1 − t)f (x) + tf (y). (4)

If we divide both sides by t, we obtain

f (x + t(y − x)) − f (x)


f (y) ≥ f (x) + , (5)
t
and taking the limit as t → 0 yields (3).
To show sufficiency, assume the function satisfies (3) for all x and y in domf (which is an interval).
Choose any x ̸== y, and 0 ≤ θ ≤ 1, and let z = θx + (1 − θ)y. Applying (3) twice yields

f (x) ≥ f (z) + f ′ (z)(x − z), f (y) ≥ f (z) + f ′ (z)(y − z). (6)

Multiplying the first inequality by θ, the second by 1 − θ, and adding them yields

θf (x) + (1 − θ)f (y) ≥ f (z), (7)

which proves that f is convex.

1
Now we can prove the general case, with f : Rn → R. Let x, y ∈ Rn and consider f restricted
to the line passing through them, i.e., the function defined by g(t) = f (ty + (1 − t)x), so g ′ (t) =
∇f (ty + (1 − t)x)T (y − x).
First assume f is convex, which implies g is convex, so by the argument above we have g(1) ≥
g(0) + g ′ (0), which means
f (y) ≥ f (x) + ∇f (x)T (y − x). (8)
Now assume that this inequality holds for any x and y, so if ty +(1−t)x ∈ domf , and t̃y +(1− t̃)x ∈
domf , we have

f (ty + (1 − t)x) ≥ f (t̃y + (1 − t̃)x) + ∇f (t̃y + (1 − t̃)x)T (y − x)(t − t̃), (9)

i.e., g(t) ≥ g(t) + g ′ (t̃)(t − t̃). We have seen that this implies that g is convex.

Definition 1: Norm functions Given a vector space X, a norm on X is a real-valued function


p : X → R with the following properties, where |s| denotes the usual absolute value of a scalar s:

• Subadditivity/Triangle inequality: p(x + y) ≤ p(x) + p(y) for all x, y ∈ X;


• Absolute homogeneity: p(sx) = |s| p(x) for all x ∈ X and all scalars s.
• Positive definiteness/Point-separating: for all x ∈ X, if p(x) = 0, then x = 0.

Theorem 2: Norm functions are convex.

Proof: If f : Rn → R is an norm, and 0 ≤ θ ≤ 1,then f (θx + (1 − θ)y) ≤ f (θx) + f ((1 − θ)y) =


θf (x) + (1 − θ)f (y). The inequality follows from the triangle inequality, and the equality follows from
homogeneity of a norm.

Problem 2: The maximum function f (x) = max{x1 , x2 , ..., xn }, x ∈ Rm is convex.

Solution: The function f (x) = maxi xi satisfies, for 0 ≤ θ ≤ 1,

f (θx + (1 − θ)y) = max(θxi + (1 − θ)yi )


i
≤ θ max xi + (1 − θ) max yi (10)
i i
= θf (x) + (1 − θ)f (y)

Problem 3: Determine whether f (x) = x−2 , x ̸= 0 is convex function.

Solution: The domain of this function is not a convex set.

Problem 4: L0 norm: ||x||0 is the number of non-zero elements in x. Explain that whether L0 norm
is a norm function, whether it is a convex function.

Solution: Let x = (0, 1). There is ||x||0 = 1 and ||2x||0 = 1. However, 2||x||0 = 2, which violates the
homogeneity. Therefore, L0 norm is not a norm function.
Let x = (0, 1), y = (1, 0). Let θ = 0.5,

f (θx + (1 − θ)y) = ||(0.5, 0.5)||0 = 2, (11)

θf (x) + (1 − θ)f (y) = 0.5 + 0.5 = 1, (12)


Therefore, L0 norm is not convex.
x2
Problem 5: f (x, y) = y , y > 0 is a convex function.

Solution: The Hessian matrix of f (x, y) is (for y > 0)


" 2 #
− 2x
  T
2 y y2 2 y y
∇ f= 2 = ⪰ 0. (13)
− 2x
y2
2x
y3 y 3 −x −x

2
Problem 6: Compute the conjugate of the functions:
(1) f (x) = ax + b, domf = R.
(2) f (x) = − log x, domf = R++

Solution: (1). As a function of x, yx − ax − b is bounded if and only if y = a, in which case it is


constant. Therefore the domain of the conjugate function f ∗ is the singleton {a}, and f ∗ (a) = −b.
(2) The function xy + log x is unbounded above if y ≥ 0 and reaches its maximum at x = −1/y
otherwise. Therefore, domf ∗ = {y|y < 0} = −R++ and f ∗ (y) = − log(−y) − 1 for y < 0.

Problem 7: Determine whether the following functions are convex


• f (x) = ex − 1, x ∈ R
• f (x1 , x2 ) = x1 x2 , x1 , x2 ∈ R
Pk
• h(z) = log( i=1 exi ), xi ∈ R

Solution: (1) The function is convex as ∇2 f = ex >0.


0 1
(2) The function is not convex as ∇2 f = is not positive semi-definite.
1 0
(3) The Hessian of the log-sum-exp function is
1
∇2 f (x) = T
((1T z)diag(z) − zz T ),
(1 z)2
where z = (ex1 , ..., exn ). To verify that ∇2 f (x) ⪰ 0 we must show that for all v, v T ∇2 f (x)v ≥ 0, i.e.,
n n n
1 X X X
v T ∇2 f (x)v = (( zi )( v 2
i zi ) − ( vi zi )2 ) ≥ 0
(1T z)2 i=1 i=1 i=1

. But this follows from the Cauchy-Schwarz inequality (aT a)(bT b) ≥ (aT b)2 applied to the vectors
√ √
with components ai = vi zi , bi = zi .
Pn
Problem 8: The KL-divergence is convex DKL (u, v) = i=1 (ui log uvii − ui + vi )
2
Solution: First, we show that g(x, t) = −t log(x/t) is convex on R++ : Let domf = C. The domain
2
domg = R++ is a convex set.
The function g(·) is convex because the function is defined as a scaled and shifted version of convex
function f (x).
n
Therefore, the relative entropy of two vectors u, v ∈ R++ , defined as
n
X
ui log(ui /vi )
i=1

is convex in (u, v), since it is a sum of relative entropies of ui , vi .


The KL-divergence is convex as it is the relative entropy plus a linear function of (u, v).

Problem 9: Convex-concave functions and saddle-points. We say the function f : Rn × Rm → R is


convex-concave if f (x, z) is a concave function of z, for each fixed x, and a convex function of x, for
each fixed z. We also require its domain to have the product form domf = A × B, where A ⊆ Rn and
B ⊆ Rm are convex.
(1) Give a second-order condition for a twice differentiable function f : Rn × Rm → R to be
convex-concave, in terms of its Hessian ∇2 f (x, z).
(2) Suppose that f : Rn × Rm → R is convex-concave and differentiable, with ∇f (x̃, z̃) = 0. Show
that the saddle-point property holds: for all x, z, we have

f (x̃, z) ≤ f (x̃, z̃) ≤ f (x, z̃)

. Show that this implies that f satisfies the strong max-min property:

sup inf f (x, z) = inf sup f (x, z)


z x x z

3
(and their common value is f (x̃, z̃).
(3) Now suppose that f : Rn × Rm → R is differentiable, but not necessarily convex-concave, and
the saddle-point property holds at x̃, z̃:

f (x̃, z) ≤ f (x̃, z̃) ≤ f (x, z̃)

for all x, z. Show that ∇f (x̃, z̃) = 0

Solution: (1) The Hessian matrix of f is


 
2 A11 A12
∇ f=
A21 A22

where A11 ∈ Rn×n , A12 ∈ Rn×m , A21 ∈ Rm×n , A22 ∈ Rm×m . And there is ∇2 f = A11 , ∇2z f = A22 .
As f is convex when z is fixed, then A11 is positive semi-definite; As f is concave when x is fixed, then
A22 is negative semi-definite.
(2) For the first inequality, as f (x, z) is convex of x, there is

f (x, z̃) ≥ f (x̃, z̃) + ∇x f (x̃, z̃)(x − x̃) = f (x̃, z̃)


The same is true for the left part of the inequality.
Next we prove supz inf x f (x, z) = f (x̃, z̃). First

inf f (x, z) ≤ f (x̃, z) ≤ f (x̃, z̃),


x

As
sup inf f (x, z) ≤ f (x̃, z̃),
z x

and
sup inf f (x, z) ≥ inf f (x, z̃) ≥ f (x̃, z̃),
z x x

Therefore, there is supz inf x f (x, z) = f (x̃, z̃). The same is true for inf x supz f (x, z) = f (x̃, z̃), which
proves the equality.
(3) We need to prove ∇x f (x̃, z̃) = 0 and ∇z f (x̃, z̃) = 0. We prove by contradiction. Assume
∇x f (x̃, z̃) ̸= 0, let v = (∇x f (x̃, z̃)T , 0)T . Get the first order of f (x̃ + tv, z̃) at (x̃, z̃) (t ̸= 0),

f (x̃ + tv, z̃) = f (x̃, z̃) + t||∇x f (x̃, z̃)||2 + O(t2 ).


Take t < 0 with small enough absolute value, there is f (x̃ + tv, z̃) < f (x̃, z̃), which contradicts with
the assumption. Therefore, there is ∇x f (x̃, z̃) = 0 and ∇z f (x̃, z̃) = 0.

Problem 10: Show that f (Ax + b) is convex if f (x) is a convex function.

Solution: The domain of f (Ax + b) is the same with f (x), which is a convex set. For any two points
x, y ∈ domf , 0 ≤ θ ≤ 1, there is,

f (A(θx + (1 − θ)y) + b) = f (θ(Ax + b) + (1 − θ)(Ay + b)) ≤ θf (Ax + b) + (1 − θ)f (Ay + b). (14)

Therefore, f (Ax + b) is convex.

You might also like