Ch4eng 18

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

August 4, 2018

CHAPTER 4: HIGHER ORDER DERIVATIVES

In this chapter D denotes an open subset of Rn .

1. Introduction
Definition 1.1. Given a function f : D → R we define the second partial deriva-
tives as
∂2f
 
∂ ∂f
Dij f = =
∂xi ∂xj ∂xi ∂xj

Likewise, we may define the higher order derivatives.

Example 1.2. Consider the function

f (x, y, z) = xy 2 + ezx

then
∂f ∂f ∂f
= y 2 + zezx = 2xy = xezx
∂x ∂y ∂z

and, for example

∂2f ∂2f ∂2f


= z 2 ezx = xezx = xezx
∂x∂x ∂x∂z ∂z∂x

We see that
∂2f ∂2f
=
∂x∂z ∂z∂x

We may check that this also holds for the other variables

∂2f ∂2f ∂2f ∂2f


= =
∂x∂y ∂y∂x ∂y∂z ∂z∂y

Example 1.3. Consider the function


(
xy(x2 −y 2 )
x2 +y 2 , si (x, y) 6= (0, 0);
f (x, y) =
0, si (x, y) = (0, 0).

Here is the graph of f ,


1
2 CHAPTER 4: HIGHER ORDER DERIVATIVES

We may check easily that for (x, y) 6= (0, 0),


∂f x4 y + 4x2 y 3 − y 5 ∂f x5 − 4x3 y 2 − xy 4
(x, y) = (x, y) =
∂x (x2 + y 2 )2 ∂y (x2 + y 2 )2
and
∂f ∂f
(0, 0) = 0 (0, 0) = 0
∂x ∂y
Then
∂f ∂f
∂2f ∂y (x, 0) − ∂y (0, 0) x
(0, 0) = lim = lim =1
∂x∂y x→0 x−0 x→0 x
and
∂f
∂2f ∂x (0, y) − ∂f
∂x (0, 0) −y
(0, 0) = lim = lim = −1
∂y∂x y→0 y−0 x→0 y
so
∂2f ∂2f
(0, 0) 6= (0, 0)
∂x∂y ∂y∂x
On the other hand, one can check that if (x, y) 6= (0, 0) then
∂2f ∂2f
(x, y) = (x, y)
∂x∂y ∂y∂x
CHAPTER 4: HIGHER ORDER DERIVATIVES 3

The following result provides conditions under which the cross partial derivatives
coincide.
Theorem 1.4 (Schwarz). Suppose that for some i, j = 1 . . . , n the partial deriva-
tives
∂f ∂2f ∂f ∂2f
, , ,
∂xi ∂xi ∂xj ∂xj ∂xj ∂xi
exist and are continuous in some ball B(p, r), with r > 0. Then,
∂2f ∂2f
(x) = (x)
∂xi ∂xj ∂xj ∂xi
for every x in the ball B(p, r).
Definition 1.5. Let D be an open subset of Rn and let f : D → R. We say that
f is of class
∂f
• C 1 (D) if all the first partial derivatives ∂x i
of f exist and are continuous
on D for all i = 1 . . . , n.
• C 2 (D) if all the first partial derivatives
∂f
∂xi
of f exist and are of class C 1 (D) for every i = 1 . . . , n.
• C k (D) if all the first partial derivatives
∂f
∂xi
of f exist and are of class C k−1 (D) for every i = 1 . . . , n.
We write f ∈ C k (D).
Definition 1.6. Let f ∈ C 2 (D). The Hessian matrix of f at p is the matrix
 2 
2 ∂ f
D f (p) = H f (p) = (p)
∂xi ∂xj i,j=1,...,n

Remark 1.7. Note that by Schwarz’s theorem, if f ∈ C 2 (D) then the matrix H f (p)
is symmetric.

2. The Implicit function Theorem


In this section we are going to study non-linear systems of equations. For
example,
(2.1) x2 + zexy + z = 1
3x + 2y + z = 3
In general, it is extremely difficult to prove that there is a solution (and it may
not exist) o to solve explicitly those systems. Nevertheless, in Economics it often
happens that the model we are interested in is described by a system of equations
such as, for example, the system 2.1. And we would like to be able to say something
about the dependence of the solution with respect to the parameters. In this section
we address this problem.
Firstly, we note that that a system of m equations and n unknowns may be
written in the following form
4 CHAPTER 4: HIGHER ORDER DERIVATIVES

f1 (u) = 0
f2 (u) = 0
..
.
fm (u) = 0
where u ∈ Rn and f1 , f2 , . . . , fm : Rn → R. For example, the system 2.1 may be
written as
f1 (u) = 0
f2 (u) = 0

with f1 (x, y, z) = x2 + zexy + z − 1 and f2 (x) = 3x + 2y + z − 3.


How are the solutions of the system 2.1. Comparing the situation with a linear
system we should expect to be able to solve for two of the variables in terms of one
parameter, since there are 2 equations and 3 unknowns. Suppose, for example that
we want to solve for y, z as functions of x. This might be complicated and in most
of the cases impossible. In this situation, the implicit function Theorem,
• provides sufficient conditions under which the system 2.1 has a unique so-
lution, that is it to proves the existence of two functions y(x) y z(x) which
satisfy the equations 2.1, even if we do not know how to compute these
functions.
• when the system of equations 2.1 has a solution it permits us to obtain
an expression for y 0 (x) and z 0 (x), even if we do not know how to compute
y(x), z(x).

Let us consider a system of equations


(2.2) f1 (u, v) = 0
f2 (u, v) = 0
..
.
fm (u, v) = 0
where u = (u1 , . . . , un ) ∈ Rn are the independent variables and v = (v1 , . . . , vm ) ∈
Rm are the variables for which we want to solve for1 and f1 , f2 , . . . , fm : Rn ×Rm →
R. To this system we associate the following expression
 ∂f1 ∂f1 
∂v1 · · · ∂v m
∂ (f1 , f2 , . . . , fm )
= det  ... .. 

∂ (v1 , . . . , vm ) . 
∂fm ∂fm
∂v1 · · · ∂vm
For example, for the system 2.1
xzexy exy + 1
 
∂ (f1 , f2 )
= det = xzexy − 2exy − 2
∂ (y, z) 2 1

1In the example 2.1 n = 1, m = 2, u = x, v = (y, z).


CHAPTER 4: HIGHER ORDER DERIVATIVES 5

Theorem 2.1 (implicit function Theorem). Suppose that the functions f1 , f2 , . . . , fm :


Rn × Rm → R are of class C 1 and that there is a point (u0 , v0 ) ∈ Rn × Rm such
that
(1) f1 (u0 , v0 ) = f2 (u0 , v0 ) = · · · = fm (u0 , v0 ) = 0; and
(2)
∂ (f1 , f2 , . . . , fm )
(u0 , v0 ) 6= 0
∂ (v1 , . . . , vm )
Then, there are an sets U ⊂ Rn and V ⊂ Rm and functions g1 , . . . gm : U → R
such that
(1) u0 ∈ U , v0 ∈ V .
(2) for every u ∈ U ,
f1 (u, g1 (u), . . . , gm (u)) = f2 (u, g1 (u), . . . , gm (u)) = · · · = fm (u, g1 (u), . . . , gm (u)) = 0
(3) If u ∈ U and v = (v1 , . . . , vm ) ∈ V are solutions of the system of equations
f1 (u, v) = f2 (u, v) = · · · = fm (u, v) = 0, then v1 = g1 (u), . . . vm = gm (u).
(4) The functions g1 , . . . gm : U → R are differentiable and for each i =
1, 2, . . . , m and j = 1, 2, . . . , n we have that

∂gi ∂ (f1 , f2 , . . . , fm ) ∂ (f1 , f2 , . . . , fm )
(2.3) =−
∂uj ∂ (v1 , . . . , vi−1 , uj , vi+1 , . . . , vm ) ∂ (v1 , . . . , vm )
Remark 2.2. Explicitly,
∂f1 ∂f1 ∂f1 ∂f1 ∂f1
 
∂v1 ··· ∂vi−1 ∂uj ∂vi+1 ··· ∂vm
∂ (f1 , f2 , . . . , fm )  .. .. .. .. 
= det  . ··· . . .

∂ (v1 , . . . , vi−1 , uj , vi+1 , . . . , vm ) 
∂fm ∂fm ∂fm ∂f1 ∂fm

∂v1 ··· ∂vi−1 ∂uj ∂vi+1 ··· ∂vm

Remark 2.3. The conclusion of the implicit function Theorem may be expressed in
the following way,
(1) The functions
z1 = g1 (u), z2 = g1 (u), . . . , zm = gm (u)
are a solution of the system of equations 2.2.
(2) The derivatives of the functions g1 , . . . gm : U → R may be computed by
implicitly differentiating the system of equations 2.2 and applying the chain
rule.
Remark 2.4. Applying several times the implicit function Theorem we may also
compute the higher order derivatives of the dependent variables.
Example 2.5. Let us apply the implicit function Theorem to the system of equations
(2.4) x2 + zexy + z = 1
3x + 2y + z = 3
First we note that x = 1, y = z = 0 is a solution of the system. On the other hand,
we have seen that
xzexy exy + 1
 
∂ (f1 , f2 )
(1, 0, 0) = det = (xzexy − 2exy − 2)|x=1,y=z=0 = −4 6= 0
∂ (y, z) 2 1
x=1,y=z=0
6 CHAPTER 4: HIGHER ORDER DERIVATIVES

The implicit function Theorem guarantees that we may solve for the variables y
and z as functions of x for values of x near 1. Furthermore, differentiating with
respect to x in the system we obtain
(2.5) 2x + z 0 exy + z(y + xy 0 )exy + z 0 = 0
0 0
3 + 2y + z = 0
Now substitute x = 1, y = z = 0,
(2.6) 2 + 2z 0 (1) = 0
3 + 2y 0 (1) + z 0 (1) = 0
so that z 0 (1) = y 0 (1) = −1. This could be computed as well using formula 2.3,
∂(f1 ,f2 )
∂(x,z) (1, 0, 0) 2x + yzexy exy + 1
 
1 −4
y 0 (1) = − = det = = −1
−4 4 3 1
x=1,y=z=0 4
and
∂(f1 ,f2 )
∂(y,x) (1, 0, 0) xzexy 2x + yzexy
 
0 1 −4
z (1) = − = det = = −1
−4 4 2 3
x=1,y=z=0 4
00 00
To compute the second derivatives y (x) y z (x), we differentiate each equation
of the system 2.5 with respect to x. After simplifying we obtain
2 + z 00 exy + 2z 0 (y + xy 0 )exy + z(2y 0 + xy 00 )exy + z(y + xy 0 )2 exy + z 00 = 0
00 00
2y + z = 0
0 0
and substituting x = 1, y(1) = z(1) = 0, z (1) = y (1) = −1
2 + 2z 00 (1) = 0
2y (1) + z 00 (1)
00
= 0
from here we see that z 00 (1) = −1, y 00 (1) = 1/2. Iterated differentiation allows us
to obtain the derivatives of any order z (n) (1), y (n) (1).
Example 2.6. Consider the macroecononomic model
(2.7) Y = C +I +G
C = f (Y − T )
I = h(r)
r = m(M )
where the variables are Y (national income), C (consumption), I (investment) and
r (interest rate) and the parameters are M (money supply), T (taxes collected) and
G (public spending). We assume that 0 < f 0 (z) < 1. Compute
∂Y ∂Y ∂Y
, ,
∂M ∂T ∂G
The system may be written as follows
f1 = C +I +G−Y =0
f2 = f (Y − T ) − C = 0
f3 = h(r) − I = 0
f4 = m(M ) − r
CHAPTER 4: HIGHER ORDER DERIVATIVES 7

First we compute
 
−1 1 1 0
∂ (f1 , f2 , f3 , f4 )  f 0 (Y − T ) −1 0 0 
= det   = 1 − f 0 (Y − T ) 6= 0
∂ (Y, C, I, r)  0 0 −1 h0 (r) 
0 0 0 −1
By the implicit function Theorem the system 2.7 defines implicitly Y , C, I and
r as functions of M , T and G. (we assume that the system has some solution).
Differentiating in 2.7 with respect to M we obtain
∂Y ∂C ∂I
= +
∂M ∂M ∂M
∂C ∂Y
= f 0 (Y − T )
∂M ∂M
∂I ∂r
= h0 (r)
∂M ∂M
∂r
= m0 (M )
∂M
Solving these equations, we obtain
∂Y h0 (r)m0 (M )
=
∂M 1 − f 0 (Y − T )
Differentiating in 2.7 with respect to T we obtain
∂Y ∂C ∂I
= +
∂T ∂T ∂T
∂C ∂Y
= f 0 (Y − T )( − 1)
∂T ∂T
∂I ∂r
= h0 (r)
∂T ∂T
∂r
= 0
∂T
Solving these equations, we obtain
∂Y −f 0 (Y − T )
=
∂t 1 − f 0 (Y − T )
Differentiating in 2.7 with respect to G we obtain
∂Y ∂C ∂I
= + +1
∂G ∂G ∂G
∂C ∂Y
= f 0 (Y − T )
∂G ∂G
∂I 0 ∂r
= h (r)
∂G ∂G
∂r
= 0
∂G
Solving these equations, we obtain
∂Y 1
=
∂T 1 − f 0 (Y − T )
8 CHAPTER 4: HIGHER ORDER DERIVATIVES

Example 2.7 (Indifference curves). Suppose that there are two consumption goods
and a consumer whose preferences a represented by a utility function u(x, y). The
indifference curves of the consumer are the sets
{(x, y) ∈ R2 : x, y > 0, u(x, y) = C}
with C ∈ R. Suppose that the function u(x, y) is differentiable and that
∂u ∂u
>0 >0
∂x ∂y
Applying the implicit function Theorem, we see that the equation
u(x, y) = C
defines y as a function of x. The set
{(x, y) ∈ R2 : x, y > 0, u(x, y) = C}
may be represented as the graph of the function y(x).

y(x)

y(a)

{ (x,y) : u(x,y) = C }

Differentiating implicitly, we may compute the derivative y 0


∂u ∂u 0
+ y (x) = 0
∂x ∂x
so that
∂u/ ∂x
y 0 (x) = −
∂u/ ∂y
We see that y(x) is a decreasing function. The absolute value of y 0 (x) (that is, the
absolute value of the slope of the straight line tangent to the indifference curve)
is the marginal rate of substitution of the consumer. Therefore, we define the
marginal rate of substitution of the consumer as
∂u/ ∂x
MRS(x, y) = (x, y)
∂u/ ∂y
Suppose that a consumer has a bundle of consumption goods (a, b = y(a)). Re-
calling the interpretation of the derivative y 0 (a), we see that the marginal rate
of substitution MRS(a, b) of the agent measures (approximately) the maximum
amount of good y that the agent would be willing to exchange for an additional
consumption of one unit of good x.
CHAPTER 4: HIGHER ORDER DERIVATIVES 9

For example, if the consumer has a Cobb-Douglas utility function u(x, y) = x2 y 4 ,


the marginal rate of substitution is
∂u/ ∂x 2xy 4 y
MRS(x, y) = = 2 3 =
∂u/ ∂y 4x y 2x
On the other hand, recall that the slope of the straight line tangent to the graph
of y(x) at the point (a, y(a)) is y 0 (a). That is, the director vector of the straight
line tangent to the graph of y(x) at the point (a, y(a)) is the vector (1, y 0 (a)).
Performing the scalar product of this vector with the gradient vector of u at the
point (a, y(a)) we obtain that
   
∂u/ ∂x ∂u ∂u
(1, y 0 (a)) · ∇u(a, y(a)) = 1, − · , =0
∂u/ ∂y ∂x ∂y
And we have checked again that the gradient vector ∇u is perpendicular to the
straight line tangent the indifference curve of the consumer.

y(x)

∇ u(a,y(a))

y(a)

{ (x,y) : u(x,y) = C }

Example 2.8. Suppose that there are two consumption goods and the agent has
preferences over theses which might be represented by a utility function u(x, y).
Suppose the prices of the goods are px and py . Consuming the bundle (x, y) costs
px x + py y
to the agent. If his income is I then
px x + py y = I
That is, if the agent buys x units of the first good, then the maximum amount he
can consume of the second good is
I px
− x
py py
so his utility is
 
I px
(2.8) u x, − x
py py
10 CHAPTER 4: HIGHER ORDER DERIVATIVES

In Economic Theory one assumes that the agent chooses the bundle of goods (x, y)
that maximizes his utility. That is the agent maximizes the function 2.8. Differen-
tiating implicitly with respect to x we obtain
∂u ∂u px
(2.9) − =0
∂x ∂y py
Thus, the first order condition is
px
MRS(x, y) =
py
The above equation together with the budget restriction
px x + py y = I
determines the demand of the agent.
For example if the preferences of the consumer may be represented by a Cobb-
Douglas utility function
u(x, y) = x2 y
the MRS is
2xy 2y
MRS(x, y) = 2 =
x x
and the demand of the agent is determined by the system of equations
2y px
=
x py
px x + py y = I
from these we obtain the demand of the agent
2I
x(px , py , I) =
3px
I
y(px , py , I) =
3py
Example 2.9 (Isoquants and the marginal rate of technical substitution). Suppose
that a firm uses the production function Y = f (x1 , x2 ) where (x1 , x2 ) are the units
of inputs used in manufacturing ofY units of the product. Given a fixed level of
production ȳ, the corresponding isoquant is the level set
{(x1 , x2 ) ∈ R2 : x1 , x2 > 0, f (x1 , x2 ) = ȳ}
As in the previous exercise, we see that on the isoquants we may write x2 as a
function of x1 and that
∂f / ∂x1
x02 (x1 ) = −
∂f / ∂x2
The marginal rate of technical substitution is defined as
∂f / ∂x1
RMST = −x02 (x1 ) =
∂f / ∂x2
1/3 1/2
For example, if the production function of the firm is Y = x1 x2 then the
marginal rate of technical substitution is
1 −2/3 1/2
∂Y / ∂x1 3 x1 x2 2x2
RMST = = =
∂Y / ∂x2 1 1/3 −1/2 3x1
2 x1 x2
CHAPTER 4: HIGHER ORDER DERIVATIVES 11

3. Taylor’s Approximations of First and Second order


Definition 3.1. Let f ∈ C 1 (D), p ∈ D. The first order Taylor polynomial at p is

P1 (x) = f (p) + ∇f (p) · (x − p)


Remark 3.2. If f (x, y) is a function of two variables and p = (a, b)then Taylor’s first
order polynomial for the function f around the point p = (a, b) is the polynomial
∂f ∂f
P1 (x, y) = f (a, b) + (a, b) · (x − a) + (a, b) · (y − b)
∂x ∂y
Definition 3.3. If f ∈ C 2 (D) we define the Taylor polynomial of order 2 around
the point p as

1 1
P2 (x) = f (p)+∇f (p)·(x−p)+ (x−p) H f (p)(x−p) = P1 (x)+ (x−p) H f (p)(x−p)
2 2
Remark 3.4. If f (x, y) is a function of two variables and p = (a, b) then Taylor’s
second order polynomial for the function f around the point p = (a, b) is the
polynomial
∂f ∂f
P2 (x, y) = f (a, b) + (a, b)(x − a) + (a, b)(y − b) +
∂x ∂y
 2
∂2f ∂2f

1 ∂ f 2 2
+ (x − a) + 2 (x − a)(y − b) + (y − b)
2 ∂x∂x ∂x∂y ∂y∂y
Remark 3.5. These are good approximations to f (x) in the sense that if f is of
class C 1 (D), then,
f (x) − P1 (x)
lim =0
x→p kx − pk
and if f is of class C 2 (D), then,
f (x) − P2 (x)
lim =0
x→p kx − pk2

4. Quadratic forms
Definition 4.1. A quadratic form of order n is a function Q : Rn → R of the
form
n
X
Q(x1 , x2 , . . . , xn ) = aij xi xj
i,j=1

for some real numbers aij ∈ R i, j = 1, . . . , n


Example 4.2. Q(x, y, z) = x2 − 2xy + 4xz + 6yz + 5z 2
Remark 4.3. A quadratic form can be expressed in matrix notation. For example,
  
 1 −1 2 x
Q(x, y, z) = x y z −1 0 3 y  = x2 − 2xy + 4xz + 6yz + 5z 2
2 3 5 z
12 CHAPTER 4: HIGHER ORDER DERIVATIVES

It is possible to do this in general (infinitely many) ways. For example, the previous
quadratic form can also be expressed as
  
 1 −2 1 x
Q(x, y, z) = x y z 0 0 4 y 
3 2 5 z
(The condition is that
xAxt = xBxt
as long as aij + aji = bij + bji for every i, j = 1, 2, . . . , n)
But, there is a unique way if we require that A be symmetric.
Proposition 4.4. Every quadratic form Q : Rn → R, can be written in a unique
way Q(x) = xAxt with A = At a symmetric matrix.
Remark 4.5. Observe that the symmetric matrix
A = (aij )
is associated with the following quadratic form
n
X n
X X
Q(x) = aij xi xj = aii x2i + 2 xi xj
i,j=1 i=1 1≤i<j≤n

We will identify the quadratic form Q(x) = xAxt with the matrix A.

4.1. Classification of quadratic forms.


Definition 4.6. A quadratic form Q : Rn → R is
(1) Positive definite if Q(x) > 0 for every x ∈ Rn , x 6= 0.
(2) Negative definite if Q(x) < 0 for every x ∈ Rn , x 6= 0.
(3) Positive semidefinite if Q(x) ≥ 0 for every x ∈ Rn and Q(x) = 0 for
some x 6= 0.
(4) Negative semidefinite if Q(x) ≤ 0 for every x ∈ Rn and Q(x) = 0 for
some x 6= 0.
(5) Indefinite if there are some x, y ∈ Rn such that Q(x) > 0 and Q(y) < 0.
Example 4.7. Q1 (x, y, z) = x2 + 3y 2 + z 2 is positive definite.
Example 4.8. Q2 (x, y, z) = −2x2 − y 2 is negative semidefinite.
Example 4.9. Q3 (x, y) = −2x2 − y 2 is negative definite.
Example 4.10. Q4 (x, y, z) = x2 − y 2 + 3z 2 is indefinite.
The previous quadratic forms are easy to classify because they are in diagonal
form, i.e.
   
1 0 0 −2 0 0
Q1 ⇔ 0 3 0 Q2 ⇔  0 −1 0
0 0 1 0 0 0
 
  1 0 0
−2 0
Q3 ⇔ Q4 ⇔ 0 −1 0
0 −1
0 0 3
CHAPTER 4: HIGHER ORDER DERIVATIVES 13

Proposition 4.11. Consider the matrix


 
λ1 0 0 ··· 0
 0 λ2 0 ··· 0 
A= .
 
.. .. ..
 ..

. .··· . 
0 0 0 ··· λn
Then, the quadratic form
Q(x) = xAxt = λ1 x21 + λ2 x22 + · · · + λn x2n
es,
(1) positive definite if and only if λi > 0 for every i = 1, 2, . . . , n;
(2) negative definite if and only if λi < 0 for every i = 1, 2, . . . , n;
(3) positive semidefinite if and only if λi ≥ 0 for every i = 1, 2, . . . , n and
λk = 0 for some k = 1, 2, . . . , n;
(4) negative semidefinite if and only if λi ≤ 0 for every i = 1, 2, . . . , n and
λk = 0 for some k = 1, 2, . . . , n;
(5) indefinite if and only if there is some λi > 0 and some λi < 0.
We are going to study some methods to determine whether a quadratic form is
positive/negative definite, semidefinite or indefinite. They are based on making a
change of variables
 that puts the quadratic
 form in diagonal form.
a11 a12 · · · a14
 a12 a22 · · · a24 
Let A =  . ..  be a symmetric matrix.
 
.. ..
 .. . . . 
a1n a2n· · · ann
a11 a12 a13
a11 a12
Let D1 = a11 , D2 = , D3 = a12 a22 a23 , . . . , Dn = |A| be
a21 a22
a13 a23 a33

the leading principal minors of A. Suppose D1 6= 0, D2 6= 0, . . . , Dn−1 6= 0. Then,
there is a change of variables T x = z such that the quadratic form Q(x) = xAxt
becomes
D2 2 D3 2 Dn 2
Q̃(z) = D1 z12 + z + z + ··· + z
D1 2 D2 3 Dn−1 n
This helps to remember the following.
Proposition 4.12. Let Q(x) = xAxt with A symmetric and suppose that |A| = 6 0.
Then,
(1) A is positive definite if and only Di > 0 for every i = 1, 2, . . . , n;
(2) A is negative definite if and only (−1)i Di > 0 for every i = 1, 2, . . . , n;
(3) if and (1) and (2) do not hold, then Q is indefinite.
The previous proposition applies when |A| 6= 0. What can we say if |A| = 0?
The following is a partial answer.
Proposition 4.13. Let Q(x) = xAxt with A symmetric and suppose Dn = 0 and
D1 6= 0, D2 6= 0, . . . , Dn−1 6= 0. Then A is
(1) positive semidefinite if and only D1 , D2 , . . . , Dn−1 > 0;
(2) negative semidefinite if and only D1 < 0, D2 > 0, . . . , (−1)n−1 Dn−1 > 0;
(3) indefinite otherwise.
14 CHAPTER 4: HIGHER ORDER DERIVATIVES

Next, we study some examples that illustrate some of thing one can do when
Dn = 0 and some of the D1 . . . , Dn−1 vanish.
Example 4.14. Consider the matrix
 
1 0 0
0 0 0
0 0 a
We see that D1 = 1, D2 = D3 = 0. Clearly, the eigenvalues are λ1 = 1, λ1 = 0,
λ1 = a. So, the associated quadratic form is positive semidefinite if and only if
a ≥ 0 and indefinite if and only if a < 0. But, this is impossible to tell from
D1 = 1, D2 = D3 = 0.
Remark that in the previous example, if we exchange the variables y and z then
the associated matrix becomes
 
1 0 0
0 a 0
0 0 0
and then, the above propositions apply. This is formalized in the following obser-
vation.
Definition 4.15. A principal minor is centered if includes the same rows and
columns. For example, the minor

a11 a13

a31 a33
is a centered minor, because it includes rows and columns 1, 3. But, the minor,

a11 a12

a31 a32
is NOT centered, because it includes rows 1, 3 and columns 1, 2.
Proposition 4.16. Proposition 4.13 still holds if we replace the chain of leading
principal minors by any other chain consisting of principal centered minors.
Remark 4.17. The methods above are especially useful for symmetric 2×2 matrices.
For example if A is 2 × 2 matrix and |A| < 0, then the associated quadratic form is
indefinite. Why?

5. Concavity and convexity


In this section, we assume that: D ⊂ Rn is a convex, open set.
Definition 5.1. We say that
(1) f is concave on D if for every λ ∈ [0, 1] and x, y ∈ D we have that
f (λx + (1 − λ)y) ≥ λf (x) + (1 − λ)f (y)
(2) f is convex on D if for every λ ∈ [0, 1] and x, y ∈ D we have that
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
Remark 5.2. Note that f is convex on D if and only if −f is concave on D.
Definition 5.3. We say that
CHAPTER 4: HIGHER ORDER DERIVATIVES 15

(1) f is strictly concave on D if for every λ ∈ (0, 1) and x, y ∈ D, x 6= y we


have that
f (λx + (1 − λ)y) > λf (x) + (1 − λ)f (y)
(2) f is strictly convex on D if for every λ ∈ (0, 1) and x, y ∈ D, x 6= y we
have that
f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y)
Remark 5.4. Note that f is strictly convex on D if and only if −f is strictly concave
on D.
Proposition 5.5. Let D be a convex, open subset of Rn . Then,
(1) f is concave ⇔ the set {(x, y) : x ∈ D, y ≤ f (x)} is convex.
(2) f is convex ⇔ the set {(x, y) : x ∈ D, y ≥ f (x)} is convex.
(3) f is strictly convex ⇔ the set {(x, y) : x ∈ D, y ≥ f (x)} is convex and the
graph of f contains no segments.
(4) f is strictly concave ⇔ the set {(x, y) : x ∈ D, y ≤ f (x)} is convex and the
graph of f no contains segments.
(5) If f is convex, then the lower contour set {x ∈ D : f (x) ≤ α} is convex for
every α ∈ R
(6) If f is concave, then the upper contour set {x ∈ D : f (x) ≥ α} is convex
for every α ∈ R
Example 5.6. f (x, y) = x2 + y 2 is strictly convex.
Example 5.7. f (x, y) = (x − y)2 is convex, but not strictly convex.
Remark 5.8. The conditions in (5) and (6) are necessary but not sufficient. For
example, any monotone function f : R → R satisfies that both sets
{x ∈ D : f (x) ≤ α} and {x ∈ D : f (x) ≥ α}
are convex.

6. First order conditions for concavity and convexity


Suppose f : Rn → R is concave and differentiable on a convex set D. Then,
the plane tangent to the graph of f at p ∈ D is above the graph of f . Recall that
the tangent plane is the set of points (x1 , . . . , xn , xn+1 ) ∈ Rn+1 that satisfy the
equation,
xn+1 = f (p) + ∇f (p) · (x − p)
Hence, if is concave and differentiable on D, then we have that,
f (x) ≤ f (p) + ∇f (p) · (x − p)
for every x ∈ D.
Proposition 6.1. Suppose f ∈ C 1 (D). Then,
(1) f is concave on D if and only if for all u, v ∈ D we have that
f (u) ≤ f (v) + ∇f (v) · (u − v)
(2) f is strictly concave on D if and only if for all u, v ∈ D, u 6= v, we have
that
f (u) < f (v) + ∇f (v) · (u − v)
16 CHAPTER 4: HIGHER ORDER DERIVATIVES

(3) f is convex on D if and only if for all u, v ∈ D we have that


f (u) ≥ f (v) + ∇f (v) · (u − v)
(4) f is strictly convex on D if and only if for all u, v ∈ D, u 6= v, we have that
f (u) > f (v) + ∇f (v) · (u − v)

7. Second order conditions for concavity and convexity


Proposition 7.1. Let D ⊂ Rn be open and convex. Let f ∈ C 2 (D) Let H f (p) be
the Hessian matrix of f at p. Then,
(1) f is concave on D if and only if for every p ∈ D, H f (p) is negative semi-
definite or negative definite. That is, f is concave on D if and only if for
every p ∈ D and x ∈ Rn we have that x · H f (p)x ≤ 0.
(2) f is convex on D if and only if for every p ∈ D, H f (p) is positive semidefinite
or positive definite. That is, f is convex on D if and only if for every p ∈ D
and x ∈ Rn we have that x · H f (p)x ≥ 0.
(3) If H f (p) is definite negative for every p ∈ D, then f is strictly concave on
D.
(4) If H f (p) is positive negative for every p ∈ D, then f is strictly convex on
D.
Remark 7.2. One can show that if f is strictly convex, then H f (x, y) is positive
definite except on a ”small” set. For example, f (x, y) = x4 + y 4 is strictly convex
and
12x2
 
0
H f (x, y) =
0 12y 2
is positive definite if xy 6= 0, that is, it is positive definite on all of R2 except on
the two axis {(x, y) ∈ R2 : xy = 0}. For points on the two axis (that is for points
(x, y) ∈ R2 such that xy = 0) the Hessian matrix is positive semidefinite.

8. Applications to convex sets


Proposition 8.1. If X1 , . . . , Xk are convex subsets of Rn , then X1 ∩ X2 ∩ · · · ∩ Xk
is also a convex subset.
Example 8.2. Using the theory of this chapter, prove that the set {(x, y) ∈ R2 :
3x2 + 10y 2 ≤ 10, x ≥ 0, y ≤ 0} is convex.
Example 8.3 (Concavity, convexity and preferences). In example 2.7 we considered
a consumer whose preferences (over two consumption goods) are represented by the
utility function u(x, y). The indifference curves of the consumer are the sets
{(x, y) ∈ R2 : x, y > 0, u(x, y) = C}
with C ∈ R. Suppose that the function u(x, y) is differentiable and that
∂u ∂u
>0 >0
∂x ∂y
Applying the implicit function Theorem, we see that the equation
u(x, y) = C
defines y as a function of x. The set
{(x, y) ∈ R2 : x, y > 0, u(x, y) = C}
CHAPTER 4: HIGHER ORDER DERIVATIVES 17

may be represented as the graph of the function y(x). Differentiation implicitly the
equation u(x, y) = C we may compute the derivative y 0
∂u ∂u 0
+ y (x) = 0
∂x ∂y
Applying again the implicit function Theorem we obtain an equation for the
second derivative y 00

∂2u ∂2u 0 ∂2u 2 ∂u 00


(8.1) +2 y (x) + (y 0 (x)) + y (x) = 0
∂x∂x ∂x∂y ∂y∂y ∂x
One of l standard assumptions in Economic Theory is that the set which consist
of all the consumption bundles which are preferred to a given consumption bundle
is a convex set. In terms of the utility function this means that the set
{(x, y) ∈ R2 : x, y > 0, u(x, y) > C}
is convex.

y(x)

{ (x,y) : u(x,y) >= C }

By Proposition 5.5, the set {(x, y) ∈ R2 : x, y > 0, u(x, y) > C} is convex if we


assume that the function u(x, y) is concave2. Suppose that the function u(x, y) is
concave and of class C 2 . According to the definition of concavity, this means that
for every h, k ∈ R we have that
∂2u 2 ∂2u ∂2u 2
h +2 hk + k ≤0
∂x∂x ∂x∂y ∂y∂y
If in this equation we plug in h = 1, k = y 0 (x) we obtain that
∂2u ∂2u 0 ∂2u 2
+2 y (x) + (y 0 (x)) ≤ 0
∂x∂x ∂x∂y ∂y∂y
and solving for y 00 in the equation 8.1 we obtain
∂2u 2
∂ u 0 ∂2u 2
00 ∂x∂x + 2 ∂x∂y y (x) + ∂y∂y (y 0 (x))
y (x) = − ≥0
∂u/ ∂x
2But, that the set {(x, y) ∈ R2 : x, y > 0, u(x, y) ≥ C} is convex does not necessarily imply
that the function u is concave
18 CHAPTER 4: HIGHER ORDER DERIVATIVES

that is, the function y(x) is convex, so that y 0 (x) is increasing. Since MRS(x, y(x)) =
−y 0 (x), we see that if the preferences of the consumer are convex his marginal rate
of substitution is decreasing.

You might also like