Ammar Khanfer - Applied Functional Analysis-Springer (2024)
Ammar Khanfer - Applied Functional Analysis-Springer (2024)
Ammar Khanfer - Applied Functional Analysis-Springer (2024)
Applied
Functional
Analysis
Applied Functional Analysis
Ammar Khanfer
Mathematics Subject Classification: 46B70, 46B50, 46A22, 47B07, 47B38, 47B99, 46B25, 46A30,
54E52, 46C05, 35D30, 35J20, 35A15
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface
The present book is the third volume of our series in advanced analysis, which
consists of
• Volume 1: “Measure Theory and Integration”.
• Volume 2: “Fundamentals of Functional Analysis”.
The field of applied functional analysis concerns with the applications of functional
analysis to different areas of applied mathematics and includes various subfields
and research directions. Historically, functional analysis emerged as a consequence
of the investigations made on the minimization problems of calculus of variations
(COV), but it was soon thrived when connected to the theory of partial differential
equations. The theories of Sobolev spaces and distributions were established in the
beginning of the twentieth century to offer genuine and brilliant answers to the big
question of the existence of solutions of PDEs. This direction: (Sobolev spaces,
minimization problems in COV, existence and uniqueness theorems of solutions of
PDEs, regularity theory) is one of the greatest and most remarkable mathematical
achievements in the twentieth century, and should be regarded as one of the most
successful stories in the history of mathematics.
The present volume highlights this direction and introduces it to the readers by
studying its fundamentals and main theories, and providing a careful treatment of the
subject with clear exposition and in a student-friendly manner. The book is intended
to help students and junior researchers focusing on the theory of PDEs and calculus of
variations. The book serves as a one-semester, or alternatively two-semester, graduate
course for mathematics students concentrating on analysis. Essential prerequisites
for the book include real and functional analysis in addition to linear algebra. A
course on PDEs can be helpful but not necessary. The book consists of five chapters,
with eleven sections in each chapter.
Chapter 1 discusses linear bounded operators: compact operators, Hilbert–
Schmidt operators, self-adjoint operators and their spectral properties, and the Fred-
holm alternative theorem. After that, the unbounded operators are discussed in detail,
with a special focus on differential and integral operators.
v
vi Preface
I am forever grateful and thankful to God for giving me the strength, health,
knowledge, and patience to endure and complete this work successfully.
I would like to express my sincere thanks to Prince Sultan University for its
continuing support. I also wish to express my deep thanks and gratitude to Prof.
Mahmoud Al Mahmoud, the dean of our college (CHS), and Prof. Wasfi Shatanawi,
the chair of our department (MSD), for their support and recognition of my work. My
sincerest thanks to my colleagues in our department for their warm encouragement.
vii
Contents
1 Operator Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Quick Review of Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Lebesgue Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Complete Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.5 Fundamental Mapping Theorems on Banach Spaces . . . . 4
1.2 The Adjoint of Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Bounded Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Definition of Adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Adjoint Operator on Hilbert Spaces . . . . . . . . . . . . . . . . . . . 7
1.2.4 Self-adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Definition and Properties of Compact Operators . . . . . . . . 8
1.3.2 The Integral Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.3 Finite-Rank Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Hilbert–Schmidt Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.1 Definition of Hilbert–Schmidt Operator . . . . . . . . . . . . . . . 14
1.4.2 Basic Properties of HS Operators . . . . . . . . . . . . . . . . . . . . . 16
1.4.3 Relations with Compact and Finite-Rank Operators . . . . . 17
1.4.4 The Fredholm Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.5 Characterization of HS Operators . . . . . . . . . . . . . . . . . . . . 22
1.5 Eigenvalues of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5.1 Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5.2 Definition of Eigenvalues and Eigenfunctions . . . . . . . . . . 24
1.5.3 Eigenvalues of Self-adjoint Operators . . . . . . . . . . . . . . . . . 24
1.5.4 Eigenvalues of Compact Operators . . . . . . . . . . . . . . . . . . . 26
1.6 Spectral Analysis of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.6.1 Resolvent and Regular Values . . . . . . . . . . . . . . . . . . . . . . . 28
1.6.2 Bounded Below Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.6.3 Spectrum of Bounded Operator . . . . . . . . . . . . . . . . . . . . . . 30
ix
x Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
About the Author
Ammar Khanfer earned his Ph.D. from Wichita State University, USA. His area of
interest is analysis and partial differential equations (PDEs), focusing on the interface
and links between elliptic PDEs and hypergeometry. He has notably contributed to
the field by providing prototypes studying the behavior of generalized solutions of
elliptic PDEs in higher dimensions in connection to the behavior of hypersurfaces
near nonsmooth boundaries. He also works on the qualitative theory of differential
equations, and in the area of inverse problems of mathematical physics. He has
published articles of high quality in reputable journals.
Ammar taught at several universities in the USA: Western Michigan University,
Wichita State University, and Southwestern College in Winfield. He was a member
of the Academy of Inquiry Based Learning (AIBL) in the USA. During the period
2008–2014, he participated in AIBL workshops and conferences on effective teaching
methodologies and strategies of creative thinking. He then moved to Saudi Arabia
to teach at Imam Mohammad Ibn Saud Islamic University, where he taught and
supervised undergraduate and graduate students of mathematics. Furthermore, he
was appointed as coordinator of the Ph.D. program establishment committee in the
department of mathematics. In 2020, he moved to Prince Sultan University in Riyadh,
and has been teaching there since then.
xvii
Chapter 1
Operator Theory
This section provides a very brief and quick review of the basics of Hilbert space
theory and functional analysis that are needed for this text. We list some of the most
important notions and results that will be used throughout this book. It should be
noted that the objective of this section is to merely refresh the memory rather than
explain these concepts as they have been already explained in detail in volume 2 of
this series [58]. The reader who did not study this material should consult [58] or
alternatively any introductory book on functional analysis.
Definition 1.1.2 (L p Spaces) The space L[a, b] is the space consisting of all
Lebesgue-integrable functions on [a, b], that is, those functions f : [a, b] → R such
that b
f = | f (x)| d x < ∞.
a
The space L[a, b] can be also generalized to L p [a, b], the space of all functions such
that | f | p is Lebesgue-integrable on [a, b] for every f ∈ L p [a, b], where 1 ≤ p <
∞, endowed with the norm
b 1/ p
f p = | f (x)| d x
p
.
a
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 1
A. Khanfer, Applied Functional Analysis,
https://doi.org/10.1007/978-981-99-3788-2_1
2 1 Operator Theory
f + g p ≤ f p + g p .
Theorem 1.1.13 Let X be complete space and Y ⊂ X. Then, Y is closed if and only
if Y is complete.
·, · : V × V −→ F
where V is a vector space over the field F, which could be R or C, such that the
following hold:
(1) For any x ∈ V , x, x ≥ 0 and x, x = 0 iff x = 0.
(2) For x, y ∈ V, a ∈ F, we have αx, · = α x, · and x + y, · = x, · + y, · .
We also have the conjugate linearity property: ·, αx = ᾱ ·, x and
·, x + y = ·, x + ·, y .
(3) x, y = y, x . If F = R, then we have x, y = y, x .
Definition 1.1.16 (Hilbert Space) Let V be a vector space. The space X = (V, ·, · )
is said to be an inner product space. A complete inner product space is called Hilbert
space.
Recall in a basic functional analysis course, the linear operator was defined to be a
mapping T from a normed space X to another normed space Y such that
for all x, y ∈ Y and for any scalar c in the field underlying the spaces X, Y which
usually is taken as R.
Let T be a linear operator, and X and Y be two normed spaces with norms · X and
·Y , respectively. Then T is called bounded linear operator if there exists M ∈ R,
such that for all x ∈ X,
T xY ≤ M x X ,
where · X and ·Y are the norms for the spaces X and Y, respectively. If there is no
such M, then T is said to be unbounded. If Y = R then T is called functional, and if
X is finite-dimensional then T is called transformation. In general, T is a mapping
that maps an element x to a unique element in Y. The norm can be written as
T = sup T (x) .
x=1
The fundamental theorem of bounded linear operators states that T is bounded iff T
is continuous at all x ∈ X iff T is continuous at 0. So, for linear functionals, bound-
edness and continuity are equivalent, and this feature is not available for nonlinear
operators. In fact, it can be easily shown that every linear operator defined on a finite-
dimensional space is bounded (i.e., continuous). An operator T is said to be injective
(or one-to-one) if for every x, y ∈ X such that T (x) = T (y), we have x = y. An
operator T is said to be surjective (or onto) if for every y ∈ Y, there exists at least
one x ∈ X such that T (x) = y. An operator T is said to be bijective if it is injective
and surjective. If dim X = dim Y = n < ∞, then T is injective if and only if T is
surjective. If T is bijective, and T, T −1 are continuous, then T is an isomorphism
between X and Y. Moreover, T is called isometry if
x X = ϕ(x)Y
for all x ∈ X.
An important operator that is defined in association with the operator T is the adjoint
operator.
Definition 1.2.1 (Adjoint Operator) Let X and Y be two Banach spaces, and let
T : X −→ Y be a linear operator. Then, the adjoint operator of T, denoted by T ∗ ,
is the operator T ∗ : Y ∗ −→ X ∗ defined as
T ∗ ( f ) = f T,
for f ∈ Y ∗ .
6 1 Operator Theory
A basic property that can be easily established from the definition is that T ∗ is
linear for linear operators, and, moreover, if S : X −→ Y is another linear operator,
then
(T + S)∗ = T ∗ + S ∗
and
(T S)∗ = S ∗ T ∗
(2) T is bounded and invertible with bounded inverse if and only if T ∗ has bounded
inverse, and
(T ∗ )−1 = (T −1 )∗ .
and so
∗
T ≤ T < ∞. (1.2.1)
Since T ∗ : Y ∗ −→ X ∗ ,
∗ ∗
T ≥ T (gx ) = |gx (T (x))| = T (x) .
Taking the supremum over all x ∈ X gives the reverse direction of (1.2.1), and this
proves (1). For (2), we have
Conversely, suppose T ∗ has bounded inverse. Then T ∗∗ has bounded inverse. Then
1.2 The Adjoint of Operator 7
T|∗∗
X
= T,
T ∗ ( f (x)) = f (T (x)) = 0
which implies that f = 0. This contradiction implies that T is onto, and hence a
bijection.
then f is clearly linear and bounded on H1 , and by the Riesz representation theorem,
there exists a unique z ∈ H1 such that
T x, y = x, z .
We define z as T ∗ (y).
Definition 1.2.3 (Adjoint Operator on Hilbert Spaces) Let T : H1 → H2 be a
bounded linear operator between two Hilbert spaces H1 and H2 . The adjoint operator
of T , denoted by T ∗ , is defined to be T ∗ : H2 → H1 given by
T x, y = x, T ∗ (y) .
(1) T ∗∗ = T .
(2) T ∗ T = T 2 .
(3) N (T ) = R(T ∗ )⊥ and N (T )⊥ = R(T ∗ ).
(4) N (T ∗ ) = R(T )⊥ and N (T ∗ )⊥ = R(T ).
Recall from a linear algebra course that a self-adjoint matrix is a matrix that is equal
to its own adjoint. This extends to operators on infinite-dimensional spaces.
Definition 1.2.5 (Self-Adjoint Operator) A bounded linear operator on a Hilbert
space T ∈ B(H) is called self-adjoint if T = T ∗ .
Proposition 1.2.6 Let T ∈ B(H). Then T is self-adjoint if and only if T x, x ∈ R
for all x ∈ H.
Proof If T = T ∗ then
T x, y = x, T (y) = T x, y .
T x, x = T x, x = x, T ∗ x = T ∗ x, x ,
hence T = T ∗ .
compact linear operator is bounded, thus continuous. One of the basic properties
of compact operators is that composing them with bounded linear operators retains
compactness.
Proposition 1.3.2 Let T be compact, and S be bounded linear on a normed space
X. Then ST and T S are compact operators.
Proof If {xn } is bounded, then S(xn ) is bounded, hence T (S(xn )) has a convergent
subsequence, so T S is compact. Also, since {xn } is bounded, T (xn ) has a convergent
subsequence T (xn j ), but since S is bounded, it is continuous in norm, so S(T (xn j ))
also converges.
Theorem 1.3.3 Let T ∈ K(X ) for a normed space X. Then, T is invertible if and
only if dim(X ) < ∞.
One of the most fundamental and important results in analysis which provides a
compactness criterion is the Arzela–Ascoli theorem.
Theorem 1.3.4 (Arzela–Ascoli Theorem) Let f n ∈ C(K ), for some compact set
K . If the sequence { f n } is bounded and equicontinuous, then it has a uniformly
convergent subsequence.
Hence
This implies that {h n (x)} is uniformly Cauchy and thus it converges to, say h(x),
which is continuous by equicontinuity of { f n }. Now, using a similar argument we
conclude that {h n (x)} converges uniformly to h(x).
10 1 Operator Theory
Hence (T ∗ f n j ) converges.
For the converse, if T ∗ is compact then so is T ∗∗ . Let J : X −→ X ∗∗ be the
canonical mapping given by
So
and consequently
∗∗
T (J (xn ) = T (xn ) .
j j
T (xn j ) converges due to the convergence of T ∗∗ (J (xn j ), and this completes the
proof of the other direction.
1.3 Compact Operators 11
Example 1.3.6 (Integral Operator) Let X = L p [a, b], 1 < p < ∞, and let k ∈
C ([a, b] × [a, b]) be a mapping from [a, b] × [a, b] to R. Consider the Fredholm
integral operator K: X −→ X defined by
b
(K u)(x) = k(x, y)u(y)dy.
a
We will show that K is a compact operator for all p > 1. It is easy to see that
|K u| ≤ k∞ u p ,
from which we conclude that K is bounded, and so for a bounded set B ⊂ X, K (B)
is bounded. Now we show that K (B) is equicontinuous. Since u ∈ L p [a, b], let
b
|u(x)| p d x = α.
a
Moreover, since k is uniformly continuous on [a, b] × [a, b], for every > 0 there
exists δ > 0 such that for all x1 , x2 ∈ [a, b], with x1 − x2 < δ we have
|k(x2 , y) − k(x1 , y)| < , ∀t ∈ [a, b].
α
Hence, for all u ∈ K (B)
b
|(K u)(x2 ) − (K u)(x1 )| ≤ |k(x2 , y) − k(x1 , y)| |u(y)| dy ≤ .
a
Define
b
K ∗ v(y) = k(x, y)v(x)d x.
a
So the integral operator is an example of a compact operator (in fact the earli-
est example in the literature), and this operator is also self-adjoint if its kernel is
symmetric and square-integrable.
An important question raised is whether K(X, Y ) is closed, in the sense that if
{Tn } is a sequence of compact operators, and Tn → T , is T compact? The following
theorem demonstrates that this property holds if Y is Banach.
Theorem 1.3.7 Let {Tn } be a sequence of compact operators from a normed space
X to a Banach space Y . If {Tn } converges in norm to T , then T : X → Y is compact.
In particular, the set K(X, Y ) is closed.
Proof Let {xn } ∈ X be a bounded sequence, so xn ≤ M for some M > 0 for all
n. Since T1 is compact, {xn } has a subsequence {xn1 } such that T1 (xn1 ) converges in Y.
But {xn1 } must be bounded, and since T2 is compact, {xn1 } has a subsequence {xn2 } such
that T2 (xn2 ) converges in Y, and keeping in mind that T1 (xn2 ) converges as well since
{xn2 } is a subsequence of {xn1 }. Proceed the argument inductively; {xnk } subsequence
of {xnk−1 } with Tk (xnk ) converges. Choose the diagonal sequence
that is, the first term in the first sequence {xn1 }, the second term of the second sequence
{xn2 }, etc. The proof is done if we can prove that T (xnn ) converges. Clearly, Tk (xnn )
converges, so it is a Cauchy sequence and
TN (x n ) − TN (x m ) < (1.3.4)
n m
3
for large n, m > N . On the other hand, since Tn → T , for every > 0 there exists
N ∈ N such that for all n ≥ N , we have
T − Tn < . (1.3.5)
3M
From (1.3.4) and (1.3.5), we obtain
1.3 Compact Operators 13
T (x n ) − T (x m ) ≤ T (x n ) − TN (x n ) + TN (x n ) − TN (x m ) + TN (x m ) − T (x m )
n m n n n m m m
m
≤ T − TN xnn + + T − TN xm = .
3
Recall from a linear algebra course that the rank of a linear transformation is defined
as the dimension of its range, and represents the maximum number of linearly inde-
pendent dimensions in the range space of the operator. The definition extends to
operators on infinite-dimensional spaces as we shall see next.
Definition 1.3.8 (Finite-Rank Operator) Let X and Y be normed spaces. A bounded
linear operator T : X → Y is said to be of finite rank if its range is a finite-
dimensional subspace, i.e.,
r = dim(T (X ) < ∞.
An operator having a finite rank is called finite-rank operator, or f.r. operator for
short. The rank of an operator T is denoted by r (T ). The class of all bounded linear
operators of finite rank is denoted by K0 (X, Y ).
Note that if at least one of the spaces X or Y is finite-dimensional, then T is of
finite rank. Note also that if the range is finite-dimensional then every closed bounded
set is compact. Choosing any bounded set in the domain of a f.r. operator, this set
will be mapped to a bounded set in the finite-dimensional range, and so its closure
is compact. Thus:
Proposition 1.3.9 Finite-rank operators are compact operators, i.e.,
K0 (X, Y ) ⊂ K(X, Y ).
Note that the inclusion is proper, i.e., there exist compact operators that are not
f.r. operators. A simple example which is left to the reader to verify is to consider
the sequence space 2 and define the operator T : 2 −→ 2 as
x2 x3 xn
T (x1 , x2 , . . .) = (x1 , , , . . . , , . . .).
2 3 n
The source of the problem here is the range of T as it is not closed. If it is closed,
then it would be a f.r. operator as in the next proposition.
Proposition 1.3.10 If T : X −→ Y is compact and R(T ) is closed in the Banach
space Y, then T is of finite rank.
14 1 Operator Theory
n
n
y = T (x) = y, e j e j = T (x), e j e j . (1.3.6)
j=1 j=1
n
n
T (x) = T (x), e j e j = x, T ∗ (e j ) e j .
j=1 j=1
n
n
x, θ j e j , y = x, y, e j θ j .
j=1 j=1
Thus
n
T ∗ (·) = ·, e j θ j .
j=1
Now we discuss another important class of compact operators which was investigated
by Hilbert–Schmidt in 1907.
1.4 Hilbert–Schmidt Operator 15
Note that the definition requires an orthonormal basis; hence the space is essen-
tially separable. Another thing to observe is that separable Hilbert spaces have more
than one basis, so this raises the question of whether the condition holds for any
orthonormal basis or for a particular one? The answer is that the condition does not
depend on the basis. First, let us find a convenient form for the norm. Let xk ∈ H.
Then, T (x) can be written as
T (xk ) = T (xk ), ϕ j ϕ j .
j
So
T (ϕk )2 = T (ϕk ), ϕ j 2 .
j
Now, we show that the condition is independent of the choice of the basis. Let {u k }
be another orthonormal basis for H. By representing T (ϕk ) and T (u k ) by {u k }, it is
easy to see that
16 1 Operator Theory
T ϕk 2 = T (ϕk ), u j 2 = T ∗ (u j ), ϕk 2 = T ∗ (u j )2 .
k j,k j,k j
(1.4.1)
Similarly,
T u k 2 = T ∗ (u j )2 . (1.4.2)
k j
The following basic properties of HS operators follow easily from the above discus-
sion.
Proposition 1.4.2 Let T be a HS operator (i.e., T ∈ K2 (H)). Then
(1) T 2 = T ∗ 2 .
(2) T ≤ T 2 .
Proof The first assertion follows immediately from (1.4.2). For the second assertion,
let x ∈ H and {ϕn } be an orthonormal basis for H. Then
T x2 = | T x, ϕk |2 = x, T ∗ ϕk 2 ≤ x2 T 2 .
2
k k
T x ≤ x T 2
Proof The first inclusion asserts that every f.r. operator is HS. To prove this, let
T ∈ K0 (H). Since
r (T ) = dim(Im(T )) = m,
n
Tn = x, ϕ j T (ϕ j ).
j=1
Then, Tn is clearly a sequence of finite-rank operators for all n since R(Tn ) is con-
tained in the span of {ϕ1 , ϕ2 , . . . ϕn }. By Proposition 1.4.2(2),
18 1 Operator Theory
Taking n → ∞ gives
Tn − T → 0.
Note that {Tn } is a sequence of f.r. operators which are compact operators, and the
result follows from Theorem 1.3.7.
The preceding result simply states that every f.r. operator is a HS operator, and
every HS operator is compact. The proper inclusions imply the existence of compact
operators that are not HS, and the existence of HS that are not of finite rank. A useful
conclusion drawn from the preceding theorem is
Corollary 1.4.5 K0 (H) = K2 (H) in ·2 HS norm, that is, for every T ∈ K2 (H)
there exists a sequence Tn ∈ K0 (H) such that Tn − T 2 → 0.
where (ϕn ) and (u n ) are two orthonormal bases in H and (αn ) be a bounded sequence
in the underlying field F, say R. Then
if lim αn = 0.
(1) T is a compact operator if and only
(2) T is a HS operator if and only if |αn |2 < ∞.
(3) T is of finite rank if and only if there exists N ∈ N such that αn = 0 for all
n ≥ N.
Hence, the sequence {T (ϕn j )} has no convergent subsequence, and this implies that
T (xn ) cannot be compact. Suppose now that lim αn = 0. Define the sequence
n
Tn (x) = αk x, ϕk u k .
k=1
Then each Tn is of finite rank, and using the same argument of the proof of the
previous theorem we see that
Tn − T → 0
R(T ) ⊆ span{u 1 , u n , . . . u k }
for some m. This means that ϕk = 0 for all k ≥ m + 1. On the other hand, if for each
k ∈ N, αk = 0, then u k ∈ R(T ) for all k; hence
dim(R(T )) = ∞,
In light of the preceding theorem, if T is a f.r. operator, then αn = 0 for all but finite
number of terms, and this implies
|αn |2 < ∞, (1.4.3)
which also implies that αn → 0. Thus, T is compact. This leads to Theorem 1.4.4.
Moreover, if T is compact, then αn → 0, but this doesn’t necessarily imply (1.4.3),
and so T may not be HS. If (1.4.3) holds, then T is HS, but this doesn’t necessarily
mean that αn = 0 for all but finite number of terms, hence T may not be of finite
rank. It turns out that the results of the last two theorems are fully consistent with
each other, and the last theorem is very helpful to construct examples of compact but
not HS operators, and HS operators but not of f.r.
The following example is a good application of the preceding theorem.
Example 1.4.7 Let T : 2 → 2 , given by
T (x) = αn xn ϕn ,
20 1 Operator Theory
Example 1.3.6 demonstrated the fact that Fredholm integral equations defined on L p
are compact. In the particular case p = 2, we have the advantage of dealing with
an orthonormal basis for the space, which will allow us to work on HS norms. In
particular, we have the following result.
Theorem 1.4.8 The Fredholm integral operator K ∈ L 2 ([a, b]) defined by
b
(K u)(x) = k(x, y)u(y)dy
a
n=1 n=1 a
∞ b
= | k, ϕn |2 d x
n=1 a
∞
b
= | k, ϕn |2 d x (DCT)
a n=1
b
= k2 d x
a
b b
= |k(x, y)|2 d x
a a
= k22 < ∞.
1.4 Hilbert–Schmidt Operator 21
The preceding theorem demonstrates that the Fredholm integral operator defined
on L 2 [a, b] is HS. In fact, the converse is also true. Namely, every HS operator
defined on L 2 [a, b] is an integral operator. This striking result justifies the particular
importance of HS operators as compact operators that behave as integral operators
and could also be self-adjoint if their kernels are symmetric.
The next theorem demonstrates that every HS operator from L 2 to L 2 is identified
by a integral operator with kernel k ∈ L 2 .
Theorem 1.4.9 Every Hilbert–Schmidt operator defined on X = L 2 ([a, b]) is an
integral operator with square-integrable kernel.
Proof Consider a Hilbert–Schmidt operator K : X −→ X and let {ϕn } be an
orthonormal basis for X. So
∞
K ϕn 2 < ∞,
n=1
∞
hence n=1 (K ϕn ) converges in X. Now for u ∈ X we have
∞
u= u, ϕn ϕn .
n=1
It follows that
∞
(K u)(x) = K u, ϕn ϕn (x)
n=1
∞
= u, ϕn K ϕn (x)
n=1
∞ b
= u(y)ϕn (y)dy K ϕn (x)
n=1 a
b ∞
= u(y) ϕn (y)(K ϕn )(x) dy,
a n=1
where we used the Dominated Convergence Theorem (DCT) in the last step. Now,
define
∞
k(x, y) = ϕn (y)(K ϕn )(x).
n=1
Then k is clearly a mapping from [a, b] × [a, b] to R and k ∈ L 2 ([a, b] × [a, b]) .
Therefore, the HS operator K can be written as
b
(K u)(x) = k(x, y)u(y)dy.
a
22 1 Operator Theory
We end the section by the following observation: It was shown in Theorem 1.4.6(2)
that the operator
T (x) = αn x, ϕn u n
is HS if and only if |αn |2 < ∞. Note here that if u k = ϕk then we obtain
T (ϕk ) = αk ϕk .
It turns out that the sequence (αk ) is nothing but the eigenvalues of T. These
are countable (possibly finite) set of eigenvalues. This motivates us to investigate
the spectral properties of the operators to elaborate more on the eigenvalues and
eigenvectors of HS and compact operators. Before we start this investigation, we
would like to obtain one final result in this section. The preceding theorem predicts
that every HS operator on L 2 is an integral operator with a square-integrable kernel.
We
will2 combine this result with the preceding theorem to show that the scalar sum
|αn | is, in fact, the HS norm of the operator, so the result that T is HS iff this
sum is finite comes as no surprise.
Theorem 1.4.10 Let T ∈ K2 be a HS operator
Let (λn ) be the eigenvalues of T. If the kernel k ∈ L 2 ([a, b] × [a, b]) of T is sym-
metric, then
∞
k22 = |λn |2 .
n=1
Proof Since L 2 is Hilbert, let (ϕn ) be an orthonormal basis of L 2 ([a, b]) which are
the corresponding eigenvectors for (λn ). Define the set
Then it can be shown that (ψnm ) is an orthonormal basis of L 2 ([a, b] × [a, b]) (see
Problem 1.11.32). Since k ∈ L 2 ([a, b] × [a, b]) ,
k(x, y) = k, ψnm ψnm ,
n m
∞
k22 = | k, ψnm |2 = |λn |2 .
n m n=1
The study of eigenvalues in functional analysis was begun by Hilbert in 1904 during
his investigations on the quadratic forms in infinitely many variables. In 1904, Hilbert
used the terms “eigenfunction” and “eigenvalue” for the first time, and he called this
new direction of research “Spectral Analysis”. Hilbert’s early theory led to the study
of infinite systems of linear equations, and mathematicians like Schmidt and Riesz,
and then John von Neumann were among the first who began this direction. Finite
systems of linear equations were investigated during the eighteenth and nineteenth
centuries and based centrally on the notions of matrices and determinants. Then
the problem of solving integral equations emerged at the beginning of the twentieth
century. It turned out that the problem of solving an integral or differential equation
could be boiled down to solving linear systems of infinite unknowns. The spectral
theory deals with eigenfunctions and eigenvalues of operators in infinite-dimensional
spaces and the conditions under which they can be expressed in terms of their eigen-
values and eigenfunctions, which helps in solving integral and differential equations
by expanding the solution as a series of the eigenfunctions. Extending matrices to
infinite dimensions leads to the notion of operator, but many fundamental and crucial
properties of matrices will be lost upon that extension.
24 1 Operator Theory
We begin our discussion by the following definition, which is analogous to the finite-
dimensional case.
Definition 1.5.1 (Eigenvalue, Eigenfunction) Let X be a normed space and T ∈
B(X ). Then, a constant λ ∈ C is called eigenvalue of T if there exists a nonzero vector
x ∈ X such that T x = λx. The element x is called eigenvector, or
eigenfunction of T corresponding to λ.
Notice the following:
(1) The concept of eigenvalue and eigenfunction has been defined for bounded linear
operators, but it can also be defined for unbounded operators.
(2) We always avoid the case x = 0 since this won’t lead to desirable results.
The next proposition gives two basic properties of self-adjoint operators not proper-
ties related to the eigenvalues and the norm of the self-adjoint operator.
Proposition 1.5.2 Let T ∈ B(H) be a self-adjoint operator. Then
(1) All eigenvalues {λi } of T are real numbers, and their corresponding eigenvectors
are orthogonal.
1.5 Eigenvalues of Operators 25
λ u2 = T u, u = u, T u = λ̄ u2 ,
T v = μv
T u, v = u, T v .
Then
0 = T u, v − u, T v = λ u, v − μ u, v = (λ − μ) u, v .
Then, clearly
T x, x ≤ T x2 .
M ≤ T .
On the other hand, choosing x, y ∈ B X , and using Polarization Identity then the
Parallelogram Law for the inner product,
1
Re( T x, y ≤ [| T (x + y), x + y | + | T (x − y), x − y |]
4
1
≤ M x + y2 + x − y2
4
1
= M x2 + y2
2
= M.
| T x, y | = T x, c̄y ≤ M.
Taking the supremum over all y ∈ B X , then supremum over all x ∈ B X gives
T ≤ M.
T ∗ = T ∈ K(H)
| T xn , xn | → T ,
T xn , xn → λ ∈ R.
T xn − λxn 2 = T xn 2 + λ2 − 2λ T xn , xn
≤ 2λ2 − 2λ T xn , xn → 0.
1.5 Eigenvalues of Operators 27
T xnk → z (1.5.1)
λT (xn k ) −→ T z,
or
1
T xn k −→ T z. (1.5.2)
λ
Then from (1.5.1) and (1.5.2), we obtain T z = λz. It remains to show that z = 0.
Note that T > 0. Then, we have
Taking the limit of both sides, and using continuity of the norm gives
z > 0.
An immediate consequence is
Corollary 1.5.5 If T ∗ = T ∈ K(H), then T has at least one nonzero eigenvalue.
This contrasts with the finite-dimensional case where symmetric matrices may
have no eigenvalues if the underlying field is R since the characteristic polynomial
may not have solutions in R. It is well-known from linear algebra that the set of
eigenvalues of a matrix is finite. Since they are simply the roots of the characteristic
polynomial, there can be at most n eigenvalues for an n × n matrix. As mentioned
at the beginning of this section, things change when we turn to operators on infinite-
dimensional spaces. The next example illustrates the idea.
Example 1.5.6 Consider the left-shift operator on p , 1 ≤ p < ∞,
xn+1 = λn x1 .
and this element does not belong to p unless |λ| < 1. Hence, the set of eigenvalues
of T is (−1, 1).
We have seen one similarity that compact operators share with matrices: the discrete
set of eigenvalues (finite or countable). Now we exhibit another property regarding
the eigenvalues. Recall from linear algebra that a scalar λ ∈ C is called an eigenvalue
of a linear mapping T if (T − λI ) is singular, i.e., if
det(T − λI ) = 0.
(i.e., Tλ is bijection with a bounded inverse operator). The set of all regular values
of T is called the resolvent set, and is denoted by ρ(T ).
The definition is limited to bounded linear operators on Banach spaces, but the
definition can always be extended to unbounded operators on normed spaces. Fur-
thermore, every bounded continuous operator and bijective on Banach spaces has a
bounded inverse, thanks to OMT. So the condition of the bounded inverse is auto-
matically satisfied in the case of Banach spaces. If a scalar λ ∈
/ ρ(T ), then one of
the following holds:
(1) The scalar λ is an eigenvalue, so that Tλ has no inverse, i.e., ker(Tλ ) = {0}, or
(2) λ is not an eigenvalue, i.e., ker(Tλ ) = {0} and Tλ has an inverse, but λ is not a
regular value. By OMT, Tλ must be nonsurjective. This is due to one of the two
reasons:
(a) R(Tλ ) = X and Tλ−1 is unbounded.
1.6 Spectral Analysis of Operators 29
(b) R(Tλ ) X.
Hence, there can be more than one reason for the scalar not to be regular. All these
values are called spectral values and the set containing all of them is called spectrum.
Definition 1.6.2 (Spectrum) Let T be an operator. The set consisting all numbers
that are not in the resolvent set ρ(T ) is called the spectrum of T, and is denoted by
σ(T ). It consists of three sets: The point spectrum σ p (T ) consisting of all eigenvalues
of T, the continuous spectrum σc (T ) consisting of all scalars λ for which R(Tλ ) = X
and Tλ−1 is unbounded, and the residual spectrum σr (T ) consisting of all scalars λ
for which R(Tλ ) X.
As an immediate consequence of the two preceding definitions, we have the
following formula:
σ p (T ) ∪ σc (T ) ∪ σr (T ) = σ(T ) = C \ ρ(T ).
Since T is invertible,
−1
T = M < ∞
c x ≤ T x
1
for c = . Notice how the operator T is bounded from below. This suggests the
M
following definition:
Definition 1.6.3 (Bounded Below) An operator T is said to be bounded below if for
some c > 0, we have
c x ≤ T x
for all x ∈ X.
30 1 Operator Theory
The previous argument shows that if an operator is invertible, i.e., bijection with
bounded inverse, then it is bounded below. For the converse, we have the following
proposition.
Proposition 1.6.4 Let T ∈ B(X ) for a Banach space X. Then, T is bounded below
if and only if T is injective and R(T ) is closed.
c xn − xm ≤ T xn − T xm ,
T xR(T ) = T xY .
R(T ) = R(T ) = X.
Example 1.6.6 In Example 1.5.6, the point spectrum of the left-shift operator on
p , 1 ≤ p < ∞,
T (x1, x2 , x3 . . .) = (x2 , x3 . . .)
was found to be
Consequently,
|λ| > T = 1
σ(T ) ⊆ {λ ∈ C : |λ| ≤ 1}
The next proposition provides information about the structure of the spectrum.
Proposition 1.6.7 Let T ∈ B(X ) for a Banach space X.
(1) If T < 1, then T − I is invertible.
(2) If T < |λ| , then λ ∈ ρ(T ).
n
n
lim T k (I − T ) = lim(I − T ) Tk
n n
k=1 k=1
= lim(I − T n+1
)
n
= I.
So
32 1 Operator Theory
∞
(I − T )−1 = T k. (1.6.2)
k=1
T T
For (2), we set S = . Then, by assumption S < 1, and so by (a) − I is
λ λ
invertible, and so is T − λI .
The series in (1.6.2) is called the Neumann series.
Proposition 1.6.8 For any operator T ∈ B(X ), the spectrum σ p (T ) is compact.
Proof Let μ ∈ ρ(T ), so T − μI is invertible. Then for z ∈ C, we can write
(z − μ)T −1 = |z − μ| T −1 < 1.
μ μ
|λ| ≤ T < ∞
knowing that
T −1 λ−1 = λ−1 T −1
p(z) − p(λ) = (z − λ1 ) . . . (z − λn ).
Then, p(λ) ∈ / σ( p(T )) iff p(T ) − p(λ)I is invertible iff (z − λ j ) = 0 for all j =
1, . . . , n iff (T − λi I ) is invertible for all j = 1, . . . , n iff λ ∈
/ σ(T ). The details are
left to the reader as an (see Problem 1.11.42).
In the case of compact operators, things change. Again, compact operators prove
to be the typical extension to matrices since they retain spectral properties of linear
maps on finite-dimensional spaces.
Proposition 1.6.11 Let T ∈ K(X ) for some Banach space X. If dim(X ) = ∞, then
0 ∈ σ(T ).
Proof If 0 ∈
/ σ(T ), then T is invertible, and if T is invertible and compact, then I
is compact by Proposition 1.3.2, and therefore by Theorem 1.3.3 X must be finite-
dimensional.
Theorem 1.7.1 Let T ∗ = T ∈ K(H). Then the set of all eigenvalues {λn } of T is at
most countable, and λn → 0.
Proof If the set of all eigenvalues is finite we are done. Suppose it is infinite and let
> 0. We claim that the set
S = {λ : |λ| ≥ }
is finite. If not, then we can construct a sequence {λn } with corresponding (orthonor-
mal) eigenvectors {ϕn } such that
ϕi , ϕ j = δi j
The result reveals one of the most important spectral properties that characterizes
compact operators. The situation in the case of compact operators is very similar to
linear maps on finite-dimensional spaces in the sense that both have discrete (finite
or countable set of eigenvalues). What about eigenvectors? No information on the
behavior of these eigenvalues or their corresponding eigenvectors is known yet so
far. We will show that they also retain some features from the finite-dimensional
case. Before that, we need to introduce the concept of invariant subspaces.
TY = T |Y = Y −→ Y, TY (x) = T (x) ∀x ∈ Y.
The new mapping is well-defined, but note that it is not necessarily surjective. A
trivial restriction can be made by choosing Y = {0}, then TY = 0. Invariant subspaces
are helpful in reducing down operators into simpler operators acting on invariant
subspaces. The following result will be very helpful in proving our main theorem of
this section.
1.7 Spectral Theory of Self-adjoint Compact Operators 35
Proposition 1.7.3 Let T ∈ B(H), and let Y be a closed subspace of H, that is,
T −invariant. Then
(1) If T is self-adjoint, then TY is self-adjoint.
(2) If T is compact, then TY is compact.
Proof Let TY = Y −→ Y , TY (x) = T (x). Note that since Y is closed in H, Y is a
Hilbert space. Suppose T = T ∗ and let x, z ∈ Y. Then
TY x, z Y = TY x, z = T x, z = x, T z = x, TY z = x, TY z Y .
yn = yn Y
Now, we come to a major result in the spectral theory of operators. The following
theorem provides these missing pieces of information about eigenfunctions.
Theorem 1.7.4 (Hilbert–Schmidt Theorem) If T ∗ = T ∈ K(H) (i.e., compact and
self-adjoint), then its eigenfunctions {ϕn } form an orthonormal basis for R(T ), and
their corresponding eigenvalues behave as
Moreover, for x ∈ H
∞
T (x) = λ j x, e j e j .
j=1
H1 = (span{ϕ1 })⊥ .
T x, ϕ1 = x, T ϕ1 = λ x, ϕ1 = 0.
H2 = (span{ϕ1 , ϕ2 })⊥ .
T2 = TH2 : H2 −→ H2
It is clear that
1.7 Spectral Theory of Self-adjoint Compact Operators 37
for all n ≥ 1. If the process stops at n = N < ∞, such that THN +1 = 0, then for every
x ∈H
N
T (x) = λ j x, ϕ j ϕ j ,
j=1
If the process doesn’t stop at any N < ∞, we continue the process and we get a
sequence
and since
n
x, ϕ j ϕ j ∈ span{ϕ1 , ϕ2 , . . . , ϕn },
j=1
we have
⎛ ⎞
n
⎝x − x, ϕ j ϕ j ⎠ ∈ Hn .
j=1
Now, let
n
zn = x − x, ϕ j ϕ j .
j=1
38 1 Operator Theory
But
⎛ ⎞
n
n
n
T ⎝x − x, ϕ j ϕ j ⎠ = T (x) − x, ϕ j T (ϕ j ) = T (x) − λ j x, ϕ j ϕ j .
j=1 j=1 j=1
This shows that the range space of T is spanned by the eigenvectors {ϕn }. That is,
letting
M = Span{ϕn }n∈N ,
R(T ) ⊆ M.
Then, we have
∞
αjϕj
=x∈X
λj
and
1.7 Spectral Theory of Self-adjoint Compact Operators 39
∞
αjϕj
T (x) = T
λj
∞
T (ϕ j )
= αj
λj
∞
= αjϕj
= y.
which was concluded at the end of the proof is called the diagonal operator with
entries {λn } on the diagonal
⎛ ⎞
λ1
⎜ λ2 ⎟
⎝ ⎠
..
.
It turns out that compact self-adjoint operators defined on Hilbert spaces can be
unitarily diagonalized, a property which is similar to those of finite-dimensional
linear mappings.
The next theorem is a continuation of the previous theorem in case the space H is
separable. This will extend the eigenvectors to cover the whole space and a complete
representation will be given.
Theorem 1.7.5 (Spectral Theorem For Self-adjoint Compact Operators)
Let T ∗ = T ∈ K(H) (i.e., compact and self-adjoint) on a Hilbert space H. Then, its
eigenfunctions form an orthonormal basis for the space H. This orthonormal basis
is countable if H is separable.
Proof Let
H =R(T ) ⊕ N (T ).
40 1 Operator Theory
where {en } are the orthonormal basis for R(T ) that was constructed in the proof of
the Hilbert–Schmidt theorem. Note that Proposition 1.5.2(1) implies that
en , ϕm = 0
This set is not necessarily countable. We proceed with the same argument for the
separable case.
The spectral theorem in its two parts shows that a compact self-adjoint operator
T ∈ K(H) can be written as
T (x) = λ j x, ϕ j ϕ j
where {ϕn } is a set of orthonormal basis for H and {λn } is the set of the eigenvalues,
which is either finite or countable and decreasing with λn → 0. The next theorem
shows that the converse of the spectral theorem holds as well.
Theorem 1.7.6 Let T ∈ B(H) be a bounded linear operator on a Hilbert space H
such that for every x ∈ H
T (x) = λ j x, ϕ j ϕ j ,
where {ϕn } is a set of orthonormal basis for H and {λn } is the set of the eigenvalues
which is either finite or countable and decreasing and λn → 0. Then T is a compact
self-adjoint operator.
Proof If dim(H) < ∞ and the system {λn , ϕn } is finite then T is of finite rank and
thus it is compact. If not then we define the following operators:
n
Tn (x) = λ j x, ϕ j ϕ j .
j=1
1.8 Fredholm Alternative 41
T x, y = λ j x, ϕ j ϕ j , y
= λ j x, ϕ j ϕ j , y
= x, λj ϕj, y ϕj
= x, T y .
λxn j = T xn j − (T − λI )(xn j ) → y − 0 = y,
42 1 Operator Theory
which implies
y
xn j → .
λ
By continuity of T and the fact that xn = 1, we get T y = λy, with
Proof The first part of the conclusion follows from Proposition 1.6.4. For the second
part, it can be readily seen that Tλ |Y is also bounded below from the previous
proposition, hence again applying Proposition 1.6.4.
Y = Y X.
X ⊃ Y1 ⊃ Y2 . . . ,
with each Yn being closed. Let 0 < γ < 1. By the Riesz lemma, there exists yn ∈
Yn \ Yn+1 such that
Ym+1 Ym ⊆ Yn+1 .
It follows that
This gives
1.8 Fredholm Alternative 43
where
T yn − T ym ≥ |λ| γ > 0.
Now, the combination of Propositions 1.6.4 and 1.8.1, and Theorem 1.8.3 implies
the following.
Corollary 1.8.4 Let T ∈ K(X ) for a Banach space X , and let 0 = λ ∈ C be a
complex number. Then, Tλ = T − λI is injective iff Tλ is invertible.
Using the notion of a compact operator, we have been able to find an analog of the
result in the finite-dimensional case, which states that invertibility and injectivity are
equivalent for linear maps. The previous corollary implies that if
/ σ p (T )
0 = λ ∈
for some compact T , then λ ∈ ρ(T ). This also means that for a compact operator
or in other words,
σ(T ) = σ p (T ) ∪ {0}.
This leads to the remarkable result, commonly called, Fredholm Alternative, which
states the following.
Theorem 1.8.5 (Fredholm Alternative) Let T ∈ K(X ) for a Banach space X , and
let 0 = λ ∈ C be a complex number. Then, we have one of the following:
(1) Tλ is noninjective, i.e., N (T − λI ) = {0}, or
(2) Tλ is invertible, i.e., T − λI has bounded inverse.
44 1 Operator Theory
The operator Tλ satisfying this Fredholm Alternative principle shall be called the
Fredholm Operator. The two statements mean precisely the following: either the
equation
T (u) − λu = 0 (1.8.3)
T (u) − λu = v (1.8.4)
In the language of integral equations, we can also say: either the equation
b
λ f (x) = k(x, t) f (t)dt
a
has a unique solution for every function g, keeping in mind that the integral operator
is a Hilbert–Schmidt operator, which is a compact operator. We state the Fredholm
Alternative theorem for Fredholm integral equations.
Theorem 1.8.6 (Fredholm Alternative for Fredholm Equations) Either the equation
b
k(x, y)u(y)dy − λu(x) = f (x)
a
K f = g,
where V is defined on L p ([0, 1]), 1 < p < ∞, and k ∈ C([0, 1] × [0, 1]). Then for
all λ = 0, there exists a unique solution for the equation
(λI − V )u = f
for 0 < x ≤ 1. Let M > 0 such that |k| < M on [0, 1], and let
u1 = α
x
|λ|2 |u(x)| ≤ k(x, y)(λu(y)dy
0 x
≤ k(x, y)αMdy
0
≤ αM 2 x
≤ αM 2 .
Iterating it n times
1
|λ|n |u(x)| ≤ M n α ,
(n − 1)!
1.9.1 Introduction
All the operators that were studied so far fall under the class of linear bounded
operators. In this section, we will investigate operators that are unbounded. The
theory of the unbounded operator is an important aspect of applied functional analysis
since some important operators, such as the differential operators, encountered in
applied mathematics and physics are unbounded, so developing a theory that treat
these operators is of utmost importance. The following theorems are central in the
treatment of linear bounded operators:
1.9 Unbounded Operators 47
is a closed subspace in X × Y.
closed but the converse is not necessarily true. The only type of discontinuity that
may occur by closed linear operators is the (infinite) essential discontinuity. Loosely
speaking, if the domain of the operator is complete then x ∈ D(T ), i.e., T (x) = y ∈
ImT , and this forces the operator to be bounded since otherwise there would be a
convergent sequence xn −→ x ∈ D(T ) such that T (xn ) −→ ∞, and this will break
the closedness of the graph of T . It turns out that closed operators can redeem some
of the properties for the continuous operators.
Theorem 1.9.2 (Closed Range Theorem) Let T : X −→ Y be linear closed opera-
tor between Banach spaces X and Y. If T is bounded below, then R(T ) is closed in
Y.
Proof The proof is similar to Prop 1.6.4. Let yn ∈ R(T ) and yn −→ y. Let xn ∈
D(T ) such that T xn = yn and T (xn ) → y ∈ X, then {T xn } is Cauchy. It follows that
c xn − xm ≤ T xn − T xm ,
hence {xn } is Cauchy and by completeness it converges to, say, x ∈ X, and since T
is closed,
y = T (x) ∈ R(T )
so if G(T ) is a closed subspace, then so is G(T −1 ) since the same argument for G(T )
can be made for G(T −1 ) with
T (xn ) = yn
T −1 (yn ) = xn −→ x.
This is a very interesting property of closed operators which says that a linear
closed operator between Banach spaces has a bounded and closed inverse even if it
is unbounded. It also shows that the solution u of the equation Lu = f for a closed
bijective operator L is controlled and bounded by f . Indeed, if Lu = f for some
f ∈ R( f ) then
u = L −1 f ≤ L −1 f .
The sum and product of operators can be defined the same way as for the bounded
case.
Definition 1.9.4 Let T and S be two operators on X. Then
(1) D(T + S) = D(T ) ∩ D(S).
(2) D(ST ) = {x ∈ D(T ) : T (x) ∈ D(S)}.
(3) T = S if D(T ) = D(S) and T x = Sx for all x ∈ D.
(4) S ⊂ T if D(S) ⊂ D(T ) and T |D(S) = S. In this case, T is said to be an extension
of S.
Note that from (2) and (3) above, we have in general T S = ST and they are equal
only if
D(T S) = D(ST ).
Furthermore, if
L : D(L) ⊂ X −→ X,
xn − x → 0
and
T xn − (λxn + y) → 0.
50 1 Operator Theory
Then
Thus,
T xn −→ λx + y.
T x = λx + y,
or
T x − λx = Tλ (x) = y.
x x
v(τ )dτ = lim u n (τ )dτ = lim[u n (x) − u n (0)] = u(x) − u(0).
0 0
So
Hence D is unbounded.
1.9 Unbounded Operators 51
So it became apparent in view of the preceding example that the class of closed
unbounded operators represents some of the most important linear operators, and
a comprehensive theory for this class of operators is certainly needed to elaborate
more on their properties. It is well-known that the inverse of the differential operator
is the integral operator, which was found to be compact and self-adjoint. So our next
step will be to investigate ways to define the adjoint of these unbounded operators.
T ∗ : Y ∗ −→ X ∗
given by
T ∗ y = yT,
for y ∈ Y ∗ . If X and Y are Hilbert spaces, then the adjoint operator for the operator
T : H1 → H2 is defined as T ∗ : H2 → H1 ,
T x, y = x, T ∗ y
T x, y = x, T ∗ y
for all x ∈ H1 . In the unbounded case, this construction might cause a trouble and
won’t give rise to a well-defined adjoint mapping since (T (xn ))n will diverge for
some sequence (xn ), therefore we need to restrict the domain of T ∗ to consist of only
the elements y that would make T x, y bounded for all x ∈ D(T ). The following
theorem illustrates this.
Theorem 1.9.7 (Toeplitz Theorem) Let L : H −→ H be a linear operator and
D(T ) = H. If T x, y = x, T y for all x, y ∈ H then T is bounded.
T (z n ) −→ ∞. (1.9.1)
| f n (x)| = | T x, yn | ≤ T x
| f n (x)| ≤ M x
T z n 2 = T z n , T z n = | f n (T z n )| ≤ M T z n ,
The Toeplitz theorem indicates that for symmetric unbounded operators, we must
have D(T ) ⊂ H. Another problem that arises in obtaining a well-defined adjoint is
that T ∗ must be uniquely determined for each y ∗ ∈ D(T ∗ ). This can be guaranteed if
D(T ) is made as large as possible, but since D(T ) ⊂ H, we will try something like
D(T ) = H. Indeed, if D(T ) ⊂ H then by the orthogonal decomposition of Hilbert
spaces
⊥
H = D(T ) ⊕ D(T ) ,
we can find
⊥
0 = y0 ∈ D(T )
T x, y = x, y ∗ = x, y ∗ + x, y0 = x, y ∗ + y0 = x, T ∗ y .
T x, y = x, T ∗ y
T ∗ + S ∗ = (T + S)∗
and
T ∗ S ∗ = (ST )∗ .
The next proposition shows that this is not the case for unbounded operators.
Proposition 1.9.9 Let T, S ∈ L(H) be two densely defined operators. Then
(1) (αT )∗ = αT ∗ .
(2) If S ⊂ T then T ∗ ⊂ S ∗ .
(3) T ∗ + S ∗ ⊂ (T + S)∗ .
(4) If ST is densely defined, then T ∗ S ∗ ⊂ (ST )∗ .
Proof For (1), we have
αT x, y = x, (αT )∗ y (1.9.2)
αT x, y = α T x, y = α x, T ∗ y = x, ᾱT ∗ y . (1.9.3)
x, T ∗ y = T x, y
for all x ∈ D(T ), hence for all x ∈ D(S). But from Definition 1.9.4(4)
T (x) = S(x)
S ∗ = T ∗ on D(T ∗ ).
x, T ∗ y = T x, y
x, T ∗ y + x, S ∗ y = T x, y + Sx, y (1.9.4)
for all
x ∈ D(T ) ∩ D(S) = D(T + S).
x, (T ∗ + S ∗ )y = (T + S)x, y = x, (T + S)∗ y .
T ∗ + S ∗ = (T + S)∗
x, T ∗ S ∗ y = T x, S ∗ y
= ST x, y
= x, (ST )∗ y
Proposition 1.2.4 gives the relations between the null space and the range of the
operator and its adjoint. The result holds to the unbounded case as well.
Proposition 1.9.10 Let T ∈ L(H). Then
1.9 Unbounded Operators 55
Proof Note that y ∈ R(T )⊥ iff T x, y = 0 for all x ∈ D(T ) iff x, T ∗ y = 0 iff
T ∗ y = 0 iff y ∈ N (T ∗ ), and this gives N (T ∗ ) = R(T )⊥ . All the other identities
can be proved similarly and are left to the reader to verify.
R(T ) ⊕ N (T ∗ ) = H.
The spaces N (Tλ∗ ) and R(Tλ ) are called the deficiency spaces, and the numbers
Due to the densely defined domains, we need the following definition for symmetric
operators.
Definition 1.9.11 (Symmetric Operator) Let T ∈ L(H). Then T is symmetric if
T x, y = x, T y
Proof For (1), let y ∈ D(T ). If T x, y = x, T y for all x ∈ D(T ) then clearly
y ∈ D(T ∗ ) and so D(T ) ⊂ D(T ∗ ). This gives the first direction. Conversely, let
T ⊂ T ∗ . Then T ∗ = T on D(T ), so for all x, y ∈ D(T )
T x, y = x, T ∗ y = x, T y ,
so T is symmetric.
56 1 Operator Theory
T x, y = x, T y = 0
z, T x = 0 = T z, x
for all x ∈ D(T ), and this implies that T z = 0, but because T is injective, we must
have z = 0, hence T is injective.
For (3), let T be symmetric. By (1), T ⊂ T ∗ , so it suffices to show that D(T ∗ ) ⊂
D(T ) which will give the other direction. Let y ∈ D(T ∗ ). Then
T x, y = x, T ∗ y
for all x ∈ D(T ). But note that T ∗ y ∈ H, so by surjection of T, there exists z ∈ D(T )
such that
T z = T ∗ y.
Consequently,
T x, y = x, T ∗ y = x, T z = T x, z ,
Note that using the adjoint operator, the Toeplitz theorem becomes more accessible
to us and follows easily from the preceding proposition since we can simply argue as
follows: If T is symmetric then by the preceding proposition T ⊂ T ∗ , which implies
D(T ) ⊂ D(T ∗ ),
D(T ∗ ) ⊆ D(T ),
which implies
D(T ) = D(T ∗ )
and therefore T = T ∗ .
The next theorem discusses the connection between disjoints and inverses. In the
bounded case, it is known that T is invertible if and only if T ∗ is invertible, and
1.9 Unbounded Operators 57
(T ∗ )−1 = (T −1 )∗ .
This identity extends to general linear invertible densely defined operators that are
not necessarily bounded. A more interesting result is to assume symmetry rather than
injectivity.
Theorem 1.9.13 Let T ∈ L(H) be symmetric. If D(T ) = H and R(T ) = H then
(T −1 )∗ exists, (T ∗ )−1 exists, and
(T −1 )∗ = (T ∗ )−1 .
x, y = T T 1 x, y
= T −1 x, T ∗ y
= x, (T −1 )∗ T ∗ y ,
(T −1 )∗ T ∗ y = y on D(T ∗ )
Consequently,
therefore
(T ∗ )−1 ⊂ (T −1 )∗ . (1.9.5)
Similarly,
58 1 Operator Theory
x, y = T −1 T x, y
= T x, (T −1 )∗ y
= x, T ∗ (T −1 )∗ y
T ∗ (T −1 )∗ y = y on D(T −1 )∗ ),
whence
(T −1 )∗ ⊂ (T ∗ )−1 . (1.9.6)
An important corollary is
Corollary 1.9.14 Let T ∈ L(H) be densely defined and injective. If T is self-adjoint
then T −1 is self-adjoint.
The next result asserts that the adjoint of any densely defined operator is closed.
Theorem 1.9.15 If T ∈ L(H) such that D(T ) = H, then T ∗ is closed.
Proof Let yn ∈ D(T ∗ ) such that yn −→ y and T ∗ (yn ) −→ z. Then for every x ∈
D(T )
T x, yn = x, T ∗ yn −→ x, z
and
T x, yn −→ T x, y .
Hence
x, z = T x, y = x, T ∗ y ,
The spectral properties of the unbounded linear operators retain much of those for
the bounded operators. The definitions are the same.
1.9 Unbounded Operators 59
Definition 1.9.16 (Resolvent and Spectrum) Let X be Banach space and T ∈ L(X ).
A scalar λ ∈ C is called a regular value of T if the resolvent
that is, Tλ is boundedly invertible (i.e., bijection with a bounded inverse operator).
The set of all regular values of T is called the resolvent set, and is denoted by ρ(T ).
The set C \ ρ(T ) = σ(T ) is the spectrum of T.
Note that to have a bounded inverse for an unbounded operator is more challeng-
ing. The notion of closedness will be invoked here as it will play an important role
in establishing some interesting properties for the unbounded operators.
Theorem 1.9.17 Let T ∈ L(X ) be a densely defined closed operator on a Banach
space X. Then λ ∈ ρ(T ) iff Tλ is injective.
ρ(T ) = {λ ∈ C : T − λI is injective.}.
The next result characterize closed operators in terms of their resolvents. It basi-
cally says that if you can find, at least, one element in the resolvent set, then the
operator is necessarily closed.
Proposition 1.9.18 Let T ∈ L(H) be a densely defined operator on a Hilbert space
H. If ρ(T ) = Ø then T is closed.
Proof If not, then for any λ ∈ C neither Tλ nor Tλ−1 is closed, so Tλ−1 is unbounded,
and therefore λ ∈/ ρ(T ) and consequently ρ(T ) is empty.
The preceding result indicates why dealing with closed operators is efficient.
This will simplify the work on self-adjoint operators knowing that every self-adjoint
operator is closed.
Theorem 1.9.19 Let T ∈ L(H) be a densely defined operator on a Hilbert space
H. Then T is self-adjoint iff T is symmetric and σ(T ) ⊂ R.
Ti = T − i I
Now, let y ∈ D(T ∗ ). Since R(Ti ) = H, there exists x ∈ D(T ) such that
(Ti∗ )(x − y) = 0,
Proof Clearly (1) gives (2) by Proposition 1.9.18 and Theorem 1.9.19. Suppose (2)
holds. By Proposition 1.9.10
R(T ± i I ) = H.
R(T ± i I ) = R(T ± i I ) = H
D(T ) ⊂ D(T ∗ ).
1.10 Differential Operators 61
(T ∗ + i I )y = z ∈ R(T ∗ + i I ) ⊆ H.
(T + i I )x = z = (T ∗ + i I )y.
Now, (1) can be obtained using the same argument as in the proof of the
preceding theorem.
for some differential operator L . If L is invertible then the solution of the equation
above is given by u = L −1 f, and so the equation is written as
L(L −1 f ) = f.
Note that since L −1 is an integral operator, it has a kernel, say, k(x, t), namely
L −1 f = k(x, t) f (t)dt,
62 1 Operator Theory
hence
−1
L(L f) = (Lk(x, t)) f (t)dt = f. (1.10.2)
is the solution to Eq. (1.10.1). The situation of the function Lk(x, t) in (1.10.2) is
rather unusual since the integral of its product with f gives f again, and this has no
explanation in the classical theory of derivatives. This problem was investigated by
Dirac in 1922, and he extended the concept of Kronecker delta function
1 i= j
δi j = ,
0 i = j
which helps select an element, say ak from a set S = {a1 , a2 , . . .} by means of the
operation
ak = δ jk a j , (1.10.3)
j
and in general
∞
δ(x − t) f (x)d x = f (t). (1.10.5)
−∞
Consequently, we obtain
Of course, the way the Dirac delta was created doesn’t make it well-defined. More-
over, a rigorous and the treatment above doesn’t stand on a firm mathematical founda-
tion, and so a rigorous analysis was needed to validate the construction of Dirac delta.
Some great mathematicians, such as Sobolev, Schwartz, and others, were among the
first to carry the mathematical analysis, which led to the creation of distribution
theory and Sobolev spaces. In fact, the observation and the debate about the Dirac
delta was a stepstone that had led to the creation of this important area of functional
analysis. The kernel k(x, t) is called Green’s function, and it is the solution of the
equation
L x k(x, t) = δ(t),
where L x is the differential operator that is the inverse of the integral operator of
which k(x, t) is its kernel, and the subscript x of L x denotes the variable under
differentiation.
The strategy is to define L x such that it is densely defined on a separable Hilbert
space, say L 2 , injective, and symmetric. In order to prove it is self-adjoint, we can
use the definition to show that D(T ∗ ) = D(T ) but this might be a challenging task
in many cases, so using theorems and results of the preceding section can be more
helpful. One can show that T is surjective, then use Proposition 1.9.12(3) to conclude
that T is self-adjoint. Alternatively, one can show that σ(T ) ⊆ R, then use Theorem
1.9.19. After we prove that L is self-adjoint, we apply Corollary 1.9.14 to conclude
that L −1 (which is an integral operator) is self-adjoint. We also know that the kernel
of the integral operator must be symmetric in order for the integral operator to be
self-adjoint, although Corollary 1.9.14 ensures that the inverse operator L −1 is self-
adjoint if L is densely defined, injective, and self-adjoint. It turns out that whenever
the differential operator L is self-adjoint, the kernel k of the integral operator L −1
is guaranteed to be symmetric so that L −1 is self-adjoint. Therefore, we can apply
Theorems 1.7.4 and 1.7.5 to conclude the existence of a decreasing countable set of
eigenvalues (λn ) for the integral operator L −1 with a countable set of eigenfunctions
(ϕn ) that form an orthonormal basis for the space, and such that
L −1 ϕn = λn ϕn .
But the eigenfunctions of the operators L and L −1 are the same, and the eigenvalues
are the reciprocals of each other. Thus we have
Lϕn = μn ϕn ,
64 1 Operator Theory
where
1
μn = −→ ∞
λn
are the eigenvalues of the differential operator L . This can be simply seen from
(1.10.1). Indeed, if Eq. (1.10.1) is of the form Lu = λu, then
u= k(x, t)λu(t)dt,
or
1
u(x) = L −1 u = k(x, t)u(t)dt,
λ
Lu = −∇ 2 u
where the minus sign is adopted for convenience. Consider the following one-
dimensional equation:
Lu = −u = f
and it is well-known that not all functions in L 2 ([a, b]) are twice differentiable, so
and L cannot be surjective. In fact, L is densely defined since the space C 2 [a, b]
consisting of all functions that are twice continuously differentiable on [a, b] is
dense in L 2 ([a, b]) as we will see in Chap. 3, so let us take it for granted now. So L ∗
is well-defined.
1.10 Differential Operators 65
Therefore, the Laplacian operator −u is a positive operator, and thus all its
eigenvalues are nonnegative (see Problem 1.11.58). Therefore, by Theorem 1.9.19
we see that L is self-adjoint, and so by Corollary 1.9.14, the inverse operator L −1
exists and it is also self-adjoint. The operator L −1 is given by
b
−1
L f = Kf = G(x, t) f (t)dt. (1.10.7)
a
Here, G(x, t) is Green’s function and it is the kernel of the operator, which is neces-
sarily symmetric since L −1 is self-adjoint. If G is continuous on [a, b] × [a, b] then
L −1 is compact, and consequently we can apply the Hilbert–Schmidt theorem and
the spectral theory of compact self-adjoint operators. From (1.10.6) and since δ = 0
for x = t, Green’s function necessarily satisfies the boundary conditions:
k(a, t) = k(b, t) = 0.
Moreover,
b
d2
LK f = − Ku = −G x x (x, t) f (t)dt = f (x),
dx2 a
66 1 Operator Theory
If x < t we have
−G x x (x, t) = 0,
For x = t we have
t+
G x x d x = G x (t + ; t) − G x (t − ; t) = 1.
t−
−u = f
u(a) = u(b) = 0
is given by
∞
1
u(x) = f, ϕn ϕn (x)
λ
n=1 n
λ1 < λ2 < . . .
−u = f,
u(0) = u(1) = 0.
Then it can be shown using classical techniques of ODEs that the eigenvalues to the
problem are
λn = n 2 π 2
∂2 ∂
L = a2 (x) + a1 (x) + a0 (x)
∂x 2 ∂x
is not self-adjoint. To convert it to self-adjoint, we first multiply L by the factor
x
1 a1 (t)
exp dt .
a2 (x) a2 (t)
If we let
x
a1 (t) a0 (x) a1 (x)
p(x) = exp dt , q(x) = exp dx
a2 (t) a2 (x) a2 (x)
such that p ∈ C 1 [a, b] and q ∈ C[a, b], we obtain the so-called Sturm–Liouville
Operator:
d d
L= p(x) + q(x). (1.10.8)
dx dx
It remains to find the appropriate boundary conditions that yields symmetry. For
this, we assume the Sturm–Liouville equation Lu = f defined on an interval [a, b].
68 1 Operator Theory
Then b x=b
Lu, v − u, Lv = vLu − u Lvd x = puv − vu x=a .
a
Lu = f
∞
1
u(x) = f, ϕn ϕn (x).
λ
n=1 n
For the eigenvalues λn of the S-L system, consider the eigenvalue problem
Lu + λu = 0.
b
puu a
≤0
and q ≤ 0 on [a, b] then λ ≥ 0, and the absolute values in the above theorem could
be removed.
It is important to observe that the more the boundary conditions are restrictive,
the more chance the operator won’t be self-adjoint (even if it is symmetric) since
D(T ) ⊂ D(T ∗ ). The following operator explains this observation.
P ∗ u = Pu
on D(P). Here, D(P ∗ ) consists of all functions in C 1 [a, b] such that u(a) = u(b).
If we adopt the same space C 1 [a, b] subject to the conditions
and P won’t be self-adjoint, hence the homogeneous conditions (1.10.10) will only
establish symmetry but won’t lead to self-adjointness. Therefore, we always need
70 1 Operator Theory
to choose suitable boundary conditions to ensure not only symmetry, but D(P) =
D(P ∗ ).
1.11 Problems
| T x, y |2 ≤ T x, x T y, y .
c x2 ≤ T x, x
for all x ∈ H.
(a) Prove that T −1 exists.
1
(b) Prove that T −1 is bounded, with T −1 ≤ .
c
(6) If T ∈ B(H) is a normal operator, show that T is invertible iff T ∗ T is invertible.
(7) If Tn ∈ B(H) is a sequence of self-adjoint operators and Tn −→ T, show that
T is self-adjoint.
(8) Let T ∈ B(H) such that T ≤ 1. Show that T x = x if and only if T ∗ x = x.
(9) Consider the integral operator
π
(T f )(x) = K (x − t)u(t)dt.
−π
(10) (a) Give an example of T ∈ B(H) such that T 2 is compact, but T is not.
(b) Show that if T ∈ B(H) is self-adjoint, and T 2 is compact, then T is compact.
(11) Let T ∈ B(X, 1 ). If X is reflexive, show that T is compact.
(12) Let T : p −→ p , 1 < p < ∞, defined as
where |αn | < 1 for all n. Show that T is compact iff lim αn = 0.
(13) Determine if the operator T : ∞ −→ ∞ , defined as
x2 , xk
T (x1 , x2 , . . .) = (x1 ,... , , . . .),
2 k
is compact.
(14) Consider X = (C 1 [0, 1], ·1,∞ ) and Y = (C[0, 1], ·∞ ) where
f 1,∞ = max{| f (t)| , f (t) : t ∈ [0, 1]}.
for all n ∈ N.
(21) Consider Example 1.3.6.
(a) Prove that the integral operator K in the example is compact if X =
L p (a, b]), 1 < p < ∞, and k is piecewise continuous on [a, b] × [a, b].
(b) Prove that K is compact if X = L p (a, b]), 1 < p < ∞, and
T u(x) = xu(x)
is compact if
(a) D(T ) = C[0, 1].
(b) D(T ) = L 2 [0, 1].
(32) Show that the system
ψnm (x, y) = ϕn (x)ϕm (y)
in (1.4.4) is an orthonormal basis for L 2 ([a, b] × [a, b]) given that (ϕn ) is an
orthonormal basis for L 2 [a, b].
(33) Let T ∈ B(X ) for some Banach space X. Show that for all n ∈ N,
(3) T : 2 −→ 2 defined by
x2 x3 xn
T (x1 , x2 , . . .) = ( , ,..., , . . .).
1 2 n−1
(T u)(x) = xu(x).
74 1 Operator Theory
(T u)(x) = u .
(x1 , x2 , . . .) = (x2 , x3 , x4 , . . .)
(a) on c.
(b) on ∞ .
(37) Let T ∈ B(H) and (T − λ0 I )−1 be compact for some λ0 ∈ ρ(T ). Show that
(a) (T − λI )−1 is compact for all λ ∈ ρ(T ).
(b) dim N (T − λI ) < ∞ for all λ ∈ ρ(T ).
(38) Let T ∈ B(X ) be a normal operator and μ ∈ ρ(T ). Show that (T − μI )−1 is
normal.
(39) Let T ∈ B(H) and λ ∈ ρ(T ). Show that T is symmetric if and only if
(T − λI )−1 is symmetric.
(40) Let T ∈ K2 (H). Show that if T is a finite rank operator, then σ(T ) is finite.
(41) Write the details of the proof of Proposition 1.6.9.
(42) Write the details of the proof of the Spectral Mapping Theorem.
(43) Let T ∈ K(X ) for some Banach space X. Show that for λ = 0,
ker(T − λI ) = N (Tλ )
is finite-dimensional.
(44) Let T ∈ B(X ) for some Banach space X. Show that T is invertible if and only
if T and T ∗ are bounded below.
(45) Consider the differential operator D : C 1 [0, 1] → C[0, 1],
D( f ) = f .
M = Span{ϕn }n∈N .
1.11 Problems 75
(λI − A)u = f
(λI − T )u = f
is
u(x) = (λ − λn )−1 f, ϕn ϕn .
for some ⊂ Rn . If 1 ∈
/ σ(V ) show that there exists a unique solution for the
equation
f (x) − k(x, y)u(y)dy = f
converges uniformly.
(56) Give an example, other than the examples mentioned in the text, of a linear
operator defined on a Banach space that is
(a) bounded but a non-closed operator.
(b) bounded with a non-closed range.
(c) closed but unbounded.
(57) Let T ∈ L(H) be a closed and densely defined operator.
(a) Show that
σ(T ∗ ) = σ(T ).
D(ST ) = H.
T ∗ S ∗ = (ST )∗ .
T ⊂ T ∗∗ ⊂ T ∗ .
T = T ∗∗ ⊂ T ∗ .
(65) Let T ∈ L(H) be a densely defined operator on a Hilbert space H. Use the
preceding problem to show that T is bounded if and only if T ∗ is bounded.
(66) Let T ∈ L(H) be an unbounded closed densely defined operator defined on a
Hilbert space H.
(a) Show that σ(T ) is closed.
(b) Show that λ ∈ σ(T ) iff λ ∈ σ(T ).
(c) If i ∈ ρ(T ), show that (T ∗ − i)−1 is the adjoint of (T + i)−1 .
(67) Let T ∈ L(H) be a densely defined operator on a Hilbert space H. Show that
if λ ∈ ρ(T ) then Tλ is bounded below.
(68) Let T ∈ L(H) be a densely defined operator on a Hilbert space H. Show that
if there exists a real number λ ∈ ρ(T ) then T is symmetric if and only if T is
self-adjoint.
(69) Let T ∈ L(H) be a densely defined operator and symmetric on H. Show that
if there exists λ ∈ C such that
then T is self-adjoint.
(70) Find the integral operator and find the eigenvalues and the corresponding eigen-
vectors for the problem Lu = −u provided that
(a) u(0) = u (1) = 0.
(b) u (0) = u(1) = 0.
(c) u (0) = u (1) = 0.
(71) Let T : L 2 [0, 1] −→ L 2 [0, 1], T f = f . Find the spectrum of T if
(a) D(T ) = { f ∈ C 1 [0, 1] : such that f (0) = 0}.
(b) D(T ) = { f ∈ AC[0, 1] : such that f ∈ L 2 [0, 1] and f (0) = f (1)}.
(72) Consider the differential operator
L = ex D2 + ex D
defined on [0, 1] such that u (0) = u(1) = 0. Determine whether or not the
operator is self-adjoint (where D is the first derivative).
78 1 Operator Theory
(73) Show that the following operators are of the Sturm–Liouville type.
(a) Legendre: (1 − x 2 )D 2 − 2x D + λ on [−1, 1].
(b) Bessel: x 2 D 2 + x D + (x 2 − n 2 )
(c) Laguerre: x D√2 + (1 − x)D
√ + λ on 0 < x < ∞.
(d) Chebyshev: 1 − x 2 D[ 1 − x 2 D] + λ on [−1, 1].
(74) Convert the equation
y − 2x y + 2ny = 0
L = D2 + 1
where f ∈ L 2 [0, 1]. Find L ∗ if the equation is subject to the boundary condi-
tions
(a) u(0) = u (0) = 0.
(b) u (0) = u (1) = 0.
(c) u(0) = u(1).
(77) Consider the problem
Lu = u = f
where 0 < x < 1, under the conditions: u(0) = u(1) and u (0) = u (1).
(a) Show that L is injective.
1.11 Problems 79
∞
1
u(x) = f, ϕn ϕn (x),
λ
n=1 n
|λn | ∞
and {ϕn } are the corresponding eigenfunctions that form an orthonormal basis
for L 2 [a, b].
(82) Let L be a self-adjoint differential operator and let f ∈ L 2 [0, 1]. Use Fredholm
Alternative to discuss the solvability of the two boundary value problems
(1) Lu = f defined on [0, 1] subject to the conditions u(0) = α and u(1) = β.
(2) Lu = 0 defined on [0, 1] subject to the conditions u(0) = u(1) = 0.
(83) Determine the value of λ ∈ R for which the operator T : C[0, 1] −→ C[0, 1]
defined by x
(T u)(x) = u(0) + u(t)dt
0
is contraction.
Chapter 2
Distribution Theory
Recall in Sect. 1.10, the Dirac delta were introduced with no mathematical founda-
tions, and we mentioned that a rigorous analysis is needed to validate the construction
of delta. This is one of the main motivations to develop the theory of distribution,
and the purpose of this chapter is to introduce the theory to the reader and discuss
its most important basics. As explained earlier, the Dirac delta cannot be considered
as a function. We shall call these mathematical objects: distributions. Distributions
are not functions in the classical sense because they exhibit some features that are
beyond the definition of the function. We can, however, view them as “generalized”
functions provided that the definition of function is being extended to include them.
This “generalized” feature provides more power and flexibility to these distributions,
enabling them to represent some more complicated behavior that cannot be repre-
sented by functions. For this reason, distributions are very useful in applications to
topics related to physics and engineering, such as quantum mechanics, electromag-
netic theory, aerodynamics, and many other fields. This chapter defines the notion
of distribution and discusses some fundamental properties. Then, we will perform
some operational calculus on rigorous mathematical settings, such as derivatives,
convolutions, and Fourier transforms.
In principle, the theory suggests that distributions should act on other functions
rather than being evaluated at particular points, and its action on a particular function
determines its values, and its definition is governed through an integral over a domain.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 81
A. Khanfer, Applied Functional Analysis,
https://doi.org/10.1007/978-981-99-3788-2_2
82 2 Distribution Theory
T, αϕ + βψ = α T, ϕ + β T, ψ .
84 2 Distribution Theory
T, ϕn → T, ϕ .
Since for linear functionals continuity at a point implies continuity at all points, it is
enough to study convergence at zero, i.e.,
T, ϕn → T, 0
The notion of locally integrable functions ignites the following idea: Since ϕ is
continuous on a compact set K , it has a maximum value on that set, and if we
2.2 Regular Distribution 85
multiply a locally integrable function f with a test function ϕ and integrate over Rn ,
this gives
f (x)ϕ(x)d x = f (x)ϕ(x)d x ≤ | f (x)ϕ(x)| d x
Rn K
K
= T, ϕ1 + T, ϕ2 = T (ϕ1 ) + T (ϕ2 ).
max |ϕn | → 0
x∈K
αT1 + βT2 ∈ D .
for every T ∈ D . The convergence in D takes the form of a weak-star. Recall that
a sequence Tn → T in D in the weak-star topology if
Tn (ϕ) → T (ϕ)
T (ϕ) = S(ϕ)
T = S ⇐⇒ T, ϕ = S, ϕ ∀ϕ ∈ D. (2.2.1)
By continuity,
Tn → T ⇐⇒ Tn , ϕ → T, ϕ ∀ϕ ∈ D. (2.2.2)
Note that when we say T = S, it means they are equal in the distributional sense.
2.3 Singular Distributions 87
Proof The properties from (1) to (7) can be easily concluded from the definition
directly and using some simple substitutions, so the reader is asked to present proofs
for them. For (8), we let y = g −1 (x) in the integral
T (g(y)ϕ(y)dy.
The terms “regular” and “singular”, if coined with functions, determine whether
a function is finite or infinite on some particular domain. The points at which the
function blows up are called singularities. If we extend the notion to distributions, we
∞
shall say that a regular distribution T is a distribution when the value f (x)ϕ(x)d x
−∞
is finite and well-defined. This can be achieved if there exists a function f that is
integrable on every compact set. This ensures that f is integrable on supp(ϕ); hence
the integral is well defined. If no such f exists, then the distribution is called singular.
One way to construct a singular distribution is through the Cauchy principal value
of a nonintegrable function having a singularity at a point a. Such a function cannot
define a regular distribution, but we can define the principal value of it, denoted by
p.v. f (x), as follows
88 2 Distribution Theory
p.v. f (x), ϕ(x) = lim+ f (x)ϕ(x)d x
↓0 |x−a|>
Since ϕ is compact, we can find a compact interval K = [−r, r ], with r > , such
that ϕ vanishes outside K . So (2.3.1) can be written as
ϕ(x) − ϕ(0) ϕ(0)
lim+ dx + d x.
↓0 <|x|<r x <|x|<r x
The second integral is zero due to symmetry, so passing to the limit we get
1 r
ϕ(x) − ϕ(0)
p.v. ,ϕ = d x.
x −r x
1
p.v. , ϕn → 0.
x
This can be easily established knowing that the convergence of the integral is uni-
form, as can be seen from (2.3.2). Therefore, the principal value of f is a singular
distribution.
Example 2.3.2 The next example illustrates the earliest and the most important
distribution known in literature. To simplify the treatment, we restrict to R, though
it is valid for Rn . Let ϕ ∈ D, and define the following
2.3 Singular Distributions 89
∞
Tx0 (ϕ) = Tx0 , ϕ = T (x − x0 )ϕ(x)d x = ϕ(x0 ). (2.3.3)
−∞
so Tx0 is continuous, and hence it is a distribution. The question arises: Is Tx0 a regular
or singular distribution? In other words, can we find a function g ∈ L loc
1
() such that
∞
Tx0 (ϕ) = g(x)ϕ(x)d x = ϕ(x0 ) (2.3.4)
−∞
We can also find a test function ϕ with supp(ϕ) ⊆ B and max{ϕ} = ϕ(0) (why?).
Then, ∞ ∞
g(x)ϕ(x)d x ≤ ϕ(0) |g(x)| d x < ϕ(0),
−∞ −∞
which contradicts (2.3.5). Hence, no such g exists, and so Tx0 cannot be regular,
hence it is singular. In (2.3.5), let us release the condition that
g ∈ L loc
1
().
What behavior does g exhibit to maintain (2.3.4) for all tests functions ϕ? One way
to allow this is to require that g(x) = 0 for all x = x0 . If x0 = 0 and g(x0 ) = c < ∞,
then ∞
g(x)ϕ(x)d x = 0 = x0 .
−∞
So, the function behaves abnormally at x0 with no possible finite value. Further, we
can find ϕ ∈ D such that ϕ = 1 a.e., and
∞
g(x)d x = 1.
−∞
90 2 Distribution Theory
These conditions remind us of the famous Dirac delta. It is the first generalized
function appeared in literature, and it is the most basic and important one that plays
a dominant role in this theory. It is called Dirac delta, in honors of Paul Dirac, a
British physicist, who introduced his delta in 1922 as the continuum analog of the
discrete Kronecker delta. Dirac introduced delta as a function and gave some of its
basic properties, and it soon became a famous powerful tool in solving problems
in mechanics with discontinuous pulses. However, the mathematical community
rejected the Dirac delta and refused to deal with it as a function since it doesn’t
behave normally as other functions. Recall in applied mathematics courses, the Dirac
delta is represented by
∞ x = x0
δ(x − x0 ) = (2.3.6)
0 x = x0 ,
If x0 = 0 then (2.3.6) reduces to (1.10.4). Note that (2.3.6) and (2.3.7) do not define a
function in the classical definition of functions. Under the representation (2.3.3), the
integral cannot be evaluated in Riemann Theory, and is evaluated as 0 in Lebesgue
Theory. No such function in the usual sense would satisfy (2.3.6)–(2.3.7). This
resulted in the famous debate of whether the delta notion defined in (2.3.6) and
(2.3.7) is a function. The debate continued until it was settled in 1936, when Sergei
Sobolev, a Soviet mathematician and pioneer in the theory of partial differential
equations, proposed a rigorous definition for the delta notion based on an integral
operator. In 1943, Laurent Schwartz, a leading French mathematician, constructed a
comprehensive theory on these “generalized functions”. The idea is that it is best to
think of delta as a distribution of the form of (2.3.1), i.e.,
This implies that (2.3.6)–(2.3.7) and (2.3.8) are, in fact, equivalent, and the Dirac delta
defined in (2.3.6)–(2.3.7) can be defined more rigorously in terms of a distribution
of the form (2.3.8). Now, we are ready to give a rigorous definition for the delta.
Definition 2.3.3 (Delta Distribution) The Dirac delta distribution, denoted by δ, is
a singular distribution defined as
δ(x − x0 ), ϕ(x) = δ(x − x0 )ϕ(x)d x = ϕ(x0 )
Rn
The Dirac delta is indeed a singular distribution because, in light of the discussion
above, it is not characterized by any locally integrable function. No measurable
function can be used to define δ as represented by the integral in the above formula.
We will, however, introduce a sequence of functions in D(R) that will converge
to δ.
Definition 2.3.4 (Delta Sequence) A sequence φm ∈ L loc
1
(Rn ) (n ≥ 1) is called delta
sequence if the following conditions hold:
(1) φm ≥ 0.
∞
(2) φm (x)d x = 1.
−∞
(3) φm (x) → 0 uniformly in |x| ≥ > 0.
The last condition means: for every > 0, there exists N ∈ R such that
for every m > N and for every x = 0. This implies that as m −→ ∞, we have
φm (x)d x −→ 0. (2.3.9)
|x|≥
We will use this sequence to conclude an important result that justifies the name of
the sequence.
Theorem 2.3.5 If φm ∈ L loc
1
(Rn ) is a delta sequence, then φm (x) → δ(x) as
m → ∞.
Proof We restrict the argument to n=1 and leave the general case for the reader.
According to Definitions 2.3.3 and 2.3.4, it suffices to prove that for any ϕ ∈ D, we
have
92 2 Distribution Theory
∞
lim φm (x)ϕ(x)d x = ϕ(0).
m→∞ −∞
We have
∞ ∞ ∞
lim φm (x)ϕ(x)d x = lim φm (x)[ϕ(x) − ϕ(0)]d x + ϕ(0) φm (x)d x.
m→∞ −∞ m→∞ −∞ −∞
(2.3.10)
But the first term of the RHS of the equation above can be written as
∞ r
lim φm (x)[ϕ(x) − ϕ(0)]d x + φm (x)[ϕ(x) − ϕ(0)]d x.
m→∞ |x|≥r >0 −r
For the first integral, we pass the limit inside the integral using (2.3.9) due to uniform
convergence on
|x| ≥ r > 0,
and this gives 0. For the second integral, note that ϕ is continuous at x = 0. Let
> 0. Then, there exists r1 > 0 such that
for |x| < r1 . Moreover, we can find r2 sufficiently small such that
r2
φm (x)d x < 1.
−r2
tions. This argument raises the following question: Does this sequence exist? The
answer is yes, and one way to construct it is to consider a function φ ∈ D(), with the
properties φ ≥ 0 on K =supp(φ), and φ = 1. Since φ is continuous on a compact
K
set K , it is uniformly continuous on K , so for every η > 0 we can find n sufficiently
large such that x
φ − φ(0) < η
n
for all x ∈ K . This shows that
x
φ → φ(0)
n
uniformly as n −→ ∞, and this can be written as
φ(y) → φ(0)
uniformly as → 0. Now, for the function φ ∈ D() and n > 0, we define the
sequence
φn (x) = nφ(nx) (2.3.12)
If we represent δ by (2.3.6) which physicists use, then observation (2) leads to the
result directly, but we prefer to maintain a rigorous treatment. Given the observations
above, we see that {φn } in (2.3.12) is a delta sequence, hence by Theorem 2.3.5, we
obtain
lim φn (x) = δ(x).
n→∞
y
ϕ → ϕ(0)
n
uniformly. Hence, we can pass the limit inside the previous integral and obtain
∞ ∞
y
lim φ(y)ϕ dy. = ϕ(0) φ(y)dy = ϕ(0).
−∞ n −∞
Therefore, as n → ∞ we have
φn (x) → δ(x).
Here is a famous example of a delta sequence known as the Gaussian delta sequence
which is derived from the Gaussian function.
Example 2.3.6 Consider the Gaussian function
1
φ(x) = √ e−x .
2
Clearly, φ ∈
/ D(R) because it is not of compact support. It is well-known that
∞ √
e−t dt =
2
π.
−∞
Define n
φn (x) = nφ(nx) = √ e−n x .
2 2
−∞ π π −∞
|x|≥r >0, π n r nr
on |x| ≥ r. In view of Definition 2.3.4 we conclude that {φn } is a delta sequence, and
hence by Theorem 2.3.5,
φn −→ δ.
Let x = 0, then
n
lim √ e−n x = 0.
2 2
n→∞ π
exercise.
(3) We can define the function φ over Rn and we obtain the same results. Moreover,
we can rewrite the sequence in (2.3.12) as
1 x
φ (x) = φ (2.3.14)
n
1
and taking → 0+ , then letting n 2 = in
4
n
φn (x) = √ e−n x .
2 2
1
e−x /4 ,
2
φ (x) = √
4π
One of the most fundamental and important properties of distributions is that they
ignore values at points and act on functions through an integration process. This
seems interesting because it enables us to differentiate discontinuous functions. The
generalized definition of functions is the main tool to allow this process to occur.
Assume a distribution is T and its derivative is T . Then
∞
T ,ϕ = T ϕd x.
−∞
If we perform integration by parts, and making use of the fact that ϕ is differentiable
and of compact support, then the above integral will be of the form
∞
T , ϕ = 0 − T ϕ d x.
−∞
We have no problem with that as long as ϕ ∈ D, which is in fact one of the main
reasons why test functions are restricted to that condition. It should be noted that
T (m) can never be the zero function. We pointed out previously that some normal
functions, such as the locally integrable functions, can be considered distributions.
This implies the following:
(1) Derivatives of the distributions should extend the notion of derivative for func-
tions. Otherwise, we may get two different derivatives for the same function if
treated as a normal function and as a distribution.
(2) The rules governing the distributional derivatives should be the same in classical
cases.
for every c ∈ R.
(3) Product Rule: If g ∈ C ∞ , then
(gT ) = gT + g T.
d
(T (g(x)) = T (g(x) · g (x).
dx
98 2 Distribution Theory
and
(cT ) , ϕ = c T , ϕ .
δ = −ϕ (0),
but it says that when δ acts on ϕ, the outcome is −ϕ (0), i.e., δ [ϕ] = −ϕ (0).
Further, using induction one can continue to find derivatives to get the general
formula
δ (n) , ϕ = (−1)n ϕ(n) (0).
Let ϕ ∈ D. Then
H , ϕ = − H, ϕ
∞
=− H ϕ d x
−∞
∞
=− ϕ d x = ϕ(0).
0
By integration by parts, and the fact that ϕ is of compact support, the RHS of (2.4.1)
equals
100 2 Distribution Theory
ϕ(x)
− lim+ d x + [(ϕ() − ϕ(−)) ln |x|]. (2.4.2)
→0 |x|> x
2.4.4 Properties of δ
The following theorem provides some interesting properties for the delta distribution
that can be very useful in computations.
Theorem 2.4.7 The following properties hold for δ(x).
(1) x · δ (x) = −δ(x).
(2) If c ∈ Dom( f ), then
1
δ(x 2 − c2 ) = [δ(x − c) + δ(x + c)].
2 |c|
n
1
δ(g(x)) = δ(x − xk ).
k=1
|g (x )|
k
Proof For (1), we make use of Proposition 2.2.3(3) and Theorem 2.4.2(3). Since
x · δ(x) = 0, we have
xδ , ϕ = (xδ) , ϕ − δ, ϕ = − δ, ϕ .
H (x 2 − c2 ) = 1 − [H (x + c) − H (x − c)].
Taking the derivative of both sides of the equation using chain rule gives
2.5.1 Introduction
The Fourier transform is one of the main tools used in the theory of distributions and
its applications to partial differential equations. In fact, a comprehensive study of
the theory of Fourier transforms and its techniques requires a whole separate book.
We will, however, confine ourselves to the material that suffices our needs and meets
the aims of the present book. Our main goal is to enlarge the domain of the Fourier
transform to apply to a wide variety of functions. If we confine distribution theory
to test functions, we cannot do much work on transformations. It is well-known
that some functions such as the Heaviside function, constant functions, polynomials,
periodic sine and cosine, and other functions are good examples of external sources
imposed on systems, so they appear in the PDEs representing the systems. Unfortu-
nately, these functions do not possess Fourier transforms. The duality of the Fourier
transform is not consistent with test functions because the Fourier transform of a test
102 2 Distribution Theory
function needs not be a test function. The key is to ensure the following two points:
1. To find a property that keeps a function vanishing at infinity. 2. If multiplied by
other smooth and nice functions, the integrand is integrable over R, or Rn .
F{ f (x)}(ω) = fˆ(ω).
We can recover the function from the Fourier transform through the inverse Fourier
transform ∞
1
f (x) = fˆ(ω)eiωx dω
2π −∞
on R, and
1
F −1 { fˆ(ω)} = fˆ(ω)ei(ω·x) dω
(2π)n Rn
f ∈ L 1 (R) ∩ L 2 (R).
How to establish a Fourier transform for f ? The idea is to define the transform F on
a dense subspace of L 2 (R), then we extend the domain of definition to L 2 (R) using
the closure obtained by continuity. The typical example of such a subspace is the
space of simple functions because this space is dense in L 2 . Consider the truncated
sequence
f n = f · χ[−n,n] ,
Then
f n ∈ L 1 (R) ∩ L 2 (R),
f n − f 2 → 0, f n − f 1 → 0,
Now, for 0 < n < m. With the aid of Plancherel Theorem which will be discussed
next, we have as n, m −→ ∞
104 2 Distribution Theory
So fˆn (ω) is Cauchy in the complete space L 2 (R), hence fˆn (ω) converges in the
L 2 −norm to a function in L 2 (R), call it h(x). Note that h was found by means of
the sequence { f n }. Let us assume there exists another Cauchy sequence, say gn ∈
L 1 (R) ∩ L 2 (R), such that gn − f 2 → 0, which implies that gn − f n 2 → 0. By
the same argument above we conclude that ĝn −→ g in the L 2 (R) norm. Using
(2.5.3) again, it is easy to show that
h − g2 ≤ h − fˆn + 2π f n − gn 2 + ĝn − g 2 → 0.
2
This means that h = g a.e., i.e. h does not depend on the choice of the approximating
sequence { f n }, and therefore we can define now the Fourier transform of f on L 2 (R)
to be fˆ(ω) = F{ f } = h, as an equivalence class of functions in L 2 , and where
n
F{ f } = l.i.m n→∞ F{ f n } = l.i.m n→∞ f (t)e−iωt dt.
−n
The notation l.i.m is the limit in mean and is referred to the limit in the L 2 −norm.
For convenience, we will, however, write it simply as lim, keeping in mind that it
is not a pointwise convergence, but a convergence in L 2 - norm. It remains to prove
(2.5.3).
The following theorem is one of the central theorems in the theory of Fourier analysis.
It is called: the Plancherel theorem, and it is sometimes called: Parseval’s identity. It
demonstrates the fact that the Fourier transform F on L 2 (R) is a bijective linear oper-
ator which maps f to fˆ, and is an isometry up to a constant, so it is an isomorphism
of L 2 onto itself.
Theorem 2.5.2 (Plancherel Theorem) Let f ∈ L 2 (R), and let its Fourier transform
be fˆ. Then,
1
f 2 = √ fˆ .
2π 2
Proof We have
2.6 Schwartz Space 105
∞ ∞
| f (t)|2 dt = f (t) f (t)dt
−∞ −∞
∞ ∞
1
= f (t) fˆ(ω)e−iωt dωdt
2π −∞ −∞
∞ ∞
1
= fˆ(ω) f (t)e−iωt dt dω
2π −∞ −∞
∞
1
= fˆ(ω) fˆ(ω)dω
2π −∞
∞
1 ˆ 2
= f (ω) dω.
2π −∞
This result shows that the space L (R) is a perfect environment for the Fourier
2
One of the central problems of Fourier analysis is how to apply Fourier transform to a
broader class of functions. The main obstacle is that we cannot guarantee F{ f } ∈ L 1 ,
even if F{ f } exists for some f ∈ L 1 , which will encounter a problem in satisfying
the essential identity
F −1 {F{ f }} = f.
One reason for this is that f is not decaying fast enough. This tells us that the rate
of convergence of the Fourier transform plays a significant role. To see this, let
f ∈ L 1 (R), with
lim f (t) = 0.
t→±∞
Using the definition and basic properties of Fourier transform, it can be easily shown
that
F{ f (t)} = iω fˆ(ω),
which gives
ˆ M
f (ω) ≤ for some M > 0.
ω
1
This implies that fˆ(ω) converges to 0 like . If we proceed further, assuming that
ω
f is absolutely integrable over R, and
lim f (t) = 0,
t→±∞
106 2 Distribution Theory
then we obtain
ˆ M
f (ω) ≤ 2 ,
ω
1
i.e., fˆ(ω) converges to 0 like 2 . This shows that the smoother the function and the
ω
more integrability of its derivatives, the faster the decay of its Fourier transform will
be. If f and all its derivatives are absolutely integrable over R and vanish at ∞, then
its Fourier transform decays at least exponentially. If we continue the process, we
find that fˆ(ω) converges faster than any inverse of the polynomial, i.e., fˆ(ω) is a
rapidly decreasing function. On the other hand, it is well-known that
d ˆ
F{t f (t)} = i ( f (ω)).
dω
Repeating these processes infinitely many times can only work if we are dealing with
infinitely differentiable functions of rapid decay. We conclude that if f has a high
rate of decay, then fˆ is smooth, and if f is smooth, then fˆ has a high rate of decay.
Due to the duality between the smoothness of a function and the rate of decay of
its Fourier transform, and the rate of decay of a function with the smoothness of its
Fourier transform, the Fourier transform of a function can be used to measure how
smooth that function is, and the faster f decays the smoother its Fourier transform
will be. This idea motivated Laurent Schwartz in the late 40s of the last century
to introduce the class of rapidly decreasing functions which provides the bases for
Schwartz spaces.
Definition 2.6.1 (Rapidly Decreasing Function) Let ϕ ∈ C ∞ (R). Then ϕ is said to
be rapidly decreasing function if
ϕk,m = sup x k ϕ(m) (x) < ∞ for all k, m ≥ 0. (2.6.1)
x∈R
sup(1 + |x|2 )k ϕ(m) (x) < ∞, for all k, m ≥ 0. (2.6.3)
x∈R
sup(1 + |x|)k ϕ(m) (x) < ∞, for all k, m ≥ 0. (2.6.4)
x∈R
The definition can be easily extended to Rn . In this case, we need partial differen-
tiation. We define a multi-index α = (α1 , . . . , αn ) to be an n-tuple of nonnegative
2.6 Schwartz Space 107
x α = x1α1 . . . xnαn .
It is a good exercise to show that if ϕ and φ are two rapidly decreasing functions,
then aϕ + bφ is also a rapidly decreasing function (verify), so the collection of all
rapidly increasing functions on R forms a linear space. This space is called Schwartz
Space.
Definition 2.6.2 (Schwartz Space) A linear space is called Schwartz Space, denoted
by S, if it consists of all rapidly increasing functions, which are also known as
Sobolev functions.
It is clear from the definition that every test function is a Schwartz function. That
is,
D(Rn ) ⊂ S(Rn ).
On the other hand, ϕ(x) = e−x is clearly Schwartz function but is not a test function.
2
Indeed,
2
f (x) = e x ∈ L loc
1
,
Hence, ϕ is not a test function. Thus we have the following important proper inclusion
108 2 Distribution Theory
D(Rn ) S(Rn ).
for every k, m ∈ N0 .
Under this new class of functions, if ϕ ∈ S then e−iωt ϕ ∈ S, and so F{ϕ} ∈ S
Indeed, the equivalent definition (2.6.3) with m = 0 gives
ϕe−iωx d x ≤ |ϕ(x)|
dx
≤ sup(1 + |x|2 )k ϕ(x)
x∈R (1 + |x|2 )k
< ∞.
So the Fourier transform of a function in S exists, and using the properties of Fourier
transform and the fact that
d ˆ
F{t f (t)} = t
f (t) = i ( f (ω)),
dω
we can claim the same result for all derivatives of F.
Proof To prove (1), we perform integration by parts k times, taking into account that
The previous result can be easily extended to S(Rn ). Recall that the n-dimensional
Fourier transform of f : Rn −→ R, denoted by F{ f (x)}(ω), is given by
F{ f (x)}(ω) = fˆ(ω) = f (x)e−i(ω·x) d x
Rn
where
ω = (ω1 , ω2 , . . . , ωn ), ω · x = ω1 x1 + . . . + ωn xn .
Proof The proof is the same for the previous proposition. To prove (1), note that the
integral in (2.6.5) becomes over Rn . Then
F{Dx j ϕ} = Dx j ϕ(x) e−iωx d x.
Rn
110 2 Distribution Theory
Then, we proceed the same as in the proof of Proposition 2.6.3, and repeating |α|
times, we obtain (1).
To prove (2) we write
Dω j ϕ̂(ω) = Dω j ϕ(x)(e−iωx )d x.
Rn
Again,
x j ϕ(x)e−iωx = x j ϕ(x)
and x j ϕ(x) ∈ L 1 . So
Dω j ϕ̂(ω) = ϕ(x)Dω j e−iωx d x
Rn
= (−i x j )ϕ(x)e−iωx d x
Rn
= −iF{x j ϕ(x)}.
Notice the correspondence between the two processes: differentiation and multipli-
cation by polynomials, and one advantage of the Schwartz space is that it can deal
well with this correspondence because it is closed under the two operations. If we
add to this the advantage of being closed under Fourier transform, we realize why
such space is the ideal space to utilize. As a consequence of the previous result, if
ϕ ∈ S(Rn ), then F{Dxα ϕ} and F{x α ϕ(x)} exist, hence by Proposition 2.6.4 ω α ϕ̂(ω)
and Dωα ϕ̂(ω) exist, and we have
and
(−i)|α| F{x α ϕ(x)} = Dωα ϕ̂(ω).
It turns out that F{ϕ} is a Schwartz function as claimed in the discussion at the
beginning of the section. We have the following important result.
2.6 Schwartz Space 111
Corollary 2.6.5 If ϕ ∈ S(Rn ), then F{ϕ} ∈ S(Rn ), that is, the Schwartz space is
closed under the Fourier transform.
The result is not valid for D(Rn ) because the Fourier transform of a test function
is not necessarily a test function.
ω α D β F{ f } = (−i)|β| ω α F{x β f }
= (−i)|β|+|α| F{D α x β f }.
But
D α x β f ∈ S(Rn ) ⊂ L 1 (Rn ). (2.6.7)
Therefore,
F{ f } ∈ S(Rn ).
Note that if the sequence f j ∈ S(Rn ) and f j → f in S(Rn ), i.e., f j − f α,β → 0,
then
Dα x β f j → Dα x β f
F{D α x β f j } → F{D α x β f j }
and consequently,
ω α D β fˆj → ω α D β fˆ.
we have
F 2 { f (x)} = (2π)n f (−x), (2.6.8)
we get
T 2 ( f ) = f (−x),
It follows that
T (T 3 ( f )) = f = T 3 (T ( f )),
and hence
T 3 = T −1 .
It was illustrated in the previous section that the existence of Fourier transforms
is one of the central problems in Fourier analysis, and it has motivated Schwartz
to introduce a new class of distributions. The idea of Schwartz was to extend the
space of test functions to include more functions in addition to the smooth functions
of compact supports. Since the space of functions associated with distributions is
getting larger, we expect the new space of distributions to be smaller, hoping that
this new class of distributions will have all the properties we need to define a Fourier
transform.
2.7 Tempered Distributions 113
then
T, ϕ j → T, ϕ .
The space of all linear continuous functionals on S(Rn ) is called the space of tem-
pered distributions, and is denoted by S (Rn ).
The tempered distribution can be defined through a function f with the property that
f ϕ is rapidly decreasing. This can be achieved by what is known as “functions of
slow growth”.
Definition 2.7.2 (Function of Slow Growth) A function f ∈ C ∞ (Rn ) is said to be
of slow growth if for every m there exists cm ≥ 0 such that
(m)
f (x) ≤ cm (1 + |x|2 )k
for some k ∈ N.
The definition implies that functions of slow growth grow at infinity but no more
than polynomials, i.e., for some k, we have
f (x)
→ 0.
xk
114 2 Distribution Theory
The reader should be able to prove that if f is a function of slow growth and ϕ is
Schwartz function, then f ϕ is Schwartz (see Problem 2.11.35), hence integrable.
Therefore, this class of functions can be used to define a tempered distribution. Let
f be of slow growth, and ϕn ∈ S, then
| f, ϕn − f, ϕ | = | f, ϕn − ϕ | ≤ f ϕn − ϕ . (2.7.1)
T f (ϕn ) = f, ϕn
∞
= f (x)ϕn (x)d x
−∞
∞
f (x)
= · (1 + |x|2 )k ϕn (x)d x
−∞ (1 + |x| )
2 k
∞ f (x)
≤ sup (1 + |x|2 )k ϕn d x.
−∞ (1 + |x| )
2 k
Since f is of slow growth, the integral exists for some large k. If ϕn → 0, then
ϕn k,m = 0.
Let m = 0, then
sup (1 + |x|2 )k ϕn → 0
f, ϕn → 0.
It should be noted that every tempered distribution is regular but not the converse.
Again, the function
2.7 Tempered Distributions 115
2
f (x) = e x
ϕ(x) = e−x ∈ S,
2
S (R) D (R),
ϕn 1,0 = 0,
so
sup (1 + |x|2 )ϕn → 0,
H (x) ∈ S (R).
| δ, ϕn | = |ϕn (0)| .
If ϕn → 0, then
ϕn k,m = 0.
δ, ϕn → 0.
2.8.1 Motivation
Suppose we need to take the Fourier transform of a tempered distribution T , and for
simplicity, let n = 1. Then
T̂ , ϕ = T̂ ϕ(ω)dω
R
= T (x)e−iωx ϕ(ω)d xdω.
R R
Thus, we have
T̂ , ϕ = T, ϕ̂ .
In order for T, ϕ̂ to make sense, it is required that ϕ̂ ∈ S for every ϕ ∈ S. So now
we understand the purpose of introducing a new class of distributions which is the
dual of rapidly decreasing (Schwartz) functions. The tempered distribution seems to
behave nicely with Fourier transforms. Now, we state the definition of the Fourier
transform of distributions.
2.8 Fourier Transform of Tempered Distribution 117
2.8.2 Definition
Remark The notation F{T } is commonly used to denote classical Fourier trans-
forms of functions, but the notation T̂ is more commonly used for distributions.
for every ϕ ∈ S, and that makes sense because ϕ̂ ∈ S, and T ∈ S , and this means
that every tempered distribution has a Fourier transform. Now, the question arises is
How to find the
Fourier
transform of a distribution? We need to manipulate with the
integration in T, ϕ̂ and rewrite it as
g(x)ϕ(x)d x
which implies
T̂ = g.
The next proposition shows that the result of Proposition 2.6.4 for Schwartz functions
is valid for distributions.
Proposition 2.8.2 Let T be a tempered distribution. Then
(1) Dωα (T̂ (ω)) = (−i)|α|
xαT ,
α |α|
α
(2) ω T̂ = (−i) Dx T .
By Proposition 2.6.4(1),
This gives
(−1)|α| T, F{Dωα ϕ(ω)} = (−i)|α| T, x α ϕ̂(ω) = (−i)|α| x α T , ϕ(ω) .
Hence,
Dωα (T̂ (ω)), ϕ(ω) = (−i)|α|
x α T , ϕ(ω) .
Hence,
α |α| α ˆ
D x T = (i) ω T.
i.e.,
F{F −1 {ϕ} = F −1 {F{T }} = T. (2.9.1)
How to construct a formula for Ť ? The next example is helpful in establishing some
subsequent results.
2.9 Inversion Formula of The Fourier Transform 119
f (x) = e−x /2
2
.
By definition, we have
∞
e−( 2 t +iωt )
1 2
F{ f (t)} = dt.
−∞
Write 2
1 2 t iω ω2
t + iωt = √ +√ + .
2 2 2 2
we get
√ −ω2 /2 ∞ √
e−u du = 2πe−ω /2
2 2
F{ f (t)} = 2e .
−∞
If x ∈ Rn , then f is written as
f (x) = e−|x| /2
2
.
So
e−|x| /2−i(ω·x)
2
F{ f } = d x.
Rn
By the previous argument for the case n = 1, taking into account the integral is made
over Rn , we have
n
∞
− 21 |ω|2
e− 2 [x+iwk ] d x
1 2
F{ f } = e
k=1 −∞
n
√
= e− 2 |ω|
1 2
( 2π)
k=1
n
−|ω|2 /2
= (2π) e 2 .
The Gaussian function shall be used to obtain the inversion formula of the Fourier
transform. Indeed, let f ∈ Cc∞ (Rn ), and consider the sequence
120 2 Distribution Theory
g (x) = g(x)
1 ω
F{g } = ĝ . (2.9.2)
n
It follows that
f, F{g } = f F{g }.
Then, we either use Dominated Convergence Theorem (verify), or the fact that
f (y) → f (0) uniformly as → 0 (justify) to pass the limit inside the integral,
and this gives
f, F{g } → f (0) ĝ(y)dy. (2.9.3)
Hence,
f (0) ĝ(y)dy = g(0) fˆ(y)dy. (2.9.4)
This holds for all possible g. So let g(x) be the Gaussian function discussed in
Example 2.9.1. Then, g(0) = 1, and
n
n
√
= (2π) 2 2π = (2π)n .
k=1
Then, for any x, f (x) can be obtained by using shifting property on (2.9.5) as
f (x − x0 ) ←→ e−iwx0 fˆ(ω),
and we obtain
1
f (x) = fˆ(y)ei y·x dy.
(2π)n Rn
This suggests (in fact establishes) the inversion formula for the Fourier transform.
The Inverse Fourier transform, denoted by fˇ(x), is defined as
1
F −1 { fˆ(ω)} = fˇ(x) = fˆ(ω)ei(ω·x) dω. (2.9.6)
(2π)n Rn
We have
FF −1 ( f ) = F −1 F( f ) = f.
Hence,
T̂ˇ = Ť.
ˆ
122 2 Distribution Theory
F{δ}, ϕ = δ, F{ϕ} .
Hence F{δ} = 1.
The result of the example seems plausible. Let us see why. Let
1
φn (x) = sin(nx).
πx
It is well-known that ∞
sin x
d x = π.
−∞ x
Therefore, φn (x) is a delta sequence. Now let us find the inverse Fourier of 1.
∞
−1 1
F {1} = 1.eiωx dω
2π −∞
L
1
= lim 1.eiωx dω
2π L→∞ −L
iωL
1 e − e−iωL
= lim
2π L→∞ ix
sin L x
= lim = δ(x).
L→∞ πx
Hence
F{δc }(ω) = e−icω .
Moreover,
δc + δ−c
F = cos cω,
2
and
F{cos cx} = π(δ(ω − c) + δ(ω + c)).
F{sgn } = iωF{sgn}.
So
2
F{sgn} = .
iω
The unit step function can be written as
1
H (x) = (1 + sgn(x)),
2
so we obtain
1 2 1
F{H } = 2πδ + = πδ + .
2 iω iω
The convolution of two functions is a special type of product that satisfies elementary
algebraic properties, such as the commutative, associative, distributive properties. To
124 2 Distribution Theory
First, we need the following result, which discusses the derivative of convolutions.
Lemma 2.10.1 (ϕ ∗ ψ)(k) = ϕ(k) ∗ ψ = ϕ ∗ ψ (k) for k = 0, 1, 2, . . .
Proof We differentiate ϕ ∗ ψ to obtain
d
(ϕ ∗ ψ) = ϕ (x − y)ψ(y)dy
dx R
d
= (−1) ϕ(x − y)ψ(y)dy
dy
R
= (−1)ϕ(x − y)ψ (y)dy
R
= ϕ ∗ ψ (x).
ϕ ∗ ψ ∈ C ∞ (Rn ).
Then,
|x|k (ϕ ∗ ψ)(m) ≤ 2k |x − y|k |ϕ(x − y)| ψ (m) (y) + |y|k |ϕ(x − y)| ψ (m) (y) dy.
R
(2.10.1)
2.10 Convolution of Distribution 125
Since ϕ, ψ ∈ S(Rn ), the integral in the RHS of (2.10.1) exists and finite (why?).
Hence
|x|k (ϕ ∗ ψ)(m) < ∞.
Taking the supremum over all x ∈ R the result follows. For S(Rn ), the proof is the
same as above, with k is replaced with |α| and m with β for some α, β ∈ Nn0 , and
taking the integral over Rn .
where
ψ − (x) = ψ(−x).
ψ − ∗ ϕ ∈ S.
f ∗ δ (n) = f (n) .
Hence, we obtain
f ∗ δ = f.
For n = 1, we have
f ∗ δ, ϕ = δ, f − ϕ
d
= − δ, ( f − ∗ ϕ)
dx
= −( f − ∗ ϕ) (0
= (−1)(−1) f (y − x)ϕ(y)dy |x=0
R
= f , ϕ .
Using induction, one can easily prove that the result is valid for all n.
The result shows that the delta distribution plays the role of the identity of the
convolution process over all distributions. The advantage of this property is that
the delta function and its derivatives can be used in computing the derivatives of
functions.
Proof We have
F{ψ ∗ T }, ϕ = ψ ∗ T, F{ϕ}
= T, ψ − ∗ F{ϕ} ,
2.11 Problems 127
Therefore, we obtain
F{ψ ∗ T } = F{ψ} · F{T }.
2.11 Problems
d( f T ) = f (DT ) + T f .
128 2 Distribution Theory
L rloc ⊂ L loc
s
.
sin2 nx
(2) ϕn (x) = , for n → ∞.
nπx 2
(15) Find the first distributional derivative of each of the following functions.
1
(1) f (x) = |x| . (5) f (x) = √ .
x+
x
(2) f (x) = . (6) f (x) = cos x for x irrational.
|x|
1
(3) f (x) = |x|2 . (7) f (x) = ln |x| sgn(x).
2
1
(4) f (x) = √ . (8) f (x) = H (x) sin x.
x
(16) Determine whether each of the following functions is a Schwartz function or
not √
(1) f (x) = e−a|x| for all a > 0. (5) f (x) = e− x +1 .
2
(2) f (x) = e−a|x| for all a > 0. (6) f (x) = e−x cos e x .
2 2 2
1 x
F{ϕ(ax)} = F ϕ .
|a| a
F{F(x)} = 2π f (−ω).
for any c ≥ 0.
(25) Show that if ϕ, ψ are rapidly decreasing functions, then ϕ · ψ ∈ S(R).
Conclude that
ϕ ∗ ψ ∈ S(R).
| f | ≤ cm (1 + |x|)−m
(33) Determine whether each of the following functions defines a tempered distri-
bution.
4
(1) f (x) = e x .
(2) f (x) = e .
x
(3) f (x) = x 3 .
(34) Show that if f (x) ∈ S (R) then
for a ∈ R.
(35) Show that a product of a function of slow growth with a Schwartz function is
again a Schwartz function.
(36) If f n → f in S , show that f n → f in S .
(37) Show that if T is tempered distribution then so is T.
(38) Determine whether e x cos(e x ) belongs to S (R).
(39) Show that
(T f )− , ϕ = T f , ϕ− .
(41) Use the duality property (F{F(x)} = 2π f (−ω)) to find the Fourier transform
1
of f (x) = .
x
(42) Show that ∞
1
δ(ω) = eiωx dω.
2π −∞
1 1
(3) . (7) p.v. .
x + ia x
1 1
(4) . (8) p.v. 2 .
(x + ia)2 x
(44) Find F{(−x)α } and F{D α δ}.
(45) Show that if f ∈ S(Rn ) then
F{ f } = (2π)n F −1 { f }.
2.11 Problems 131
(48) Let T be the distribution defined by the function F{ f }, for some f ∈ S (Rn ).
Show that TF { f } = F{T f }.
(49) Fourier Transform of Polynomials: Show that the following is true.
(a) F{e−icx } = 2πδ(ω + c) for x ∈ R and F{e−ic·x } = (2π)n δ(ω + c) for x ∈
Rn .
(b) F{i k x k e−icx } = (−1)k (2π)D k (δ(ω + c)) for x ∈ R and
(c) F{i |α| x α e−ic·x } = (−1)|α| (2π)n D α (δ(ω + c)) for x ∈ Rn .
(d) F{x k } = 2πi k D k δ(ω).
!n !
n
(e) F{P(x)} = 2π a j (i) j D j (δ(ω)) for the polynomial P(x) = ajx j.
j=0 j=0
(50) Show that the following inclusions are proper
(a) S(Rn ) ⊂ Cc∞ (Rn ).
(b) S (R) ⊂ D (R).
(51) If T ∈ S and ϕ, ψ ∈ S, show that
(T ∗ ϕ) ∗ ψ = T ∗ (ϕ ∗ ψ).
F −1 {F{ f } ∗ F{g}} = f g.
for x ∈ Rn .
(56) Let f ∈ L loc
1
(R).
(a) Show that if g ∈ Cc (R) then f ∗ g ∈ C(R).
(b) Show that if g ∈ Cc1 (R) then f ∗ g ∈ C 1 (R).
(c) Show that if g ∈ Cc∞ (R) then f ∗ g ∈ C ∞ (R).
Chapter 3
Theory of Sobolev Spaces
Recall Definition 1.4.1 for distributional derivatives was given in the form
T (k) , ϕ = (−1)k T, ϕ(k) .
Under this type of derivative, distributions have derivatives of all orders. Another
generalization of differentiation is proposed for locally integrable functions that are
not necessarily differentiable in the usual sense. Such type of derivatives has two
advantages: Providing derivatives for nondifferentiable functions and generalizing
the notion of partial derivative. Recall the multi-index
α = (α1 , . . . , αn )
αi ∈ N0 = N ∪ {0},
and we denote
|α| = α1 + · · · + αn .
x α = x1α1 . . . xnαn .
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 133
A. Khanfer, Applied Functional Analysis,
https://doi.org/10.1007/978-981-99-3788-2_3
134 3 Theory of Sobolev Spaces
∂ |α| u
∂xα = ∂1α1 . . . ∂nαn , ∂ α u = .
∂x1α1 . . . ∂xnαn
∂3u
Dαu = ∂ αu = .
∂x 2 ∂ y
with norm
∂ α T, ϕ = (−1)|α| T, ∂ α ϕ .
u D α ϕd x = (−1)|α| vϕd x.
for every ϕ ∈ Cc∞ (), then D α u = v, and we say that u is weakly differentiable of
order α, and its αth weak derivative is v. We also say that the αth partial derivative
of the distribution T is given by
D α T (ϕ) = (−1)|α| T (D α ϕ)
and
dk
D α u = u (k) = u.
dxk
(3) If T is a regular distribution (i.e., T = Tu for some locally integrable u), then
the distributional derivative and the weak derivative coincide, that is, if |α| = k,
then D α u = T (k) .
(4) If u ∈ L 1Loc (), but there is no v ∈ L 1Loc (Rn ) such that
u D α ϕd x = (−1)|α| vϕd x
for every ϕ ∈ Cc∞ (), then we say that u has no weak αth partial derivative.
The next theorem discusses some basic properties of the weak derivative.
Theorem 3.1.2 Let u, D α u ∈ L 1Loc (), then
(1) D α u is unique.
(2) If D β u, D α+β u ∈ L 1Loc () exist, then
D β (D α u) = D α (D β u) = D α+β u.
v1 ϕd x = − uϕ d x
and
v2 ϕd x = − uϕ d x,
136 3 Theory of Sobolev Spaces
hence we have
(v2 − v1 )ϕd x = 0.
D β (D α u)ϕd x = (−1)|β| D α u D β ϕd x
= (−1)|β|+|α| u D α D β ϕd x
= (−1)|β|+|α| u D α+β ϕd x
This yields
D β (D α u) = D α+β u,
D α (D β u) = D α+β u.
For (3), note that the convergence in L 1Loc implies convergence in the sense of dis-
tributions. which allows us (verify) to write
Proposition 3.1.3 Let u ∈ C k (), ∈ Rn . Then the weak derivatives |α| ≤ k exist
and are equal to the classical derivatives 1 ≤ n ≤ k, up to order k.
Proof If u ∈ C k () then
Hence
∇u = D α u.
D α u = vα ∈ L 1Loc ().
Therefore,
u D α ϕd x = lim u j D α ϕd x
= (−1)|α| lim D α u j ϕd x
|α|
= (−1) vα ϕd x,
Proof The L p convergence implies that (verify) for all ϕ ∈ Cc∞ ().
lim D α v j ϕd x = wα ϕd x.
We conclude the section with the following transformation property between weak
derivatives and powers of the independent variables.
Proposition 3.1.6 Let u ∈ L 1Loc (Rn ), and F{u}(ω) = û(ω). Then
(1) F{Dxα u} = (i)|α| ω α û(ω).
(2) Dωα (û(ω)) = (−i)|α| F{x α u(x)}.
This demonstrates that the Fourier transform behaves well with weak derivatives in
a similar manner to the usual (and distributional) derivatives, and it preserves the
correspondence between the smoothness of the function and the rate of decay of its
Fourier transform.
3.2 Regularization and Smoothening 139
In light of the discussion in the preceding sections, it turns out that Schwartz spaces
play a dominant role in the theory of distributions. We will further obtain interesting
results that will lead to essential consequences on distribution theory. The main tool
for this purpose is “mollifiers”.
A well-known result in measure theory is that any function f in L p can be approx-
imated by a continuous function g. If g is smooth, this will give an extra advantage
since g in this case can serve as a solution to some differential equation. Since the
space of smooth functions C ∞ is a subspace of C, we hope that any function f ∈ L p
can be approximated by a function g ∈ C ∞ , which will play the role of “mollifier”.
There are two remarks to consider about g:
(1) Since our goal is to approximate any f ∈ L p by a function g ∈ C ∞ , this function
is characterized by f, so we can view it as f such that f → f as → 0.
(2) The smoothening process of getting a smooth function out of a continuous (not
necessarily smooth) reminds us of convolutions in which the performed integra-
tion has the effect of smoothening the curve of the function and thus eliminating
sharp points on the graph, which justifies the name “mollifier”. As the word
implies, to mollify an edge means to smoothen it. In general, if f is continuous
and g is not, then f ∗ g is continuous. If f is differentiable but g is not, then
f ∗ g is differentiable, i.e., f ∗ g will take the better regularity conditions of the
two functions.
Mollifiers are functions that can be linked to other functions by convolution to
smoothen the resulting function and give it more regularity. Therefore, we may write
f = φ ∗ f (3.2.1)
φ ∈ C ∞ (Rn )
independent of the choice of f . To get an idea of φ , note that if we take the Fourier
transform of (3.2.1) and take → 0, then we obtain
On the other hand, the continuity of the Fourier transform, and the fact that f → f
implies
F{ f } → F{ f }. (3.2.3)
140 3 Theory of Sobolev Spaces
F{φ } → 1,
But
1 = F{δ},
Thus, we have
φ → δ
ϕ = 1.
R
The function ϕ is smooth and having the ball B1 (0) as support. For n = 1, we have
supp(ϕ ) = B (0)
i.e. for n = 1,
supp(ϕ ) = [−, ].
3.2.2 Mollifiers
Definition 3.2.1 (Mollifier) Let ϕ ∈ Cc∞ (Rn ) with ϕ ≥ 0, supp(ϕ) = B1 (0), and
ϕ = 1. Then, the family of functions ϕ given by
Rn
1 x
ϕ (x) = n ϕ
Many functions can play the role of ϕ, but the function given in (3.2.5) is the
standard one, so the family ϕ is called the standard mollifier if ϕ is as given in
(3.2.5). It is easy to check that ϕ has the following standard properties:
(1) ϕ (x) ≥ 0 for all x ∈ Rn .
(2) ϕ ∈ Cc∞ (Rn ), with supp(ϕ ) = B (0).
(3) ϕ = 1.
Rn
(4) ϕ (x) → δ(x) as → 0.
Mollifiers are defined as C ∞ approximations to the delta distribution. Now, we can
mollify a function f ∈ L p (Rn ) by convolving it with any mollifier, say the standard
142 3 Theory of Sobolev Spaces
f = f ∗ ϕ .
The family f is called: f -mollification. For > 0, define the following set:
= {x ∈ : B (x) ⊆ },
so it is clear that → as → 0.
Note that if f : ⊆ Rn −→ R, then
f = ϕ (x − y) f (y)dy = ϕ (x − y) f (y)dy
Rn B (0)
The following theorem discusses the properties of f and the importance of its
formulation.
Theorem 3.2.2 Let f ∈ L p , 1 ≤ p < ∞, and let f = f ∗ ϕ for some mollifier
ϕ , > 0. Then, we have the following:
f = f ∗ ϕ = ϕ (x − y) f (y)dy.
Rn
Observe that
= (−1)|α| D αy ϕ (x − y) f (y)dy
Rn
That is
(D α f )(x) = (D α f ) (x),
f (x + hei ) − f (x) 1 1 x + hei − y x−y
= n ϕ −ϕ f (y)dy
h h
1 1 x + hei − y x−y
= n ϕ −ϕ f (y)dy
K h
uniformly on K . Hence, the weak derivative of f exists, and similar argument to the
above can be made for D α f to obtain
(D α f )(x) = (D α f ) (x)
ϕ = (ϕ ) p + q .
1 1
| f (x)| = ϕ (x − y) f (y)dy
1 1
q p
≤ ϕ (x − y) . ϕ (x − y). f
q p
q1 1p
= ϕ (x − y)dy . ϕ (x − y) | f (y)| p dy .
B (x) B (x)
But the first integral of the mollifier is equal to 1, and since supp(ϕ ) is compact, we
conclude that the second integral exists and finite. So
f pp = | f (x)| p d x ≤ ϕ (x − y) | f (y)| p d yd x,
B (x)
f Lp p ( ) ≤ ϕ (x − y) | f (y)| p d xd y
B (x)
= | f (y)| p dy ϕ (x − y)d x
B (x)
≤ | f (y)| p dy ϕ (x − y)d xd y
Rn
= | f (y)| p dy = f Lp p ()
and this proves (3). The result can also be proved similarly for the case
= Rn .
K = {x : dist(x, K ) ≤ }.
1 x ∈ K
χ K (x) = .
0 x∈/ K
supp(ξ) ⊂ K + B (x) = K 2 ⊂ U.
Therefore, we have
ξ(x) ∈ D(Rn ).
Moreover, 0 ≤ ξ ≤ 1 and
1 |x| ≤ 1
ξ(x) = . (3.2.7)
0 |x| ≥ 2
1 |x| ≤ m
ξm (x) = .
0 |x| ≥ 2m
Now, one can use the Dominated Convergence Theorem to show that if f ∈
L p (Rn ), then
ξm f − f p −→ 0
in L p (Rn ).
One important advantage of using cut-off functions over characteristic functions
is that breaking smooth functions with a cut-off function preserves the smoothness,
so if f ∈ C ∞ () then f ξ ∈ C ∞ (), while breaking them with the characteristic
(indicator) function of the form χ K m (x) may produce jump discontinuities at the
boundary of K m , so f ξ won’t be smooth on .
The third regularizing tool that will be studied is the partition of unity, which can
be obtained by use of cut-off functions. This is an important tool to globalize local
approximations.
Definition 3.2.4 (Locally Finite Cover) Let M be a manifold in Rn , and let {Ui } be
a collections of subsets of Rn . Then {Ui } is said to be locally finite cover of M if for
every x ∈ M, there exists a neighborhood of x N (x) such that N x ∩ Ui = Ø for all
but a finite number of indices i.
In other words, a locally finite cover means that for every point we can find a
neighborhood intersecting at most finitely many of sets in the cover. A topological
space is called paracompact if every open cover admits a locally finite refinement. All
locally compact Hausdorff spaces are paracompact spaces, hence Rn is paracompact,
i.e., every open cover {Uα } of a manifold in Rn has a locally finite refinement {Vi }.
If we apply this result again on {Vi }, we obtain another cover {Wi }, and it can be
shown that Wi ⊂ Wi ⊂ Vi .
Definition 3.2.5 (Subordinate) Let F1 = {Vi , i ∈ Λ} and F2 ={Ui , i ∈ Λ} be
two families of sets. Then we say that F1 subordinate to F2 if Vi ⊂ Ui for all i ∈ Λ.
Note here that if the two families are covers for some set M, then we say that F1
is refinement to F2 and a subcover of it.
Definition 3.2.6 (Partition of Unity) Let M be a manifold in Rn , and {φi : M →
R : i ∈ I } be a collection of nonnegative smooth functions. Then {φi } is a partition
of unity on M if
(2) φi (x) = 1 for all x ∈ M.
i
In light of the definition, one can also describe the functions as φi ∈ C ∞ (M), φi :
M → [0, 1]. The partition of unity can be used to “glue” local smooth paths to obtain
a global smooth one, and this will allow us to approximate nonsmooth functions
by smooth functions. It is worth noting that the set I is not necessarily countable,
but condition (2) implies that the summation is only over a countable indices i ∈
I ⊂ I where I is a countable set, so we can WLOG assume that I is countable.
Consequently, we have φi (x) = 0 for all but countably many i ∈ I . If we want the
summation to be over a finite number of indices, then we need extra conditions.
Proposition 3.2.7 Let M be a manifold in Rn , {Ui } be a locally finite cover of M,
and let {ξi } be a sequence of cut-off functions defined on M. If the subcover {supp(ξi )}
is a subordinate to {Ui }, then for each x ∈ M, we have ξi (x) = 0 for all but finitely
many i ∈ I .
Another way of saying that the subcover {supp(φi )} is subordinate to the open
cover {Ui } is to say that {φi } is subordinate to {Ui }. Note that if {supp(φi )} is subor-
dinate to a locally finite cover {Ui }, then
Proof Consider the collection {i }i∈N of subsets of defined by 0 = Ø, and for
k ≥ 1,
1
i = Bi (0) ∩ x ∈ : d(x, ∂) > .
i
Then for each i, we have: i is open, i is compact, i ⊂ i+1 , and i = .
Furthermore, for every x ∈ , there exists N ∈ N such that
x ∈ N +2 \ N +1 ,
N +3 \ N .
148 3 Theory of Sobolev Spaces
Thus, let
K i = i+2 \ i+1
and
Ui = i+3 \ i .
Then clearly, {K i } and {Ui } are collections of compact sets and open sets, respectively,
and for each i, K i ⊂ Ui . It is easy to see from the construction of {Ui } that it is a locally
finite cover of (for example, if i < r < i + 1 then Br (0) will intersect Ui , Ui−1 , and
Ui−2 at most). By Theorem 3.2.3, there exists a sequence of cut-off functions {ξi }i∈N
such that for each i, we have ξi (x) ∈ D(), 0 ≤ ξi ≤ 1 with ξi (x) ≡ 1 on K i and
supp(ξi ) ⊂ Ui . This implies by Definition 3.2.5 that the sequence {ξi } subordinate
{Ui }, hence it is locally finite. By Proposition 3.2.7, the summation i ξi (x) is finite
for every x ∈ (only three nonzero terms), so we set
ξ(x) = ξi (x).
i
Then function ξ is well-defined and smooth. Now, we define the following sequence
φi ∈ D() given by
ξi (x)
φi (x) = .
ξ(x)
It is clear that 0 ≤ φi ≤ 1 and φi (x) = 1 for all x ∈ . So this is the smooth
partition of unity.
The three tools that we studied: mollifiers, cut-off functions, and partition of
unity, are among the most powerful tools that can be used to establish density and
approximation results. Mollifiers provide a mollification (smoothening the edges),
and cut-off functions provide compact support, and finally, the partition of unity
enables us to pass from local property to global.
uvd x = 0
3.3 Density of Schwartz Space 149
uv ≥ uξ > 0.
N
and u j −→ u in C. Hence, we begin our density results with the following well-
known result in real analysis which implies that a continuous function can always
be approximated by another continuous function of compact support.
Theorem 3.3.1 Cc∞ (Rn )) is dense in C(Rn ).
Recall that ϕ ∈ C ∞ (Rn ) is said to be rapidly decreasing if
which is equivalent to
for all k ∈ N and β ∈ Nn0 , |β| ≤ k. The Schwartz space S(Rn ) is the space
This section discusses some interesting properties of this space and how it can be
used to construct other function spaces. Schwartz space S(Rn ) has three significant
properties that make it very rich in construction and important in applications. These
properties are:
(1) S(Rn ) is closed under differentiation and multiplication by polynomials.
(2) S(Rn ) is dense in L p (Rn ).
(3) The Fourier transform is preserved under S(Rn ), i.e., it carries S(Rn ) onto itself.
150 3 Theory of Sobolev Spaces
The first property is clear from the previous chapter. To demonstrate the other prop-
erties, we need the following theorem.
Theorem 3.3.2 If u ∈ L p () for some ⊆ Rn and 1 ≤ p < ∞, then u → u in
L p ().
v = v ∗ ϕ .
Then, we have v ∈ C ∞ (Rn ). Note that v and ϕ are both of compact support, say
K and B (0) respectively, so
for some r > 0. Hence, v ∈ Cc∞ (Rn ). Moreover, since g is continuous on a compact
support, it is uniformly continuous on K . So there exists ρ > 0 such that if
x − y < ρ,
then σ
|v(x − y) − v(y)| <
3
for all x, y ∈ K . Then, using standard property (3) in the preceding section for ϕ ,
we get
σ σ
|v (x) − v(x)| ≤ |v(x − y) − v(x)| ϕ (y)dy ≤ ϕ (y)dy = .
Rn Br (0) 3 3
(3.3.4)
Taking the supremum of (3.3.4) over B (0), we conclude that v (x) → v(x) uni-
formly. Hence, there exists 0 sufficiently small such that for all < 0 and all
x ∈ B (0)
σ
v − v p < . (3.3.5)
3
Finally, we have
u − v = (u − v) ∗ ϕ (3.3.6)
3.3 Density of Schwartz Space 151
and one can easily show that an approximation of u by v gives the same result for u
and v (verify). Now, from (3.3.3), (3.3.5), and (3.3.6), we obtain
u − u p ≤ u − v p + v − v p + v − u p < σ.
u − u L p () −→ 0.
One important consequence that can be established from the previous theorem is the
following.
Theorem 3.3.3 Let ⊆ Rn be an open set. Then, the space Cc∞ () is dense in
L p () for 1 ≤ p < ∞.
in the proof of the previous theorem such that ū converges to ū in L p (Rn ) by the
previous theorem. Let = m such that −→ 0 as m → ∞. Define the following
collection of sets K m
1
d(K m , ∂) ≥ .
m
∞
Then clearly each K m is compact for all m, and K m = . Now, define a sequence
m=1
of cut-off functions
152 3 Theory of Sobolev Spaces
1 x ∈ Km
ξm (x) = (3.3.8)
0 x ∈ Rn \ .
Note that
|ūξm (x) − ū(x)| −→ 0
a.e. on R and
The above density result is not valid for p = ∞ (choose f (x) = c for some
nonzero constant c). The significance of this result lies in the fact that any function
in L p can be approximated by a smooth function in Cc∞ . It seems clear from the
proof how the mollifiers and the cut-off functions can be effective tools to construct
smooth sequences with compact support since the mollification provides C ∞ and
the cutting-off provides compact support. Note that we performed the mollification
then the cutting-off. If the reverse order was performed we wouldn’t obtain a smooth
sequence in but rather in K n ⊂ . This is one crucial advantage of convoluting
on R is that it produces a convolution approximating sequence on R not on a subset
of it.
Next, we will use the argument of the proof above in proving a generalization of
the Fundamental Lemma of COV (Lemma 3.2.9) which will play a crucial role in
the theory of elliptic PDEs in Chap. 4.
Theorem 3.3.4 (Fundamental Theorem) If u ∈ L 1Loc () satisfying
uvd x = 0
Proof Consider
u m = ū ξm (x) ∈ Cc∞ ()
as in the proof of the previous theorem and with the same settings for ξm (x) and ū .
Then
Substituting with
v = ϕ (x − y)ξm (y) ∈ Cc∞ ()
u m (x) = 0,
Proof If there are two weak derivatives v1 , v2 for u, then using the same argument
in Theorem 3.1.2(1), we obtain
(v1 − v2 )ϕd x = 0,
The following result is very important and will be useful for upcoming results.
Theorem 3.3.6 S(Rn ) is dense in L p (Rn ).
The first inclusion above is clear since functions in D(Rn ), hence their derivatives
are of compact support and their supremum, therefore, exist on compact subsets. For
the second inclusion, letting u ∈ S(Rn ), then u can be written as
u(x) = (1 + |x|)−
n+1 n+1
p (1 + |x|) p u(x).
It follows that
154 3 Theory of Sobolev Spaces
p dx
u pp = |u| p ≤ sup (1 + |x|)n+1 u(x) < ∞.
Rn Rn (1 + |x|)n+1
and taking into account that (1 + |x|)n+1 u(x) is bounded because u ∈ S(Rn ). Com-
bining (3.3.9) together with Theorem 3.3.3 and the fact that L p (Rn ) is complete, the
result follows.
The Fourier transform is isometric (up to a factor of 2π) with respect to the L 2
norm, i.e., it preserves norm. One may eliminate the constant difference by adopting
the definition
F{ f } = f (x)e−2πiωx d x,
Rn
and
ρ(ϕ + φ) ≤ ρ(ϕ) + ρ(φ).
is a countable base at 0. Consequently, (S(R), ρk,m ) is a metric space but not normed
space. Consider the Frechet metric
1 ρ(ϕ − φ)
d(ϕ, φ) = . (3.4.1)
k,m
2k+m 1 + ρ(ϕ − φ)
It is easy to see that the metric d turns S(R) into a Frechet space. Indeed, let ϕn ∈
S(R) be a Cauchy sequence, so it is bounded. Then, x k ϕ(m)n (x) is Cauchy in S(R),
so it is Cauchy in the complete space (C ∞ (R), d), so it converges uniformly to a
bounded function ϕ in a Banach space C ∞ (R) with supremum norm (why?). We,
therefore, have
or
156 3 Theory of Sobolev Spaces
ϕ − ϕn k,m → 0,
so
Thus, ϕ ∈ S(R). The same argument can be extended to S(Rn ) by considering the
following seminorm:
and
1 ρα,β (ϕ − φ)
d(ϕ, φ) = .
2|α|+|β| 1 + ρα,β (ϕ − φ)
It should be noted that S(R) is not a complete normed space, i.e., there is no norm
that would make the space complete. Thus, it is important to extend the space to its
completion. This is nothing but the Sobolev space as in the following definition.
Definition 3.4.1 (Sobolev Space) Let be in Rn . Then, the Sobolev space, denoted
by W k, p (), is defined by 0 ≤ |α| ≤ k}.
In particular, the space W k,2 () is denoted by H k (). Note that if k = 0, then
W = L p . For k = 1, the definition reads
0, p
∂u
W 1, p () = {u ∈ L P () such that ∈ L p () i = 1, 2, . . . n}
∂xi
The Sobolev space W k, p () is a normed vector space with the norm
⎛ ⎞1/ p
uk, p = ⎝ D α u pp ⎠ . (3.4.2)
|α|≤k
and
⎛ ⎞1/ p
u + vk, p = ⎝ D α (u + v) pp ⎠
|α|≤k
⎛ ⎞1/ p
≤⎝ D α u pp + D α v⎠
|α|≤k
≤ D α u pp + D α v pp
|α|≤k |α|≤k
= uk, p + vk, p .
The Sobolev space contains all functions in L p such that their distributional deriva-
tives are also in L p , and since these derivatives belong to L p , they exist and are
unique up to a set of measure zero, and we also have in general
W 1, p L p .
The above proper inclusion indicates that there are examples of functions that are
in L p , but they are not in W 1, p . They could even be weakly differentiable, but
nevertheless they are not Sobolev functions (see Problem 3.11.25). We can tell if a
function is in W 1, p by looking at its derivatives if they belong to L p in the sense of
distributions. For example, let u(x) = x −1/3 , then it is readily seen that u ∈ L 2 (0, 1)
but u ∈ / L 2 (0, 1), hence u ∈
/ H 2 (0, 1). In general, the Sobolev spaces are nested in
the following sense:
where
158 3 Theory of Sobolev Spaces
The importance of the following result is that it helps propose another equivalent
definition for H s .
Proposition 3.4.2 Let u ∈ L 2 (Rn ). Then, u ∈ H k (Rn ) iff
1
D α u22 = F{D α u}22
|α|≤k
2π |α|≤k
1
i |α| w α û(w)2
= 2
2π |α|≤k
1
|w α |2 û(w) dw.
2
=
2π Rn |α|≤k
1
for all w ∈ Rn . Let a = and b = M. Then, by the above argument we conclude
M
that the norm u2,k is equivalent to the norm defined by
1/2
u = (1 + |w| ) û(w)dw
2 k
.
The space H s (Rn ) is a Hilbert space endowed with the inner product
u, vk,2 = D α u, D α v2 . (3.4.3)
|α|≤k
Note here that, in this case, the space H s (Rn ) is not a subspace of L 2 (Rn ) (why?).
for all x ∈ X .
Proposition 3.5.1 For the space W k, p , 1 ≤ p, the norm
⎛ ⎞1/ p
uk, p = ⎝ D α u pp ⎠
|α|≤k
160 3 Theory of Sobolev Spaces
Proof For convenience, let us call the first norm ·1 and the second norm ·2 .
Using the fact that |x| p is convex for p ≥ 1, it is easy to show that
p p p p
c1 + c2 ≤ (c1 + c2 ) p ≤ 2 p−1 (c1 + c2 ),
It follows that
u1p = D α u pp
|α|≤k
p
≤ D α u p = u2
p
|α|≤k
≤ n p−1 D α u pp
|α|≤k
=n p−1
u1p .
That is
C1 u1 ≤ u2 ≤ C2 u1 ,
√
where C1 = 1 and C2 = n p−1 .
The convergence in Sobolev spaces can now be defined as follows:
Definition 3.5.2 Convergence in Sobolev Spaces. Let u m , u ∈ W k, p (), for ⊆
Rn . Then u m −→ u in W k, p () if
or
3.5 Basic Properties of Sobolev Spaces 161
uk, p = D α u p .
|α|≤k
1/ p 1/ p
uW k, p () = |u| p
+ |Du| p
,
respectively. Each one of them can be convenient to use in some cases, and we will
constantly interchange the two norms and use whichever is suitable for the specific
argument.
u m − u p → 0, & D α u m − v p → 0
(D α u m )ϕ − vϕ ≤ |(D α u m − v) ϕ|
≤ D α u m − v p ϕq → 0,
as m → ∞. Consequently
lim (D α u m )ϕ = vϕ (3.5.1)
Moreover, as m → ∞, we have
162 3 Theory of Sobolev Spaces
u m (D α ϕ) − u(D α ϕ) ≤ u m − u p D α ϕq → 0,
so
u Dϕ = lim um Dαϕ
|α|
= (−1) lim Dαum ϕ
= (−1)|α| vϕ.
D α u m − D α v p → 0.
It follows that
u m − uk, p → 0
k, p
for all 0 ≤ |α| ≤ k. Therefore, u m −→ u. So u m converges to u in W k, p (Rn ), and
consequently W k, p (Rn ) is complete.
To show separability, note that the product space (Z , · Z ), defined as
N +1
Z= (L p ) j
j=1
T : W k, p (Rn ) −→ Z (Rn ), (T u) j = (D α u)
uk, p = T u Z .
1
Recall in Sect. 3.1 we defined L loc (), the space of locally integrable functions on
, to be the functions that are Lebesgue integrable on every compact subset of .
We would like to define a similar space for the Sobolev spaces. However, one critical
issue arises here. Sobolev spaces involves weak derivatives, and these derivatives
might possess bizarre behavior on the boundary of compact sets, so defining the
locality in terms of compactness might be problematic. Alternatively, we strengthen
the idea of locality to involve an open set whose closure is compact proper subset of
the domain. This gives a control over the derivatives within the domain.
Definition 3.5.4 (Compact Inclusion) A set is said to be compactly contained in
(denoted by ⊂⊂ ) if ⊂ ⊂ where is compact.
Remark In other textbooks, this may also be denoted by , but we will not
adopt this notation.
Now, we define the local Sobolev space.
Definition 3.5.5 (Local Sobolev Space) Let ∈ Rn . Then, the local Sobolev space,
denoted by W k, p (), 0 ≤ |α| ≤ k is defined by
k, p
Wloc () = {u ∈ L P () such that u ∈ W k, p ( ) for every ⊂⊂ }.
k, p
We can alternatively say that u ∈ Wloc () if u ∈ W k, p (K ) for any compact set
K ⊂ , which might be a more convenient way of describing the functions in the
space. The functions in this space don’t have any growth constraints at the boundary.
Convergence in local Sobolev Spaces can be defined similar to that for the Sobolev
space: Let u m , u ∈ W k, p (), for ⊆ Rn . Then we say that the sequence u m con-
k, p
verges to u in Wloc () if for every ⊂⊂ we have
lim u m − uW k, p ( ) = 0.
164 3 Theory of Sobolev Spaces
∂u
Dαu = ,
∂xi
i = 1, . . . n. Let ϕ ∈ Cc∞ (). Note that ψϕ ∈ Cc∞ (), hence we can use the classical
product rule
∂ ∂ψ ∂ϕ
(ϕ.ψ) = ϕ +ψ .
∂xi ∂xi ∂xi
Then
∂(u.ψ) ∂ϕ
ϕ=− u.ψ
∂xi ∂xi
∂ ∂ψ
= − u. (ϕ.ψ) + uϕ
∂xi ∂xi
∂u ∂ψ
= ϕψ + uϕ
∂xi ∂x
i
∂ψ ∂u
= u +ψ ϕ.
∂x i ∂x i
Hence,
∂(u.ψ) ∂ψ ∂u
=u +ψ
∂xi ∂xi ∂xi
Now, assume it is case for |α| = k, and consider α = β + γ such that |β| = k, and
|γ| = 1. Then by Theorem 3.1.2(2) we have
3.5 Basic Properties of Sobolev Spaces 165
Dβ Dη = Dα.
It follows that
u.ψ D α ϕ = u.ψ D β (D γ ϕ)
β
|β|
= (−1) (D η ψ)(D β−η u)(D γ ϕ)
η≤β η
β
|β|+|γ|
= (−1) D γ (D η ψ D β−η u)(ϕ).
η
η≤β
Now, we apply Theorem 3.1.2 to the RHS of the equality and rearrange terms, making
use of the fact that
β β α
+ = ,
η−γ η η
we obtain
α
u.ψ D α ϕ = (−1)|α| (D η ψ)(D α−η u)(ϕ).
η≤α η
Therefore,
α
α
D (u.ψ) = (D η ψ)(D α−η u)(ϕ).
η
η≤α
D α (u.ψ) ∈ L p ()
D α u = ϕ ∗ D α u = (D α u)
166 3 Theory of Sobolev Spaces
for all x ∈ Rn .
(2) If u ∈ W k, p () then u ∈ C ∞ ( ), and
D α u = ϕ ∗ D α u = (D α u)
for all x ∈ .
(3) u k, p ≤ u k, p .
Proof (1) and (2) follows immediately from Theorem 3.2.2 since every u in
W k, p (Rn ) is in L p (Rn ). For (3), Theorem 3.2.2 proved that
u p ≤ u , p . (3.5.3)
Now (3) follows from (3.5.3) and (3.5.4). The result can also be proved similarly for
the case = Rn .
k, p
3.5.6 W0 ()
One of the important Sobolev spaces is the so called: “zero-boundary Sobolev space”.
This is defined in most textbooks as the closure (i.e., completion) of the space Cc∞ .
However, since we haven’t yet discussed approximation results, we shall adopt for
the time being an equivalent definition which may seem to be a bit more natural.
Definition 3.5.8 (Zero-Boundary Sobolev Space) Let ⊆ Rn . Then, the zero-
k, p
boundary Sobolev space, denoted by W0 (), is defined by
k, p
In words, the Sobolev functions in W0 () together with all their weak derivatives
up to k − 1 vanish on the boundary. More precisely,
1, p
W0 (Rn ) = {W 1, p (Rn ) ∈ L P (Rn ) such that u |∂ = 0}.
Two advantages of this property, as we shall see later, is that: 1. the regularity of
the boundary is not necessary, and 2. extensions can be easily constructed outside .
3.6 W 1, p () 167
k, p
Functions in W0 () are thus important in the theory of PDEs since they naturally
satisfy Dirichlet condition on the boundary ∂.
k, p
Proposition 3.5.9 For any ⊆ Rn , the space W0 () is Banach, separable, and
reflexive, for every k ≥ 0, 1 ≤ p ≤ ∞.
1, p
Proof It suffices to prove that W0 () is closed in W 1, p (). The proofs for sep-
arability and reflexivity are very similar to that for W 1, p (). The details are left to
the reader as an exercise.
3.6 W 1, p ()
The space W 1, p () is particularly important since it provides the bases for several
properties of Sobolev spaces. Sometimes it might be simpler to prove certain prop-
erties in W 1, p () because the techniques involved might become more complicated
when dealing with k > 1, so we establish the result for W 1, p () knowing that the
results can be extended to general k by induction. The next result demonstrates one of
the features of this space that distinguish it from other Sobolev spaces. We consider
the simplest type of this space which is W 1,1 (I ), where I is an interval in R. The
Sobolev norm in this case takes the form
u1,1 = |u| d x + u d x.
I I
The following theorem gives a relation between classical derivatives and weak deriva-
tives.
(u − ũ) = 0,
and so
u − ũ = c.
(ũ) = u .
This implies that Du exists, and since u is absolutely continuous, we have by Fun-
damental Theorem of Calculus
Du = u ∈ L 1 (I ).
K ⊂ ⊂⊂ .
supp(ξ) ⊆ ⊂
for some open ⊃ K . By Theorem 3.3.3, there exists u m ∈ D() such that
u m −→ u
in L p (), and
∂u m ∂u
−→
∂xi ∂xi
Further, we have
∂(ξu m ) ∂(ξu) ∂ξ ∂u m ∂u
− = (u m − u) − ξ −
∂x ∂xi L p () ∂xi ∂xi ∂xi L p ()
i
∂ξ
≤ u m − u L p () + ξ∞ ∂u m − ∂u −→ 0.
∂x ∂x ∂xi L p ()
i ∞ i
Therefore,
ξu m −→ ξu = u
1, p
By completeness, u ∈ W0 ().
Remark The result of the previous proposition can be easily extended to W K , p ().
u ∗ f ∈ W 1, p (Rn )
and
Dxi ( f ∗ u) = f ∗ Dxi u.
=− u( f ∗ Dxi ϕ)
Rn
=− u.Dxi ( f ∗ ϕ)
Rn
= (Dxi u)( f ∗ ϕ)
Rn
= ( f ∗ Dxi u)ϕ.
Rn
3.6.2 Inclusions
W k2 , p () ⊂ W k1 , p ().
2. If ⊂ then
W k, p () ⊂ W k, p ( ).
4. If is bounded then
k, p k, p p
W0 () ⊂ W k, p () ⊂ Wloc () ⊂ L loc ().
Proof The proofs follow directly from the definitions of the spaces.
3.6 W 1, p () 171
The next result is a continuation of the calculus of weak derivatives. We have estab-
lished various types of derivative formulas, and it remains to establish the chain rule,
which plays an important role when we discuss the extension of Sobolev functions.
Theorem 3.6.5 Let u ∈ W 1, p (), 1 ≤ p ≤ ∞, and F ∈ C 1 (R) such that F ≤
M. Then
∂ ∂u
(F ◦ u) = F (u) · .
∂x j ∂x j
F ◦ u ∈ W 1, p ().
∂u
Proof Let 1 ≤ p < ∞. Since ∈ L p () for 1 ≤ i ≤ n, and F ∈ C(R),
∂xi
∂u p ∂u p
F (u) dx ≤ M p d x < ∞.
∂xi ∂x i
Hence,
∂u
F (u) ∈ L p (). (3.6.1)
∂xi
Hence,
F(u ) −→ F(u)
F (u ) −→ F (u)
∂u
and since ∈ L p (), we have by Theorem 3.3.2
∂xi
∂u ∂u
−→ in L p ().
∂xi ∂xi
Then
∂u ∂u
F (u ) ∂u − F (u) ∂u ≤ M −
∂xi ∂xi L p () ∂xi ∂xi L p ()
! " ∂u
+ F (u ) − F (u) ∂x
.
i L p ()
Hence,
∂u ∂u
F (u ) −→ F (u)
∂xi ∂xi
Consequently,
∂ F(u(x) ∂u
= F (u) ,
∂xi ∂xi
Integrating over ,
F ◦ u ∈ W 1, p ().
If is bounded, then
|F(0)| p d x < ∞,
Mu + |F(0)| ∈ L p (),
so
thus F ◦ u ∈ L p ().
For p = ∞, note that if u ∈ W 1,∞ (), then u ∈ W 1, p () and the chain rule holds
for all p < ∞.
It follows that
∂ϕ u ∂u
F (u) dx = − ϕ√ dx
∂xi + u2 + 2 ∂xi
and so |u| ∈ W 1, p () and the weak derivative is the summation of (3.6.2) and (3.6.3).
The details are left to the reader.
Recall in a basic functional analysis that the dual space X ∗ of a normed space X
was defined to be the space consisting of all bounded linear functionals on X, i.e.,
f ∈ X ∗ if f : X −→ R is bounded linear functional. (Here, the scalar field could be
also C but we only focus on R in this book). A Fundamental result is that, if p and
1 1
q are conjugates in the sense that + = 1, then
p q
(L p )∗ = L q .
3.6 W 1, p () 175
Since W 1, p () contains L p functions and their derivatives, one can define the
dual space as follows:
Definition 3.6.7 (Dual of Sobolev Space) Let q be the conjugate of p. Then the dual
space (W 1, p ())∗ of the Sobolev space W 1, p () is defined as
In general, we have
(W k, p ())∗ = W −k,q ().
For p = 2, the space H −1 is the dual space of H01 . The norm defined on H −1 can
be written as follows:
v D2u = − Dv.Du.
In particular,
for all v ∈ H01 (). On the other hand, every u ∈ L 2 defines a function
f u ∈ H −1 ()
Recall in Theorem 3.2.2 it was shown that if u ∈ L p () then u ∈ C ∞ ( ), and
D α u = ϕ ∗ D α u
Theorem 3.3.2 also established the fact that u → u in L p (). This gives an
approximating sequence in the form of convolution which approaches u from the
interior of the domain. With every fixed > 0, we can have some x in the neigh-
borhood ⊂ such that this interior neighborhood approaches from inside.
This is a “local” type approximation. This localness necessarily requires a bounded
subset . As soon as we consider the whole space Rn , the localness property should
disappear.
If = Rn then
u −→ u in W k, p (Rn ).
D α u = ϕ ∗ D α u. (3.7.1)
u ∈ W k, p ( ).
Choosing any compact set ⊂⊂ such that ⊂ for some small > 0, then
for all |α| ≤ k and letting → 0+ , we have
D α u − D α u L p ( ) = (D α u) − D α u L p ( ) −→ 0.
i.e., u −→ u in W k, p (Rn ).
As discussed above, in order to remove the localness property, one needs to work
on the whole space. However, one can still obtain global results when considering
bounded sets. The following theorem extends our previous result from local to global.
Theorem 3.7.2 (Meyers–Serrin Theorem) For every open set ⊆ Rn and 1 ≤ p <
∞, we have
u = ϕ ∗ u,
u k, p ≤ uk, p ,
D α u − D α u p → 0
in L p (Rn ). Therefore,
→0
p
u − uk, p = D α u − D α u pp −→ 0,
|α|≤k
u ∈ C ∞ (Rn ) ∩ W k, p (Rn ),
and u −→ u in W k, p (Rn ).
Now, let be open in Rn . Then there exists a smooth locally finite partition of
unity ξ˜i ∈ Cc∞ () subordinate to a cover {Ui }. Let > 0, and u ∈ W k, p (). Define
D α u n − D α uW k, p () −→ 0,
for all |α| ≤ k. This is a significant advantage over the local approximation. The theo-
rem has several important consequences. One consequence is the following corollary
which is analogous to (3.7.4).
k, p
Corollary 3.7.3 Cc∞ () = W0 () in the Sobolev norm ·k, p .
k, p
Proof In the proof of the previous theorem, let u ∈ W0 (), then the sequence u i
k, p
in (3.7.2) belongs to W0 (), hence by (3.7.3) and the argument thereafter, we have
∞
v ∈ Cc (). This gives
k, p
Cc∞ () ∩ W k, p () = W0 ().
k, p
This result serves as an alternative definition of W0 ().
Definition 3.7.4 (Zero-boundary Sobolev Space) Let be open in Rn . The space
W0 () is defined as the closure of Cc∞ () in the Sobolev norm ·k, p .
k, p
The proof of the previous corollary clearly shows that Definition 3.5.8 implies
Definition 3.7.4, whereas Definition 3.7.4 trivially implies Definition 3.5.8, thus the
two definitions are equivalent.
Another important consequence of Meyers–Serrin is that any Sobolev function
on the whole space Rn can be approximated by a smooth function in the ·k, p norm,
i.e.
C ∞ (Rn ) = W k, p (Rn )
in the Sobolev norm ·k, p . The next result has even more to say.
Proposition 3.7.5 Cc∞ (Rn ) = W k, p (Rn ) in the Sobolev norm ·k, p for
1 ≤ p < ∞.
180 3 Theory of Sobolev Spaces
and set
Then v j ∈ Cc∞ (Rn ), and clearly v j (x) → u a.e. Differentiate for |α| = 1 and i =
1, 2, . . . , n, we obtain
∂v j ∂u j 1 a.e. ∂u j ∂u
= ξj + ξu j −→ = .
∂xi ∂xi j ∂xi ∂xi
i.e.,
∂v j ∂u
−→ in L p (Rn ).
∂xi ∂xi
A similar argument can be done for higher derivatives up to k using Leibnitz rule to
obtain
α
D (v j ) − D α u → 0.
p
Thus,
D α (v j ) ∈ L p (Rn )
Two important results can be inferred from the previous result. It was indicated
earlier that the Sobolev space is supposed to be the completion of the Schwartz space.
Since
3.8 Extensions 181
we infer
Corollary 3.7.6 S(Rn ) = W k, p (Rn ).
The second result that can be inferred from Corollary 3.7.3 is about the connection
between the two Sobolev spaces W and W0 . Although we have the inclusion
k, p
W0 () ⊂ W k, p ()
Proof Note that from the proof of Proposition 3.7.5, for every u ∈ W k, p (Rn ), we
can find an approximating sequence
k, p
So v j ∈ W0 (Rn ). Since v j −→ u in ·k, p norm, by the completeness of the space
k, p
we obtain u ∈ W0 (Rn ).
3.8 Extensions
3.8.1 Motivation
In the previous section, we have obtained our results on bounded sets and in
Rn . In general, the behavior of Sobolev functions on the boundary of the domain
has always been a critical issue that could significantly affect the properties of the
(weak) solutions of a partial differential equation. In this regard, it might be useful
sometimes to extend from W k, p () to a W k, p ( ) for some ⊂ , in particular,
from W k, p () to W k, p (Rn ) because functions in W k, p () inherit some important
properties for those in W k, p (Rn ). This boils down to extending Sobolev functions
defined on a bounded set to be defined on Rn . However, we need to make certain
that our new functions preserve the weak derivative and other geometric properties
across the boundary. One of the many important goals is to use the extension to
obtain embedding results for W k, p () from W k, p (Rn ). It should be noted that we
have already used the zero extension in (3.3.7) in the proof of Theorem 3.3.2 in
the case ⊂ Rn . The treatment there wasn’t really problematic because we dealt
merely with functions. In Sobolev spaces, the issue becomes more delicate due to
the involvement of weak derivatives.
182 3 Theory of Sobolev Spaces
The first type of extension is the zero extension. For a function f defined on , the
zero extension can be simply defined by
f (x) x ∈
f¯(x) = f (x) · χ (x) = (3.8.1)
0 x ∈ Rn \ .
Proof Let ū ∈ L p (Rn ) be the zero extension of u. For > 0, consider the convolution
approximating sequence
= u(y)ϕ (x − y)dy
∩B (x)
= u(x) ∗ ϕ (x) on .
Proof Let u ∈ L p (). Then ū ∈ L p (Rn ). Hence, by Theorems 3.3.2 and 3.3.3, the
mollification ū m ∈ C ∞ (Rn ) and ū m −→ ū in L p (Rn ). Hence,
u − u p = ū − ū p n −→ 0.
m L () m L (R )
For convenience, set u m = u m , and consider a partition of unity ξm ∈ Cc∞ (), and
define the sequence
wm = u m ξm .
But,
wm − ξm u L p () = ξm (u m − u) L p () ≤ u m − u L p () → 0.
ξm u − u L p () −→ 0.
Hence, wm −→ u in L p ().
The situation in Sobolev spaces is more delicate since they involve weak deriva-
tives. The zero extension breaks the graph of the function across the boundary and
jump discontinuity may occur, and consequently, the weak derivatives could fail to
k, p
exist. The space W0 () will play an important role here since functions on this
space already assumed to vanish at the boundary, so the zero extension won’t break
the graph.
k, p
Proposition 3.8.3 Let be open in Rn , and let u ∈ W0 () for some 1 < p < ∞.
Then ū ∈ W k, p (Rn ) and
ūW k, p (Rn ) = uW k, p () .
k, p
Proof If u ∈ W0 (), then u ∈ L p (), and so ū ∈ L p (Rn ). Moreover, by
Meyers–Serrin Theorem, there exists a sequence u m ∈ Cc∞ () such that u m −→ u
in W k, p (), so u m −→ u in L p (). Also, there exists ū m ∈ Cc∞ (Rn ) such that
ū D α ϕd x = u D α ϕd x
Rn
= (lim u m )D α ϕd x
= lim u m D α ϕd x
= (−1)α lim D α u m ϕd x
= (−1)α D α ϕϕd x
= D α ϕϕd x.
Rn
So
D α ū = D α u on Rn ,
u m −→ u in L p ()
and
D α u m −→ D α u in L p ( )
for every ⊂⊂ .
3.8 Extensions 185
∂vm ∂u
−→ in L p ( )
∂xi ∂xi
and set
u m −→ ū = u a.e. on .
∂ϕ ∂ϕ(x)
ū m (x) dx = (ϕm (x) ∗ ū(x)ξm (x))
∂xi ∂xi
∂ϕ(x)
= [ū(x − y)ξm (x − y)ϕm (y)dy] dx
Rn ∂xi
∂ϕ(x)
= dx ū(x − y)ξm (x − y)ϕm (y)dy
Rn ∂xi
∂
=− ϕ(x) (ū(x − y)ξm (x − y)) ϕm (y)dy
Rn ∂xi
∂ ū(x − y)
=− ϕ(x) ξm (x − y)ϕm (y)dy
Rn ∂xi
∂ ū
=− ϕ(x)d x
Rn ∂xi m
That is,
186 3 Theory of Sobolev Spaces
∂ ∂u
ū − −→ 0
∂x m ∂x p
i i L ( )
in L p ( ) for every ⊂⊂ .
The previous result enables us to construct a sequence in Cc∞ (Rn ) that converges to
k, p
u ∈ W k, p () in Wloc (). One may start to wonder when can we get the convergence
in W k, p ()? We will either let = Rn , so no extension is needed to pass across the
boundary, or we need to impose extra conditions on the boundary ∂ to guarantee
nice behavior of the weak derivatives. The next result deals with the first option, i.e.,
extending the domain to the whole space.
Proposition 3.8.5 Let u ∈ W k, p (Rn ) for some 1 ≤ p ≤ ∞. Then there exists u m ∈
Cc∞ (Rn ) such that u m −→ u in W k, p (Rn ).
which is defined by
1 x ≤ m
ξm =
0 x > m
Hence, u m −→ u in W k, p (Rn ).
but may require boundary with a “nice” structure. The word “nice” here shall be
formulated mathematically in the following definition.
Definition 3.8.6 Let ⊂ Rn be bounded and connected.
(1) The set is said to be Lipschitz (denoted by Lip) if for every x ∈ ∂, there
exists a neighborhood N (x) such that
Γ (x) = N (x) ∩ ∂
(2) The set is said to be of class C k if for every x ∈ ∂, there exists a
neighborhood N (x) such that
Γ (x) = N (x) ∩ ∂
In words, a Lipschitz domain means its boundary locally coincides with a graph of
a Lipschitz function. A C k -class domain means its boundary locally coincides with
a C k -surface. A bounded Lip domain has the extension property for all k. Roughly
speaking, a bounded domain is Lip if its boundary behave as a Lipschitz function.
So, every convex domain is Lip, and all smooth domains are Lip. On the other hand,
a polyhedron in R3 is an example of a Lip domain that is not smooth.
Remark We need to note the following:
φ = (φ1 , φ2 , . . . , φn )
ψ = (ψ1 , ψ2 . . . , ψn ).
∂φ j
which is the n × n matrix with entries , 1 ≤ i, j ≤ n. The determinant of J is
∂xi
known as the Jacobian determinant of φ, and is denoted by
|J (φ)| = det(Dφ(x)).
(ψ ◦ φ)(x) = I d(x) = x.
3.8 Extensions 189
Taking the derivatives of both sides of the equation, then taking the determinant of
each side, given the fact that det(AB) = det(A) · det(B), give
1 = det(Dφ(x)) det(Dψ(y)),
hence
in other words,
∂(y1 , . . . , yn ) 1
= .
∂(x1 , . . . , xn ) ∂(x1 ,...,xn )
∂(y1 ,...,yn )
for some M > 0, so the Jacobian determinant doesn’t vanish for C k -diffeomorphisms.
This coordinate system helps us to change variables when performing multiple inte-
grals. Namely, let f ∈ L 1 (V ), then substituting y = ϕ(x), gives
f dy = ( f ◦ φ) |J (φ)| d x.
V U
Similarly, if f ∈ L 1 (U ), then
f dy = ( f ◦ ψ) |J (ψ)| dy.
U V
φ−1 = ψ ∈ C k (V ).
This strong version of the diffeomorphism guarantees that the mappings φ and
ψ, in addition to all their derivatives up to kth order are bounded since they are
continuously defined on closed sets, i.e.
∂ϕ j ∂ψ j
max , < ∞.
1≤i, j≤n ∂x i ∞ ∂ yi ∞
190 3 Theory of Sobolev Spaces
such that , ∗ are open sets in Rn , and U ⊂ and V ⊂ ∗ . This guarantees that
all first derivatives of φ, ψ are bounded on and ∗ , respectively.
Another advantage of the definition is that it defines φ on a compact set , which
allows us to define new Sobolev spaces on compact manifolds in Rn . Indeed, ∂ can
be covered by a finite number of open sets. In particular, each point, say x, in the set
∂ is contained in some neighborhood N (x) that can be represented by the graph
of φ, and so ∂ is covered by a finite number of these neighborhood, say {Ni }. In
other words, ∂ is covered by a finite number of subgraphs of mappings
φi ∈ C k (Ni ),
and thus a system of local coordinates is constructed via the mappings {ψi } for .
The following result, which is helpful in proving the next theorem, provides
a sufficient condition for a composition with a diffeomorphism to be a Sobolev
function.
Lemma 3.8.9 (Change of Coordinates) Let U, V be open bounded sets in Rn , and
let u ∈ W 1, p (U ), and φ : U −→ V , be a C 1 -strongly diffeomorphism, and let
φ−1 = ψ = (ψ1 , . . . , ψn ).
∂v ∂u(ψ) ∂ψk
n
= · .
∂ yi k=1
∂xk ∂ yi
Moreover,
vW k, p (V ) ≤ C uW k, p (U )
∂u ∂v
Also, note that ∈ L p (U ) and ∇ψ is continuous, and consequently exists.
∂xi ∂ yi
∂v ∂v
The next step is to evaluate , then to show that ∈ L p (V ), which implies that
∂yj ∂yj
v ∈ W 1, p (V ). By Meyers–Serrin Theorem, there exists a sequence
u m ∈ W 1, p (U ) ∩ C ∞ (U )
∂u m ∂u
−→ in L p (U ).
∂xi ∂xi
vm = u m ◦ ψ.
∂u m ∂u
Since −→ in L p (U ), this gives
∂xi ∂xi
∂vm n
∂u(ψ) ∂ψk
−→ · = w ∈ L p (V ) (3.8.3)
∂ yi k=1
∂x k ∂ yi
∂vm
where |J (ψ)| p < M. This implies that vm −→ v in L p (U ) and −→ w in
∂ yi
∂v
L p (U ), but since v ∈ L p (V ) and exists, we conclude from (3.8.3) and unique-
∂ yi
ness of weak derivatives that
192 3 Theory of Sobolev Spaces
∂vm ∂v
−→ = w.
∂ yi ∂ yi
p
∂v p n
∂u(ψ) ∂ψk
∂y p = ∂xk
·
∂ yi
dy
i L (V ) V k=1
n
∂u ∂ψk p
≤ · dy.
k=1 V ∂xk ∂ yi
∂v p n
∂u(x) p
≤ MC dx
∂y p 1
∂x j
i L (V ) k=1 U
n
∂u p
≤ MC1
∂x p < ∞
k=1 j L (U )
∂v
This implies that ∈ L p (V ), and hence v ∈ W 1, p (V ). Letting C p = MC1 , this
∂yj
gives
n 1/ p
∂v ∂u p
∂y p ≤ C ∂x p .
i L (V ) k=1 j L (U )
vW 1, p (V ) ≤ C uW 1, p (U ) .
u m ∈ W k, p (U ) ∩ C ∞ (U )
converging to u ∈ W k, p (U ) and the chain and Leibnitz rules are applied, then taking
the limits. We leave the details for the reader.
3.8 Extensions 193
H = {x = (x1 , . . . , xn−1 , 0) ∈ Rn },
Γ (x0 ) = N (x0 ) ∩ ∂
is “flat” and lies in the hyperplane H = {xn = 0}. For a small δ > 0, let
u(x) x ∈ B+
ū(x) = (3.8.4)
3u(x1 , . . . , −xn ) − 2u(x1 , . . . , −2xn ) x ∈ B − .
194 3 Theory of Sobolev Spaces
Letting
u + = ū | B +
lim u + (x , xn ) = lim− u − (x , xn ),
xn →0+ xn →0
and for 1 ≤ i ≤ n − 1,
∂u − ∂u ∂u
=3 (x , −xn ) − 2 (x , −2xn ),
∂xi ∂xi ∂xi
hence
∂u − ∂u ∂u +
lim− = (x , 0) = lim+ .
xn →0 ∂xi ∂xi xn →0 ∂x i
For i = n, we have
∂u − ∂u ∂u
= −3 (x , −xn ) + 4 (x , −2xn ),
∂xn ∂xn ∂xn
so
∂u − ∂u ∂u +
lim− = (x , 0) = lim+ .
xn →0 ∂xn ∂xn xn →0 ∂x n
Therefore,
ū ∈ C 1 (B) ⊂ W 1, p ()
(Proposition 3.6.4(5)), and by simple calculation one can easily see that
/ C 1 () and
where C is a constant that doesn’t depend on u. Now, suppose that u ∈
∂ is not flat near x0 . We flatten out the boundary
Γ = ∂ ∩ N
through a change in the coordinate system that will map it to a subset of the hyperplane
xn = 0. We will use a C 1 -strongly diffeomorphism
φ : N −→ B, φ ∈ C 1 (N ),
3.8 Extensions 195
and
ψ = φ−1 ∈ C 1 (B),
φ(U ) = V = B + ,
where U = N ∩ , and
ψ(V ) = U.
Under this new coordinate system, consider the restriction of u to U, and define
which makes V = B + the domain of v. and u(U ) = v(V ). By Lemma 3.8.9 and the
same procedure as above, it can be shown that
v ∈ C 1 (B + ) ∩ W 1, p (B + ),
and
Similar to (3.8.2), we extend v from B + to B through the even reflection v̄. Again,
it can be shown that v̄ ∈ W 1, p (B) and
Then, we pull the function back to the original system by composing v̄ with φto
produce a new extension
ū = v̄(φ(x)),
and
196 3 Theory of Sobolev Spaces
Note that we have two issues with the treatment above. First, the extensions are
still not of compact support, so they are not extended to the whole space. Second,
it can be implemented only locally because the coordinate system provides a local
representation. So we need to make use of the powerful tool of partition of unity to
globalize our results and compactly support the extensions so that we can extend by
zero the functions to Rn . Since ∂ is compact, there exists a finite cover {Ni } of ∂,
m
Ni = N , and let N0 ⊂⊂ such that
i=1
m
⊆ Ni = N ∪ = A ⊆ ∗ .
i=0
u i ∈ W 1, p (Ui ).
ũ i ∈ W 1, p (Ni )
m
ū = ũ i ,
i=0
m
for x ∈ Ni . Then ū ∈ W 1, p (A), such that ū = u on and
i=0
supp(ū) ⊆ A ⊆ ∗ .
ū x ∈ A
Eu =
0 x ∈R\ A
where
m
M = max{Ci , i = 0, . . . , m}, K = ξi W 1,∞ (Ni ) , C = K M(m + 1).
i=0
(1) The result holds for all k ∈ N, and in this case the procedure of the proof becomes
harder since the diffeomorphism will be C k instead of C 1 , which requires a
more complicated even reflection. For example, u − of (3.8.4) will be of the form
k xn
ci u(x , − ) for some coefficients ci such that
i=0 i +1
198 3 Theory of Sobolev Spaces
k+1 j
−1
ci = 1, j = 0, 1, . . . , k
i=1
i
where
x0 = (x0 , x0n ), y0 = (y0 , y0n ).
ūW 1, p (V ) = uW 1, p (U ) .
u(x) x ∈ B+
ū(x) = ,
u(x1 , . . . , −xn ) x ∈ B − .
Now, with the help of the Extension Theorem, we can strengthen the result of Propo-
sition 3.8.4.
Proposition 3.8.11 Let be open bounded in Rn and of C k class, and let u ∈
W k, p () for some 1 ≤ p ≤ ∞. Then there exists u m ∈ Cc∞ (Rn ) such that u m −→ u
in W k, p ().
Proof Let ∂ be bounded. Then by the Extension Theorem, there exists an extension
operator
Eu : W k, p () −→ W k, p (Rn ).
u m = ξm (Eu)
so it is the desired sequence. The case when ∂ is unbounded is left to the reader as
an exercise (see Problem 3.11.31).
One of the advantages of imposing a nice structure on the boundary of the domain
is that it allows us to construct our approximating function to be defined in rather
than in , i.e., our approximating function will belong to C ∞ (). This will provide
a global approximation up to the boundary. Functions in C ∞ () are functions which
are smooth up to the boundary.
We next prove another variant of the Meyers–Serrin Theorem which establishes a
global approximation of smooth functions up to the boundary. The idea is to extend
functions in W k, p () to functions in W k, p (Rn ) in order to apply Proposition 3.7.5.
Theorem 3.8.12 Let be open bounded in Rn and of C k class. Then C k () is
dense in W k, p () in the Sobolev norm ·k, p for all 1 ≤ p < ∞.
Let u ∈ W k, p () (1 ≤ p < ∞). We want to show that there exists a sequence
u j ∈ W k, p () ∩ C ∞ ()
(u j ) | ∈ Cc∞ (),
and u j converges to E(u) | = u. This proves that C ∞ () is dense in W k, p (). Now
the result follows from (3.8.10).
This section establishes some inequalities that play an important role in embedding
theorems and other results related to the elliptic theory and partial differential equa-
tions. There are many of these inequalities, but we will discuss some of the important
ones that will be used later and may provide the foundations for other inequalities.
In particular, we will study estimate inequalities in Sobolev or Holder spaces in the
following forms:
(1) u L ≤ C Du L .
(2) u L ≤ C uW .
(3) uC ≤ C uW .
Here, L refers to an arbitrary Lebesgue measurable space L p , W refers to a Sobolev
space, and C refers to a Holder continuous space. A main requirement of all these
inequalities is that the constant C of the estimate must be kept independent of the
function u, otherwise the estimate will lose its power and efficiency in applications
and producing further inequalities and other embedding results. Another concern is
the conjugate we need for a number p. Recall in measure theory, the conjugate of
a number p is the number q such that p −1 + q −1 = 1. This parameter was required
to guarantee the validity of Holder’s inequality, which is a fundamental inequality in
the theory of Lebesgue measurable spaces, and this is the reason why it is sometimes
known as “Holder conjugate”. Likewise, the conjugate needed for the number p to
obtain Sobolev inequalities shall be called Sobolev conjugate, and will be denoted
by p ∗ . Inequality (1) above is fundamental in this subject and plays the same role
as Holder’s inequality in Lebesgue spaces, therefore, it is important to establish this
inequality.
The most basic form of inequality (1) takes the following form: If u ∈ C 1 [a, b]
then
u L 1 [a,b] ≤ C u ∞ .
3.9 Sobolev Inequalities 201
so
b
|u| ≤ u (x) d x = (b − a) max u (x) ,
a [a,b]
i.e., the constant C = b − a depends only on the domain [a, b]. How can we extend
this estimate in more general Lebesgue spaces? What if the estimate is taken over
Rn instead of R? Let us assume that for some p, q ≥ 1,
where φ is the function defined in (3.2.5). Then integrating φ over Rn using the
change of variable x = dy gives
so
where
n n
α= − − 1.
p q
Thus, this will be defined as the Sobolev conjugate (or Sobolev exponent) of p
for all p ∈ [1, n).
Definition 3.9.1 (Sobolev Exponent) Let p ∈ [1, n). Then the Sobolev exponent of
p is
np
p∗ = .
n−p
Remark (1) For the definition to take place, we should have 1 ≤ p < n. For p = n,
we agree to have p ∗ = ∞.
(2) The new conjugate definition takes into account the space Rn but it cannot be
reduced to the classical Holder conjugate q −1 + p −1 = 1 as it is apparent from
the definition of p ∗ , so it cannot be regarded as a generalization of the Holder
conjugate.
Now, we come to the next stage; proving the inequality, with the assumption that
q = p ∗ . Before we prove the inequality, we recall some basic inequalities from the
theory of Lebesgue spaces.
3.9 Sobolev Inequalities 203
and
n n
ui ≤ u i pi .
i=1 1 i=1
Theorem 3.9.3 (Nested Inequality) Let u ∈ L q () for some bounded measurable
set of measure μ() = M. If 1 ≤ p < q, then u ∈ L p () and
u p ≤ C uq ,
where C = M p − q = C( p, q, ).
1 1
q
Proof Note that |u| p ∈ L q (). Let v = |u| p , then v ∈ L q/ p (). Let r = , then
p
find s the conjugate of r , then using Holder’s Inequality on v ∈ L () and 1 ∈ L s ().
r
This gives
1/r
|u| p = |v| ≤ |u| pr .(μ())1/s .
1
Taking the power of for both sides of the inequality, given that pr = q and
p
1 q−p
= ,
sp qp
So we write
= uθq (1−θ)q
p · ur .
Now we come to our Sobolev inequalities. Our first inequality is fundamental, and
known as Gagliardo–Nirenberg–Sobolev inequality. Gagliardo and Nirenberg proved
the inequality for the case p = 1, and Sobolev for 1 < p < n in the space Rn . The
first three inequalities are known as “Gagliardo–Nirenberg–Sobolev inequalities”,
although the first of them (Theorem 3.9.5) is the most well known.
Theorem 3.9.5 (Gagliardo–Nirenberg–Sobolev Inequality I) Let 1 ≤ p < n and
u ∈ Cc1 (Rn ). Then
Then
xi
|u(x)| ≤ Dxi u dti .
−∞
Now, we integrate with respect to x1 over R, then we use the extended Holder’s
inequality
∞ ∞ n−1
1
∞ n ∞ n−1
1
n
|u(x)| n−1 dx ≤ Dx1 u dt1 · Dxi u dti d x1
−∞ −∞ −∞ i=2 −∞
1
n−1 n−1
1
∞ n ∞ ∞
≤ Dx1 u dt1 ( Dxi u d x1 dti ) .
−∞ i=2 −∞ −∞
∞ n ∞ ∞ 1 n
n n−1 n−1
|u(x)| n−1 d x ≤ ... Dxi u d x1 . . . dti . . . d xn = |Du| d x .
−∞ i=1 −∞ −∞ Rn
(3.9.4)
This establishes the inequality for p = 1.
For 1 < p < n, let y = |u|α where α is to be determined later, and substitute above
in (3.9.4). Then
∞ n−1
n
αn
|u(x)| n−1 dx ≤ |D |u|α | d x = α |u|α−1 |Du| d x.
−∞ Rn Rn
We apply the Holder’s inequality on the last term, and taking into account that |Du| ∈
L p and |u|α−1 ∈ L q , where q is the Holder conjugate of p such that p −1 + q −1 = 1.
Consequently
p
q= .
p−1
This gives
∞ n−1
n
p−1
p
1/ p
αn (α−1) p
|u(x)| n−1 dx ≤α |u| p−1 dx |Du| d x
p
. (3.9.5)
−∞ Rn Rn
206 3 Theory of Sobolev Spaces
Now, we choose α such that the powers of u in both sides of the above inequality
are equal to p ∗ , i.e.
αn p(α − 1) np
= = = p∗ .
n−1 p−1 n−p
This gives
p(n − 1)
α= .
n−p
Substitute in (3.9.5), and divide both sides of the inequality by the first term of the
RHS of the inequality, noting that
We thus obtain
∞ 1/ p∗ 1/ p
p∗ p(n − 1)
|u(x)| d x ≤ |Du| d x
p
, or
−∞ n−p Rn
Remark Note that from the proof of the case 1 < p < n, we cannot assume the
case p ≥ n since the choice of α will be invalid otherwise.
u L p ≤ uW k, p
hence
Du L p ≤ uW 1, p .
3.9 Sobolev Inequalities 207
∗
The next result extends the corollary to include not only L p (Rn ), but other L q (Rn )
for all q ∈ [ p, p ∗ ].
Theorem 3.9.7 Let 1 ≤ p < n and u ∈ Cc1 (Rn ). Then
for all q ∈ [ p, p ∗ ].
Note that
1 1
1= + ,
p q
where
1 1
p = > 1, q = > 1.
θ 1−θ
The next step is to generalize the above results in two ways. In particular, we will
establish the inequalities for any Sobolev function in W 1,1 rather than Cc1 , and on any
bounded open set rather than the whole space. Of course, we need the Meyers–Serrin
Theorem for the former idea, and the extension operator for the latter. The second
inequality (Theorem 3.9.7) shall also be generalized to hold for any Sobolev function
in W 1,1 and on any bounded open set. The first inequality is the famous Poincare
inequality, which is one of the most useful and important inequalities in the theory
of PDEs. Note that q here is just a parameter that doesn’t play the role of Holder
conjugate.
By GNS inequality
u j = u j L p∗ (Rn ) ≤ C Du j L p (Rn ) ≤ C Du j L p () , (3.9.6)
L p∗ ()
and
u j − u i p∗ n ≤ C Du j − Du i p n −→ 0.
L (R ) L (R )
∗ ∗
Hence, {u j } is Cauchy in L p (Rn ) which is Banach, so u j −→ v ∈ L p (Rn ), hence
v = u. Take the limit of both sides of (3.9.6) using Fatou’s Lemma (Theorem 1.1.5).
Since q ≤ p ∗ , the result now follows from the nested inequality (Theorem 3.9.3),
and the second inequality follows from the fact that Du L p ≤ uW 1, p .
Remark The particular case when q = p in the first estimate is the classical Poincare
inequality. In this case, the inequality holds for 1 ≤ p < ∞ since we always have
p < p ∗ for all p, n.
1, p
We infer from the above inequality that if we measure the size of the function in W0
by the p-norm, then its size will be bounded above by the size of its weak derivative.
where C = C( p, q, n, ).
Proof We proceed the same as we did in the previous inequality. The details are left
to the reader as an exercise.
All the above results hold for p < n. For the borderline case p = n, we see that
p ∗ = ∞. So in fact we have the following:
Theorem 3.9.10 (Sobolev’s Inequality) If u ∈ W 1,n (Rn ), n ≥ 2, then u ∈ L q (Rn )
for all n ≤ q ≤ ∞ and
which implies
n2 n
where r1 = . We apply Young’s inequality with p = and q = n. This
n−1 n−1
gives
n−1 1
unL r1 (Rn ) ≤n unL n (Rn ) + DunL n (Rn )
n n
! "
≤ n unL n (Rn ) + DunL n (Rn ) .
1
Now, take the power of both sides of the equation and make use of the equivalent
norms n
! "1/n
u L r1 (Rn ) ≤ n 1/n unL n (Rn ) + DunL n (Rn ) ≤ C uW 1,n (Rn ) . (3.9.7)
210 3 Theory of Sobolev Spaces
Now, we have 1 < n < r1 , so we apply the interpolation inequality for all q ∈ [n, r1 ]
and make use of (3.9.7)
We can repeat the same argument for p = n and α = n + 1. This will also give us
the same estimate
We will discuss some inequalities that connect Sobolev spaces to Holder spaces.
Thus, we need to review some facts about Holder spaces.
Definition 3.9.11 (Holder-Continuous Function) A function u : −→ R is called
Holder continuous with exponent β ∈ (0, 1] if there exists a constant C > 0 such
that for all x, y ∈
In general, we write
3.9 Sobolev Inequalities 211
uk,β = D α u∞ + [D α u]0,β .
|α|≤k |α|=k
Here, it is important to note that a function u ∈ C k,β () doesn’t necessarily imply
that u is βth Holder continuous, but only its kth partial derivative is a βth Holder
continuous. It can be shown that the space C k,β () endowed with the norm [u]k,β is
a Banach space.
The reason for studying this type of spaces in this section is that for relatively
high values of p ( p > n), the Sobolev function tends to embed in a Holder space.
This case will be illustrated through what is known as the Morrey’s inequality. The
next lemma is useful in proving the inequality.
Lemma 3.9.13 Let u ∈ Cc1 (Rn ) and p > n. Then
(1) For all r > 0, we have for some constant C1 = C(n) > 0
1 |Du(y)|
|u(y) − u(x)| dy ≤ C1 dy.
|Br (x)| Br (x) Br (x) |x − y|n−1
p
(2) If q is the Holder conjugate of p, (i.e., q = ), then for some constant
p−1
C2 = C(n, p) we have
1
dy = C2 · r ( p−n)/( p−1) .
Br (x) |x − y|(n−1)q
Then
t t
|u(x + tv) − u(x)| = |Du(x + τ v)| · vdτ ≤ |Du(x + τ v)| dτ .
0 0
212 3 Theory of Sobolev Spaces
t
|u(x + tv) − u(x)| ds ≤ |Du(x + τ v)| d Sdτ
S 0 S
t
|Du(x + τ v)|
= τ n−1 d Sdτ
0 S τ n−1
t
|Du(x + τ v)|
= τ n−1 d Sdτ
0 S |x + τ v − x|n−1
Now, substitute with y = x + τ v, and note that the first integral is over the (n − 1)-
dimensional sphere Sτn−1 (x) of radius τ ≤ t. Converting to polar coordinates, given
that
Sτn−1 (x) = τ n−1 Sτn−1 (0) ,
where |S| denotes the surface area, and the above integral becomes
t
|Du(y)|
|u(x + tv) − u(x)| d S ≤ d Sdτ .
S 0 Sτ (x) |y − x|n−1
Bt (x) ⊆ Br (x).
So we have
|Du(y)|
|u(x + tv) − u(x)| d S ≤ dy.
S Br (x) |y − x|n−1
Multiplying both sides by t n−1 , and then integrating both sides with respect to t from
0 to r yields
r r
|Du(y)|
|u(x + tv) − u(x)| t n−1 dtd S ≤ t n−1 dydt.
0 S 0 Br (x) |y − x|n−1
Again, the integration on the LHS is an integration over the ball Br (x) and t n−1 dt
can be integrated using usual calculus rules. This gives
rn |Du(y)|
|u(x + tv) − u(x)| d S ≤ dy.
Br (x) n Br (x) |y − x|n−1
rn
= C(r, n)
n
depends also on r . To eliminate r , we divide both sides of the inequality by the
volume of the n- dimensional ball
π n/2 r n
|Br (x)| =
n2 + 1
where is the gamma function. This gives the inequality in (1) with
n
2
+1
C1 = = C(n).
nπ n/2
To prove (2), let r = |x − y|. Then using polar coordinates by letting ρ = |x − y|
and dy = ρn−1 dρ, we obtain
r
1
(n−1) p/ p−1
dy = ρ(1−n) p/ p−1 ρn−1 dρ
Br (x) |y − x| 0
r
p − 1 ( p−n)/( p−1)
= ρ(1−n)/ p−1 dρ = ·r .
0 p−n
p−1
C2 = = C(n, p).
p−n
Theorem 3.9.14 (Morrey’s Inequality) Let p ∈ (n, ∞]. If u ∈ Cc1 (Rn ), then
n
where β = 1 − .
p
Proof We will only prove the case n < p < ∞, and the case p = ∞ is left to the
reader as an exercise (see Problem 3.11.44). The inclusion u ∈ Cc1 (Rn ) ⊂ W 1, p (Rn )
is clear, so we just need to prove the estimate. Let x ∈ Rn , and |B1 (x)| denotes the
volume of the unit ball centered at x. Then we have
1
|u(x)| ≤ |u(x) − u(y)| dy + |u(y)| dy
|B1 (x)| B1 (x) B1 (x)
|Du(y)| 1
≤ C1 dy + |u(y)| dy
B (x) |x − y|
n−1 |B1 (x)| B1 (x)
1
|Du(y)|
≤ C1 dy + |u(y)| dy
B1 (x) |x − y|
n−1
B1 (x)
|x − y|1−n ∈ L q (Rn )
p
where q = p = . Hence, we apply Holder’s inequality to obtain
p−1
|u(x)| ≤ C1 |Du(y)| |x − y|1−n dy + u L p (Rn )
B1 (x)
1/ p p−1/ p
(1−n) p/ p−1
≤ C1 |Du(y)| dy p
· |x − y| dy
B1 B1
+ C1 u L p (Rn ) .
( p−1)/ p
where C3 = C1 C2 . Taking the supremum over all x ∈ Rn
x+y
Define the open set N = Br (x) ∩ Br (y). Letting z = , then it is clear that
2
B r2 (x) = B r2 (z) ≤ |N | < |Br (x)| = |Br (y)|
We can write
|u(x) − u(y)| ≤ |u(x) − u(z)| + |u(z) − u(y)| .
Then
1
|u(x) − u(y)| ≤ |u(x) − u(z)| dz + |u(z) − u(y)| dz
|N | Br (x) Br (y)
1
≤ |u(x) − u(z)| dz + |u(z) − u(y)| dz
B r2 (z) Br (x) Br (y)
2n
= |u(x) − u(z)| dz + |u(z) − u(y)| dz
|Br (x)| Br (x) Br (y)
1
=2 n+1
|u(x) − u(z)| dz
|Br (x)| Br (x)
|Du(z)|
|u(x) − u(y)| ≤ 2n+1 C1 dz
Br (x) |x − z|n−1
r p−1/ p
≤ 2n+1 C1 Du L p (Rn ) |x − z|(1−n) p/ p−1 dz
0
( p−1)/ p ! " p−1/ p
=2 n+1
C1 C2 · Du L p (Rn ) r ( p−n)/( p−1)
So we obtain
( p−1)/ p
where C = 2n+1 C3 = 2n+1 C1 C2 . Now, dividing both sides of (3.9.9) by r β =
|x − y|β gives
|u(x) − u(y)|
≤ C Du L p (Rn ) < ∞.
|x − y|β
216 3 Theory of Sobolev Spaces
where
n
+1 p−1
2 p−1 p
C = 2n+1 .
nπ n/2 p−n
Morrey’s inequality holds for Rn . We can, however, generalize it to hold for any
subset of Rn that satisfies the hypotheses of Theorem 3.8.10, thanks to the extension
operator.
n
for p ∈ (n, ∞], and β = 1 − .
p
Proof This follows from Extension Theorem, Meyers-Serrin Theorem, and Morrey’s
inequality. The details are left to the reader (see Problem 3.11.45).
n
(2) If k = then u ∈ L q () where 1 ≤ k < ∞, and
p
n
(3) If k > then u ∈ C k−m−1,β () where
p
⎧ n n
⎪
⎨m + 1 − ∈
/N
n p p
m= ,β = n ,
p ⎪θ
⎩ ∈N
p
Proof To prove (1), note that D α u ∈ W 1, p () for all |α| ≤ k − 1, and so by GNS
1 1 k ∗
where = − . So u ∈ W k−1, p () and
p∗ p n
uW k−1, p∗ () ≤ C1 uW k, p () .
∗ ∗
We repeat the same argument for u ∈ W k−1, p () so that we get u ∈ W k−1, p ()
and
uW k−2, p∗∗ () ≤ C2 uW k−1, p∗ ()
where
1 1 1 1 1 1 1 2
= − = − − = − .
p∗∗ p∗ n p n n p n
Continue repeating this process k − 1 times until we obtain u ∈ W 0,q () = L q ()
where
1 1 k
= −
q p n
n n
To prove (2), let k = , then u ∈ W k, p (). But for p = , there is p < p such that
p k
n
k < , so W () ⊂ W (), and we we apply (1) for this chosen p to obtain
k, p k, p
p
218 3 Theory of Sobolev Spaces
q > q such that W k, p () ⊂ L q (), and the result follows from the combination of
the above two inclusions in addition to the nested inequality (Theorem 3.9.3).
n n
For (3), we only prove the first case. Let u ∈ W k, p () where k > and ∈ / N.
p p
n n
Let m = . Then we clearly have m < < m + 1.
p p
n
Then m < . Applying the same argument of case (1), we obtain u ∈ W k−m,r ()
p
and
uW k−m,r () ≤ C uW k, p () ,
1 1 m
where = − .
r p n
But this implies that D α u ∈ W 1,r () for all α ≤ k − m − 1. Moreover, note that
n
< m + 1, so we have r > n, and using Morrey’s inequality, we conclude that
p
D α u ∈ C 0,β ()
and
where n n
β =1− = 1 − + m.
r p
Since D α u ∈ C 0,β () for all α ≤ k − m − 1, we must have u ∈ C k−m−1,β () and
n n
If ∈ N, then letting m = − 1, by similar argument we can show that u ∈
p p
W k−m,r () , then we proceed as above.
the solutions, which plays a critical role in the theory of PDEs, and demonstrates
the fact that Sobolev spaces are, in many cases, the perfect spaces to deal with when
searching solutions for PDEs due to their nice integrability properties. Here, we recall
the definition again.
Definition 3.10.1 (Embedding) Let X and Y be two Banach spaces with norms
· X and ·Y respectively, and let ϕ : X −→ Y be a mapping. If ϕ is an isomet-
ric injective, then ϕ is said to be “embedding”, and is written as ϕ : X → Y (and
sometimes X ⊂⊂ Y ).
If we consider the map ı : X → Y , ı(x) = x and X ⊂ Y , then ı is called the
inclusion map. In general, the map ı is called embedding in the sense that it embeds
(or stick) X inside Y, and we can think of the elements of X as if they are in Y, or to
say that Y contains an isomorphic copy of X .
If this map is bounded, then we have more to say about this type of embedding.
Recall that a linear operator is bounded if and only if it is continuous, and so the
inclusion map ı : X → Y is continuous if there exists a constant C such that
for every x ∈ X . The equality on the above is due to the isometry of ı. In other words,
if x X < ∞ (i.e. x ∈ (X, · X ) then xY < ∞ (i.e. x ∈ (Y, ·Y ).
This embedding map is continuous. In this case we say that X is continuously
embedded into Y . It is important to note that when we say that X is embedded in Y
and x ∈ X, we don’t necessarily mean that x ∈ Y, but rather there is a representative
element in Y, say y such that x = y a.e. and x = y.
In the previous section we established some important estimates between Sobolev
spaces connected to other Banach spaces (Lebesgue or Holder). This gives rise to
inclusions and embeddings results. In view of the preceding estimates, we have the
following continuous embeddings.
Theorem 3.10.2 All the following inclusions are continuous:
(1) If 1 ≤ p < n then
∗
W 1, p (Rn ) ⊂ L p (Rn ).
Moreover, if p ≤ q ≤ p ∗ then
W 1, p (Rn ) ⊂ L q (Rn ).
W 1, p () ⊂ L q ()
for all 1 ≤ q ≤ p ∗ .
(3) If n < p ≤ ∞, then
W 1, p (Rn ) ⊂ L ∞ (Rn ).
W 1, p () ⊂ L ∞ ()
n
and W 1, p () ⊂ C 0,β () where β = 1 − .
p
The theorem is an immediate conclusion of the estimates established in the previous
section. Note that all the above inclusions are continuous, i.e. for all 1 ≤ p < n
∗
the space W 1, p (Rn ) is continuously embedded in L p (Rn ) and in L q (Rn ) for all
q ∈ [ p, p ∗ ], and for all n < p ≤ ∞ it is continuously embedded in C 0,β (Rn ), which
in turn embedded in Cb (Rn ). The condition n < p in (3) and (4) is sharp (see Problem
3.11.50).
One of the interesting properties of these continuous embeddings is that any
Cauchy sequence in X is Cauchy in Y, and any convergent sequence in X is con-
vergent in Y . A more interesting type of embedding is what is known as compact
embedding, where the inclusion operator is not only bounded, but also compact. Here
is the definition.
Definition 3.10.3 (Compact Embedding) Let X and Y be two Banach spaces with
norms · X and ·Y respectively. Then an inclusion mapping is called compact
c
embedding. This is denoted by X → Y .
The property of being sequentially compact means that the bounded sequence
{ϕ(xk )} has a convergent subsequence. So one simple argument to show that an
embedding X → Y is not compact is to find an example of a bounded sequence in
Y with no convergent subsequence (see Problem 3.11.57).
The next theorem, due to Rellich and Kondrachov, is a powerful tool in establishing
compactness property. Rellich proved the result in 1930 for the case p = q = 2, and
Kondrachov generalized it in 1945 to p, q ≥ 1.
3.10 Embedding Theorems 221
An important example to which the theorem can be applied is the convolution approx-
imating sequence u = ϕ ∗ u.
Lemma 3.10.4 If (u m ) ∈ W 1, p (Rn ) with compact support K , then
lim (u m ) − u m L 1 (K ) = 0
(u m ) = u m
Proof We have
(u m ) − u m = ϕ (x − y) (u m (y) − u m (x)) dy
Rn
1 x−y
= ϕ (u m (y) − u m (x)) dy.
B1 (0)
m
x−y
Using the substitution z = in the above integral, and also using the funda-
mental theorem of calculus on u m
1
............. = − ϕ(z) Du m (x − t)zdtdz.
B1 (0) 0
It follows that
1
|(u m ) (x) − u m (x)| d x ≤ ϕ(z) |Du m (x − t y)| d xdtdz
K B1 (0) 0 K
Thus
C
|(u m ) (x)| ≤ ϕ (x − y) |u m (y)| dy ≤ ϕ ∞ u m L 1 (K ) ≤ (3.10.2)
Rn n .
C
|D(u m ) (x)| ≤ ϕ (x − y) |u m (y)| dy ≤ Dϕ ∞ u m L 1 (K ) ≤ .
Rn n+1
(3.10.3)
It follows that
C
D(u m ) ∞ = sup |D(u m ) (x)| ≤ .
m n+1
1
|(u m ) (y) − (u m ) (x)| ≤ |D(u m ) (x + t (y − x))| dt
0
C
≤ δ = ε,
n+1
εn+1
for δ = . Therefore, (u m ) is equicontinuous in C(K ) which, in turns, is con-
C
tinuously embedded in L 1 (K ).
The previous lemma will be used next to prove a fundamental compact embedding
result: Rellich–Kondrachov Theorem, which states that the inclusion in
Theorem 3.10.2(2) is not only continuous, but also compact.
Theorem 3.10.5 (Rellich–Kondrachov Theorem) Let be open C 1 and bounded
in Rn . If 1 ≤ p < n then
c
W 1, p () → L q ()
Proof Theorem 3.10.2(2) already established the fact that the inclusion is
continuous, so we only need to prove compactness. Consider the bounded sequence
u m ∈ W 1, p (), for 1 ≤ p < n. By Theorem 3.8.10, there exists an extension
3.10 Embedding Theorems 223
Then by the interpolation inequality (Theorem 3.9.4), for any q ∈ [1, p ∗ ] we have
Hence
u m − u m q ∗ ≤ u m − (u m ) q ∗ + (u m ) − (u m ) q ∗
i j L ( ) i i L ( ) i j L ( )
3
+ (u m j ) − u m j L q (∗ ) < .
k
Note that since k is fixed, we cannot yet conclude that {u m i } is Cauchy in L q ().
But if we can repeat the same argument above for k + 1, k + 2, . . . , and for each
choice and every > 0 we obtain i, j ≥ Nk+1 > Nk , and in order to construct the
corresponding Cauchy sequence we must continue the process i, j −→ ∞, and in
224 3 Theory of Sobolev Spaces
The significance of this result stems from the fact that for every bounded sequence
of functions in W 1, p we can always extract a convergent subsequence in some L q
space for some suitable q, which turns out to be extremely useful in applications to
PDEs. Note that we required the domain to be bounded open C 1 in order to apply
1, p
the extension operator. If u ∈ W0 () then by Proposition 3.8.3 we don’t need this
condition on .
Corollary 3.10.6 Let be open and bounded in Rn . If 1 ≤ p < n then
1, p c
W0 () → L q ()
Proof Use Proposition 3.8.3 to obtain a zero extension, then proceed the same as in
the proof of Theorem 3.10.5.
and consequently
c
W 1, p () → L q ()
for 1 ≤ q < ∞.
Proof (1) Let u m ∈ W 1,n () be a bounded sequence. Since is bounded, by the
nested inequality, u m is bounded in W 1,q () for all 1 ≤ q < n, so we apply Rellich–
Kondrachov Theorem, and for q ≥ n, we can find p such that 1 ≤ p < n. Choose
p < n such that q < p ∗ . Then
u m ∈ W 1, p (),
u m j −→ uinL q (),
but it is easy to show that bounded sets in C 0,β () are uniformly bounded and
equicontinuous, so by Arzela–Ascoli Theorem,
c c
C 0,β () → C() → L q ().
1 1 m
2. For all > − , k ≥ m ≥ 1, we have
q p n
c
W k, p () → W k−m,q ().
D α u j ∈ W 1, p ()
∗
W k, p () ⊆ W k−m+1, p (), (3.10.4)
1 1 k
for all q ≥ 1, such that > − .
q p n
n
(2) If k = then
p
c
W k, p () → L q (),
for 1 ≤ q < ∞.
n
(3) If k > then
p
c
W k, p () → C 0,β ()
n
for 0 < β < γ where γ = min{1, k − }.
p
1 1 j
where = − , 1 ≤ j ≤ k. After k iterations we obtain
pj p n
c
W 1, p () → W 0,q () = L q (),
1 1 k
where q < pk∗ and = − .
pk∗ p n
For (2), repeat the argument above k − 1 iterations.
For (3), we use Morrey’s inequality to show that every u ∈ W k, p () is Holder con-
tinuous. We leave the details for the reader.
H r (Rn ) → H t (Rn ).
$ r −t
%
F −1 {(1 + w2 )t/2 û(w)} = F −1 (1 + w2 )− 2 · (1 + w2 )r/2 û(w)
r −t
= F −1 {(1 + w2 )− 2 } ∗ F −1 {(1 + w2 )r/2 û(w)}.
r −t
From the hypothesis, the exponent − < 0, hence
2
r −t
F −1 {(1 + w2 )− 2 } ∈ L 1,
228 3 Theory of Sobolev Spaces
which implies
E(u n ) ∈ H r (Rn ),
and define the cut-off function ξ ∈ C0∞ () such that ξ = 1 on . Define the sequence
vn = ξ E(u n ).
supp(vn ) ⊆ supp(ξ) ⊆ K
The theorem implies that in a fractional Sobolev space H r () with a bounded
domain of a nice regularity, any bounded sequence can have a convergent subse-
quence that converge in another Fractional Sobolev space H t () for any t < r .
Another type of compact embedding for fractional Sobolev spaces is the follow-
ing:
Theorem 3.10.12 Let be bounded and C k (or Lip) in Rn . Then
n
(1) If k > then
2
H k (Rn ) → Cb (Rn ).
n
(2) If k > then
2
c
H k () → C().
n
(3) If k > m + then
2
H k (Rn ) → Cbm (Rn ).
3.11 Problems 229
n
(4) If k > m + then
2
c
H k () → C m ().
Proof We will only prove the continuous inclusion (1). By performing m successive
iterations we can prove (3), and by the extension theorem and Arzela–Ascoli theorem
we can prove (2) and (4). Since
it suffices to prove the result for u ∈ S(Rn ) ∩ H k (Rn ). But this implies that
1
u∞ ≤ û(w) dw = C (1 + |w|2 )k/2 û(w) dw < ∞.
Rn Rn (1 + |w|2 )k/2
1/2 1/2
1 2
u∞ ≤ C dw (1 + |w|2 )k û(w) dw . (3.10.6)
Rn (1 + |w|2 )k Rn
n
Since k > , we obtain
2
1
dw < ∞,
R (1 + |w|2 )k
so (3.10.6) becomes
In Theorem 3.6.1, it was shown that any function in W 1,1 (I ) has an absolutely
continuous representation ũ ∈ C(I ). Theorem 3.10.12 shows that functions in H k (R)
are always continuous and bounded for all k ≥ 1, and there are no continuous rep-
resentations for functions in H 1 (R2 ). In order to get bounded continuous Sobolev
functions on R3 , we need at least H 2 .
3.11 Problems
x x >0
f (x) = .
0 x ≤0
supp{D k f } ⊆ supp{ f }.
f = f ∗ ϕ .
f = e−(+bi)|x| .
2
(a) Find F{ f }
(b) Find F{e−bi|x| }.
2
1 −|x|2 /2
h (x) = n e ,
(2π) 2
and let f = h ∗ f .
(a) Show that f −→ f .
(b) Show that
F{h } = e−|x| /2
2
.
is a seminorm on S(Rn ).
(b) Use the Frechet metric
∞
ρk ( f − g)
d( f, g) =
k=0
1 + ρk ( f − g)
n−p
α< .
p
converges to u in L p ().
(30) If u ∈ L 1 (Rn ), show that u (x) ∈ L ∞ (Rn ).
(31) Prove the case when ∂ is unbounded in Proposition 3.8.11.
1, p
(32) Show that if ū ∈ W 1, p (Rn ) and is C 1 -class then u ∈ W0 ().
(33) If u ∈ W ((0, ∞)), show that
1, p
lim u(x) = 0.
x→∞
(34) (a) Let u ∈ W 1,1 (I ) for some interval I ⊂ R. If u is weakly differentiable and
Du = 0 then u = c a.e. for some constant c ∈ R.
3.11 Problems 233
∂w ∂w ∂ξ
=ξ +w
∂xi ∂xi ∂xi
(36) Use approximation results to show that if u ∈ W 1, p (Rn ) and Dxi u = 0 for all
i = 1, 2, . . . , n, then u is constant a.e.
(37) (a) Show that if ϕn ∈ Cc∞ () and u ∈ W k, p () then uϕn ∈ W0 ().
k, p
k, p k, p
(b) Show that if v ∈ C k () ∩ W k,∞ () and u ∈ W0 () then uv ∈ W0 ().
(38) Show that for every u ∈ W k, p (), there exists w ∈ W k, p ( ) such that w = u
on and
wW k, p ( ) ≤ c uW k, p () .
w (x) = u (x + 2en ),
w (x) −→ u(x)
in W k, p (Rn+ ).
(40) Let u ∈ W 1, p (Rn+ ). Define
u(x) xn > 0
ū(x) =
u(x , −xn ) xn < 0.
(43) Let be an open C 1 and bounded in Rn . Show that if u ∈ W 1,n (), n ≥ 2, then
u ∈ L q ()
(51) Let u ∈ W 1, p () for some open ⊂ Rn . If p > n show that u is pointwise
differentiable and
∇u = Du a.e.
(52) Use only the Arzela–Ascoli Theorem together with Theorem 3.6.1 to show that
for all 1 < p < ∞, we have the compact embedding
c
W 1, p (I ) → C(I )
W 1, p (Rn ) → L p (Rn )
is not compact.
n
(56) Let k ≤ .
2
a) Show that
H k (Rn ) → L p (Rn )
2n
for all 2 ≤ p < .
n − 2k
b) If ⊂ R is open bounded and C 1 , show that
n
c
H k () → L p ()
2n
for all 2 ≤ p < .
n − 2k
(57) (a) Give an example of a bounded sequence u n ∈ L p () but has no convergent
subsequence in L p ().
(b) Use (a) to show that for q > p the inclusion
ι : L q () → L p ()
cannot be compact.
236 3 Theory of Sobolev Spaces
n
(58) If 1 ≤ p < ∞ and p = , then show that
m
W m, p (Rn ) → L q (Rn )
W k, p () → L q ()
1 1 k
q = p∗ , = − .
p∗ p n
The general form of a second-order partial differential equation in R2 takes the form
Au x x + 2Bu x y + Cu yy + Du x + Eu y + F = 0.
n
∂2u
n
∂
Lu(x) = − ai j (x) + bi (x) u(x) + c(x)u(x), (4.1.1)
i, j=1
∂xi ∂x j i=1
∂xi
and is positive definite for all x ∈ , i.e., ξ T Aξ > 0 for every nonzero ξ ∈ Rn . A
more convenient way of writing it is in the divergence form
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 239
A. Khanfer, Applied Functional Analysis,
https://doi.org/10.1007/978-981-99-3788-2_4
240 4 Elliptic Theory
Lu(x) = −div(A(x)∇u),
i.e.
n n
∂ ∂u ∂u
Lu(x) = − ai j (x) + bi (x) (x) + c(x)u(x). (4.1.2)
i, j=1
∂xi ∂x j i=1
∂xi
The equation models many steady-state natural and physical systems (e.g., heat
conduction, diffusion, heat and mass transfer, flow of fluids, and electric potential).
The divergence term
n
∂ ∂u
ai j (x)
i, j=1
∂xi ∂x j
is the advection term, and the zeroth-order term c(x)u(x) is the decay term. The
function A(x) is called symmetric if ai j = a ji for all 1 ≤ i, j ≤ n. For the matrix A
to be positive definite means that all its eigenvalues are positive. In particular, for
every x ∈ ,
n
ai j (x)ξi ξ j
i, j=1
≥ λ(x), (4.1.3)
|ξ|2
where λ(x) is the eigenvalue of A(x) and ξ ∈ Rn . Taking the minimum over all
such vectors ξ gives the smallest eigenvalue λmin (x) > 0. So (4.1.3) characterizes
the ellipticity of PDEs, and the smallest eigenvalue λmin (x) depends on the chosen
value of x and serves as the minimum of the LHS of (4.1.3).
To make this lower bound uniform, we need to make it independent of x. This gives
rise to the following definition.
Definition 4.1.1 (Uniformly Elliptic Operator) A second-order partial differential
equation of the form
n
∂2u n
∂
Lu(x) = − ai j (x) + bi (x) u(x) + c(x)u(x), x ∈ ⊆ Rn
i, j=1
∂xi ∂x j i=1
∂x i
4.1 Elliptic Partial Differential Equations 241
is called uniformly elliptic if there exists a positive number λ0 > 0 such that
n
ai j (x)ξi ξ j ≥ λ0 |ξ|2
i, j=1
for all ξ ∈ Rn . This is a stronger version of the uniform ellipticity in which the
diffusion term can be controlled from above and below. It is easy to see that choosing
suitable values of ξi , ξ j yields ai j ∈ L ∞ (). This is interpreted by the fact that there
is no blow-up in the diffusion process, which, in many situations, arises naturally,
so we will adopt this assumption throughout this chapter. One advantage of this
two-sided control of the diffusion process is that all eigenvalues of the matrix A lie
between these two bounds. Namely, for any eigenvalue λ(x) of A and for all x ∈
we have
λ0 ≤ λ ≤ . (4.1.4)
Another particular case is when A(x) is the identity matrix (i.e., ai j (x) = δi j ),
and bi = c = 0. In this case, the operator (4.1.2) reduces to the Laplace operator
n
∂2u
Lu(x) = −∇ u(x) = −
2
(x).
i, j=1
∂xi ∂x j
Elliptic equations are extremely important and can be used in a wide range of appli-
cations in applied mathematics and mathematical physics. The most basic examples
of elliptic equations are
(1) Laplace equation:
∇ 2 u = 0.
2. Neumann condition:
∂u
= ∇u · n = g, on ∂,
∂n
where n is the outward unit normal vector.
Laplace’s equation is the cornerstone of potential theory and describes the elec-
tric potential in a region. It is one of the most important partial differential equations
because it has numerous applications in applied mathematics, physics, and engineer-
ing. Poisson’s equation is the nonhomogeneous version of Laplace’s equation and
describes the electric potential in the presence of a charge. It plays a dominant role
in electrostatic theory and gravity. The Helmholtz equation is another fundamental
equation which has important applications in many areas of physics, such as electro-
magnetic theory, acoustics, classical and quantum mechanics, thermodynamics, and
geophysics. So no wonder the theory of elliptic partial differential equations stands
as one of the most active areas in applied mathematics, and attracts an increasing
interest from researchers. Therefore, studying solutions of elliptic PDEs provides
a comprehensive overview of equations of mathematical physics and a wide scope
of the theory of PDEs, so it suffices our needs in this text. Moreover, another fea-
ture of studying elliptic type is that elliptic equations have no real characteristic
curves, and consequently, solutions to elliptic equations don’t possess discontinuous
derivatives, because if they do so, they will be only along characteristic curves. This
makes the elliptic equations the perfect tool to use in investigating equilibrium (time-
independent) steady-state processes in which time has no effect, and no singularities
in the solution to be transported.
A classical solution for a boundary value problem is a solution that satisfies the
problem pointwise everywhere. It should be differentiable as many times as needed
to fulfill the PDE. If the equation is of order n and defined in a domain , then the
classical solution must be in C n () for every x ∈ , and must also satisfy boundary
conditions of the problem for every x ∈ ∂. The problem of finding a solution to an
equation in a domain and satisfying the conditions on the boundary of this domain
is called “Boundary Value Problem”.
4.2 Weak Solution 243
Our elliptic BVP takes the following form: Let be a bounded and open set in
Rn . Find u ∈ C 2 () ∩ C() such that
Lu = f x ∈
(4.2.1)
u=g x ∈ ∂
for some linear elliptic operator L as in (4.1.2). Because of the boundary condition,
this is called Dirichlet problem. Notice that we can write the solution of this problem
as w + g where w is the solution to the problem
L(v + g) = f x ∈
v=0 x ∈ ∂,
so for simplicity we can just assume g = 0 in (4.2.1). If we can find such u, then it will
be a classical solution for the problem (4.2.1). As said earlier, these equations model
natural and physical phenomena occurring in the real world. Unfortunately, these
models may not admit classical solutions. In fact, many equations in various areas of
applied mathematics may have solutions that are not continuously differentiable, or
not even continuous (e.g., shock waves), and so finding classical solutions to these
equations in general may be too restrictive to obtain. Consider for example the case
when f is a regular distribution, then the equation
∇2u = f
How can we find such solutions? Sobolev spaces come to the rescue as they provide
all the essentials to obtain these solutions. Sobolev spaces are the completion of
C ∞ spaces, and they include weakly differentiable functions that are not necessarily
continuous or differentiable in the usual sense, but they can be approximated by
some smooth functions in Cc∞ (). These proposed solutions are supposed to satisfy
these equations in a distributional sense not in pointwise sense. Recall in (2.2.1) two
distributions T and S are equal if
T, ϕ = S, ϕ
for all ϕ ∈ Cc∞ (). We will use the same formulation here. Namely, we will multiply
both sides of the equation Lu = f in (4.2.1) by a test function ϕ ∈ Cc∞ () then
integrate over to obtain
244 4 Elliptic Theory
(Lu, ϕ) d x = f ϕd x. (4.2.2)
By density, this extends to all functions v ∈ H01 (). Indeed, for every v ∈ H01 ()
we can choose a sequence ϕn ∈ Cc∞ (∞) such that ϕn −→ v. Since the above equa-
tion is written in terms of ϕn , we pass the limit using Dominated Convergence
Theorem to obtain
(Lu, v) d x = f vd x.
and performing integration by parts in the divergence term (the first term), making
use of the fact that v(∂) = 0, yields
⎛ ⎞
n
∂u ∂v n
∂u
⎝ ai j + bi v + cuv ⎠ d x = f, v L 2 () . (4.2.3)
i, j=1 ∂xi ∂x j i=1
∂x i
Definition 4.2.1 (Weak Formulation, Weak Solution) Consider the Dirichlet problem
Lu = f (x) x ∈
, (4.2.4)
u=0 x ∈ ∂
n n
∂ ∂u ∂u
Lu(x) = − ai j (x) + bi (x) (x) + c(x)u(x).
i, j=1
∂x i ∂x j i=1
∂x i
(Lu, v) d x = f, v L 2 () x ∈
(4.2.5)
u=0 x ∈ ∂
Here, we need to emphasize the fact that f, v L 2 () is not really an inner product
but rather a sloppy way of describing it because it behaves the same. Since
| f, v | < ∞,
f (v) = f, v .
So the problem is equivalent to the problem of finding u ∈ H01 () such that for
f ∈ H −1 ()
f, v = B[u, v]
which implies
⎛ ⎞
n n
⎝− ∂ ∂u ∂u
ai j (x) + bi (x) (x) + c(x)u(x) − f ⎠ vd x = 0.
i, j=1
∂x i ∂x j i=1
∂x i
By the Fundamental Lemma of COV (Lemma 3.2.9), we see that u satisfies (4.2.4)
almost everywhere, but since the terms on both sides of (4.2.4) are continuous, the
result extends by continuity to all x ∈ , and thus u is a classical solution of (4.2.4).
Define B[u, v] to be the integral in the LHS of (4.2.5). This is defined as the bilinear
form associated with L , and equation (4.2.5) can be written as
246 4 Elliptic Theory
B[u, v] = f (v).
This B will play a dominant role in establishing the existence of weak solutions of
elliptic PDEs.
u = f, in
u = 0 on ∂
for some f ∈ L 2 (). If u ∈ C 2 () and satisfies the equation pointwise for every
x ∈ together with the boundary condition, then u is a classical solution to the
problem, and in this case we get f ∈ C(). When it comes to applications to science
and engineering, the condition of obtaining a continuous data is a bit restrictive in
practice, and requiring f to be measurable and L 2 integrable seems more realistic in
many situations. If it happens that
satisfies the equation at almost all x ∈ except for a set of measure zero, then
u is a strong solution. Notice the difference between the two notions; u continues
to be a Sobolev function, and so it is measurable but not continuous, and conse-
quently the boundary condition cannot be taken pointwise because the boundary of
has measure zero and measurable functions don’t change by a set of measure zero
(remember that in L p spaces we are dealing with classes of functions rather than
functions). Nevertheless, the function belongs to H 2 , so it possesses second weak
derivatives and this allows it to satisfy the equation pointwise almost everywhere and
produces f ∈ L 2 () as a result of the calculations. If u ∈ H01 () and satisfies the
variational weak formulation
4.3 Poincare Equivalent Norm 247
Du.Dvd x = f vd x
for all v ∈ H01 (), then u is a weak solution of the equation. Observe here that u does
not satisfy the equation nor the boundary in a pointwise behavior, but rather globally
via an integration over the domain. We only require the first weak derivative of u to
exist, and since H 2 () ⊂ H 1 () we see that every strong solution is indeed a weak
solution but the converse is not necessarily true, thus it may happen that the equation
has a weak solution but not a strong solution, but if the weak solution turns out to
be in H 2 () then it becomes a strong solution. Sections 4.9 and 4.10 investigate this
direction thoroughly.
We end the section by the following important remark: The notion of weak solution
should not be confused with the notion of weak derivative as the former is called
“weak” because it satisfies the weak formulation of the PDE, which is a weaker
condition than satisfying the equation pointwise, but it doesn’t mean it satisfies
the equation with its weak derivatives. For example, the function (3.1.1) is a weak
solution of the Laplace equation u = 0 although it is not weakly differentiable as
illustrated in Sect. 3.1.
In this section, we will deduce some important results that are very useful in estab-
lishing existence and uniqueness of weak solutions. Theorem 3.9.8 discussed the
1, p
Poincare inequality in W0 as a consequence of the Gagliardo–Nirenberg–Sobolev
inequality. According to the remark after Theorem 3.9.8, the inequality holds for
p = 2 for all n. Here is a restatement of the result with an alternative proof that
doesn’t depend on GNS inequality, which implies that it holds for 1 ≤ p < ∞, and
in which the domain can be unbounded in general but bounded in at least one
direction, and this gives an extra flexibility to the choice of .
Theorem 4.3.1 (Poincare Inequality on H01 ) Let ⊂ Rn be an open set that is
bounded in at least one direction of Rn . Then there exists C > 0 (which depends
only on ) such that for every u ∈ H01 (), we have
Proof For n > 1, we assume that is bounded in the xi direction, that is,
|xi | ≤ M
∂u
u2 = |u|2 d x = − 2xi u dx
∂xi
the inequality extends by density to u ∈ H01 () where we can assume a sequence
u n ∈ Cc∞ () such that u n −→ u in H01 , which implies
i
D un −→ D i u L 2 ()
L 2 ()
for i = 0, 1.
defines an inner product on H01 (), and H01 () endowed with this inner product is
a Hilbert space.
This implies that the norm defined as Du L 2 () is equivalent to the standard norm
u H01 () , and so we can define the following inner product on H01 ():
4.3 Poincare Equivalent Norm 249
such that
and the result follows since H01 () endowed with · H01 () is a complete space.
The norm
u∂ = Du L 2 ()
on H 1 ()
shall be called Poincare Norm. In a similar fashion, we write u∂ 2 to
denote D 2 u L 2 () .
One main concern about the Poincare inequality is when u is constant over . Of
/ H01 (). So we need to generalize the inequality to include
course, in this case u ∈
this case and in the general space H 1 (). To motivate this, we need the following
definition:
Definition 4.3.3 (Mean Value). Let ⊂ Rn . Then the mean value of a function u
over , denoted by ū , is given by
1
ū = ud x.
||
Proof If the estimate above doesn’t hold, then for every m > 0 there exists u m ∈
H 1 () such that
∂ϕ ∂ϕ ∂vm 1
v d x = lim vm d x = − lim ϕ d x = − lim ϕd x = 0.
∂xi m ∂xi m ∂xi m m
In view of the above result, we can reduce the space H 1 () by collapsing the real
numbers to zero by considering the quotient Sobolev space
H̃ 1 () = H 1 ()/R
to include all the equivalence classes [u] = ũ with respect to the equivalence relation
u ∼ v iff u − v is constant, and u, v ∈ ũ implies that u − v is a constant. So, in
this space the functions u and u + 1 are the same because they belong to the same
equivalence class [u]. If this space is endowed with the Poincare norm
In words, a bilinear mapping is linear in both coordinates, i.e., x −→ B(x, y) for
all y ∈ H2 and y −→ B(x, y) for all x ∈ H1 . A typical example of a bilinear form
on Euclidean spaces Rn is the dot product, and takes the form
n
B[x, y] = x, y = ai j xi y j
i, j
for some n × n matrix A = (ai j ). In general, if H is any Hilbert space with R as the
underlying field, then the inner product on H × H is bilinear.
Moreover, we have the following.
Definition 4.4.2 Let B : H × H → R be a bilinear mapping. Then
(1) B is bounded if
|B[u, v]| ≤ C u H v H .
(2) B is symmetric if
B[u, v] = B[v, u]
for all u, v ∈ H.
(3) B is positive definite if
B[u, u] > 0
B[u, u] ≥ η u2H
for all u ∈ H.
In view of Definition 4.4.2, we expect that much of the properties for linear mappings
extend to bilinear mappings. In particular, we have the following, whose proof is left
to the reader.
252 4 Elliptic Theory
Proof Exercise.
Now we come to our elliptic bilinear map that we already introduced in the previous
section.
Definition 4.4.4 (Elliptic Bilinear Map) Let ai j , bi , c ∈ L ∞ () for some open
⊆ Rn . We define the elliptic bilinear map, B : H01 () × H01 () −→ R and it
is given by
⎛ ⎞
n
∂u ∂v n
∂u
B[u, v] = ⎝ ai j + bi v + cuv ⎠ d x. (4.4.1)
i, j=1 ∂xi ∂x j i=1
∂x i
This B is the bilinear form associated with the elliptic operator L defined in (4.1.1)
which serves our weak formulation given in (4.2.5). The conditions adopted in the
definition suffice our needs in this text. Before we establish our estimates, we need
the following.
Lemma 4.4.5 (Cauchy’s inequality) Let > 0. Then for s, t > 0 we have
t2
st ≤ s 2 + .
4
Proof We have
t
st = (2 )1/2 s .
(2 )1/2
t
The result follows using Young’s inequality with a = (2 )1/2 s and b = .
(2 )1/2
Theorem 4.4.6 (Elliptic Estimates) Let B be the elliptic bilinear map (4.4.1) for
some open in Rn .
4.4 Elliptic Estimates 253
(1) If ai j , bi , c ∈ L ∞ (), then B is bounded, i.e., there exists α > 0 such that for
every u, v ∈ H01 () we have
Remark For convenience, we write · L ∞ () = ·∞ and · L 2 () = ·2 .
Then we have
n
∂u ∂v
n
∂u
|B[u, v]| ≤ ai j dx + bi ∞ |v| d x
∞ ∂x
i, j=1 ∂x i j i=1 ∂x i
n
ai j (x)ξi ξ j ≥ λ0 |ξ|2 (4.4.2)
i, j=1
for all ξ ∈ Rn , let ξ = Du, substitute in (4.4.2) and integrate both sides over . Then
we have
n
∂u ∂u
λ0 |Du|2 d x ≤ ai j (x) dx
i, j=1 ∂xi ∂x j
254 4 Elliptic Theory
n
∂u
= B[u, u] − bi (x) ud x − cu 2 d x
i=1 ∂xi
n
∂u
≤ B[u, u] + bi ∞ |u| d x + c∞ |u|2 d x.
i=1 ∂x i
∂u
Now we substitute with = s and |u| = t in Cauchy’s inequality with
∂xi
λ0
0 < < n .
2 i=1 bi ∞
This gives
n
1
λ0 |Du|2 d x ≤ B[u, u] + bi ∞ |Du|2 d x + + c∞ u 2 d x.
4
i=1
It follows that
λ0
n
|Du| d x ≤ λ0 −
2
bi ∞ |Du|2 d x
2 i=1
1
≤ B[u, u] + + c∞ u 2 d x,
4
which implies
λ
β u2H 1 () ≤ Du22 ≤ B[u, u] + γ u22 ,
0 2
where
λ0 1
β= , γ= + c∞ .
2(C2 + 1) 4
If bi = 0, then we have
where
λ0
β= (C2 +1)
, γ = c∞ .
4.5 Symmetric Elliptic Operators 255
Theorem 4.5.1 (Riesz Representation Theorem (RRT) for Hilbert Spaces) Let H be
a Hilbert space, and 0 = f ∈ H∗ . Then, there exists a unique element u ∈ H such
that
f (v) = v, u
0 = f (v) y0 , y0 − f (y0 ) v, y0
Then
f (y0 )
u= y0
y0 , y0
Choose v = u − u gives
u=u,
which established uniqueness. Note that f ∈ H∗ and f ≤ u . On the other hand,
−∇ 2 u = f in
(4.5.1)
u = 0 on ∂
where f ∈ L 2 () for some open ⊂ Rn that is bounded in at least one direction.
Then there exists a unique weak solution u ∈ H01 () for problem (4.5.1).
B[u, v] = Du.Dvd x = u, v ∂
which is the Poincare inner product defined in (4.3.4). Proposition 4.3.5 asserts that
H01 () with this inner product is Hilbert, and since f ∈ L 2 (), the inner product
takes the form
so
f ∈ (H01 ())∗ = H −1 (),
and therefore by the Riesz Representation Theorem (RRT) there exists a unique
u ∈ H01 () satisfying the equation, and clearly
u |∂ = 0.
−∇ 2 u + u = f in
(4.5.2)
u = 0 on ∂
where f ∈ H −1 () for some ⊆ Rn . Then there exists a unique weak solution
u ∈ H01 () for problem (4.5.2).
which is elliptic with ai j (x) = δi j , bi (x) = 0 for all i = 1, . . . n, and c(x) = 1. Here,
the elliptic bilinear map takes the following form:
B[u, v] = f, v = f (v)
for all v ∈ H01 () and f is a bounded linear functional on H01 (). Thus, by Riesz
Representation Theorem
(RRT)
there exists a unique function u ∈ H01 () satisfying the equation, and of course
u |∂ = 0.
n
∂ ∂u
Lu = − ai j (x) + c(x)u(x).
i, j=1
∂xi ∂x j
n
L=− ∂xi ai j (x)∂x j + c(x), (4.5.3)
i, j=1
defined on H01 () for some open and bounded in at least one direction in Rn , and
let ai j , c ∈ L ∞ () with c(x) ≥ 0.
(1) If L is uniformly elliptic, then the associated elliptic bilinear map B[u, v] is
coercive.
(2) Moreover, if A = (ai j ) is symmetric then B defines a complete inner product on
H01 ().
Proof The elliptic bilinear map associated with L takes the form
⎛ ⎞
n
∂u ∂v
B[u, v] = ⎝ ai j (x) + c(x)u(x)v(x)⎠ d x.
i, j=1 ∂xi ∂x j
By the uniform ellipticity of L, there exists λ > 0 such that for all ξ ∈ Rn
n
ai j (x)ξi ξ j ≥ λ0 |ξ|2 . (4.5.4)
i, j
B[u, u] ≥ λ0 |Du|2 + c(x)u 2 d x
≥ λ0 |Du|2 d x
λ0 λ0
= |Du|2 + |Du|2 d x
2 2
λ0 λ0 2
≥ |Du|2 + u d x (by Poincare inequality)
2 2C2
4.5 Symmetric Elliptic Operators 259
≥σ |Du|2 + u 2 d x
= σ u2H 1 ()
0
B[u, u] > 0
or
√
u B ≥ σ u H01 () . (4.5.5)
where
⎧ ⎫
⎨n
⎬
M = max ai j ∞ , c L ∞ () .
⎩ L () ⎭
i, j=1
Then (4.5.5) and (4.5.6) imply the inner product u, u B is equivalent to the standard
inner product u, u H01 () and thus the space (H01 (), ·, · B ) is Hilbert space.
260 4 Elliptic Theory
The next theorem provides an existence and uniqueness theorem for (4.2.4) for a
symmetric uniformly operator of the form (4.5.3). Remember that the condition
bi = 0 is essential for symmetry of L .
Theorem 4.5.5 Consider the Dirichlet elliptic problem
Lu = f in
(4.5.7)
u = 0 on ∂.
where L is a uniformly elliptic operator of the form (4.5.3) defined on some open
set and bounded in at least some direction in Rn . If A = (ai j ) is symmetric, and
ai j , c ∈ L ∞ (), f ∈ L 2 (), and c(x) ≥ 0, then there exists a unique weak solution
for the problem (4.5.7).
Proof As in Theorem 4.5.4, we have
⎛ ⎞
n
∂u ∂v
B[u, v] = ⎝ ai j (x) + c(x)u(x)v(x)⎠ d x.
i, j=1 ∂xi ∂x j
The elliptic bilinear map B in Theorem 4.5.4 for a symmetric elliptic operator defines
the most general inner product on H01 () in the sense that if A is the identity matrix
and c = 1 then
4.6 General Elliptic Operators 261
u, u B = u, u H01 () ,
u, u B = u, u ∂ .
If bi = 0 for at least one i, then L is not symmetric, which implies that B is not
symmetric. In this case, B cannot define an inner product and consequently we cannot
apply the Riesz representation theorem. Therefore, we need to investigate a more
general version of the Riesz representation theorem to allow us to deal with general
elliptic operators that are not symmetric. The following theorem is fundamental and
serves our needs.
Theorem 4.6.1 (Lax–Milgram) Let B : H × H → R be a bilinear mapping for the
Hilbert space H . If B is bounded and coercive, then for every f ∈ (H )∗ there exists
a unique u ∈ H such that
B[u, v] = f, v
for all v ∈ H.
Proof Let u ∈ H be a fixed element. Then the mapping v −→ B[u, v] defines a
bounded linear functional on H and so by the Riesz representation theorem, there
exists a unique w = wu ∈ H such that
B[u, v] = wu , v
for all v ∈ H. Our claim is that the mapping u −→ wu is onto and one-to-one, which
implies the existence of u ∈ H such that
wu = f.
B[u, v] = T u, v .
η u H ≤ T u H . (4.6.1)
262 4 Elliptic Theory
The next step is to show that R(T ) is closed, then show that
R(T )⊥ = {0}.
This implies that R(T ) = H and so T is onto. To show that R(T ) is closed, let wn
be Cauchy in R(T ). Then for each n, there exists u n such that
T (u n ) = wn .
H = R(T ) ⊕ R(T )⊥ .
y, R(T ) = 0.
By coercivity of B
η y2H ≤ B[y, y] = T y, y = 0.
T u1 = T u2;
u1 = u2,
The next theorem is basically the same as Theorem 4.5.5 except that the symmetry
condition for A is relaxed.
Theorem 4.6.2 Consider the Dirichlet elliptic problem
Lu = f in
(4.6.2)
u = 0 on ∂.
f ∈ L 2 () ⊂ H −1 (),
f (v) = B[u, v]
for all v ∈ H01 (). The result follows from the Lax–Milgram theorem.
Recall that in an elliptic operator L, the coefficient matrix A is positive definite, so
its eigenvalues are all positive. In the previous theorem, it was assumed that c ≥ 0,
so it cannot be an eigenvalue of A. But for arbitrary c, we need to avoid the values
that would make c an eigenvalue of −L since this would give zero in the LHS of
(4.5.7) and hence f cannot be obtained.
Now we will study the solution of the equation
Lu + μu = f in ,
n
∂ ∂u
Lu = − ai j (x) + c(x)u(x),
i, j=1
∂xi ∂x j
264 4 Elliptic Theory
so the zeroth-order term in the equation is (c(x) + μ)u. When we relax the condition
c ≥ 0, we will make use of
γ = c∞
μ ≥ γ,
for all choices of c, so by Theorem 4.5.4 the elliptic bilinear map B[u, v] is coercive.
Thus we have the following.
Theorem 4.6.3 Consider the Dirichlet elliptic problem
Lu + μu = f in
(4.6.4)
u=0 on ∂.
for some uniformly elliptic operator L of the form (4.5.3), such that ai j , c ∈ L ∞ ()
and f ∈ L 2 () for some open and bounded in at least some direction in Rn . If
μ ≥ γ, which was obtained in Garding inequality, then there exists a unique weak
solution for the problem (4.6.4).
L μ = (L + μI ) : H01 −→ H −1 (4.6.5)
Now we investigate Elliptic PDEs with Neumann conditions. Consider the problem
Lu = f in
(4.6.6)
∇u · n = 0 on ∂
4.6 General Elliptic Operators 265
where L is the elliptic operator in (4.5.3). Here, we will assume is bounded open
in Rn for n ≥ 2. The weak formulation of the problem takes the following form:
Find u ∈ H 1 () such that
⎛ ⎞
n
∂u ∂v
⎝ ai j (x) + c(x)u(x)v(x)⎠ d x = f, v in
i, j=1 ∂xi ∂x j
∇u · n = 0 on ∂.
The argument of finding the weak formulation of the problem is almost the same
as for the Dirichlet problem except one thing: The solution doesn’t vanish on the
boundary, so our test function will be in C ∞ (), and consequently our space solution
will be H 1 () rather than H01 (). This means that we won’t be able to use Poincare
inequality since it requires H01 (). To solve the problem we assume that ai j , c ∈
L ∞ (), f ∈ L 2 () and c(x) ≥ 0 on . Here, the case c = 0 should be treated with
extra care for reasons that will be discussed shortly. So we need to discuss the two
cases separately: the first case when c(x) is away from 0 and the second case is when
c = 0.
Theorem 4.6.4 Let ai j , c ∈ L ∞ () and f ∈ L 2 () in the problem (4.6.6) defined
on a bounded open in Rn for n ≥ 2. If c(x) ≥ m > 0 then there exists a unique
weak solution u ∈ H 1 () for problem (4.6.6), and for some C > 0, we have
Then,
n
|B[u, v]| ≤ ai j ∞ |Du| |Dv| d x + c L ∞ () |u| |v| d x
L ()
i, j=1
≤α |Du| |Dv| d x + |u| |v| d x
= 2α u H 1 () v H 1 () ,
where
n
α = max{ ai j ∞ , c L ∞ () }.
L ()
i, j=1
Lu = f in
(4.6.8)
∇u · n = 0 on ∂
The weak formulation of the problem takes the following form: Find u ∈ H 1 ()
such that
n
∂u ∂v
ai j (x) d x = f, v in
i, j=1 ∂xi ∂x j
∇u · n = 0 on ∂.
The difficulty of this problem stems from the fact that the existence and uniqueness
result is not guaranteed because if u is a weak solution to the problem, then u + k
is also a solution to the problem for any constant k. Moreover, for a constant k, we
have
f, k = 0,
ū = 0,
and so the Poincare–Wirtinger inequality and the quotient Sobolev space will be
invoked here.
Theorem 4.6.5 Let ai j ∈ L ∞ () and f ∈ L 2 () in problem (4.6.8) defined on a
bounded connected open in Rn for n ≥ 2. If
4.6 General Elliptic Operators 267
ū = 0
and
f, 1 L 2 () = 0,
then there exists a unique weak solution u ∈ H 1 () for problem (4.6.8). Moreover,
for some C > 0 we have
Proof Consider the quotient Sobolev space H̃ 1 () with the Poincare norm
which is a Hilbert space by Proposition 4.3.5. The associated elliptic bilinear map
takes the form
n
|B[ũ, ṽ]| ≤ ai j ∞ |Du| |Dv| d x
L ()
i, j=1
where
n
α= ai j ∞ ,
L ()
i, j=1
| f (ṽ)| = | f, v | = | f, v − v | ≤ f L 2 v − v L 2 .
L −1
μ : H
−1
() −→ H01 ().
n
L=− ∂i ai j ∂ j + c(x) + μ
i, j
defined on an open bounded set ∈ Rn . It was proved that for every f ∈ L 2 (),
there exists a unique weak solution u ∈ H01 () such that B[u, v] = f, v for every
v ∈ H01 (). Adding the term μu to both sides of the equation gives
Lu + μu = f + μu.
Then
u = L −1 −1 −1
μ (g) = L μ ( f ) + μL μ (u). (4.7.3)
Let us denote
μL −1
μ = K,
L −1
μ ( f ) = h,
(I − K )u = h, (4.7.4)
with
K = μL −1
μ : H
−1
() −→ H01 (). (4.7.5)
The following theorem gives the first result of this section which implies that a
uniform elliptic operator has a compact resolvent.
Theorem 4.7.1 The operator
K = μL −1
μ : L () −→ L ()
2 2
β u2H 1 () ≤ Bμ [u, u] = g, u ≤ g L 2 () u L 2 () ≤ g L 2 () u H01 () .
0
(4.7.7)
Then (4.7.6) and (4.7.7) give
1
u H01 () ≤ g L 2 () .
β
270 4 Elliptic Theory
and hence K is a bounded linear operator which maps bounded sequences to bounded
sequences in H01 (), which, in turn, is compactly embedded in L 2 () (by the
Rellich–Kondrachov theorem for n > 2 and Theorem 3.10.7 for n = 1, 2) and there-
fore K = ι ◦ K is compact.
Lu = f
u = 0 ∂,
or
2. There exists a nonzero weak solution u ∈ H01 () for the equation Lu = 0.
Proof We start from the fact that the operator I − K is Fredholm by Theorem 4.7.1.
So either
(i) for every h ∈ L 2 () the equation
(I − K )u = h (4.7.8)
(I − μL −1 −1
μ )u = L μ ( f ) = h. (4.7.9)
Apply L μ to both sides of (4.7.9) and rearrange terms to finally obtain statement (1)
of the theorem. Suppose statement (ii) above holds. Again, letting K = μL −1 μ ,
(I − μL −1
μ )u = 0,
4.8 Self-adjoint Elliptic Operators 271
Multiplying both sides by μ gives statement (2) of the theorem. This completes the
proof.
An immediate conclusion is
Corollary 4.7.3 Let K be the resolvent of a uniformly elliptic operator of the form
λ1 > λ2 > λ3 . . . ,
and λn −→ 0.
Proof This is a consequence of the Hilbert–Schmidt theorem and the Spectral the-
orem for Self-adjoint compact operators.
B ∗ [u, v] = B[v, u]
272 4 Elliptic Theory
and the adjoint problem is defined as finding the weak solution v ∈ H01 () of the
adjoint equation
L ∗v = f
v=0 ∂
such that
B ∗ [v, u] = f, u
Lu, v = u, Lv .
To achieve this equality, we integrate (Lu)v by parts (by ignoring the summations
for simplicity). This gives
(Lu)vd x = −(ai j u xi )x j + bi u xi + cu vd x
= (L ∗ v)ud x.
Thus:
Theorem 4.8.1 Let ai j ∈ C 1 (), and bi be a constant number. Then the elliptic
operator
4.8 Self-adjoint Elliptic Operators 273
Proof The argument above implies L = L ∗ , and since I is also self-adjoint, then so
is L + μI, hence
∗
K ∗ = (L + μI )−1
= (L ∗ + μI )−1
= (L + μI )−1
= K.
Lu + μu(x) = f in
(4.8.1)
u = 0 on ∂
n
L=− ∂xi ai j (x)∂x j + c(x) (4.8.2)
i, j=1
Lu = λu in
u=0 on ∂.
274 4 Elliptic Theory
The following theorem provides the main property for the spectrum of L .
Theorem 4.8.2 (Spectral Theorem of Elliptic Operators) Consider the uniformly
elliptic operator L in (4.8.2) defined on an open bounded set in Rn . Then, the
eigenfunctions of L form an countable orthonormal basis for the space L 2 () and
their corresponding eigenvalues behave increasingly as
0 < λ1 ≤ λ2 ≤ λ3 . . . ,
and λn −→ ∞.
Proof Consider the case when Lu − λu = 0 has a nontrivial solution, which identi-
fies λ as an eigenvalue of L . Add the term γu to both sides of the equation, λ > −γ.
This gives
Lu + γu − λu = γu,
or
L γ u = (γ + λ)u. (4.8.3)
u = (γ + λ)L −1
γ u.
Substituting with K = γ L −1
γ ,
γ
Ku = u.
γ+λ
From Corollary 4.7.3, the eigenvalues of K are countable and decreasing, so let them
be γ
νn =
γ + λn
then the eigenvalues of L , and they increase, and since νn −→ 0 we must have
λn −→ ∞.
4.9 Regularity for the Poisson Equation 275
Investigating the regularity of weak solutions of elliptic PDEs has been a major
research direction since 1940s. We begin with one of the earliest and most basic
regularity results. The result surprisingly asserts that the weak solution of the Laplace
equation is, in fact, a classical solution.
Theorem 4.9.1 (Weyl’s Lemma) If u ∈ H 1 () such that
Du · Dv = 0
Letting ⊂ K ⊂ for some compact set K , it can be easily shown that the
sequence u is uniformly bounded and equicontinuous on K , hence there exists
v ∈ C ∞ (K ) such that u −→ v uniformly on K , so ∇ 2 v = 0 in K , and since
u −→ u in L 2 () we conclude that u = v.
Corollary 4.9.2 If u ∈ H 1 () is a weak solution to the Laplace equation then
u ∈ C ∞ (). In other words, weak solutions of the Laplace equation are classical
solutions.
The significance of the result is that it shows that the weak solution is actually
smooth and gives a classical solution. This demonstrates interior regularity.
Now we turn our discussion to the Poisson equation. The treatment for this equation
is standard and can be used to establish regularity results for other general elliptic
equations. The main tool of this topic is difference quotient. In calculus, the difference
quotient of a function u ∈ L p () is given by the formula
276 4 Elliptic Theory
for all x ∈ .
(2) Sum Rule:
Dkh (u + v)(x) = Dkh u(x) + Dkh v(x)
for all x ∈ .
(3) Product Rule:
for all x ∈ .
(4) Integration by Parts:
Proof The first three statements can be proved by similar arguments for the classical
derivatives in ordinary calculus and it is thus left to the reader. For (4), note that by
using the substitution y = x + hek ,
4.9 Regularity for the Poisson Equation 277
Therefore we have
u(x)v(x + hek ) − u(x)v(x)
u(x)Dkh v(x)d x = dx
h
u(x − hek )v(x) − u(x)v(x)
= dx
h
u(x − hek )v(x) − u(x)v(x)
=− dx
−h
u(x − hek ) − u(x)
= − v(x) dx
−h
=− v(x)Dk−h u(x)d x.
The next theorem investigates the relation between difference quotients of a Sobolev
function in W 1, p () and its weak derivatives. Of course, if u ∈ W 1, p () then Du ∈
L p (), so
Dk u L p () < ∞.
How to compare between the two norms of Dk u and Dkh u? In view of the next theorem,
we see that difference quotients of functions in W 1, p () are bounded above by its
partial derivatives. On the other hand, if u ∈ L p () and its difference quotients Dkh u
are uniformly bounded above and independent of h, then its weak derivative exists
and is bounded by the same bound. That is, u ∈ W 1, p ().
Theorem 4.9.5 Let ⊆ Rn and suppose ⊂⊂ and ⊆ h .
(2) If u ∈ L p (), 1 < p and there exists M > 0 such that for any and h above
we have Dkh u ∈ L p ( ) and
h
D u ≤ M,
k L p ( )
u ∈ W 1, p () ∩ C 1 ().
h 1 h
D u(x) p d x ≤ |Dk u(x + tek )| p dtd x
k
h 0
h
1
= |Dk u(x + tek )| p d xdt (by Fubini Thm)
h 0
h
1
= |Dk u(x)| p d xdt
h 0
= |Dk u(x)| p d x.
(2): A well-known result in functional analysis states that every bounded sequence
in a reflexive space has a weakly convergent subsequence (see Theorem 5.2.6 next
chapter). So, letting h = h n such that h n −→ 0 as n −→ ∞, there exists a weakly
w
convergent subsequence, say Dkh n u, such that Dkh n u −→ v ∈ L p ( ), and so for
every ϕ ∈ C0∞ () we have by definition of weak convergence
ϕvd x = − u Dk ϕd x,
w
which implies that v = Dk u ∈ L p ( ) and since Dkh n u −→ v, we obtain (see
Proposition 5.3.3(3) next chapter)
Dk u L p ( ) ≤ lim inf Dkh n u p ≤ M.
L ( )
4.9 Regularity for the Poisson Equation 279
Dk u L p () ≤ M.
The above results will help us establish regularity results. The regularity results
contain tedious calculations, so we will start with the simplest settings of Poisson
equation in Hilbert space, and the results for general equations can be proved by a
similar argument.
Proof Since u is a weak solution to the Poisson equation, it satisfies the weak
formulation
Du.Dvd x = f vd x (4.9.2)
for every v ∈ H01 . Define v = ξ 2 u where ξ ∈ Cc∞ is a cut-off function having the
following properties:
1. On , we have ξ = 1.
2. On \ , we have 0 ≤ ξ ≤ 1, and |∇ξ| ≤ M for some M > 0.
3. supp(ξ) ⊆ .
Substituting it in (4.9.2) gives
ξ 2 |Du|2 + 2ξ∇ξu Du = ξ 2 f ud x.
It follows that
280 4 Elliptic Theory
1
Use Cauchy inequality (Lemma 4.4.5) on the second integral on the LHS with = ,
4
and Young’s inequality on the RHS integral with
1 1 1
ξ 2 |Du|2 ≤ ξ 2 |Du|2 + 2 u 2 |∇ξ|2 + ξ2 f 2 + ξ2u2.
2 2 2
Proof We will use the same settings of the preceding lemma. Let ⊂⊂ h ⊂ ,
and define a cut-off function ξ ∈ Cc∞ having the following properties:
1. In , we have ξ = 1.
2. In h \ , we have 0 ≤ ξ ≤ 1, and |∇ξ| ≤ M for some M > 0.
3. supp(ξ) ⊆ h .
Since u is a weak solution to the Poisson equation, it satisfies the weak formulation
(4.9.2). Consider v ∈ H01 () and let supp(v) ⊆ h . Substitute it in (4.9.2) with
u = −Dkh u.
This gives
4.9 Regularity for the Poisson Equation 281
= f.Dkh vd x
h
So we have
Now, define
v = −ξ 2 Dkh u
in (4.9.3). Note that, using Proposition 4.9.4(3), the expression D(ξ 2 Dkh u) can be
written as
so
ξ D(D h u)2 2 = D(Dkh u.).D(ξ 2 Dkh u)d x − 2 ξ∇ξ.Dkh u.D(Dkh u)d x.
k L (h )
h h
(4.9.5)
Given property 2 for ξ above and using it in (4.9.4) gives
ξ D(D h u)2 2 = D(Dkh u.).D(ξ 2 Dkh u)d x − 2 ξ∇ξ.Dkh u.D(Dkh u)d x
k L (h )
h h
≤ D(ξ 2 Dkh u) L 2 (h ) f L 2 () + 2 ξ D(D h u) ∇ξ D h u d x
k k
h
h
≤ 2M Dk u L 2 (h ) f L 2 () + ξ D(Dkh u) L 2 (h ) f L 2 ()
+ ξ D(D h u) 2M D h u d x (by (4.9.6)).
k k
h
Again, invoking the Cauchy inequality in the RHS of the inequality above, with
1
= and the values of s, t are in the order appeared in the inequality, we have
4
2
ξ D(D h u)2 2 ≤ M 2 Dkh u L 2 (h ) + f 2L 2 ()
k L (h )
1 2
+ ξ D(Dkh u) L 2 (h ) + f 2L 2 ()
4
1 2 2
+ ξ D(Dkh u) L 2 (h ) + 4M 2 Dkh u L 2 (h )
4
1 2
≤ ξ D(Dkh u) L 2 (h ) + 2 f 2L 2 () + 5M 2 Du2L 2 (h )
2
(Theorem 4.9.5(1))
which implies
ξ D(D h u)2 2 ≤ 4 f 2L 2 () + 10M 2 Du2L 2 (h )
k L (h )
# $
≤ C f 2L 2 () + Du2L 2 (h )
2
≤ C f L 2 () + Du L 2 (h ) .
Substituting the above and using Lemma 4.9.6 (Caccioppoli’s Inequality) with =
h in the Lemma gives
2
D u ≤ C f L 2 () + Du L 2 () . (4.9.7)
L 2 ( )
which is incorrect. It should also be noted here that having ∇ 2 u ∈ L 2 doesn’t nec-
essarily mean that the weak derivatives Di2 u exist and belong to L 2 because, as dis-
cussed earlier in Section 3.1, the existence of pointwise derivatives doesn’t always
imply the existence of weak derivatives, so our case should be handled with extra
care. For example, consider the equation u = 0 in some interval in R. Then, if u
is a strong solution to the equation, it is also a weak solution since it satisfies the
weak formulation, but we cannot conclude that u ∈ W 1, p because u might be a step
function, and step functions of the form (3.1.1) are not weakly differentiable.
Now we take the result one step further. We will prove it for a general elliptic equation
(4.2.4), namely
n
∂ n
− ai j (x)D j u + bi (x)Di u(x) + c(x)u(x) = f . (4.10.1)
i, j=1
∂xi i=1
The argument of the proof is similar to the preceding theorem and so we will give
a sketch of the proof, leaving all the details to the reader to figure out. Note also that
the Caccioppoli inequality can be proved for the operator in (4.10.1) by a similar
argument.
Theorem 4.10.1 (Interior Regularity Theorem) Consider the elliptic equation Lu =
f where L is a uniformly elliptic operator given by (4.10.1) for ai j ∈ C 1 (), bi , c ∈
L ∞ (), and f ∈ L 2 () for some bounded open in Rn . If u ∈ H 1 () is a weak
solution to the equation Lu = f , then u ∈ HLoc
2
(). Furthermore, for any ⊂⊂
and some constant C > 0, we have
Proof We will use the same settings as before. Namely, let ⊂⊂ h ⊂ , and
define a cut-off function ξ ∈ Cc∞ having the following properties: In , we have
ξ = 1, in h \ , we have 0 ≤ ξ ≤ 1, |∇ξ| ≤ M for some M > 0, and supp(ξ) ⊆
h . Since u is a weak solution to (4.10.1), it satisfies (4.10.2), so
n
ai j (x)Di u Di vd x = fˆvd x, (4.10.3)
i, j=1
n
fˆ(x) = f − bi (x)Di uv + c(x)u(x).
i=1
Choose
v(x) = −Dk−h (ξ 2 Dkh u),
n
ai j (x)Di u Di (−Dk−h (ξ 2 Dkh u))d x = − fˆ Dk−h (ξ 2 Dkh u)d x. (4.10.4)
i, j=1
Employing all the previous useful results that we used in the preceding theorem, the
LHS of the equation can be written as
n
ai j (x)Di u Di (−Dk−h (ξ 2 Dkh u))d x = A + B,
i, j=1
where
n
A= ai j Dkh D j u(ξ 2 Dkh Di u))d x,
i, j=1
n
B= [ai j Dkh D j u(2ξ∇ξ Dkh u)
i, j=1
λ
and for the integral B, we can use Cauchy’s inequality with = , then
2
Theorem 4.9.5(1). We obtain
λ h 2
|B| ≤ ξ D Du d x + C |Du|2 d x.
k
2 h
λ
n
h 2
ξ Dk Du d x − C1 |Du|2 d x ≤ ai j Di u Di vd x = fˆvd x.
2 h i, j=1
(4.10.5)
On the other hand, by doing the necessary calculations on the RHS of the inequality
λ
above and using Cauchy’s inequality with = , Theorem 4.9.5(1) gives
4
λ h 2 2
f Dk (ξ Dk u)d x ≤
ˆ −h 2 h ξ D Du d x + C2 | f | + |u|2 + |Du|2 d x.
4 k
h
Remark It should be noted that the estimate of Theorem 4.10.1 can be expressed
by L 2 −norm of u rather than the H 1 −norm and becomes
Now that we proved the regularity result for u ∈ H 1 and prove that u ∈ H 2 , one can
use induction to repeat the argument above and iterate the estimates and obtain higher
order regularity. Indeed, if f ∈ H 1 (), then by the preceding theorem u ∈ HLoc 2
().
Let us for simplicity consider the Poisson equation. Recall that a weak solution
satisfies in (4.9.2) which deals only with the first derivative, and consequently one
cannot perform integration by parts to obtain the equation −∇ 2 u = f because there
is no grantee that D 2 u ∈ L 2 and Di f ∈ L 2 exists and is integrable in L 2 , and this is
the main reason why weak solutions cannot be automatically regarded as strong or
classical solutions to the original equation. However, if the weak solution is found
to be in H 2 and Di f ∈ L 2 , then we can perform integration by parts. In this case,
we can choose to deal with Dv instead of v ∈ Cc∞ . Substituting it in (4.9.2) gives
Du.D(Dv)d x = f Dvd x.
The solution u in this case satisfies the original equation pointwise and almost
everywhere, and is thus a strong solution.
Corollary 4.10.2 Under the assumptions of the preceding theorem, if u ∈ H 1 () is
a weak solution of the equation Lu = f , then u is a strong solution to the equation.
Another important consequence is that, based on (4.10.6) and the argument above,
if we repeat the proof of the preceding theorem and iterate our estimates we will
obtain u ∈ HLoc
3
(). This process can be inductively repeated for k ∈ N. In general,
we have the following.
Theorem 4.10.3 (Higher order Interior Regularity Theorem) Consider the elliptic
equation Lu = f where L is a uniformly elliptic operator given by (4.10.1) for ai j ∈
C k+1 (), bi , c ∈ L ∞ (), and f ∈ H k () for some open bounded in Rn . If u ∈
H 1 () is a weak solution to the equation Lu = f , then u ∈ HLoc
k+2
(). Furthermore,
for any ⊂⊂ and some constant C > 0, we have
A natural question arises now: if the preceding theorem holds for all k ∈ N, shouldn’t
that imply that the solution is smooth? The following theorem answers the question
positively.
Theorem 4.10.4 (Interior Smoothness Theorem) Let ai j , bi , c, f ∈ C ∞ () and u ∈
H 1 () be a weak solution to the equation Lu = f , then u ∈ C ∞ () and u be a
classical solution to the equation.
All the preceding results obtained so far establish the interior regularity for weak
solutions H k ( ) for sets ⊂⊂ . This shows that a solution of an elliptic equa-
tion with regular/smooth data is locally regular/smooth in the interior of its domain
of definition; but it doesn’t mention any information about the smoothness of the
solution at the boundary. In other words, the results above didn’t obtain a solution in
H k (). In order to obtain a smooth solution at the boundary, and based on the treat-
ment we gave to obtain interior regularity and smoothness, we require the boundary
itself to be sufficiently regular. The following theorem can also be iterated to yield
smoothness provided the data are smooth. The proof is long and very technical and
may not fit the scope of the present book. The interested reader may consult books
on the theory of partial differential equations for the details of the proof.
Theorem 4.10.5 (Boundary Regularity Theorem) In addition to the assumptions of
Theorem 4.10.1, suppose that is of C 1 −class, and ai j ∈ C 1 (). If u ∈ H01 () is a
weak solution to the equation Lu = f under the boundary condition u = 0 on ∂,
then u ∈ H 2 (). Furthermore, for some constant C > 0, we have
It becomes clear now that the weak solutions of elliptic equations can become
strong or even classical solutions provided the data of the equation are sufficiently
regular. It is well-known that the task of establishing the existence of a solution of
an elliptic equation is not an easy task. Now, in light of the present section, one can
seek weak solutions over Sobolev spaces, then use regularity techniques to show
that these weak solutions are in fact strong or classical solutions. In the next chapter,
we will see that weak solutions of elliptic partial differential equations are closely
288 4 Elliptic Theory
connected with minimizers of integral functionals that are related to these elliptic
PDEs through their weak formulation.
4.11 Problems
(1) If ai j (x) ∈ C 2 () and bi = 0 in the elliptic operator L in (4.1.1), prove that
B[u, v] is symmetric.
(2) Suppose that L is an elliptic operator and there exists 0 < < ∞ such that
n
ai j ξi ξ j ≤ |ξ|2
i=1
for some bounded ⊂ Rn , then show that u is a classical solution to the prob-
lem.
(10) Consider H01 () with the Sobolev norm
for u ∈ H01 (). Show that the norm Du L 2 () is equivalent to u H01 () on
H01 ().
(11) Determine whether or not H01 (Rn ) with the inner product
(u, v) = ∇u∇vd x
is a Hilbert space.
(12) Show that if f ∈ L 2 () and D f is its distributional derivative, then
D f H −1 () ≤ f L 2 () .
γ = λ0 − min(c(x)).
(b) Show that the bilinear map associated with the operator
Lu + μu = f
is coercive for μ ≥ γ.
(c) Establish the existence and uniqueness of the weak solution of the Dirichlet
problem of
Lu + μu = f
with u(∂) = 0.
(14) Let L be a uniformly elliptic general operator with ai j , bi , c ∈ L ∞ () for some
open and bounded in R2 . Write Garding’s Inequality with
λ0 M
γ= −m+ ,
2 2λ0
f dx + gd x = 0.
(b) Establish the existence and uniqueness of the weak solution of the problem.
(17) Prove that there exists a weak solution for the nonhomogeneous Helmholtz
problem
−∇ 2 u + u = f : x ∈
u=0 : x ∈ ∂
−div(∇u) + u = f
∂u
= g.
∂n
(a) Find the weak formulation of the problem.
(b) Prove the existence and uniqueness of the weak solution of the problem.
(19) Consider the problem
(− pu ) + qu = f
u(0) = u(1) = 0,
Lu + μu = f x ∈
u = 0, x ∈ ∂
n n n
∂ ∂ ∂ ∂
L=− ai j (x) − (bi ·) + c(x) + d,
i, j=1
∂xi ∂x j i=1
∂xi i=1
∂xi
λ0 1
β= , γ= (max bi ∞ + c∞ )2 + d∞ .
2 2λ0
c) If μ ≥ γ, show that
Bμ [u, v] = B[u, v] + μ u, v L2
(26) Without using Theorem 4.5.4, show that the elliptic bilinear map associated
with
Lu + μu = f
is coercive for μ ≥ γ.
(27) Show that the norm
u∂ 2 = ∇ 2 u L 2 ()
is equivalent to the standard norm u H01 () for some open and bounded in
R2 .
(28) Consider the equation
n n
∂ ∂u ∂
− ai j (x) + (bi u) + c(x)u(x) + μu(x) = f, in .
i, j=1
∂x i ∂x j i=1
∂x i
∇ 2 ∇ 2 u = f in
u = 0 on ∂
∇u · n = 0 on ∂
where f ∈ L 2 () and is open and bounded in R2 . Use the preceding problem
to show the existence and uniqueness of the weak solution of the problem.
(30) Show that the eigenvalues of a self-adjoint uniformly elliptic operator are
bounded from below.
(31) Show that
K ∗ = (L ∗ + μI )−1 | L 2
K ∗ (v) = −v
u + γKu = g
has a weak solution in H01 () iff h, v = 0 for every weak solution v ∈ H01 ()
of K ∗ v − v = 0.
(b) If Lu = 0 has a nontrivial solution, show that
Lu = f
has a weak solution in H01 () iff f, v = 0 for every weak solution v ∈ H01 ()
of L ∗ v = 0.
(34) In the previous problem, show that if
Lu − λu = f
u L 2 ≤ C f L 2
(35) Let
L = −∇ 2 ,
and = Rn .
(a) Show that L has a continuous spectrum.
(b) Conclude that the boundedness of in Theorem 4.8.2 is essential to obtain
a discrete set of eigenvalues.
(36) Let {φn } be the orthonormal basis for L 2 () which are the eigenfunctions of
the Laplacian equation
−∇ 2 φn = λn φn .
φn
Show that { √ } is an orthonormal basis for H01 () endowed with the Poincare
λn
inner product
u, v ∂ = Du Dv.
n
(42) In Theorem 4.10.3, show that if k > then u ∈ C 2 () is a classical solution
2
for the equation.
Chapter 5
Calculus of Variations
The problem of finding the maximum value or minimum value of a function over
some set in the domain of the function is called: variational problem. This problem
and finding ways to deal with it is as old as humanity itself. In case we are looking
for a minimum value, the problem is called: minimization problem. We will focus
on the minimization problems due to their particular importance: In physics and
engineering, we look for the minimum energy, in geometry, we look for the shortest
distance or the smallest area or volume, and in economy, we look for the minimum
costs. Historically, the minimization problems in physics and geometry were the
main motivation to develop the necessary tools and techniques that laid the founda-
tions of this field of mathematics which connects functional analysis with applied
mathematics. We will discuss the relations between minimization problems and the
existence and uniqueness problem of solutions of PDEs. We confine ourselves to
elliptic equations. We shall give a formal definition to the problem.
Definition 5.1.1 (Minimization Problem) Let f : X −→ R be a real-valued func-
tion. The minimization problem is the problem of searching for some x0 ∈ A for
some set A ⊆ X , such that
f (x0 ) = inf f (x).
x∈A
If there exists such x0 ∈ A, then we say that x0 is the solution to the problem,
and the point x0 is said to be the minimizer of f over A. If f (x0 ) is the minimum
value over X then x0 is the global minimizer of f . The set A is called: admissible
set, and it consists of all the values of x that shall compete to obtain the minimum
value attained in A. We observe the following points:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 295
A. Khanfer, Applied Functional Analysis,
https://doi.org/10.1007/978-981-99-3788-2_5
296 5 Calculus of Variations
(2) If the minimizer was not found over a set A, it is still possible to find it if the
admissible set A is enlarged to a bigger one, and the larger the admissible set,
the bigger chance the problem will have a solution.
(3) The maximization problem is the dual of the minimization problem, noting that
over an admissible set A, we always have that
does have a minimum value of 0; but it does not have a maximum value. If we restrict
our goal to exploring minimizers only, then functions such as the above may serve as a
good example to motive us. In fact, this function is known as: lower semicontinuous.
5.1 Minimization Problem 297
f −1 (−∞, r] = {x : f (x) ≤ r}
In words, the epigraph of a function f is the set of all points in X × R lying above
or on the graph of f . The next result shows that l.s.c. functions can be identified by
their epigraphs.
Proposition 5.1.8 Let f be defined on some normed space X . Then f is l.s.c. iff
epi(f ) is closed.
Proof Consider the sequence (xn , rn ) ∈ epi(f ), which implies that xn ≤ rn . let
(xn , rn ) −→ (x0 , r0 ) for some x0 ∈ X and r0 ∈ R. Then by lower semicontinuity,
Hence
(f (x0 ), r0 ) ∈ epi(f )
and this proves one direction. Conversely, let epi(f ) be closed. Let xn −→ x0 in X .
Then
(xn , f (xn )) ∈ epi(f )
which is closed, so
(xn , f (xn )) −→ (x0 , r0 ) ∈ epi(f ).
With the use of the l.s.c notion, Bolzano–Weierstrass Theorem can be reduced to the
following version:
Theorem 5.1.9 If f :X −→ (−∞, ∞] is l.s.c on a compact set K ⊆ X , then f is
bounded from below and attains its infimum on K, i.e., there exists x0 ∈ K such that
Proof Let
inf f (x) = m ≥ −∞.
x∈K
Since f is l.s.c., every Cn is closed in X , and since they are subsets of C, they are all
compact; noticing that Cn+1 ≤ Cn for all n. By Cantor’s intersection theorem,
∞
Cn = Ø.
n=1
therefore,
f (x0 ) = inf f (x) = m > −∞.
x∈K
The above theorem establishes the existence of minimizers without giving a con-
structive procedure to find it. To find the minimizer, we need to implement various
analytical and numerical tools to obtain the exact or approximate value of it, but this
is beyond the scope of the book. To determine whether the minimizer predicted by
the above theorem is global or not, we need to impose further conditions to give
extra information about this minimizer. One interesting and important condition is
the notion of convexity.
5.1.4 Convexity
for all 0 < θ < 1. If the inequality is strict then f is called: strictly convex. It is
evident from the definition of convex function that the domain D(f ) must be convex.
The following theorem benefits from the property of convexity.
Proof If x0 is local minimizer, then for any y ∈ D(f ), we can choose suitable 0 <
θ < 1 such that
f (x0 ) ≤ f (y).
We have seen that compactness and lower semicontinuity were the basic tools used
to search for minimizers in finite-dimensional spaces. This situation totally changes
when it comes to infinite-dimensional spaces. As discussed in a previous course of
functional analysis, compactness is not easy to achieve in these spaces, and Heine–
Borel Theorem, which states that every closed bounded set is compact, fails in
infinite-dimensional spaces. In fact, a well-known result in analysis states that a
space is finite-dimensional if and only if its closed unit ball is compact, and hence
the closed unit ball in infinite-dimensional spaces is never compact. A good sug-
gestion to remedy this difficult situation is to change the topology on the space.
More specifically, we replace the norm topology with another “weaker” topology
that allows more closed and compact sets to appear in the space. This is the weak
topology, which is the subject of the next section.
It is well-known that in normed spaces, the stronger the norm the more open sets we
obtain in the space, which makes it harder to obtain convergence and compactness.
Replacing the norm topology with a coarser, or weaker topology results in fewer
open sets, which will simplify our task. We are concerned with the smallest topology
that makes any bounded linear functional continuous. This merely requires f −1 (O)
5.2 Weak Topology 301
to be open in the topology for any O open in R. The generated topology is called the
weak topology.
Historically, the minimization problem was one of the main motivations to explore
the weak topology. As noted in the preface, the reader of this book must have prior
knowledge of weak topology in a previous course of analysis, but we shall give a
quick overview and remind the reader of the important results that will be used later
in the chapter.
w
Theorem 5.2.1 (Radon-Riesz Theorem) Let fn ∈ Lp , 1 < p < ∞. If xn −→ x) in
LP and fn p −→ f p then fn − f p −→ 0.
n
x∈ fk−1 (Ok ) ⊂ U.
k=1
Now, we state two important theorems about closed sets in the weak topology.
Theorem 5.2.2 (Mazur’s Theorem) If a set is convex and closed, then it is weakly
closed.
x0 ∈ U = {x ∈ X : f (x) > b} ⊂ Ac .
BX = {x ∈ X : x ≤ 1}
is weakly compact.
Theorem 5.2.5 Let X be normed space and A be a subset of X . Then the following
statements are equivalent:
(1) X is reflexive.
(2) A is weakly compact if and only if A is weakly closed and weakly bounded.
Proof Let
Y = span{xn } ⊂ X .
The following are well-known results that can be easily proved using the above
theorems:
(1) Every reflexive space is a Banach space.
(2) Every Hilbert space is reflexive.
(3) All Lp spaces for 1 < p < ∞ are reflexive.
(4) A closed subspace of a reflexive space is reflexive.
We provided brief proofs for the theorems above.1
In the weak topology, the lower semicontinuity takes the following definition:
Definition 5.2.7 (Weak Lower Semicontinuous Mapping) A mapping f : X −→
(−∞, ∞] is said to be weakly lower semicontinuous, denoted by w.l.s.c. at
x0 ∈ D(f ) if
f (x0 ) ≤ lim inf f (xn )
1 For details, the reader can consult volume 2 of this series, or alternatively any other textbook on
functional analysis.
304 5 Calculus of Variations
Since convergence in norm implies weak convergence, it is evident that every w.l.s.c.
mapping is indeed l.s.c. The converse, however, is not necessarily true, unless the two
types of convergence are equivalent, which is the case in finite-dimensional spaces.
The convexity property once again proves its power and efficiency. We first need to
prove the following result.
Lemma 5.2.8 A function f is convex if and only if epi(f ) is convex.
Then
z = θr + (1 − θ)s
≥ θf (x) + (1 − θ)f (y)
≥ f (θx + (1 − θ)y)
≥ f (t).
Hence (z, t) ∈ epi(f ). Conversely, let epi(f ) be convex and suppose x, y ∈ D(f ).
Then
(x, f (x), (y, f (x) ∈ epi(f )
and so
(θx + (1 − θ)y, θf (x) + (1 − θ)f (y)) = θ(x, f (x)) + (1 − θ)(y, f (y)) ∈ epi(f ).
Proof Since f is convex, by the preceding theorem epi(f ) is convex, and since f
is l.s.c., by Proposition 5.1.8 epi(f ) is closed, hence by Mazur’s theorem (The-
orem 5.2.2) epi(f ) is weakly closed, i.e., closed in the weak topology, hence by
Proposition 5.1.8 again, f is l.s.c in the weak topology.
The next result shows that if a sequence converges weakly, then there exists a finite
convex linear combination of the sequence that converges strongly to the weak limit
of the sequence.
5.3 Direct Method 305
w
Lemma 5.2.10 (Mazur’s Lemma) Let xn −→ x) in a normed space X with a norm
·X . Then for every n ∈ N, there exists a corresponding k = k(n) and another
sequence
k
yn = λj xj ,
j=n
λn + λn+1 + · · · + λk = 1,
such that
yn − xX −→ 0.
k
k
Cn = { λj xj , s.t. λj = 1}.
j=n j=n
Then it is clear from Proposition 5.1.10(5) that Cn is the convex hull of the set {xj : j ≥
n}, so it is convex, hence by Proposition 5.1.10(3) Cn is convex and closed. By Mazur’s
Theorem 5.2.2, this implies that Cn is weakly closed, and since xn converges weakly
to x, we must have x ∈ Cn from which we can find a member yn ∈ C such that
1
yn − xX ≤ . (5.2.1)
n
Since {xn } is a sequence, we can take the limit as n −→ ∞ in (5.2.1) which gives
the desired result.
The main tool and the key to solve the problem is to start with minimizing sequences.
(1) Constructing the minimizing sequence that converges to the infimum of the
function.
(2) Prove a subsequence of it converges.
(3) Show that the limit is the minimizer
Step 2 is the crucial one here. The general argument is to extract a subsequence
of the minimizing sequence which converges in the weak topology. We will use
the Kakutani Theorem (Theorem 5.2.6) which is a generalization of the Bolzano–
Weierstrass Theorem.
For step 1, the construction of the minimizing sequence comes naturally, i.e., from
the definition of the infimum. Another way of showing the finiteness of the infimum
is to prove that the function is bounded from below, which implies the infimum can
never be −∞. It is important to note that if the function is not proved to be bounded
from below, its infimum may or may not be −∞, and assuming that
f : X −→ (−∞, ∞]
doesn’t help in this regard because the infimum may still be −∞ even if the function
doesn’t take this value. For example, if f (x) = e−x then f : R −→ (0, ∞) although
inf f = 0. One advantage of defining the range of the function to be (−∞, ∞] is
to guarantee that there is no x0 ∈ D(f ) with f (x0 ) = −∞. So if we can show the
existence of a minimizer x0 , then we have
In fact, a function that is convex and l.s.c. cannot take the value −∞ since other-
wise it would be nowhere finite (verify). Thus, it is important to avoid this situation
by assuming the function to be proper. A function is called proper if its range is
(−∞, ∞]. So, a proper functional f means neither f ≡ ∞, nor does it take the value
−∞.
Remark Throughout, we use the assumption that f is proper in all subsequent
results.
5.3.4 Coercivity
We will assume that our space X is reflexive Banach space in order to take advan-
tage of the Banach–Alaoglu Theorem and its consequences, such as the Kakutani
Theorem. We will also assume the function to be proper convex and l.s.c., and the
admissible set is bounded, closed, and convex. If the admissible set is bounded, then
any sequence belongs to the set is also bounded, so we can use Kakutani Theorem if
the space is a reflexive Banach space, and we extract a subsequence that converges
weakly. If the set is not bounded, then we can control the boundedness of the sequence
using the coercivity of the function. One equivalent variant of coercivity of f is that
308 5 Calculus of Variations
f (xn ) −→ ∞
if
xn −→ ∞
f (xn ) −→ inf f
f (xn ) ∞,
and consequently xn ∞ and this proves that xn is bounded. We can look at it
from a different point. Letting z ∈ D(f ) such that f (z) < ∞, we can choose r > 0
large enough to let
f (x) > f (z) whenever x > r.
We can then exclude all these members x by taking the following intersection:
M = D(f ) ∩ Br (0),
which is clearly a bounded set, and is also closed if D(f ) is closed, and convex if
D(f ) is convex, and the infimum of f over D(f ) is the same as the infimum over M .
Our minimizing sequence xn is certainly inside the bounded set M , so it is bounded.
If the space is reflexive Banach space, then M lies in a large fixed closed ball which
is weakly compact, so we can extract a subsequence from xn that converges weakly
to some member x0 . It remains to show that this limit is the minimizer, and it belongs
to M , keeping in mind that
inf f = inf f .
M D(f )
Now we state our main result of the section which solves the minimization problem.
J [un ] −→ m,
It follows that
m ≤ J [u] ≤ lim inf J [un ] ≤ m,
and therefore
J [u] = m > −∞.
u1 + u2
u0 = .
2
Then by strict convexity,
u1 + u2 1
J [u0 ] = J < (J [u1 ] + J [u2 ]) = m,
2 2
and this contradicts the fact that m is the infimum. The proof is complete.
In light of the preceding theorem, we see that the property of being reflexive and
Banach is very useful in establishing the existence of minimizers. We already proved
in Theorem 3.5.3 that all Sobolev spaces W k,p () are Banach, separable, and reflex-
ive for 1 < p < ∞, which justifies the great importance of Sobolev spaces in applied
mathematics. Also, we notice the importance of the admissible set to be convex as
this guarantees I to be convex and u to be in the admissible set. Proving a functional
is w.l.s.c. could be the most challenging condition to satisfy. Once the functional is
proved to be l.s.c. and convex, Theorem 5.2.9 can be used.
We recall some basic results in the analysis:
Proposition 5.3.3 Let un , u ∈ Lp , 1 ≤ p < ∞. The following statements hold:
310 5 Calculus of Variations
(1) If un −→ u in norm then there exists a subsequence (unj ) of (un ) such that
unj −→ u a.e.
(2) If (un ) is bounded then there exists a subsequence (unj ) of (un ) such that
w
(3) If un −→ u) then (un ) is bounded and
Proof (1) follows immediately from the convergence in measure. (2) is proved using
the definition of infimum. (3) can be proved using the definition of weak convergence,
noting that the sequence {f (un )} is convergent for every bounded linear functional
f ∈ X ∗ . Now, we use Hahn–Banach Theorem to choose f such that f = 1 and
f (u) = u .
The third statement is especially important and can be a very efficient tool in prov-
ing weak lower semicontinuity property for functionals as we shall see in the next
section.
The goal of this section is to employ the direct method in establishing minimizers
of some functionals. Then we proceed to investigate connections between these
functionals and weak solutions of some elliptic PDEs. It turns out that there is a
close relation between the weak formulation (4.2.5) of the form
B[u, v] − f , v = 0
J [v] = B[u, v] − f , v .
It was observed that the minimizer of the functional J , i.e., the element u0 that will
minimize (locally or globally) the value of J , is the solution of the weak formulation,
which in turns implies that u0 is the weak solution of the associated PDE from which
the bilinear B was derived, and given that B is identified by an integral, the same
5.4 The Dirichlet Problem 311
Recall the weak formulation of the Laplace equation takes the form
∇u.∇vdx = fvdx
Theorem 5.4.2 (Dirichlet Principle) Let ⊂ Rn be bounded, and consider the col-
lection
A = {v ∈ Cc2 () : v(∂) = 0}.
A function
u ∈ Cc2 () ∩ C 0 ()
is a solution to problem (5.4.1) if and only if u is the minimizer over A of the Dirichlet
integral
1
E(u) = |∇u|2 dx.
2
Proof Note that for u, v ∈ Cc2 (), we have by integration by parts (divergence the-
orem)
∇u · ∇vdx = [(v∇u)]∂ − ∇ uvdx = − ∇ 2 uvdx
2
(5.4.2)
Now, let u ∈ Cc2 () be a solution to the Laplace equation. Let v ∈ A. Then by (5.4.2)
0= (∇ u)vdx = −
2
∇u · ∇vdx,
so
|∇(u + v)|2 dx = |∇u|2 dx + ∇u · ∇vdx + |∇v|2 dx
= |∇u| dx +
2
|∇v|2 dx
≥ |∇u|2 dx,
1
for an arbitrary v in A. Multiplying both sides of the inequality by , we conclude
2
that u is a minimizer of E.
Conversely, assume that the functional E has the minimizer u and E(u) ≤ E(v)
for all v ∈ A. Let v ∈ A and choose t ∈ R. Then
h(t) = (u + tv),
Note that the derivative takes place in t while the integration is in x, and since
u ∈ Cc2 () ∩ C 0 () we can take the derivative inside, and so
h (t) = |∇u| |∇v| + t |∇v|2 dx.
The principle gives the equivalence between the two famous problems. Based on this
principle, one can guarantee the existence of the Harmonic function (which solves
the Laplace equation) if and only if a minimizer for the Dirichlet integral is obtained.
It is well known that there exist harmonic functions that satisfy Laplace’s equation.
Now, if
∇ 2 u = 0,
this implies that ∇u is a constant, and when u = 0 on the boundary then we must
have u = 0 in the entire region, and so the energy integral equals zero. However, that
wasn’t the end of the story. If we go the other way around, mathematicians in the late
nineteenth century started to wonder: does there exist a minimizer for the integral?
Riemann, a pupil of Dirichlet, argued that the minimizer of the integral exists since
E≥0
314 5 Calculus of Variations
from which he incorrectly concluded that there must exist a greatest lower bound. In
1869, Weierstrass provided the following example of a functional with a minimum
but with no minimizer, i.e., the functional doesn’t attain its infimum.
Example 5.4.3 (Weierstrass’s Example) Consider the Dirichlet integral
1
xv (x) dx
2
E[v] =
−1
where
v ∈ A = {v ∈ Cc2 () : v((−1) = −1, v(1) = 1}.
tan−1 nx
un (x) =
tan−1 n
Then, it is easy to see that E[un ] −→ 0, and so we conclude that
inf E[u] = 0.
u∈A
Now, if there exists u ∈ A such that E[u] = 0, then u = 0, which implies that u is
constant, but this contradicts the boundary conditions.
The example came as a shock to the mathematical community at that time, and the
credibility of the Dirichlet principle became questionable. Several mathematicians
started to rebuild the confidence in the principle by providing a rigorous proof for
the existence of the minimizer of the Dirichlet integral. In 1904, Hilbert provided
a “long and complicated” proof of the existence of the minimizer of E in C 2 using
the technique of minimizing sequence. After the rise of functional analysis, it soon
became clear that the key to achieve a full accessible proof lies in the direct method.
The Dirichlet principle is valid on C 2 but the minimizer cannot be guaranteed unless
we have a reflexive Banach space, and C 2 is not. The best idea was to enlarge the
space to the completion space of C 2 , which is nothing but the Sobolev space. It
came as no surprise to know that defining Sobolev spaces as the completion of
smooth functions C ∞ was for a long time the standard definition of these spaces.
Furthermore, according to Theorem 3.5.3, Sobolev spaces are reflexive Banach space
for 1 < p < ∞. It turns out that Sobolev spaces are the perfect spaces to use in order
to handle these problems in light of the direct method that we discussed in Sect. 5.3.
This was among the main reasons that motivated Sergei Sobolev to develop a theory
to construct these spaces, and this justifies Definition 3.7.4 from a historical point of
view.
5.5 Dirichlet Principle in Sobolev Spaces 315
The next two theorems solve the minimization problem of the Dirichlet integral over
two different admissible sets in a simple manner by applying Theorem 5.3.2.
Theorem 5.5.1 There exists a unique minimizer over A = {v ∈ H01 (), is
bounded in Rn } for the Dirichlet integral.
Proof It is clear that E is bounded from below by 0, and so there is a finite infimum
for E. Moreover, E is coercive by Poincare’s inequality (or considering the Poincare
w w w
norm on H01 ). Let un −→ u in H01 (). Then un −→ u and Dun −→ Du in L2 ().
By Proposition 5.3.3(3),
which implies
E[u] ≤ lim inf E[un ], (5.5.1)
and so E[·] is w.l.s.c. Finally, it can be easily proved that E[·] is strictly convex due
to the strict convexity of |·|2 and linearity of the integral. We, therefore, see that E
is bounded from below, coercive, w.l.s.c. and strictly convex, and it is defined on a
reflexive Banach space H01 (). The result now follows from Theorem 5.3.2.
Now, we investigate the same equation, but with u = g on ∂ this time. Here we
assume g ∈ H 1 (), and consequently we also have u ∈ H 1 (). This boundary con-
dition can be interpreted by saying that the functions u and g have the same trace on
the boundary. Since the functions are both Sobolev, they are measurable and so point-
wise values within a set of measure zero have no effect. To avoid this problematic
issue on the boundary, we reformulate the condition to the form:
u − g ∈ H01 ().
Accordingly, the admissible set consists of all functions in H 1 () such that they are
equal, in a trace sense, to a fixed function g ∈ H 1 () on ∂. The following result
proves an interesting property of such a set.
Proposition 5.5.2 The admissible set
is weakly closed.
Proof Notice that the set A is convex. Indeed, let u, v ∈ A and letting w = θu +
(1 − θ)v, we clearly have w ∈ H 1 () being a linear space. We also have
w − g = θu + (1 − θ)v − g
= θ(u − v) + v − g
= θ(u − g) − θ(v − g) + (v − g)
∈ H01 ()
since
(u − g), (v − g) ∈ H01 ()
un − g ∈ H01 ()
hence u ∈ H01 (), from which we conclude that A is closed, and therefore, it is
weakly closed.
The next is a variant of Theorem 5.5.1 for the space H 1 (). The source of difficulty
here is that the Poincare inequality is not applicable directly.
Theorem 5.5.3 For a bounded set in Rn , there exists a unique minimizer for the
Dirichlet integral over the set
Proof In view of the proof of Theorem 5.5.1, it suffices to show that the minimizing
sequence is bounded. If uj ∈ A is a minimizing sequence such that J [uj ] < ∞, then
sup Duj q
< ∞.
We also have
1,q
uj − g ∈ W0 (),
uj Lq
= uj − g + g Lq
≤ uj − g Lq
+ gLq
≤ D(uj − g) Lq
+ gLq
= Duj − Dg Lq
+ C1
≤C
Hence,
sup uj q
< ∞,
Now we are ready to prove the Dirichlet principle in the general settings over Sobolev
spaces.
Theorem 5.5.4 Let ⊂ Rn be bounded. A function u ∈ A = {v ∈ H01 ()} is a weak
solution to problem (5.4.1) if and only if u is the minimizer over A of the Dirichlet
Integral.
Proof If u ∈ A is a weak solution to the Laplace equation, then u satisfies the weak
formulation of the Laplace equation, namely,
Du · Dvdx = 0.
≥ |Du| dx,
2
E(u) ≤ E(v)
318 5 Calculus of Variations
for all v ∈ A. Theorem 4.5.2 proved the existence and uniqueness of the weak solution
for the problem (with f = 0), and it was already proved above that it is a minimizer
of the integral variational E[·] and Theorem 5.5.1 proved the uniqueness of the
minimizer, so the other direction is proved.
The next result discusses the Dirichlet principle under a Neumann boundary condi-
tion, ⎧
⎨−∇ 2 u = 0 x ∈
∂u (5.5.2)
⎩ = g x ∈ ∂.
∂n
for a bounded Lip. domain ⊂ Rn and g ∈ C 1 (∂). It can be seen from the problem
that the solution is unique up to constant. Let u ∈ C 2 () be a classical solution of
problem (5.5.2). Multiply the equation by v ∈ C 2 () and integrate over then using
Green’s formula,
∂u
0= ∇ 2 uvds = vds − |∇u| |∇v| dx,
∂ ∂n
or
∂u
|∇u| |∇v| dx = vds, (5.5.3)
∂ ∂n
Proof Note that the admissible space here is C 2 () and so our functions u, v don’t
necessarily vanish on ∂. Let u ∈ C 2 () be a solution of problem (5.5.2). Let
w ∈ A, and write w = u − v for some w ∈ C 2 (). Then
J [w] = J [u − v]
1
= |∇(u − v)|2 dx − (u − v)gds
2 ∂
1 1 ∂u
= |∇u|2 − guds + |∇v|2 − |∇u| |∇v| + vds.
2 ∂ 2 ∂ ∂n
Note that if v is a constant, then J [w] = J [u]. We substitute in the above equality
with v = c, and thus the condition
∂u
ds = 0
∂ ∂n
is verified, and so u ∈ A. Using the first Green’s formula in the last two integrals in
the RHS of the equation yields
∂u
− |∇u| |∇v| + vds = v∇ 2 u = 0.
∂ ∂n
J (u) ≤ J (u + tv).
h(t) = J (u + tv)
1
= |∇(u + tv)|2 − g(u + tv)ds
2 ∂
1 1
= |∇u|2 + t |∇u| |∇v| + t 2 |∇v|2 − guds − t gvds
2 2 ∂ ∂
or
∂u
v∇ udx =
2
vds − gvds. (5.5.5)
∂ ∂n ∂
Remember that we are yet to prove that u is a solution for Laplace equation, so we
cannot say that ∇ 2 u = 0. The trick here is to reduce the admissible space by adding a
suitable condition, namely v = 0 on ∂. This gives the following reduced admissible
space
A0 = {v ∈ Cc2 ()} ⊂ A = {v ∈ C 2 ()}.
So, if (5.5.5) holds for all v ∈ A then it holds for all v ∈ A0 , i.e., v = 0 on ∂, and
consequently the integrals in the RHS of (5.5.5) vanish and we thus obtain
v∇ 2 udx = 0
∇ 2 u = 0.
Now, it remains to show that u satisfies the boundary condition. Getting back to our
admissible space A = {v ∈ C 2 ()}, and since the left-hand side of (5.5.5) is zero,
this gives
∂u
0= − g vds,
∂ ∂n
for all v ∈ A. By the Fundamental Lemma of COV, we get the Neumann BC and the
proof is complete.
It is important to observe how the variational integral changes its formula although it
corresponds to the same equation, namely, Laplace equation. It is, therefore, essential
in this type of problems to determine the admissible set in which the candidate
functions compete since changing the boundary conditions usually cause a change
in the corresponding variational integral.
5.5 Dirichlet Principle in Sobolev Spaces 321
We end the section with the Dirichlet principle with Neumann condition over Sobolev
spaces.
Theorem 5.5.6 A function u ∈ H 1 (), for bounded ∈ Rn , is a weak solution of
problem (5.5.2) if and only if u is the minimizer of the variational integral (5.5.4)
over
∂u
A = {v ∈ H 1 (), ds = 0}.
∂n
∂
Proof The (if) direction is quite similar to the argument for the preceding theorem.
Conversely, let u be a minimizer of J . It has been shown that the problem (5.5.2) has
a unique weak solution (see Problem 4.11.15(a)), and the argument above showed
that a weak solution is a minimizer to the variational integral (5.5.4), so it suffices
to prove that there exists a unique minimizer for (5.5.4), but this is true since J is
strictly convex.
We end the section with the following important remark: In the proof of the preceding
theorem, to show a minimizer is a weak solution for the problem, one may argue
similar to the proof of Theorem 5.5.5. Namely, let t ∈ R, and define the function
1
h(t) = J (u + tv) = |∇(u + tv)|2 dx − (u + tv)gds.
2 ∂
Again, h has minimum at 0, and so h (0) = 0. Differentiating both sides, then sub-
stituting with t = 0 gives
0 = h (0) = |∇u| |∇v| − gvds. (5.5.6)
∂
Hence, u satisfies the weak formulation (5.5.3). Although the argument seems valid,
but in fact, we may have some technical issue with it. Generally speaking, inserting
the derivative inside the integral when one of the functions of the integrand is not
smooth can be problematic, and this operation should be performed with extra care.
We should also note that differentiating h implies that the integral functional J should
be differentiable in some sense and the two derivatives are equal. The next section will
elaborate more on this point, and legitimize this previous argument by introducing a
generalization of the notion of derivative that can be applied to functionals that are
defined on infinite-dimensional spaces.
322 5 Calculus of Variations
5.6.1 Introduction
In Sect. 5.3, we discussed the direct method and indirect method and the comparison
between them. The former is based on functional analysis, whereas the latter is based
on calculus. Two reasons for choosing the direct method are:
1. discussing the direct method fits the objective and scope of this book, and
2. we didn’t have tools to differentiate the variational integrals E[·] and J [·].
In this section, we will introduce the notion of differentiability of functionals and
will apply it to our variational integrals. As the title of the section suggests, we are
merely concerned with the idea of differentiating the variational integrals, and the
topic of differential calculus on metric function spaces is beyond the scope of the
book.
One of the main reasons that causes the direct method to appear and thrive is the
lack of differential calculus required to deal with functional defined on function
spaces at that time. Hilbert and his pupils in addition to their contemporaries didn’t
had the machinery tools of calculus to deal with these “functionals”. The theory of
differential calculus in function and metric spaces was developed by René Fréchet and
René Gateaux, and they both were still pupils studying mathematics when Hilbert
published his direct method in 1900. The first work that appeared in differential
calculus was the Frechet’s thesis in 1906, but the theory wasn’t clear and rich enough
to use at that time. Gateaux started to work on the theory in 1913, and his work was not
published until 1922. The theory of differential calculus in metric and Banach spacers
soon started to attract attention, and it soon became an indispensable tool in the area
of calculus of variations due to its efficiency and richness in techniques. The present
section gives a brief introduction to the theory. We will not give a comprehensive
treatment of the theory, but rather highlight the important results that suffice our needs
in this chapter, and show how calculus can enrich the area of variational methods
and simplify the task of solving minimization problems.
f (x + tv) − f (x)
Dv f (x) = lim . (5.6.1)
t→0 t
and given by the limit definition (5.6.1). If the limit above exists, then the quantity
Dv f (x) is called the Gateaux differential of f at x in the direction of v ∈ X , or simply:
G-differential. So at each single point, there are two G-differentials in one dimension
and infinitely many of them in two or more dimensions. Let us take a closer look at
the differential in (5.6.1). Let x0 ∈ X for some normed space X be a fixed point at
which the G-differential in the direction ofv = x − x0 for some x ∈ X exists. Then
for every t > 0, we have
Df (x, x − x0 ) ∈ R.
and
Dv f (x) = Df (x, v)
(i.e. Df (x0 ) is bounded and linear in v). If Df (x0 , v) exists at x0 in all directions
v ∈ Rn , then Df (x0 , v) is called the Gateaux derivative of f , and f is said to be
Gateaux differentiable at x.
Care must be taken in case the domain of f is not all the space X , as we must
ensure the definition applies to points in the interior of D(f ). Furthermore, as we
observe from the definition, it requires the Gateaux derivative to be bounded and
linear in v. This is especially helpful in the calculus of variations to obtain results
that are consistent with the classical theory of differential calculus.
The following proposition shows that the G-derivative rules operate quite similar to
those for the classical derivative.
Proposition 5.6.2 Let f : X −→ R that is G-differentiable on X , and c ∈ R. Then
Proof Exercise.
5.6 Gateaux Derivatives of Functionals 325
One significant difference between the classical derivative f and the G-derivative is
that if a functional is differentiable at x0 in the classical sense, then it is continuous at
x0 . This is not the case for the Df (x0 , v). The following example demonstrates this
fact.
Example 5.6.3 Consider the function f : R2 −→ R given by
xy4
x2 +y8
(x, y) = (0, 0)
f (x, y) =
0 (x, y) = (0, 0).
It is easy to show that f is not continuous at (0, 0) by showing the limit at (0, 0)
doesn’t exist. However, let v = (v1 , v2 ) ∈ R2 . Then applying the limit definition at
x0 = (0, 0) gives
v1 v4
= lim t 2 2 26 8
t→0 v + t v
1 2
= 0,
so
Df (0, 0)v = 0
where the norm ·X is the norm on the space X . Here, if the Frechet derivative exists
at a point x0 then the G-derivative exists at x0 exists and they coincide. Moreover, the
function is continuous at x0 . However, the G-derivatives may exist, but F-derivatives
do not exist. That’s why, dealing with the “weaker” type of differentiation may
sound more flexible and realistic in many cases. Moreover, due to the norm and the
uniform convergence, demonstrating the Frechet differentiability is sometimes not
an easy task, and evaluating the G-derivative is usually easier than the F-derivative.
More importantly, since this derivative can differentiate discontinuous functions, it
suits Sobolev functions and suffices our needs in this regard, so we will confine the
discussion to it throughout the chapter.
for all x, y ∈ D(f ). The following result extends the above fact to
infinite-dimensional spaces.
Theorem 5.6.4 Let f : ⊆ X −→ R be G-differentiable defined on a convex set
in a normed space X . Then f is convex if and only if
for all u, v ∈ .
Proof Let f be convex. Then for u, v ∈
tv + (1 − t)u = u + t(v − u) ∈ ,
and
w = tv + (1 − t)u = u + t(v − u) ∈ .
v − w = (1 − t)(v − u)
Multiply (5.6.5) by (1 − t) and (5.6.6) by t, respectively, then add the two equations
gives
f (u) + t(f (v) − f (u)) ≥ f (w),
but we know that Df (u) ∈ X ∗ , then by the definition of weak convergence, we have
Df (u)(un − u) −→ 0,
and the result follows by taking the liminf for both sides of the inequality above.
We can use the same discussion above in defining a second order G-derivative. Let
Df (x0 ) ∈ X ∗ be the G-derivative of a functional f : X −→ R. We define the G-
derivative of Df (x0 , v) in the direction of w ∈ X as
328 5 Calculus of Variations
This gives the second G-derivative of f in the directions v and w and taking the form
(D2 f (x0 )v)w = (D2 f (x0 )v, w ∈ R
t2 2
f (u + tv) = f (u) + t Df (u), v + D f (u + t0 v)w, w (5.6.7)
2
Proof Define the real-valued function
ϕ(t) = f (u + tv).
t 2
ϕ(t) = ϕ(0) + tϕ (0) + ϕ (t0 ) + (v),
2!
This yields the expansion (5.6.7).
(1) f is convex
(2) f (v) − f (u) ≥ Df (u, v − u) for all u, v ∈ .
(3) D2 f (u)v, v ≥ 0 for all u, v ∈ .
Proof The equivalence between (1) and (2) has been proved in Theorem 5.6.4. The
equivalence between (2) and (3) follows easily from (5.6.7).
J [u + tv] − J [u]
DJ (u)v = lim
t
t
1 1 1
= lim B[u + tv, u + tv] − L(u + tv) − B[u, u] + L(u)
t t 2 2
1
= lim tB[u, v] + t 2 B[v, v] − tL(v)
t t
= B[u, v] − L(v).
DJ (u + tv)v − DJ (u)v
D2 J (u, v)v = lim
t t
1
= lim [B[u + tv, v] − L(v) − (B[u, v] − L(v))]
t t
1
= lim tB[v, v] = B[v, v].
t t
The arguments were fairly straightforward but we preferred to write the details out
due to the significance of the result. The functional J in the theorem represents the
variational functional from which an equivalent minimization problem can be defined
for an elliptic PDE. One consequence of the above theorem is that the G-derivative
of the variational functional is nothing but the weak formulation of the associated
PDE. Moreover, the second G-derivative of the functional is the elliptic bilinear form
B evaluated at the new variation v. If B is coercive then the second G-derivative of J
is positive for all v = 0.
Another consequence is that any critical point of J is a weak solution of the
PDE associated with the functional J . Note from Theorem 5.6.8(1) that the equation
DJ (u, v) = 0 gives B[u, v] = L(v) which is the the weak formulation of the equation
associated to the variational integral J .
330 5 Calculus of Variations
We end the section with some calculus techniques for the functionals to explore
minimizers using the first and the second G-derivative. The first concerning the
critical point of a functional. The term: local point will serve the same meaning as
we have for the calculus. A local minimum point (or function) is an interior point at
which the function/functional takes a minimum value.
Theorem 5.6.9 Let J : ⊂ X −→ R be a functional that is G-differentiable on a
convex set .
Proof We will only prove (1) and (2), leaving (3) as easy exercises to the reader.
(1): Let u ∈ X be a local minimizer for a G-differentiable functional J . Letting v ∈
be any vector, and choosing t > 0 such that u + tv ∈ yield
J (u + tv) − J (u) ≥ 0.
DJ (u, v) ≥ 0.
Since this holds for any arbitrary v, we can choose −v, and using
Proposition 5.6.2(1)
−DJ (u, v) = DJ (u, −v) ≥ 0.
D2 J (u)(v, v) ≥ 0.
5.7 Poisson Variational Integral 331
Proof The result follows easily using Theorem 5.6.9(1) above and Taylor’s
Formula (5.6.7).
In this section, we will employ the tools and results that we learned in the previous
section in establishing some existence results and necessity conditions for mini-
mizers. This “calculus-based” method seems more flexible than the direct method
(which deals only with the existence problems), and provides us with various tools
from calculus. We will pick the Poisson variational integral (which is a generalization
to the Dirichlet integral) as our first example since we already established minimiza-
tion results and equivalence to weak solution for the Laplace equation. Recall the
Poisson variational integral takes the form
1
J [v] = |Dv| dx −
2
fvdx, (5.7.1)
2
This implies
J [u + tv] − J [u] t
= |Dv|2 + |Du| |Dv| dx − fvdx.
t 2
The integrand in the first integral is clearly dominated by |Dv|2 which is integrable,
so applying the Dominated Convergence Theorem gives
t t
lim |Dv|2 = |Dv|2 = 0.
lim
t→0 2 t→0 2
then
h (0) = DJ (u, V ).
for f ∈ L2 (), if and only if u is the minimizer of the Poisson variational integral.
Proof For the first part of the theorem, note that J is clearly strictly convex, coercive
by Poincare inequality, G-differentiable by Proposition 5.7.1, and so weakly l.s.c. by
Theorem 5.6.5. The result follows from Theorem 5.3.2. Now we prove the equiv-
alence of the two problems. Let u ∈ H01 () be the weak solution of the Poisson
problem. Let v ∈ H01 (), and write v = u − w for some w ∈ H01 (). Then
1
J [v] = J [u − w] = |∇(u − w)|2 − (u − w)f dx.
2
1 1
J [v] = |∇u| −2
fudx + |∇w| − 2
|∇u| |∇w| + fwdx.
2 2
Performing integration by parts on the fourth integral in the RHS of the equation,
given that
w |∂ = 0
yields
− |∇u| |∇w| = w∇ 2 u = − fwdx.
which holds for every v ∈ H01 (), hence u is the minimizer of J . Conversely, let u
be a minimizer of J . We have
1 1
J [u + tv] − J [u] = B[u + tv, u + tv] − f (u + tv])dx − B[u, u] + fudx
2 2
t2
= t B[u, v] − fvdx + B[v, v].
2
As a consequence, we have
Corollary 5.7.3 u ∈ H01 () is a minimizer for the Poisson variational integral J [·]
if and only if DJ (u) = 0.
and note that the right-hand side represents the weak formulation of the Poisson
equation with homogeneous Dirichlet condition. Hence, DJ (u) = 0 if and only if u
is a weak solution to the Poisson equation, and by the previous theorem this occurs
if and only if u is a minimizer for the functional J .
The above corollary views the vanishing of DJ (u) as a necessary and sufficient
condition for minimization, but only for the variational integral of Poisson-type.
334 5 Calculus of Variations
Now we attempt to generalize our discussion of the Dirichlet principle to hold for
the following problem:
Lu = f in
(5.7.2)
u = 0 on ∂.
for some uniformly elliptic operator L as in (4.5.3) with symmetric aij , c ∈ L∞ (),
and c(x) ≥ 0 a.e.x ∈ , for some open and bounded in Rn , and f ∈ L2 (). Recall
that the elliptic bilinear B[u, v] associated with an elliptic operator L is continuous,
and B[v, v] is coercive if L is uniformly bounded. Moreover, by uniform ellipticity,
we have
n
aij (x)ξi ξj ≥ λ0 |ξ|2 .
i,j
It has been shown (Theorem 4.5.5) that there exists a unique weak solution in H01 ()
for the problem. The bilinear map takes the form
⎛ ⎞
n
⎝ ∂u ∂v
B[u, v] = aij (x) + c(x)u(x)v(x)⎠ dx, (5.7.3)
i,j=1 ∂xi ∂xj
We will follow the same plan. Namely, we prove the existence of the minimizer, then
we prove that the problem of finding the minimizer of (5.7.4) is equivalent to the
problem of finding the solution of (5.7.3), i.e., the solution of the equation and the
minimizer is the same. Remember that Theorem 4.5.5 states that there exists only one
weak solution, so if our result is valid, there should be only one minimizer. Before
establishing the result, we recall that for bounded sequences, the following identities
are known and can be easily proved:
and
lim inf(xn ) + lim inf(yn ) ≤ lim inf(xn + yn ). (5.7.6)
5.7 Poisson Variational Integral 335
Theorem 5.7.4 There exists a unique minimizer over H01 (), where is bounded
in Rn , for the variational integral (5.7.4) for f ∈ L2 ().
Hence, J is bounded from below. Also, from the third inequality above, we have
J [v] ≥ C1 v2H 1 − f 2 dx − C2 v2 dx,
= C3 Du2L2
≥ C u2H 1 (C = 2C3 ).
C + 1
Hence, J [·] is coercive. Further, J is a strictly convex being the summation of two
w
strictly convex terms (i.e., |Du|2 and u2 ) and a linear term. Finally, let un −→ u
in H 1 (). Since f is a bounded linear functional on H 1 (), by definition of weak
convergence, we have
lim un fdx = ufdx,
336 5 Calculus of Variations
and
Du2 ≤ lim inf Dun 2 .
and given that A = [aij (x)] ∈ L∞ (), it is readily seen (verify) that
1 1
A |Du| dx ≤ lim inf
2
A |Dun |2 dx. (5.7.9)
2 2
Using (5.7.5), we add − lim sup( un , f ) and lim inf(− un , f ) to the left-hand side
and to the right-hand side of (5.7.9), respectively. This gives
I [u] − lim sup(− un fdx) ≤ lim inf I [un ] + lim inf − un fdx. (5.7.10)
and again add (5.7.8)–(5.7.10), therefore, we conclude that J [·] is w.l.s.c. The result
now follows from Theorem 5.3.2.
The step where we proved strict convexity is not necessary since we already proved
the functional is weakly l.s.c.. In fact, establishing strict convexity in these cases is
important only to prove uniqueness of the solution, which has been already verified
by Theorem 4.5.5.
Next, we consider the problem of minimizing the functional (5.7.4) over the set of
admissible functions A = {v ∈ H01 ()}. The next theorem shows that the problem
of finding a solution for (5.7.3) and the problem of finding the minimizer of (5.7.4)
are equivalent.
Theorem 5.7.5 Consider the uniformly elliptic operator L defined in (5.7.2) for
a symmetric aij ∈ L∞ (), f ∈ L2 (), and c(x) ≥ 0 a.e.x ∈ for some open and
5.8 Euler–Lagrange Equation 337
bounded in Rn , and B[u, v] is the elliptic bilinear map associated with L. Then
u ∈ H01 () is a weak solution to (5.7.2) if and only if u is a minimizer for the
variational integral (5.7.4) over A = {v ∈ H01 ()}.
Proof Let u be a weak solution of (5.7.2). Then it satisfies the weak form
for every v ∈ H01 (). Let w ∈ H01 () be a weak solution of (5.7.2), and without loss
of generality, assume w = u + v for any function v ∈ H01 (). Our claim is that
0 ≤ J [u] − J [w].
We have
1 1
J [u + v] − J [u] = B[u + v, u + v] − f (u + v])dx − B[u, u] + fudx.
2 2
(5.7.12)
By simple computations, taking into account that B is symmetric, and the fact from
Theorem 4.5.4 that if L is uniformly elliptic operator, then B is coercive, we obtain
1
J [u + v] − J [u] = − fvdx + B[u, v] + B[v, v]
2
1
= B[v, v]
2
≥ β v2H 1 () (By coercivity of B)
0
≥ 0.
This implies that u is the unique minimizer of (5.7.4). Note that the above inequality
becomes equality only when v = 0.
Conversely, let u ∈ H01 () be the minimizer of (5.7.4). Theorem 4.5.5 already
proved the existence and uniqueness of a weak solution for problem (5.7.2) which
we already proved it is a minimizer of (5.7.4), and on the other hand, Theorem 5.7.4
proved the existence and uniqueness of the minimizer of (5.7.4). This completes the
proof.
Now, it seems that we are ready to generalize the work to more variational functionals.
We developed all the necessary tools and techniques to implement a “calculus-based”
method to solve minimization problems and their connections with their original
PDEs. Our goal is to investigate variational problems and see if the minimizers of
338 5 Calculus of Variations
these problems hold as the weak solutions for the corresponding partial differential
equations. Consider the general variational integral
J [u] = L(∇u, u, x)dx. (5.8.1)
L : Rn × R × −→ R,
where u, v ∈ C 1 (), and is C 1 open and bounded set. The first variable in place
of ∇u is denoted by p, the second variable in place of u is denoted by z. This is a
common practice in the differentiation process if a function and its derivatives are the
arguments of another function so that the chain rule is not misused. Such a function
with the properties above is known as the Lagrangian functional. The functional
(5.8.1) shall be called: Lagrangian Integral. To differentiate L with respect to any
of the variables, we write
We will establish an existence theorem for the minimizer of the general variational
functional (5.8.1).
One of the consequences of the preceding section is that by defining the function
h : R −→ R, by
h(t) = J (u + tv),
we see that, assuming sufficient smoothness on the integrand, the function h is dif-
ferentiable if and only if J is G-differentiable. Indeed, if L is C 2 , then J is G-
differentiable and both ∇p L and Lz are continuous, so we can perform chain rule,
dL
and since u, v ∈ C 1 (), is continuous, which implies that h is differentiable on
dt
R. Now, if u is a minimizer of a G-differentiable variational integral J , then
∂
h (0) = J [u + tv] |t=0 = DJ (u, V ). (5.8.2)
∂t
Equation (5.8.2) is called the first varition of J [·], and it provides the weak form of
the PDE which is associated with J .
5.8 Euler–Lagrange Equation 339
Let us see how to obtain the weak formulation explicitly from the first variation.
Writing
h(t) = J (u + tv) = L(∇(u + tv), u + tv, x)dx,
u + tv ∈ C01 ().
Thus, we have
n
∂L
0 = h (0) = Lpi (∇u, u, x)vxi + (∇u, u, x)v dx.
i=1
∂z
This is the weak formulation of the PDE associated with the variational integral J .
To find the equation, we use Green’s identity (or integrate by parts with respect to x
in the first n terms of the integral), taking into account v|∂ = 0, we obtain
n
∂ ∂L
0= − Lp (∇u, u, x) + (∇u, u, x) vdx.
i=1
∂xi i ∂z
n
∂ ∂L
− Lpi (∇u, u, x) + (∇u, u, x) = 0,
i=1
∂xi ∂z
or in vector form
∂L
−div(∇p L) + (∇u, u, x) = 0.
∂z
Theorem 5.8.1 (Necessary Condition For Minimiality I) Let
L(∇u, u, x) ∈ C 2 ()
340 5 Calculus of Variations
for some ∈ Rn , and consider the Lagrangian Integral (5.8.1). If u ∈ C01 () is a
minimizer for J over A = {v ∈ C01 ()}, then u is a solution for the equation
n
∂ ∂L ∂L
− (∇u, u, x) + (∇u, u, x) = 0, (5.8.3)
i=1
∂x ∂pi ∂z
1 2
L(∇u, u, x) = |p| ,
2
∇ 2 u = 0.
1 2
L(∇u, u, x) = |p| − fu.
2
Then
Lpx = ∇ 2 u, Lu = −f ,
−∇ 2 u, = f .
In the preceding theorem, the minimizer of the functional J was taken over the space
C01 (), and consequently, the direction vector v was also chosen to be in C01 () so
that
u + tv ∈ C01 ().
If the admissible set is chosen to be C 1 (), then we must have v ∈ C 1 (). Writing
again the first variation as
∂L
0 = h (0) = ∇p L(∇u, u, x) · Dv + (∇u, u, x)v dx.
∂z
where n is the outward normal vector, and this equation holds for all v ∈ C 1 (). We
thus have the following:
Theorem 5.8.3 Let L(∇u, u, x) ∈ C 2 () for some ∈ Rn . If u ∈ C 2 () is a min-
imizer of the Lagrangian Integral J over C 2 (), then u is a solution for the Euler–
Lagrange equation with the Neumann boundary condition
∇p L(∇u, u, x) · n = 0 on ∂.
A natural question arises is: Does the converse of Theorem 5.8.1 hold? We know
from the discussion above that a weak solution to the E-L equation is a critical point
of the associated variational integral. But is it necessarily minimizer? The answer
in general is: No, as it may also be maximizer, or neither. As we usually do with
elementary calculus, the second derivative needs to be invoked here. Consider again
the function
h(t) = J (u + tv) = L(∇(u + tv), u + tv, x)dx.
If h has a minimum value at 0, then h (0) = 0 and h (0) ≥ 0. The first derivative was
found to be
n
∂L
h (t) = Lpi (∇u + t∇v, u + tv, x)vxi + 2 (∇u + t∇v, u + tv, x)v dx.
i=1 ∂z
342 5 Calculus of Variations
Then ⎡ ⎤
n
n
h (t) = ⎣ Lpi pj vxi vxj + 2 Lzpj vvxj + Lzz v2 ⎦ dx,
i,j=1 j=1
More precisely,
⎡ ⎤
n
n
0≤ ⎣ Lpi pj (∇u, u, x)(∇v)2 + 2 Lzpi (∇u, u, x)v∇v + Lzz (∇u, u, x)v2 ⎦ dx,
i,j=1 i=1
which is valid for all v ∈ Cc∞ (). The above integral is called: second variation.
n
Lpi pj (∇u, u, x)ηi ηj ≥ 0.
i,j=1
where
n
n
A= Lpi pj , B = Lzpi , C = Lzz ,
i,j=1 i=1
then it can be seen that if u is a local minimizer then Q[u, u] is positive definite,
which implies that the integrand
t2
J [u + tv] = J [u] + t DJ (u v + Q[u, v],
2
where
d2
Q[u, v] = h (0) = J [u + tv] |t=0 .
dt 2
Therefore, we have
Theorem 5.8.5 Let L(∇u, u, x) ∈ C 2 () for some bounded ∈ Rn , and suppose
u ∈ C01 () is a critical point of the Lagrangian integral J . Then u is a local minimizer
for J over
344 5 Calculus of Variations
A = {v ∈ C01 ()}
Our first task is to prove that J is G-differentiable and find its derivative. We need to
impose further conditions on L. To ensure the functional J [·] is finite, we assume L,
Lp , and Lz are all Caratheodory. We also assume the p−growth condition
together with
max{Lp (p, z, x) , |Lz (p, z, x)|} ≤ C |p|q−1 + |z|q−1 + 1 . (5.9.3)
1
ft = (L(Du + tDv, u + tv, x) − L(Du, u, x)) .
t
It is clear that
a.e. dL
ft −→ |t=0 = Dp L(Du, u, x)Dv + Lz (Du, u, x)v. (5.9.6)
dt
On the other hand, letting 0 < t ≤ 1, then ft can be written as
1 t d
ft = (L(Du + τ Dv, u + τ v, x)) d τ
t 0 dτ
1 t
= Dp L(Du + τ Dv, u + τ v, x)Dv + Lz (Du + τ Dv, u + τ v, x)v d τ
t 0
1,q
Note that since u, v ∈ W0 , we have
∗
(|Du + τ Dv|q−1 )q = (|Du + τ Dv|)q ∈ L1 ,
Together with some constant C, set them all to be the function g(x) ∈ L1 (), we
then have
1 t
|ft | ≤ g(x)d τ = g(x) ∈ L1 ().
t 0
Now from (5.9.5) and (5.9.6), the Dominated Convergence Theorem gives (5.9.4).
Theorem 5.9.2 Under the assumptions of the preceding theorem, if u ∈ W 1,q () is
a local minimizer of the functional J over
Proof Multiplying the equation above by v ∈ Cc∞ () and integrating by parts gives
Dp L(Du, u, x)Dv + Lz (Du, u, x)v dx = 0. (5.9.7)
The task of proving that a minimizer is unique for such general functionals is a bit
challenging, and some further conditions should be imposed. One way to deal with
this problem is to assume convexity in the two variables (p, z) rather than p alone.
Such property is called: jointly convexity.
If the inequality is strict, then the functional F is said to be jointly strictly convex.
5.9 Dirichlet Principle for Euler–Lagrange Equation 347
L(x, v, Dv) − L(x, u, Du) ≥ Lp (x, u, Du)(Dv − Du) + Lz (x, u, Du)(v − u).
(5.9.8)
Proof For 0 < t ≤ 1, set
w = tv + (1 − t)u.
This implies
w
Proof The variational integral J is G-differentiable by Theorem 5.9.1. Let un −→ u
in W 1,q (). Then by (5.9.8)
L(x, un , Dun ) − L(x, u, Du) ≥ Lp (x, u, Du)(Dun − Du) + Lz (x, u, Du)(un − u).
Note that Lp (Du, un , x) and Lz (x, u, Du)(v − u) are bounded linear functionals, and
w w
un −→ u, Dun −→ Du, so
and
Lz (x, u, Du)(un − u) −→ 0.
This gives
L(x, un , Dun ) ≥ L(x, u, Du).
Now we integrate both sides over , and then taking the limit inferior,
So, J is weakly l.s.c., and therefore, by Theorem 5.3.2, there exists a minimizer.
The uniqueness of the minimizer follows from the joint strict convexity by a similar
argument to that of Theorem 5.3.2, and this will be left to the reader as an easy
exercise.
Now we are ready to establish the Dirichlet principle for the Lagrangian integral.
Theorem 5.9.6 Under the assumptions of Theorem 5.9.1, and assuming L is jointly
convex in (z, p), u ∈ W 1,q () is a weak solution of the Euler–Lagrange equation if
and only if u is a minimizer of the Lagrangian integral.
Proof Let u ∈ W 1,q () be a weak solution of the Euler–Lagrange equation. Inte-
grating both sides of the inequality (5.9.8) over yields
J [v] − J [u] ≥ Lp (x, u, Du)(Dv − Du) + Lz (x, u, Du)(v − u) = 0.
This gives J [v] ≥ J [u], and here u is a minimizer. Theorem 5.9.2 gives the other
direction.
f (x, y) = xy
can be shown that it is convex in x and convex y but not jointly convex in (x, y). A
main motivation for us is the Legendre condition (Theorem 5.8.4), in the sense that
the inequality
n
Lpi pj (Du, u, x)ηi ηj ≥ 0
i,j=1
is essential for the critical point to be a minimizer. The above inequality implies that
in p, which seems to be the natural property to replace the joint convexity of L, so
we will adopt this property in the next two results. A classical result in real analysis
shall be invoked here. Recall that Egoroff’s theorem states the following: If {fn } be a
sequence of measurable functions and fn → f a.e. on a set E of finite measure, then
for every > 0, there exists a set A, with μ(A) < such that fn −→ f uniformly on
E \ A. The theorem shall be used to prove the following.
w
but we know that Lp (Du, un , x) is a bounded linear functional, and Dun −→ Du, so
Secondly, since L is bounded from below, J is also bounded from below, so let
i.e., we have
lim L(Du, un , x)dx = L(Du, u, x)dx. (5.10.2)
Thirdly, since L is bounded from below by, say c > −∞, then WLOG we can assume
L > 0 (since we can use the shift transformation L −→ L + c). Also, we note that
as → 0, , so we can write
L(Du, un , x)dx = χ L(Du, un , x)dx,
We have seen that the two main conditions to guarantee the existence of minimizers
are the coercivity and lower semicontinuity. The preceding theorem deals with the
latter condition, and we need assumptions to guarantee the former. As in the preceding
theorem, we gave conditions on L rather than J , so we will continue to do that for
the existence theorem.
Theorem 5.10.2 Let L = L(p, z, x) be the Lagrangian functional. Suppose that L is
bounded from below and convex in p. Moreover, there exists α > 0 and β ≥ 0 such
that
L ≥ α |p|q − β,
for 1 < q < ∞. Then there exists u ∈ W 1,q () for some C 1 open bounded ⊂ Rn
such that u is the minimizer of the Lagrangian variational integral
J [u] = L(Du, u, x)dx
Remark To avoid triviality of the problem, we assume that inf J < ∞, and A = ∅.
Proof WLOG we can assume β = 0 or use the shift L −→ L + β. The bound con-
dition of L implies that
J [u] = L(Du, u, x)dx ≥ α |Du|q ,
hence J is bounded from below, and note also from the preceding theorem that J is
weakly l.s.c., so let un ∈ A such that J [un ] < ∞. Then
352 5 Calculus of Variations
Therefore,
sup un q < ∞,
and consequently, (un ) is bounded in W 1,q (), which shows that J is coercive.
Finally, Proposition 5.5.2 shows that A is weakly closed. The result follows now
from Theorem 5.3.2.
Lastly, we prove that the minimizer for the Lagrangian variational integral is a weak
solution to the Euler–Lagrange equation.
Theorem 5.10.3 Suppose that the Lagrangian functional L satisfies all the assump-
tions of Theorem 5.9.1 and 5.10.2. If u ∈ W 1,q () is a local minimizer of the func-
tional J over
5.11 Problems
f (x) ≥ α |x|p − β
for some α, β > 0 and 1 < p < ∞ then f has a minimizer over R.
(6) Give an example of a function f : R −→ R such that f is coercive, bounded
from below, but does not have a minimizer on R.
(7) Let f : X −→ R be convex and ls.c. If f < ∞ and there exists x0 ∈ X such that
f (x0 ) = −∞ then show that f ≡ −∞.
(8) Give an example of a minimizing sequence with no subsequence converging in
norm.
(9) Let {fi : i ∈ I } be a family of convex functionals defined on a Hilbert space.
Show that sup{fi : i ∈ I } is convex.
(10) Show that if f , g are l.s.c and both are bounded from below then f + g is l.s.c.
(11) Show that if fn is a sequence of l.s.c. functions and fn converges uniformly to
f , then f is l.s.c.
(12) If f is bounded from below, convex, and l.s.c. Prove or disprove: f is continuous
on its domain.
(13) Use Prop 5.3.3(3) to prove the statement of Theorem 4.9.5(2).
(14) (a) Show that a function f is coercive if and only if its lower level sets
{x : f (x) ≤ b, b ∈ R}
are bounded.
(b) Deduce from (a) that if f : H −→ (−∞, ∞] is proper coercive then every
minimizing sequence of f is bounded.
(15) A function is called: quasi − convex if its lower-level sets
{x : f (x) ≤ b, b ∈ R}
are convex.
(a) Show that every quasi-convex function is convex.
(b) Show that every monotone function is quasi-convex.
(c) Let f : H −→ (−∞, ∞] be quasi-convex. Show that f is l.s.c. if and only
if f is weakly l.s.c.
(16) Let Let f : H −→ (−∞, ∞] be quasi-convex and l.s.c., and suppose C ⊂ H
is weakly closed. If there exists b ∈ R such that
C ∩ {x : f (x) ≤ b, b ∈ R}
1,p
(17) (a) Show that the Dirichlet integral I is not coercive on W0 () for p > 2.
(b) Show that the Dirichlet integral I is strictly convex on W 1,p () for p ≥ 2.
(18) (a) Show that x2 is weakly l.s.c.
(b) Determine values of p for which xp is weakly l.s.c.
(19) Let F : Rn −→ (−∞, ∞] be l.s.c. and convex. Let be bounded Lip. in Rn ,
and define the variational J : W 1,p () −→ R,
J [u] = F(Du)dx.
∇ 2 u − u2 = f , x ∈
u = 0, x ∈ ∂.
(24) Let ψ : Rn −→ R be l.s.c. and convex. Consier the functional J : W 1,p () −→
R, for some open and lip. in Rn , and 1 < p < ∞, and defined by
J [u] = ψ(Du)dx.
for all u, v ∈ X .
(28) Consider the integral functional J : C 1 [0, 1] −→ R defined by
1
J [u] = |u| dx
0
(d) Show that the minimizer u is the weak solution of the problem
div |∇u|p−2 ∇u = 0, x ∈
u = g, x ∈ ∂.
(33) If J [u] = B[u, v] + L[u] for some bilinear form B and linear L. Show that
(35) Consider the problem of minimizing the variational integral (5.7.1) over H01 (),
where is bounded in at least one direction in Rn and f ∈ L2 ().
(a) Prove the following identity
1 1 1
inf J [v] ≤ J [ui ] + J [uj ] − ∇(ui − uj )2 dx,
v∈X 2 2 4
−∇ 2 u = f , x ∈
∂u
= g, x ∈ ∂.
∂n
p u = ∇ · (|∇u|p−2 ∇u)
p
(a) Find the G-derivative of u(x) = uLp .
(b) Consider the p−Laplace equation (for 1 < p < ∞)
−p u = f , x ∈
u = 0, x ∈ ∂.
has a minimizer over A ={u ∈ C 1 [0, 1] such that u(0) = 0 and u(1) = 1}.
(45) Consider the variational integral J : H01 () −→ R, given by
1
J [u] = |Du|2 − f (x)Du dx
2
is strictly convex.
(b) If f (x, y) is convex in (x, y) and strictly convex in x and strictly convex in
y, then f is strictly convex.
(c) If f (x, y) is jointly convex in (x, y) and strictly convex in x and strictly
convex in y, then f is jointly strictly convex.
(51) A function f : Rn −→ R is said to be strongly convex if there exists β > 0 such
that f (x) − β x2 is convex.
(a) Show that if a function is strongly convex, then it is strictly convex.
(b) Give an example of a function that is strictly convex but not strongly convex.
(c) Show that if a function is strongly convex then
β
f (y) ≥ f (x) + (∇f ) · (y − x) + y − x2 .
2
1. R.A. Adams, J.J.F. Fournier, Sobolev Spaces (Academic, Elsevier Ltd., 2003)
2. N.I. Akhiezer, I.M. Glazman, Theory of Linear Operators in Hilbert Space (Dover Publica-
tions, 1993)
3. C. Alabiso, I. Weiss, A Primer on Hilbert Space Theory (Springer International Publishing
Switzerland, 2015)
4. F. Albiac, N.J. Kalton, Topics in Banach Space Theory (Springer International Publishing
Switzerland, 2006; 2nd edn., 2016)
5. C.D. Aliprantis, K.C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide (Springer,
Berlin, 1999; Heidelberg, 2006)
6. T. Apostol, Mathematical Analysis, 2nd edn. (Pearson, 1974)
7. J.-P. Aubin, Applied Functional Analysis, 2nd edn. (Wiley, 2000)
8. G. Bachman, L. Narici, Functional Analysis (Dover Publications, 1998)
9. V. Barbu, T. Precupanu, Convexity and Optimization in Banach Spaces (Springer Netherlands,
2012)
10. H.H. Bauschke, P.L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert
Spaces (Springer, 2011)
11. B. Beauzamy, Introduction to Banach Spaces and their Geometry (North- Holland Publishing
Company, 1982)
12. L. Beck, Elliptic Regularity Theory a First Course (Springer, 2016)
13. S. Berberian, Fundamentals of Real Analysis (Springer, New York, 1999)
14. S.K. Berberian, P.R. Halmos, Lectures in Functional Analysis and Operator Theory (Springer,
1974)
15. K. Bichteler, Integration - A Functional Approach (Birkhäuser, Basel, 1998)
16. A. Bowers, N.J. Kalton, An Introductory Course in Functional Analysis (Springer, New York,
2014)
17. S. Boyd, L. Vandenberghe, Convex Optimization (Cambridge University Press, 2004)
18. A. Bressan, Lecture Notes on Functional Analysis: With Applications to Linear Partial Dif-
ferential Equations (American Mathematical Society, 2012)
19. H. Brezis, Functional Analysis, Sobolev Spaces and Partial Differential Equations (Springer,
New York, 2010)
20. D.S. Bridges, Foundations of Real and Abstract Analysis (Springer, New York, 1998)
21. T. Bühler, D.A. Salamon, Functional Analysis (American Mathematical Society, 2018)
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 361
Nature Singapore Pte Ltd. 2024
A. Khanfer, Applied Functional Analysis,
https://doi.org/10.1007/978-981-99-3788-2
362 References
22. C. Caratheodory, Calculus of Variations and Partial Differential Equations of First Order,
3rd edn. (American Mathematical Society, 1999)
23. K. Chandrasekharan, Classical Fourier Transforms (Springer, Berlin, Heidelberg, 1989)
24. N.L. Carothers, A short Course on Banach Space Theory (Cambridge University Press, 2004)
25. Ward Cheney, Analysis for Applied Mathematics (Springer, New York Inc, 2001)
26. M. Chipot, Elliptic Equations: An Introductory Course (Birkhäuser, Berlin, 2009)
27. M. Chipot, Elements of Nonlinear Analysis (Birkhauser Advanced Texts, 2000)
28. C. Chidume, Geometric Properties of Banach Spaces and Nonlinear Iterations (Springer-
Verlag London Limited, 2009)
29. P.G. Ciarlet, Linear and Nonlinear Functional Analysis with Applications (SIAM-Society for
Industrial and Applied Mathematics, 2013)
30. R. Coleman, Calculus on Normed Vector Spaces (Springer, 2012)
31. J.B. Conway, A Course in Functional Analysis (Springer, New York, 1985)
32. R.F. Curtain, A. Pritchard, Functional Analysis in Modern Applied Mathematics (Academic,
1977)
33. B. Dacorogna, Direct Methods in the Calculus of Variations (Springer, Berlin, 1989)
34. J. Diestel, Geometry of Banach Spaces - Selected Topics (Springer, Berlin, Heidelberg, NY,
1975)
35. G. van Dijk, Distribution Theory: Convolution, Fourier Transform, and Laplace Transform,
De Gruyter Graduate Lectures (Walter de Gruyter GmbH, Berlin/Boston, 2013)
36. J.J. Duistermaat, J.A.C. Kolk, Distributions: Theory and Applications (Springer, New York,
2006)
37. Y. Eidelman, V. Milman, A. Tsolomitis, Functional Analysis: An Introduction (American
Mathematical Society, 2004)
38. L.D. Elsgolc, Calculus of Variations (Dover Books on Mathematics, 2007)
39. L.C. Evans, Partial Differential Equations, 2nd edn. (American Mathematical Society, 2010)
40. Marián Fabian, Petr Habala, Petr Hájek, Vicente Montesinos, Václav. Zizler, Functional
Analysis and Infinite-Dimensional Geometry (Springer, New York, 2001)
41. A. Friedman, Foundations of Modern Analysis (Dover Publications Inc, 1970)
42. I.M. Gelfand, S.V. Fomin, Calculus of Variations (Prentice-Hall, Inc, 1963)
43. M.G.S. Hildebrandt, Calculus of Variations (Springer, 1996)
44. D. Gilbarg, N.S. Trudinger, Elliptic Partial Differential Equations of Second Order (Springer,
2001)
45. G. Giorgi, A. Guerraggio, J. Thierfelder, Mathematics of Optimization: Smooth and Nons-
mooth Case (Elsevier Science, 2004)
46. I. Gohberg, S. Goldberg, Basic Operator Theory (Birkhäuser, 1980)
47. H.H. Goldstine, A History of the Calculus of Variations from the 17th through the 19th Century
(Springer, 1980)
48. D.H. Griffel, Applied Functional Analysis (Ellis Horwood LTD, Wiley, 1981)
49. G. Grubb, Distributions and Operators (Springer Science+Business Media, 2009)
50. C. Heil, A Basis Theory Primer (Springer Science+Business Media, LLC, 2011)
51. V. Hutson, J.S. Pym, M.J. Cloud, Applications of Functional Analysis and Operator Theory
(Elsevier Science, 2006)
52. W.B. Johnson, J. Lindenstrauss, Handbook of the Geometry of Banach Spaces, vol. 2 (Elsevier
Science B.V., 2003)
53. J. Jost, Partial Differential Equations, 2nd edn. (Springer, 2007)
54. V. Kadets, A Course in Functional Analysis and Measure Theory (Springer, 2006)
55. L.V. Kantorovich, G.P. Akilov, Functional Analysis (Pergamon Pr, 1982)
56. S. Kantorovitz, Introduction to Modern Analysis (Oxford University Press, 2003)
57. N. Katzourakis, E. Varvaruca, An Illustrative Introduction To Modern Analysis (CRC Press,
2018)
58. A. Khanfer, Fundamentals of Functional Analysis (Springer, 2023)
59. H. Kielhöfer, Calculus of Variations, An Introduction to the One-Dimensional Theory with
Examples and Exercises (Springer, 2018)
References 363
60. A.N. Kolmogorov, S.V. Fomin, Elements of the Theory of Functions and Functional Analysis
(Martino Fine Books, 2012)
61. V. Komornik, Lectures on Functional Analysis and the Lebesgue Integral (Springer, 2016)
62. S.G. Krantz, A Guide to Functional Analysis (Mathematical Association of America, 2013)
63. E. Kreyszig, Introductory Functional Analysis with Applications (Wiley Classics Library,
1989)
64. A.J. Kurdila, M. Zabarankin, Convex Functional Analysis (Springer Science & Business
Media, 2005)
65. S.S. Kutateladze, Fundamentals of Functional Analysis (Springer-Science+Business Media,
B.V., 1995)
66. Serge Lang, Real and Functional Analysis (Springer, New York, 1993)
67. D. Peter, Lax, Functional Analysis: A Wiley-Interscience Series of Texts, Pure and Applied
Mathematics (2002)
68. L.P. Lebedev, I.I. Vorovich, Functional Analysis in Mechanics (Springer, New York, Inc.,
2003)
69. G. Leoni, A First Course in Sobolev Spaces (American Mathematical Society, 2009)
70. E.H. Lieb, M. Loss, Analysis (American Mathematical Society, 2001)
71. J. Lindenstrauss, L. Tzafriri, Classical Banach Spaces II: Function Spaces (Springer, Berlin,
Heidelberg GmbH, 1979)
72. Yu.I. Lyubich, Functional Analysis I: Linear Functional Analysis (Springer, Berlin, Heidel-
berg, 1992)
73. T.-W. Ma, Classical Analysis on Normed Spaces (World Scientific Publishing, 1995)
74. M.V. Marakin, Elementary Operator Theory (De Gruyter, 2020)
75. R. Megginson, An Introduction to Banach Space Theory (Springer, New York Inc, 1998)
76. M. Miklavcic, Applied Functional Analysis and Partial Differential Equations (World Scien-
tific Publishing Co., 1998)
77. D. Mitrea, Distributions, Partial Differential Equations, and Harmonic Analysis (Springer,
2018)
78. T.J. Morrison, Functional Analysis: An Introduction to Banach Space Theory (Wiley-
Interscience, 2000)
79. J. Muscat, Functional Analysis: An Introduction to Metric Spaces, Hilbert Spaces, and Banach
Algebras (Springer, 2014)
80. L. Narici, E. Beckenstein, Topological Vector Spaces (Chapman & Hall/CRC, Taylor & Fran-
cis Group, 2011)
81. J.T. Oden, L.F. Demkowicz, Applied Functional Analysis (CRC Press, Taylor & Francis
Group, 2018)
82. M.S. Osborne, Locally Convex Spaces (Springer International Publishing, Switzerland, 2014)
83. S. Ponnusamy, Foundations of Functional Analysis (Alpha Science International Ltd, 2002)
84. V. Maz’ya, Sobolev Spaces with Applications to Elliptic Partial Differential Equations, 2nd
edn. (Springer, 2011)
85. M. Renardy, R. Rogers, An Introduction to Partial Differential Equations, 2nd edn. (Springer,
2004)
86. M. Reed, B. Simon, Methods of Modern Mathematical Physics I: Functional Analysis (Aca-
demic, 1981)
87. F. Riesz, B. Sz.-Nagy, Functional Analysis (Dover Publications, 1990)
88. A.W. Roberts, D.E. Varberg, Convex Functions (Academic, 1973)
89. R.T. Rockafellar, Convex Analysis (Princeton University Press, 1970)
90. R.T. Rockafellar, R. Wets, Variational Analysis (Springer, 2010)
91. W. Rudin, Functional Analysis (McGraw-Hill, 1991)
92. B.P. Rynne, M.A. Youngson, Linear Functional Analysis. Springer Undergraduate Mathe-
matics Series (2008)
93. H.H. Schaefer, M.P. Wolff, Topological Vector Spaces (Springer Science+Business Media,
New York, 1999)
94. M. Schechter, Principles of Functional Analysis (American Mathematical Society, 2002)
364 References
O
I Open mapping theorem, 4
Inclusion map, 219
Infimum, 296
Inner product space, 3 P
Interior regularity theorem, 283 Parseval’s identity, 4
Interior smoothness theorem, 287 Partition of unity, 146
Interpolation inequality, 203 Plancherel theorem, 104
Invariant subspace, 34 Poincare inequality, 207
Poincare norm, 249
Poincare–Wirtinger inequality, 249
J Poisson equation, 241
Jointly convex function, 346 Proper function, 307
Index 367
Q T
Quotient Sobolev spaces, 250 Tempered distribution, 113
Test function, 83
Toeplitz theorem, 51
R
Radon-Riesz property, 301
Rapidly decreasing function, 106 U
Reflexive space, 302 Uniform bounded principle, 4
Regular distribution, 85 Uniformly elliptic operator, 240
Regular value, 28 Upper semicontinuous, 297
Rellich-Kondrachov theorem, 222
Resolvent, 28
Riesz-Fischer theorem, 2 V
Riesz Representation Theorem for Hilbert space, Variational integral, 311
255 Variational problem, 295
Riesz’s lemma, 2 Volterra equation, 45
S W
Schwartz space, 107 Weak derivative, 134
Self-adjoint operator, 8 Weak formulation of elliptic equation, 245
Sequentially lower semicontinuous, 297 Weakly bounded set, 302
Singular distribution, 87 Weakly closed, 301
Smooth domain, 187 Weakly closed set, 301
Smooth functions, 82 Weakly compact, 302
Sobolev conjugate, 200 Weakly compact set, 302
Sobolev embedding theorem, 226 Weakly convergence, 301
Sobolev exponent, 202 Weakly differentiable, 134
Sobolev’s inequality, 208 Weakly lower semicontinuous, 303
Sobolev space, 156 Weakly sequentially closed set, 301
Spectral mapping theorem, 33 Weakly sequentially compact set, 302
Spectral theorem for self-adjoint compact Weak solution, 245
operators, 39 Weak topology, 301
Spectral theorem of elliptic operator, 274 Weierstrass’s example, 314
Spectrum, 29 Weyl’s lemma, 275
Strictly convex, 300
Strongly diffeomorphism, 189
Strong solution, 246 Z
Sturm–Liouville operator, 67 Zero-boundary Sobolev space, 166, 179
Subordinate, 146 Zero extension, 182