Untitled
Untitled
Untitled
Series editors
Editor-in-Chief
A. Chenciner J. Coates S.R.S. Varadhan
For further volumes:
www.springer.com/series/138
Dominique Bakry r Ivan Gentil r Michel Ledoux
Analysis and
Geometry of
Markov Diffusion
Operators
Dominique Bakry Michel Ledoux
Institut de Mathématiques de Toulouse Institut de Mathématiques de Toulouse
Université de Toulouse Université de Toulouse
and Institut Universitaire de France and Institut Universitaire de France
Toulouse, France Toulouse, France
Ivan Gentil
Institut Camille Jordan
Université Claude Bernard Lyon 1
Villeurbanne, France
To Leonard Gross
Preface
Semigroups of operators on a Banach space provide very general models and tools
in the analysis of time evolution phenomena and dynamical systems. They have a
long history in mathematics and have been studied in a number of settings, from
functional analysis and mathematical physics to probability theory, Riemannian ge-
ometry, Lie groups, analysis of algorithms, and elsewhere.
The part of semigroup theory investigated in this book deals with Markov dif-
fusion semigroups and their infinitesimal generators, which naturally arise as solu-
tions of stochastic differential equations and partial differential equations. As such,
the topic covers a large body of mathematics ranging from probability theory and
partial differential equations to functional analysis and differential geometry for op-
erators or processes on manifolds. Within these frameworks, research interests have
grown over the years, now encompassing a wide variety of questions such as reg-
ularity and smoothing properties of differential operators, Sobolev-type estimates,
heat kernel bounds, non-explosion properties, convergence to equilibrium, existence
and regularity of solutions of stochastic differential equations, martingale problems,
stochastic calculus of variations and so on.
This book is more precisely focused on the concrete interplay between the ana-
lytic, probabilistic and geometric aspects of Markov diffusion semigroups and gen-
erators involved in convergence to equilibrium, spectral bounds, functional inequal-
ities and various bounds on solutions of evolution equations linked to geometric
properties of the underlying structure.
One prototypical example at this interface is simply the standard heat semigroup
(Pt )t≥0 on the Euclidean space Rn whose Gaussian kernel
1
e−|x| /4t ,
2
u = u(t, x) = pt (x) = t > 0, x ∈ Rn ,
(4πt)n/2
is a fundamental solution of the heat equation
∂t u = u, u(0, x) = δ0 ,
for the standard Laplace operator , thus characterized as the infinitesimal generator
of the semigroup (Pt )t≥0 .
vii
viii Preface
1 The terminology “Markov triple” should not, of course, be confused with solutions of the Markov
to cover all the possible interesting cases. In order to keep the monograph within a
reasonable size, we have had to omit, among other things, the specific analysis re-
lated to hypoelliptic diffusions, special features of diffusions on Lie groups, and
many interesting developments arising from infinite interacting particle systems.
In addition, although we have largely been motivated by the analysis of the be-
havior of diffusion processes (that is, solutions of time homogeneous stochastic dif-
ferential equations), rather than concentrating on the probabilistic aspects of the
subject, such as almost sure convergence of functionals of the trajectories of the un-
derlying Markov processes, recurrence or transience, we instead chose to translate
most of the features of interest into functional analytic properties of the Markov
structure (E, μ, ) under investigation.
Heat kernel bounds, functional inequalities and their applications to convergence
to equilibrium and geometric features of Markov operators are among the main
topics of interest developed in this monograph. A particular emphasis is placed on
families of inequalities relating, on a Markov Triple (E, μ, ), functionals of func-
tions f : E → R to the energy induced by the invariant measure μ and the carré du
champ operator ,
E(f, f ) = (f, f )dμ.
E
Typical functionals are the variance, entropy or Lp -norms leading to the main func-
tional inequalities of interest, the Poincaré or spectral gap inequality, the logarithmic
Sobolev inequality and the Sobolev inequality. A particular goal is to establish such
families of inequalities under suitable curvature conditions which may be described
by the carré du champ operator and its iterated 2 operator.
Similar inequalities are investigated at the level of the underlying semigroup
(Pt )t≥0 for the heat kernel measures, comparing Pt (ϕ(f )) (for some ϕ : R → R)
to Pt ((f, f )) or (Pt f, Pt f ), which give rise to heat kernel bounds. With this
task in mind, we will develop the main powerful tool of heat flow monotonicity, or
semigroup interpolation, with numerous illustrative applications and strong intuitive
content. To illustrate the principle, as a wink towards what is to come, let us briefly
present here a heat flow proof of the classical Hölder inequality which is very much
in the spirit of this book. In particular, the reduction to a quadratic bound is typi-
cal of the arguments developed in this work. Let f, g be suitable (strictly) positive
functions on Rn and θ ∈ (0, 1). For fixed t > 0, consider, at any point (omitted), the
interpolation
(s) = Ps eθ log Pt−s f +(1−θ) log Pt−s g , s ∈ [0, t],
where (Pt )t≥0 is the standard heat semigroup on Rn as recalled above. Together
with the heat equation ∂s Ps = Ps = Ps , the derivative in s of is given by
(s) = Ps eH − eH θ e−F eF + (1 − θ )e−G eG
x Preface
which is negative by convexity of the square function. Hence (s), s ∈ [0, t], is
decreasing, and thus
(t) = Pt f θ g 1−θ ≤ (Pt f )θ (Pt g)1−θ = (0).
Normalizing by t n/2 and letting t tend to infinity yields Hölder’s inequality for the
Lebesgue measure. Actually, the same argument may be performed at the level of
a Markov semigroup with invariant finite discrete measure, thus yielding Hölder’s
inequality for arbitrary measures.
While functional inequalities and their related applications are an important focal
point, they also give us the opportunity to discuss a number of issues related to
examples and properties of Markov semigroups and operators. One objective of
this work is thus also to present the basic tools and ideas revolving around Markov
semigroups and to illustrate their usefulness in different contexts.
The monograph comprises three main parts.
The first part, covering Chaps. 1 to 3, presents some of the main features,
properties and examples of Markov diffusion semigroups and operators as con-
sidered in this work. In a somewhat informal but intuitive way, Chap. 1 intro-
duces Markov semigroups, their infinitesimal generators and associated Markov
processes, stochastic differential equations and diffusion semigroups. It also de-
scribes a few of the standard operations and techniques while working with semi-
groups. Chapter 2 develops in detail a number of central geometric models which
will serve as references for later developments, namely the heat semigroups and
Laplacians on the flat Euclidean space, the sphere and the hyperbolic space. Sturm-
Liouville operators on the line, and some of the most relevant examples (Ornstein-
Uhlenbeck, Laguerre and Jacobi), are also presented therein. On the basis of these
preliminary observations and examples, Chap. 3 then tries to describe a general
framework of investigation. While it would not be appropriate to try to cover in a
unique formal mould all the cases of interest, it is nevertheless useful to emphasize
the basic properties and tools in order to easily and suitably develop the -calculus.
In particular, it is necessary to describe with some care the various classes and al-
gebras of functions that we shall be dealing with and to show their relevance in
the classical smooth settings. Note that while infinite-dimensional models would re-
quire further care in this abstract formalism, the methods and principles emphasized
throughout this work are similarly relevant for them. Taking the more classical pic-
ture as granted, Chap. 3 may be skipped at first reading (or limited to the summary
Sect. 3.4).
Part II, forming the core of the text, includes Chaps. 4 to 6 and covers the three
main functional inequalities of interest, Poincaré or spectral gap inequalities, loga-
rithmic Sobolev inequalities and Sobolev inequalities. For each family, some basic
Preface xi
properties and tools are detailed, in tight connection with the reference examples of
Chap. 2 and their geometric properties. Stability, perturbation and comparison prop-
erties, characterization in dimension one, concentration bounds and convergence to
equilibrium are thus addressed for each family. The discussion then distinguishes
between inequalities for the heat kernel measures (local) and for the invariant mea-
sure (global) which are analyzed and established under curvature hypotheses. Chap-
ter 4 is thus devoted to Poincaré or spectral gap inequalities, closely related to
spectral decompositions. Chapter 5 deals with logarithmic Sobolev inequalities, em-
phasized as the natural substitute for classical Sobolev-type inequalities in infinite
dimension, and their equivalent hypercontractive smoothing properties. Sobolev in-
equalities form a main family of interest for which Chap. 6 provides a number of
equivalent descriptions (entropy-energy, Nash or Gagliardo-Nirenberg inequalities)
and associated heat kernel bounds. A significant proportion of this chapter is devoted
to the rich geometric content of Sobolev inequalities, their conformal invariance, and
the curvature-dimension conditions.
On the basis of the main functional inequalities of Part II, Part III, consisting of
Chaps. 7 to 9, addresses several variations, extensions and related topics of interest.
Chapter 7 deals with general families of functional inequalities, each of them hav-
ing their own interest and usefulness. The exposition mainly emphasizes entropy-
energy (on the model of logarithmic Sobolev inequalities) and Nash-type inequal-
ities. In addition, the tightness of functional inequalities is studied by employing
the tool of weak Poincaré inequalities. Chapter 8 is an equivalent description of the
various families of inequalities for functions presented so far in terms of sets and
capacities for which co-area formulas provide the suitable link. The second part of
this chapter is concerned with isoperimetric-type inequalities for which semigroup
tools again prove most useful. Chapter 9 briefly presents some of the recent impor-
tant developments in optimal transportation in connection with the semigroup and
-calculus, including in particular a discussion of the relationships between func-
tional and transportation cost inequalities (in a smooth Riemannian setting).
The last part of the monograph consists of three appendices, on semigroups of
operators on a Banach space, elements of stochastic calculus and the basics of dif-
ferential and Riemannian geometry. At the interface between analysis, probability
and geometry, these appendices aim to possibly supplement the reader’s knowledge
depending on his own background. They are not strictly necessary for the compre-
hension of the core of the text, but may serve as a support for the more special-
ized parts. It should be mentioned, however, that the last two sections of the third
appendix on the basics of Riemannian geometry actually contain material on the
-calculus (in a Riemannian context) which will be used in a critical way in some
parts of the book.
This book has been designed to be both an introduction to the subject, intended to
be accessible to non-specialists, and an exposition of both basic and more advanced
results of the theory of Markov diffusion semigroups and operators. Indeed we chose
to concentrate on those points where we felt that the techniques and ideas are central
and may be used in a wider context, even though we have not attempted to reach
the widest generality. Every chapter starts at a level which is elementary for the
xii Preface
notions developed in it, but may evolve to more specialized topics which in general
may be skipped at first reading. It should be stressed that the level of exposition
throughout the book fairly non-uniform, sometimes putting emphasis on facts or
results which may appear as obvious or classical for some readers while developing
at the same time more sophisticated issues. This choice is motivated by our desire
to make the text accessible to readers with different backgrounds, and also by our
aim to provide tools and methods to access more difficult parts of the theory or to
be applied in different contexts. This delicate balance is not always reached but we
nevertheless hope that the chosen style of exposition is helpful.
The monograph is intended for students and researchers interested in the mod-
ern aspects of Markov diffusion semigroups and operators and their connections
with analytic functional inequalities, probabilistic convergence to equilibrium and
geometric curvature. Selected chapters may be used for advanced courses on the
topic. Readers who wish to get a flavor of Markov semigroups and their applica-
tions should concentrate on Part I (with the exception of Chap. 3) and Part II. Via
an appropriate selection of topics, Part III tries to synthesise the developments of
the last decade. The book demands from the reader only a reasonable knowledge
of basic functional analysis, measure theory and probability theory. It is also ex-
pected that it may be read in a non-linear way, although the various chapters are not
completely independent. The reader not familiar with the main themes (analysis,
probability and geometry) will find some of the basic material collected together in
the appendices.
Each Chapter is divided into Sections, often themselves divided in Sub-Sections.
Section 1.8 is the eighth section in Chap. 1. Theorem 4.6.2 indicates a theorem in
Chap. 4, Sect. 4.6, and (3.2.2) is a formula in Sect. 3.2. An item of a given chapter
is also referred to in other chapters by the page on which it appears. There are no
references to articles or books within the exposition of a given chapter. The Sections
“Notes and References” at the end of each chapter briefly describe some historical
developments with pointers to the literature. The references are far from exhaustive
and in fact are rather limited. There is no claim for completeness and we apologize
for omissions and errors. For books and monographs, we have tried to present the
references in historical order with respect to original editions (although the links
point toward the latest editions).
This book began its life in the form of lectures presented by the first author
at Saint-Louis du Sénégal in April 2009. He thanks the organizers of this school
for the opportunity to give this course and the participants for their interest. This
work presents results and developments which have emerged during the last three
decades. Over the years, we have benefited from the vision, expertise and help of a
number of friends and colleagues, among them M. Arnaudon, F. Barthe, W. Beck-
ner, S. Bobkov, F. Bolley, C. Borell, E. Carlen, G. Carron, P. Cattiaux, D. Chafaï,
D. Cordero-Erausquin, T. Coulhon, J. Demange, J. Dolbeault, K. D. Elworthy,
M. Émery, A. Farina, P. Fougères, N. Gozlan, L. Gross, A. Guillin, E. Hebey,
B. Helffer, A. Joulin, C. Léonard, X. D. Li, P. Maheux, F. Malrieu, L. Miclo, E. Mil-
man, B. Nazaret, V. H. Nguyen, Z.-M. Qian, M.-K. von Renesse, C. Roberto, M. de
la Salle, L. Saloff-Coste, K.-T. Sturm, C. Villani, F.-Y. Wang, L. Wu and B. Ze-
garliński. We wish to thank them for their helpful remarks and constant support.
Preface xiii
F. Bolley, S. Campese and C. Léonard went through parts of the manuscript at sev-
eral stages of the preparation, and we warmly thank them for all their corrections
and comments that helped to improve the exposition.
We sincerely thank the Springer Editors C. Byrne and M. Reizakis and the pro-
duction staff for a great editing process.
We apologize for all the errors, and invite the readers to report any remarks,
mistakes and misprints. A list of errata and comments will be maintained online.
Lyon, Toulouse Dominique Bakry
June 2013 Ivan Gentil
Michel Ledoux
Basic Conventions
Here are some classical and basic conventions used throughout the book.
N is the set of integers {0, 1, 2, . . .}. The set of real numbers is denoted by R.
Functions (on some state space E) are always real-valued. Points in R are usually
denoted by x (if R is the underlying state space) or by r.
An element r ∈ R is positive if r ≥ 0, strictly positive if r > 0, negative if r ≤ 0
and strictly negative if r < 0. Moreover, R+ = [0, ∞) is the set of positive real
numbers while (0, ∞) denotes the set of strictly positive numbers. For r, s ∈ R,
r ∧ s = min(r, s) and r ∨ s = max(r, s). We agree that 0 log 0 = 0.
In the same way (and somewhat against the current), a positive (respectively
negative) function f (on E) is such that f (x) ≥ 0 (respectively f (x) ≤ 0) for every
x ∈ E. The function is strictly positive or strictly negative whenever the inequal-
ities are strict. Similarly, an increasing (respectively decreasing) function f on R
or some interval of R satisfies f (x) ≤ f (y) (respectively f (x) ≥ f (y)) for every
x ≤ y. The function f is said to be strictly increasing or strictly decreasing when-
ever the preceding inequalities are strict. A function is monotone if it is increasing
or decreasing.
Points in Rn are denoted by x = (x1 , . . . , xn ) = (xi )1≤i≤n (or sometimes
x = (x 1 , . . . , x n ) = (x i )1≤i≤n depending on the geometric context). The scalar prod-
uct and Euclidean norm in Rn are given by
n
n 1/2
x·y = xi yi , |x| = (x · x)
1/2
= xi2 .
i=1 i=1
The notation | · | is used throughout to denote the Euclidean norm of vectors and of
tensors.
The constant function equal to 1 on a state space E is denoted by 1. If A ⊂ E,
1A is the characteristic or indicator function of A.
All measures on a measurable space (E, F) considered here are positive mea-
sures. Positive (measurable) functions on (E, F) may take the value +∞. If μ is a
(positive) measure on (E, F), and if f is a function on E
which is
integrable with
respect to μ, its integral with respect to μ is denoted by E f dμ or E f (x)dμ(x),
xv
xvi Basic Conventions
xvii
xviii Contents
Appendices
Appendix A Semigroups of Bounded Operators on a Banach Space . . 473
A.1 The Hille-Yosida Theory . . . . . . . . . . . . . . . . . . . . . . 473
A.2 Symmetric Operators . . . . . . . . . . . . . . . . . . . . . . . 475
A.3 Friedrichs Extension of Positive Operators . . . . . . . . . . . . 477
A.4 Spectral Decompositions . . . . . . . . . . . . . . . . . . . . . 478
A.5 Essentially Self-adjoint Operators . . . . . . . . . . . . . . . . . 481
A.6 Compact and Hilbert-Schmidt Operators . . . . . . . . . . . . . 483
A.7 Notes and References . . . . . . . . . . . . . . . . . . . . . . . 485
xx Contents
1
e−|x| /4t ,
2
pt (x) = t > 0, x ∈ Rn .
(4πt)n/2
Formally, we may already agree that p0 (x) may be considered as the Dirac mass
at 0. The normalization is chosen so that pt (x) are
probability densities with respect
to the Lebesgue measure (denoted dx), that is Rn pt (x)dx = 1. It is a classical
result, and easy to see, that pt solves the (parabolic) heat equation
∂t pt = pt
∂t Pt f = Pt f = Pt (f ).
second order space derivatives, highlighting the central importance of second order
differential operators of interest in this monograph.
On the probabilistic side, the densities pt describe the transition probabilities of
a random process (Bt )t≥0 in Rn called Brownian motion. More precisely, (Bt )t≥0 is
a (continuous) random process with values in Rn such that B0 = 0 (almost surely)
and the increments
mension of a Markov (diffusion) operator which will play a crucial role throughout,
in particular in the study of functional inequalities.
of the use of Markov evolutions in mathematical physics and biology, for example,
dealing in particular with processes on spaces with an infinite number of coordi-
nates. For such spaces, the setting of locally compact spaces is clearly not sufficient.
Although we do not really discuss such infinite-dimensional models in this book, it
is important to choose the correct framework to start with.
Markov processes are characterized by the basic Markov property. Let (Xtx )t≥0
be a measurable process on a probability space (
, , P) starting at time t = 0
from x ∈ E. When the initial point x is fixed or clear from the context, (Xtx )t≥0
is denoted more simply by (Xt )t≥0 . For a given starting point x ∈ E, denote by
Ft = σ (Xu ; u ≤ t), t ≥ 0, the natural filtration of (Xt )t≥0 . The Markov property
then indicates that for t > s, the law of Xt given Fs is the law of Xt given Xs ,
as well as the law of Xt−s given X0 , the latter property reflecting the fact that the
Markov process is time homogeneous, which is the unique case that will be consid-
ered here. The process (Xt )t≥0 (starting at x) is then said to be a Markov process.
One prototypical example is Brownian motion (with values in E = Rn ) for which
the independence of the increments immediately ensures the Markov property.
In a more analytical language, the Markov property may be described on
the finite-dimensional distributions of the process (Xt )t≥0 . Namely, if pt (x, dy)
is the probability kernel describing the distribution of Xt starting from x, the
law of the pair (Xt1 , Xt2 ), t1 ≤ t2 , is obtained after conditioning by Ft1 as
pt1 (x, dy1 ) pt2 −t1 (y1 , dy2 ). Iterating the procedure, the law of the sample
(Xt1 , . . . , Xtk ), 0 < t1 ≤ · · · ≤ tk , given that X0 = x, is
pt1 (x, dy1 ) pt2 −t1 (y1 , dy2 ) · · · ptk −tk−1 (yk−1 , dyk ).
The description of the laws of the various k-uples (Xt1 , . . . , Xtk ), 0 < t1 ≤ · · · ≤ tk ,
is certainly not enough all the properties of the Markov process itself. For example,
it does not allow the description of the law of the random variable XT where T
is the first (random) time when the process (Xt )t≥0 enters some set A. For such
tasks, it is in general necessary to request additional assumptions such as the fact
that (Xt )t≥0 lives in a topological space and has regular paths (continuous, or at
least right-continuous). For the purpose of our investigation, the study of finite-
dimensional marginal laws will suffice, so we shall mainly concentrate on them.
Given such a Markov process (or family of processes) {Xtx ; t ≥ 0, x ∈ E}, de-
fine the associated Markov semigroup (Pt )t≥0 on suitable measurable functions
f : E → R by the conditional formula
Pt f (x) = E f Xtx = E f (Xt ) | X0 = x , t ≥ 0, x ∈ E. (1.1.1)
In particular P0 f = f . The Markov property then indicates that, for every suitably
integrable function f : E → R, and every t, s ≥ 0,
x x
Pt+s f (x) = E f Xt+s = E E f Xt+s | Ft
x x
= E E f Xt+s | Xt
x
= E Ps f Xt = Pt (Ps f )(x).
1.2 Markov Semigroups, Invariant Measures and Kernels 9
This section introduces the basic notions of Markov semigroup and invariant mea-
sure and, still in an informal way, some of the general principles governing their
analysis. The framework will be made progressively more precise in subsequent
sections. The basics for the construction and properties of abstract semigroups are
presented in Appendix A.
1.2.2 Kernels
where pt (x, dy) is, for every t ≥ 0, a probability kernel (that is, for every x ∈ E,
pt (x, ·) is a probability measure, and for every measurable set A ∈ F , x → pt (x, A)
is measurable). Recall that such a representation formula is actually understood for
μ-almost every x in E, as are all identities between functions throughout this work.
The distribution at time t of the underlying Markov process Xtx starting at x is thus
given by the probability pt (x, ·).
From the representation (1.2.4), the operators Pt , t ≥ 0, can be extended to pos-
itive measurable functions, possibly with infinite values. Often, inequalities involv-
ing Pt f are accordingly stated for bounded or positive measurable functions f .
The representation (1.2.4) requires good measurable spaces and the measure
decomposition theorem and is established in the next proposition (more for self-
consistency, since again Markov semigroups associated with Markov processes are
usually given explicitly in terms of kernels).
1.2 Markov Semigroups, Invariant Measures and Kernels 13
Proof We use a result from measure theory, holding on good measurable spaces,
known as the bi-measure theorem. This theorem states that whenever (E, F) is
a good measurable space, if there is a map (A, B) → ν(A, B) from F × F into
R+ such that for every A, B ∈ F , the maps C → ν(A, C) and C → ν(C, B)
are measures on F , then there exists a measure ν1 on the product space E × E
equipped with the product σ -field F ⊗ F such that for every (A, B) ∈ F × F ,
ν(A, B) = ν1 (A × B).
On the basis of this result, we establish the proposition when μ is a probability
measure (the case when μ is σ -finite then easily follows). Set, for (A, B) ∈ F × F ,
ν(A, B) = 1A P (1B )dμ,
E
It is then easily checked that the kernel p(x, dy) fulfills the required conditions. The
proposition is established.
14 1 Markov Semigroups
Very often, the family of kernels pt (x, dy) have densities with respect to a refer-
ence measure (often also the invariant measure).
In this case, E pt (x, y)dm(y) = 1 for (m-almost) every x ∈ E (reflecting the fact
that Pt (1) = 1).
Proof We only prove the result in the case when m is a probability measure, the gen-
eral case being easily adapted by σ -finiteness of m. For a good measurable space,
there exists a sequence (A )∈N of measurable sets which generates the σ -field
F up to sets of m-measure 0. For each ∈ N, let F be the σ -field generated by
A0 , . . . , A . It is also generated by a partition (B1 , . . . , Bj ) of E from which the
1.2 Markov Semigroups, Invariant Measures and Kernels 15
sets with m-measure 0 have been removed. For f ∈ L1 (m), define P f as the condi-
tional expectation E(Pf | F ) of Pf with respect to F . The operator P is bounded
from L1 (m) into L∞ (m) with norm M and may be represented as
j
P f (x) = Qj f 1B (x)
j
j =1
where
1
Qj f = Pf dm.
m(Bj ) Bj
The Qj ’s, j = 1, . . . , j , are linear operators, bounded on L1 (m), with norm
bounded above by M. By duality,
Qj f =
f (y) qj (y)dm(y)
E
In other words, p is the kernel of the operator E(P (E(f | F )) | F ). It is then not
hard to see that p , ∈ N, is an m ⊗ m-martingale with respect to the increasing
family of σ -fields F ⊗ F , ∈ N. Since it is bounded, by elementary martingale
theory, it converges (m ⊗ m-almost everywhere) to a measurable function p(x, y),
(x, y) ∈ E × E, bounded by M, which is easily seen to be a kernel of the operator
P . The proof is complete.
In particular,
√ 2 the density kernel p 2 (x, y) of P 2 is bounded from below by
(1 − ε ) (1 + ε).
a P (1{g≥a} ). But
It is easy to see that the condition ε < 1 is necessary in the above statement:
consider for example on E = {0, 1} the Markov operator P given by Pf (0) = f (1)
and Pf (1) = f (0), invariant with respect to the uniform measure μ, for which ε = 1
and P 2 = Id. For this P , the kernel p 2 (x, y) with respect to μ is such that p 2 (0, 1) =
p 2 (1, 0) = 0.
Section 1.1 describes the basic principle used to construct a Markov semigroup from
a Markov process. Now conversely, given a Markov semigroup, the construction of a
Markov process associated to it relies on the Chapman-Kolmogorov equation which
expresses the semigroup property from a probabilistic point of view. As constructed
in the preceding sections, kernels and density kernels are only defined almost ev-
erywhere with respect to the underlying measure, and identities between them are
understood in this sense below.
Let P = (Pt )t≥0 be a Markov semigroup on L2 (μ) according to Definition 1.2.2.
The semigroup property Pt ◦ Ps = Pt+s translates to the kernels pt (x, dy) of the
representation (1.2.4) via the composition property, for all t, s ≥ 0, x ∈ E,
pt+s (x, dy) = pt (z, dy) ps (x, dz). (1.3.1)
z∈E
When the kernels admit densities pt (x, y), t > 0, (x, y) ∈ E × E, with respect to
a reference measure m on (E, F) as in Definition 1.2.4, the latter equation expresses
that, for all t, s > 0, (x, y) ∈ E × E,
pt+s (x, y) = pt (z, y) ps (x, z)dm(z). (1.3.2)
E
for every, say, positive or bounded measurable function f on the product space
E × · · · × E. When the distribution of such a k-dimensional vector (Xt1 , . . . , Xtk ) is
specified, it determines the distributions of each extracted lower dimensional vector
(for example, the law of (Xt1 , Xt3 ) can be deduced from that of (Xt1 , Xt2 , Xt3 )). It
is thus necessary that the system of finite-dimensional distributions be compatible,
which is precisely the content of the Chapman-Kolmogorov equation.
Equation (1.3.3) describes the law of the process (Xt )t≥0 starting at X0 = x. If
the initial distribution of X0 is given by some other measure ν, then the law of
(X0 , Xt1 , . . . , Xtk ) is given by
E f (X0 , Xt1 , . . . , Xtk ) = f (y0 , y1 , . . . , yk ) ptk −tk−1 (yk−1 , dyk ) · · ·
E×···×E (1.3.4)
× pt1 (y0 , dy1 ) ν(dy0 ).
Given a Markov semigroup P = (Pt )t≥0 , we therefore have candidates for the
distribution of a Markov process (Xt )t≥0 starting at a point x ∈ E. This entails a
correspondence between Markov processes and Markov semigroups (even without
the continuity condition (vi)). However, as already mentioned, the preceding finite-
dimensional description is not enough in general to characterize the full law of the
Markov process and its regularity properties. Usually, is does not even describe the
joint laws of the processes starting from different initial values x and y, as is the case
for example when solving stochastic differential equations. In fact, the process often
lives on a topological space, and one then looks for a process such that the trajec-
tories (the random maps t → Xt (ω) for ω ∈
) are continuous, or right-continuous
with left limits (càdlàg). It requires extra work to construct these laws as probability
measures on the space of continuous (or càdlàg) maps on R+ with values in the state
space E. This question will not be addressed in this book, since most of the time the
concrete processes will be explicitly given, for example as solutions of stochastic
differential equations (see Sect. 1.10).
18 1 Markov Semigroups
X associated with the semigroup P, the generator L will also be called the Markov
generator of X.
The linearity of the operators Pt , t ≥ 0, together with the semigroup property,
shows that L is the derivative of Pt at any time t > 0. Namely, for t, s > 0,
1 1 1
[Pt+s − Pt ] = Pt [Ps − Id] = [Ps − Id] Pt .
s s s
To a Markov semigroup P = (Pt )t≥0 and its infinitesimal generator L, with (L2 (μ)-)
domain D(L), is naturally associated a bilinear form. Assume that we are given a
vector subspace A of the domain D(L) such that for every pair (f, g) of functions
in A, the product f g is in the domain D(L) (A is an algebra).
The definition of the carré du champ operator is clearly subordinate to the algebra
A. Such a class A will be mostly natural in a given context (for example smooth
functions on a manifold) and will be carefully described and discussed in Chap. 3.
1.4 Infinitesimal Generators and Carré du Champ Operators 21
The carré du champ operator (which is so named for reasons which will be made
clear below) will play a crucial role throughout this work. Numerous examples will
be made discussed in this monograph, but it may already be useful to mention the
simple example of the Laplacian L = on Rn giving rise to the standard carré du
champ operator (f, g) = ∇f · ∇g (the usual scalar product of the gradients of f
and g) for smooth functions f, g on Rn .
If f ∈ A, since Pt (f 2 ) ≥ (Pt f )2 for every t ≥ 0 by (1.2.1), in the limit as t → 0,
L(f 2 ) ≥ 2f Lf . It follows that the carré du champ operator is positive on A in the
sense that
(f, f ) ≥ 0, f ∈ A. (1.4.2)
This is a fundamental property of Markov semigroups. To somewhat lighten the
notation, we set (f ) = (f, f ), f ∈ A. Observe also that by bilinearity, (1.4.2)
immediately yields the Cauchy-Schwarz inequality
But now, if φ(Pt f ) is in the domain D(L), then the integral on the right-hand side
of the preceding inequality is zero since μ is invariant. Therefore is decreasing
so that
φ(Pt f )dμ ≤ φ(f )dμ
E E
for every t ≥ 0. In particular, for the convex function φ(r) = |r|, and with f ≥ 0,
|Pt f |dμ ≤ |f |dμ = f dμ = Pt f dμ. (1.4.5)
E E E E
22 1 Markov Semigroups
Lφ(f ) = φ (f ) Lf + φ (f ) (f )
From the abstract semigroup viewpoint, the stochastic representation of the mar-
tingale problem (1.4.7) indicates that, for every t ≥ 0,
t
Pt = Id + Ps L ds. (1.4.9)
0
The second formulation (1.4.8) on the other hand is concerned with the heat equa-
tion ∂t G = LG (G(0, x) = f (x)). To formally solve it, it suffices to consider the
exponential of the operator L as
tk
Qt f = et L f = Lk f (1.4.10)
k!
k≥0
which clearly satisfies ∂t Qt = LQt , and thus Pt = Qt = et L . Hence, the first for-
mulation (1.4.7) expresses that ∂t Pt = Pt L while the second (1.4.8) indicates that
∂t Pt = LPt . The fact that the operator L commutes with Pt actually tells us that
Pt is in a sense a function of L. This statement will be given a precise meaning in
various examples below, but this basic observation can serve as a guide through-
out this monograph. Note, however, that it is quite impossible, given the represen-
tation (1.4.10) of Pt = et L , to analyze properties such as positivity preservation,
which is central to the analysis of Markov semigroups.
where Lx denotes the operator L acting on the x variable. This expresses that
∂t Pt f = LPt f.
But one may similarly consider the dual equation, called the Fokker-Planck equation
(or Kolmogorov forward equation), for t > 0,
where L∗ is the adjoint of L with respect to the reference measure m in the sense
that
∗
f L g dm = g Lf dm
E E
for suitable functions f, g (see Sect. A.2, p. 475). The latter identity (1.5.2) then
expresses that
∂t Pt f = Pt Lf.
This dual Fokker-Planck equation may be obtained, at least formally, admitting that
derivation in t > 0 commutes with the integral in y, via the identities
∂t pt (x, y)f (y)dm(y) = pt (x, y) Lf (y)dm(y)
E E
= f (y) L∗y pt (x, y)dm(y)
E
Markov processes. As before, denote by P = (Pt )t≥0 a Markov semigroup with state
space (E, F), invariant measure μ and infinitesimal generator L with L2 (μ)-domain
D(L). The associated Markov process is denoted by X = {Xtx ; t ≥ 0, x ∈ E}.
If the semigroup P = (Pt )t≥0 admits density kernels pt (x, y), t > 0,
(x, y) ∈ E × E, with respect to μ, then pt is a symmetric function on E × E.
If μ is a probability measure, the symmetry property for f = 1 shows that μ is
invariant. The converse is not true in general. Under reasonable minimal assump-
tions, an invariant measure is unique. Such a measure may or may not be reversible.
Hence Markov semigroups are divided into two classes, according to whether the in-
variant measure is reversible or not. In the fundamental examples studied in Chap. 2
we will learn how to distinguish between these two classes.
From a probabilistic point of view, the name reversible refers to reversibility
in time of the associated Markov process whenever the initial law is the invariant
measure. Indeed, from (1.2.3), we know that if the process starts from the invari-
ant distribution μ, then it keeps the same distribution for any time. Moreover, if
the measure is reversible, and if the initial distribution of X0 is μ, then for any
t > 0 and any partition 0 ≤ t1 ≤ · · · ≤ tk ≤ t of the time interval [0, t], the law of
(X0 , Xt1 , . . . , Xtk , Xt ) is the same as the law of (Xt , Xt−t1 , . . . , Xt−tk , X0 ). This is
easily seen from (1.3.4) when the initial measure ν is μ since in this case the mea-
sure pt (x, dy)μ(dx) is symmetric in (x, y), and then, by an immediate induction,
the measures
ptk −tk−1 (yk−1 , dyk ) ptk−1 −tk−2 (yk−2 , dyk−1 ) · · · pt1 (y0 , dy1 ) μ(dy0 )
are invariant under the change (y0 , . . . , yk ) → (yk , . . . , y0 ). Therefore, for any t > 0,
the law of the process (Xs )0≤s≤t is the same as the law of the process (Xt−s )0≤s≤t .
Hence, the law of the Markov process is “reversible in time”.
In terms of the generator L of the semigroup P, the reversibility property may be
expressed as
f Lg dμ = g Lf dμ (1.6.2)
E E
26 1 Markov Semigroups
Lemma 1.6.2 (Rota’s Lemma) Let P = (Pt )t≥0 be a Markov semigroup symmetric
with respect to μ. Then, for any 1 < p < ∞, there exists a Cp > 0 such that for any
measurable function f : E → R,
sup Ps f
≤ Cp f p .
s≥0 p
Proof We sketch the proof in the case where μ is a probability measure (the general
case has to be adapted with some care). It suffices to assume that f is positive
and, by approximation, bounded. Let (Xt )t≥0 be the Markov process with initial
1.6 Symmetric Markov Semigroups 27
distribution μ associated with (Pt )t≥0 . For T > 0, consider the (positive) martingale
Mt = PT −t (f )(Xt ), t ∈ [0, T ] (cf. Sect. 1.4.3). By Doob’s maximal inequality,
p p p
E sup Mt ≤ Cp E MT
0≤t≤T
for some constant Cp > 0. But MT = f (XT ) and since (by invariance) the law of
p p
XT is μ, E(MT ) = f p . Furthermore, conditional expectation is a contraction in
L (P), so that
p
p
p
E E sup Mt | XT ≤ Cp f p .
0≤t≤T
Now reversibility ensures that the law of (Xt , XT ) is the same as that of (XT −t , X0 )
so that E(Mt | XT ) = P2(T −t) f (XT ). It follows that
E sup Mt | XT ≥ sup E(Mt | XT ) = sup P2(T −t) f (XT ).
0≤t≤T 0≤t≤T 0≤t≤T
In conclusion
sup Pt f
≤ Cp f p
0≤t≤2T p
When the measure μ is symmetric, the generator L and the semigroup P = (Pt )t≥0
are completely described by the measure μ and the carré du champ operator from
Definition 1.4.2. Indeed, for all functions f, g in the algebra A ⊂ D(L) on which
is defined,
(f, g)dμ = − f Lg dμ. (1.6.3)
E E
The operator −L is therefore a positive operator, and for such positive operators,
the theory of symmetric and self-adjoint operators is much easier, as presented in
Appendix A.
As the carré du champ operator and the measure μ completely determine the
symmetric Markov generator L, we will mostly work throughout this monograph
with what will be called a Markov Triple (E, μ, ) consisting of a (σ -finite) measure
μ on a state space (E, F) and a carré du champ operator (on some suitable class
A of functions on E). This framework will be carefully presented in Chap. 3. The
adoption of this framework is particularly justified in our investigation of functional
inequalities in later chapters. As a standard example, consider the Lebesgue measure
on the Borel sets of Rn and the carré du champ operator (f, g) = ∇f · ∇g for
smooth functions f, g on Rn . Further basic examples will be discussed in this and
the next chapter.
In the last part of this section, we collect a few observations on the Chapman-
Kolmogorov and Fokker-Planck heat equations in the symmetric context.
We first briefly revisit the Chapman-Kolmogorov and Fokker-Planck equations
for a symmetric Markov semigroup (Pt )t≥0 , and draw some useful conclusions.
Thus let μ be reversible for L. The operators Pt , t ≥ 0, are then symmetric in
L2 (μ) and often admit kernel densities with respect to μ, denoted pt (x, y), t > 0,
(x, y) ∈ E × E, which are symmetric in (x, y).
As discussed in Appendix A, Sect. A.2, p. 475, while symmetry and self-
adjointness might not coincide for unbounded operators, the generator L of a sym-
metric semigroup is self-adjoint, that is L = L∗ where the adjoint operator L∗ is
computed with respect to the invariant measure μ. Now, from the heat and Fokker-
Planck equations (1.5.1) and (1.5.2),
In particular
Lx pt (x, y) = Ly pt (x, y).
Recall that the notations Lx and Ly are to emphasize the action on the respective
variables x and y of the kernel pt (x, y). This is however not a simple consequence
of the symmetry property. For example, on R2 , the function h(x, y) = x 4 + y 4 is
symmetric in (x, y), but ∂x2 h = ∂y2 h. On the other hand, on Rn , all smooth functions
F of |x − y|2 satisfy x F = y F . This will be the case for the heat kernels of the
standard Laplacian.
The second illustration is related to the trace of the semigroup. Namely, the
Chapman-Kolmogorov equation (1.3.2) indicates that the scalar product in L2 (μ)
1.7 Dirichlet Forms and Spectral Decompositions 29
of the functions z → ps (x, z) and z → pt (z, y) is pt+s (x, y). In particular, by the
Cauchy-Schwarz inequality, for all t, s > 0 and (x, y) ∈ E × E,
Note that this inequality, when applied for t = s (and after the change t → 2t ),
indicates that, for every t > 0, the function pt (x, y) of (x, y) achieves its maximum
on the diagonal x = y. Moreover, in the symmetric case, and provided the following
expression makes sense, one has
p2t (x, x)dμ(x) = pt (x, y)2 dμ(x)dμ(y). (1.6.4)
E E E
When the latter is finite, the operator Pt is Hilbert-Schmidt (see Sect. A.6, p. 483,
for more information about Hilbert-Schmidt operators). In terms of the associated
Markov process X = {Xtx ; t ≥ 0, x ∈ E}, pt (x, x) is the probability that Xt returns
at time t > 0 to its initial point x (as for the Brownian bridge for example). Of
course, this naive representation is formal since in general this probability vanishes.
This basic formula however links the distribution of the so-called bridge process to
the trace of the semigroup.
for any f, g ∈ A.
30 1 Markov Semigroups
The bilinear form E may actually be defined on a class of functions larger than
A. To this end, observe first that from (1.7.1), E(f ) = E(f, f ) may be defined for
any function f ∈ D(L). Also (and this type of argument will be used constantly
throughout this book), for any f ∈ L2 (μ) and any t > 0,
∂t (Pt f )2 dμ = 2 Pt f LPt f dμ = −2 E(Pt f )
E E
and
∂t E(Pt f ) = −∂t Pt f LPt f dμ
E
=− (LPt ) dμ −
2
Pt f L2 Pt f dμ (1.7.2)
E E
= −2 (LPt f )2 dμ.
E
is decreasing
in t > 0. But the latter may be rewritten by the reversibility property
as 1t E f (f − Pt f )dμ, and therefore, for f ∈ D(L), it converges to E(f ) when t
tends to 0.
This observation allows us to extend the definition of E(f ) to a wider class of
functions, defining D(E) as the set of functions f ∈ L2 (μ) for which the quantity
1 1
f 2 dμ − (Pt/2 f )2 dμ = f (f − Pt f )dμ
t E E t E
has a finite (decreasing) limit as t decreases to 0, and E(f ) to be this limit for
f ∈ D(E).
Definition 1.7.1 (Dirichlet form) For a symmetric Markov semigroup P = (Pt )t≥0
with reversible measure μ, the energy E(f ) is defined as the limit as t → 0 of
1
f (f − Pt f ) dμ
t E
for all functions f ∈ L2 (μ) for which this limit exists, defining in this way the
domain D(E). The Dirichlet form E(f, g) is defined by polarization for f and g in
the Dirichlet domain D(E), and E(f ) = E(f, f ).
1.7 Dirichlet Forms and Spectral Decompositions 31
If f, g ∈ D(L),
E(f, g) = f (−Lg)dμ = g(−Lf )dμ,
E E
Hence D(L) ⊂ D(E) ⊂ L2 (μ). These objects will be further investigated in Chap. 3
where in particular D(E) will be constructed first and then D(L) will be extracted
from it.
−Lek = λk ek , k ∈ N.
In this case, by the integration by parts formula (1.6.3), for every integer k,
E(ek ) = (ek , ek )dμ = − ek Lek dμ = λk ek2 dμ = λk .
E E E
shows that
Pt f (x) = f (y) pt (x, y)dμ(y)
E
in accordance with the kernel representation (1.2.4). That the density kernels
pt (x, y) thus defined are positive is however not easy to see directly, even in simple
examples. From (1.6.4), the trace formula
pt (x, x)dμ(x) = e−tλk , t > 0, (1.7.4)
E k∈N
follows. Hence, information on the functions pt (x, x) yields estimates on the spec-
trum of L. Of course, there are numerous examples where the operator admits den-
sities without being Hilbert-Schmidt (the standard Euclidean heat semigroup being
the simplest example).
Contrary to the finite-dimensional case, in infinite dimension, symmetric or self-
adjoint operators L do not always admit an orthonormal basis of eigenvectors. This
is replaced in the general setting by a spectral decomposition. As described in
Sect. A.4, p. 478, the spectrum, which is the set of λ ∈ R such that L − λId is
not invertible, is actually divided into two parts, the discrete spectrum, which corre-
sponds to isolated eigenvalues associated with a finite-dimensional eigenspace, and
the essential spectrum. For example, the Laplace operator on Rn , Markov gener-
ator of the heat or Brownian semigroup, has an empty discrete spectrum. In ideal
settings, as described later in Chap. 4, the essential spectrum is entirely determined
by the behavior of L at infinity. This is in particular the case for Persson operators,
which are developed in Sect. 4.10.3, p. 227.
1.8 Ergodicity
In probability theory, ergodic properties usually relate to the long time behavior. In
the context of Markov processes {Xtx ; t ≥ 0, x ∈ E}, and when the invariant mea-
sure μ is a probability measure, it is in general expected that quantities such as
1 t x
f Xs ds
t 0
converge almost surely as t → ∞ to E f dμ, whatever the starting point x and for
suitable functions f . Although such results may in general be deduced from the
analysis developed in this monograph (see for example Sect. 1.15.9), we adopt here
a less ambitious approach. Ergodicity in our context will relate to the convergence
1.9 Markov Chains 33
continuous paths), it is of interest to briefly test and motivate the preceding abstract
definitions and properties of Markov semigroups and their infinitesimal generators
on the concrete example of a Markov chain on a finite or countable state space. Fur-
thermore, many of the difficulties that will be encountered in general may already be
described in the context of Markov chains on a countable state space. Indeed, in this
simple setting, generators appear as matrices L for which the associated semigroup
is nothing else than et L , that is the usual exponential of matrices. In particular, it is
natural in this context to start from the Markov generator and define from it the as-
sociated semigroup. Most of the properties of general Markov semigroups and gen-
erators have an analogue in this context, and this setting therefore provides a rich
source of ideas and basic counterexamples. Moreover, both the finite and count-
able models also have great importance when one considers numerical estimates.
For example, when solving a partial differential equation such as the heat equation,
discretization of space is often considered. When discretizing differential operators
like those described in Sect. 1.11 below, the resulting operators are generators of
Markov processes on finite or countable spaces.
In the following, we (briefly) review for the example of Markov chains on a finite
or countable state space some of the objects and properties considered in the abstract
framework of the previous sections. We refer the reader to general references on
Markov chains for a more complete picture.
We begin with one of the simplest cases, a finite state space. Thus given a finite set E
(equipped with the σ -field F of all its subsets), consider, as domain of any Markov
generator L, the vector space of all real-valued functions on E (with dimension the
cardinality of E, say N ). On the basis of the functions fx = 1x , x ∈ E, a Markov
generator L can be represented as a matrix (L(x, y))(x,y)∈E×E .
In this context, the (Markov) property L(1) = 0 expresses that, for every x ∈ E,
y∈E L(x, y) = 0. The carré du champ operator is given by
2
(f )(x) = L(x, y) f (x) − f (y) , x ∈ E.
y∈E
Hence, the carré du champ operator is positive if and only if all off-diagonal terms
are positive. A square matrix L = L(x, y), (x, y) ∈ E × E, with the
of the matrix L
two properties y∈E L(x, y) = 0 for all x ∈ E, and L(x, y) ≥ 0 for all x = y, is
called a Markov generator on the finite state space E.
The easiest way to produce such a Markov generator is to consider a stochas-
tic (or Markov) matrix P and to set L = λ(P − Id) for some λ > 0. The matrix
P = (P (x, y))(x,y)∈E×E actually represents the transition probabilities of a Markov
chain jumping at integer times from x to y with probability P (x, y). The continuous
time Markov process associated with P is obtained from the chain by replacing the
times between the jumps by independent exponential variables with parameter λ.
1.9 Markov Chains 35
for all (x, y) ∈ E × E (also known as the detailed balance condition). Summing over
x this identity shows that μ is invariant, so that in this particular case, reversibility
implies invariance. Observe that when the matrix L is symmetric in the ordinary
sense, then the uniform measure on E is reversible for L. Moreover, when L is
irreducible, μ(x) > 0 for every x ∈ E, and the matrix
1
R(x, y) = μ(x) L(x, y) √ , (x, y) ∈ E × E,
μ(y)
is symmetric in the usual sense. It can then be diagonalized with respect to an
orthonormal basis, which means that there is a basis of eigenvectors of L which
is orthonormal in L2 (μ). The eigenvalues are therefore negative, and if ek , λk ,
k = 0, . . . , N , denote respectively the eigenvectors and eigenvalues of −L, then,
for every k,
et L (ek ) = e−λk t ek .
If the vectors ek are normalized in L2 (μ), that is, if x∈E ek (x)μ(x) = 1,
2 then
N
e−λk t ek (x) ek (y)
k=0
is positive for every pair (x, y) in E × E and every t ≥ 0 (this is not obvious).
36 1 Markov Semigroups
Beyond the simplest case of finite spaces, the next case to consider is when the space
is countable. It can thus be assumed that E = N but we tend to prefer the abstract
notation E below, which is particularly justified when one has to take into account
the geometric aspects of the state space E. For example, random walks on Z and
Zn do not have the same behavior concerning recurrence and transience (that is,
whether or not the corresponding Markov process tends to infinity with time).
In the context of a countable state space E, a Markov semigroup P = (Pt )t≥0
is described by an “infinite matrix” of positive kernels (pt (x, y))(x,y)∈E×E , t ≥ 0,
such that for all t ≥ 0 and x ∈ E, and any positive function f on E,
Pt f (x) = f (y) pt (x, y).
y∈E
Solving such an equation requires some properties of the matrix L. Among them,
note that whenever x = y, L(x, y) ≥ 0, and, for every x ∈ E, y∈E L(x, y) = 0.
As in the case of a finite space, the carré du champ operator is given by
2
(f )(x) = L(x, y) f (x) − f (y) , x ∈ E
y∈E
(on, again, say finitely supported functions), and an invariant measure μ for L is
characterized by
μ(y) = μ(x) L(x, y)
x∈E
for every y ∈ E. Again, finding an invariant measure μ requires the solution of an
infinite-dimensional eigenvector problem. The measure μ is reversible if
for all (x, y) ∈ E × E (detailed balance property). Hence, in order for μ to be re-
versible the ratios L(x,y) R(x)
L(y,x) must be expressible in the form R(y) for some function R.
A reversible measure may then be easily defined directly from the generator L, fix-
ing the value at one point x0 ∈ E, say μ(x0 ) = 1, and setting μ(x) = L(x 0 ,x)
L(x,x0 ) , x ∈ E.
If this measure is finite, it can be normalized into a probability measure. An invariant
measure will exist and is unique in the recurrent case. When the underlying Markov
chain is transient, there may exist many invariant measures.
It is also of interest to describe the paths of the Markov process (or family of
Markov processes) X = {Xtx ; t ≥ 0, x ∈ E} with generator L in these finite or count-
able contexts. Working directly in the countable state space setting, we introduce the
Markov matrix
L(x, y)
K(x, y) = − , x = y, K(x, x) = 0.
L(x, x)
We therefore exclude the possibility that L(x, x) = 0 for any x (such a point would
be a trap point, meaning that when the process arrives at this point, it stays there
forever). Look first at the discrete time Markov chain (X k )k∈N with matrix K as
Markov kernel, that is the sequence of random variables with X 0 = x such that, for
all k ∈ N and (x, y) ∈ E × E,
P X k = x = K(x, y).
k+1 = y | X
This process is known as the underlying Markov chain of the process X. Then the
associated Markov process X may described in the following terms. For each k ∈ N,
38 1 Markov Semigroups
if Tk is the k-th jump time of (Xt )t≥0 , the sequence (XTk )k∈N is a Markov chain
which follows the same law as (X k )k∈N . The interval Tk+1 − Tk between the jump
follows an exponential law with parameter −L(XTk , XTk ), independently of what
happened before Tk . In other words, when the process arrives at some point x ∈ E,
it waits for an exponential time with parameter −L(x, x), and then jumps to y with
probability K(x, y).
This rough description hides some serious difficulties. It may happen that the
underlying Markov chain (X k )k∈N is transient (that is, goes to infinity when k → ∞)
and that the jumps are so quick (when L(x, x) goes to infinity) that the process so
described goes to infinity in finite time. The actual construction of the process given
the data contained in the matrix L requires further restrictions. We refer the reader
to the standard literature for a complete account of the topic.
These discrete models may often be thought of as approximations for diffusion
processes. As a basic example, consider the Markov process on Z/N ⊂ R, which
jumps from x to x ± N1 at exponential times with parameter N 2 . Its generator may
be described as
1 1
LN f (x) = N 2 f x + +f x − − 2f (x) ,
N N
which clearly converges to Lf (x) = f (x) when N goes to infinity (at least along
smooth functions). In a naive way, this reflects Donsker’s theorem which describes
convergence of a suitable renormalization of the random walk on Z to Brownian
motion.
Diffusion semigroups are the main objects of investigation in this work, and will be
introduced in Sect. 1.11. They occur quite naturally in the probabilistic context of
stochastic differential equations and we present here some of the intuition behind
this description. The reader is referred to Appendix B for the necessary stochas-
tic calculus background and more precisely to Sect. B.4, p. 495, for the necessary
material on diffusion processes.
n
j
df Xtx = σi Xtx ∂j f Xtx dBti + Lf Xtx dt
i,j =1
f
= dMt + Lf Xtx dt, X0x = x,
f
where (Mt )t≥0 is a local martingale (given as a stochastic integral) such that
f
M0 = 0 and
1 i j 2
n n n
Lf = σk σk ∂ij f + bi ∂i f. (1.10.2)
2
i,j =1 k=1 i=1
f
In other words, (Mt )t≥0 solves the martingale problem (1.4.6) for L. Actually
f
(Mt )t≥0 is a true martingale as soon as f and Lf are bounded functions, and in
this case, taking expectations, for all t ≥ 0 and x ∈ Rn ,
t
x
E f Xt = f (x) + E Lf Xsx ds. (1.10.3)
0
n
g(V , V ) = g ij (x) Vi Vj ≥ 0 for all V = (Vi )1≤i≤n ∈ Rn .
i,j =1
Very often, the operator L is given in the so-called Hörmander form. Recall (cf. Ap-
pendix C, Sect. C.1, p. 500) that a vector field on Rn or on an open set of Rn is a
first order operator f → Zf which may be written as
n
Zf (x) = Z i (x) ∂i f (x).
i=1
The coefficients Z i (x) are the coordinates of Z in the base (∂i )1≤i≤n . If Z is a
vector field, then Z 2 = Z ◦ Z is a second order differential operator. Then, given a
collection (Zj )0≤j ≤d of vector fields (d does not need to be the dimension of the
space, it may be smaller or larger), the operator
1 2
d
L= Zj + Z0 (1.10.5)
2
j =1
is a second order differential operator, semi-elliptic with no 0-order term. For ex-
ample, the ordinary Laplacian on Rn is given in this form, with d = n and Zj = ∂j
for 1 ≤ j ≤ n and Z0 = 0. At least locally, an operator L given by (1.10.4) may
always be represented in this Hörmander form, so that there is no loss of generality
in restricting to operators of this form. This Hörmander form is particularly useful
when analyzing the hypoelliptic properties of L (see Sect. 1.12 below), however it
is not unique.
For such an operator L in Hörmander form (1.10.5), to construct the associated
Markov process, say on Rn , X = {Xtx ; t ≥ 0, x ∈ Rn } with generator L, it is enough
j
to choose d independent Brownian motions (Bt )t≥0 in Rn , 1 ≤ j ≤ d, and to solve
the stochastic differential equation
d
j
dXtx = Zj Xtx ◦ dBt + Z0 Xtx dt, X0x = x. (1.10.6)
j =1
(j )
Here, the notation Zj (Xtx ) ◦ dBt means that the standard Itô integral has been
replaced by the Stratonovich integral (Sect. B.4, p. 495, in Appendix B). The previ-
ous notation has to be understood component-wise, that is, for any coordinate Xtx,i ,
1 ≤ i ≤ n, of Xtx ,
d
j
dXtx,i = Zji Xtx ◦ dBt + Z0i Xtx dt.
j =1
This change from Itô’s integral to Stratonovich’s integral may be viewed as a mere
change of notation, but it has the distinct advantage of having a simpler chain rule
formula than the Itô integral, and therefore leads to a direct representation of L in
42 1 Markov Semigroups
the desired form. The link between this stochastic differential equation and the as-
sociated Markov semigroup as a martingale problem (1.4.6) can be seen, as before,
f f
by checking that df (Xtx ) = dMt + Lf (Xtx )dt where Mt is a martingale, and this
is again the result of a direct application of Itô’s formula for Stratonovich integrals.
n
n
Lf = g ij
∂ij2 f + bi ∂i f (1.11.1)
i,j =1 i=1
where g = g(x) = (g ij (x))1≤i,j ≤n and b = b(x) = (bi (x))1≤i≤n are smooth, respec-
tively n × n symmetric matrix-valued and Rn -valued, functions of x ∈ E. The fact
that L does not involve constant terms expresses the fact that L(1) = 0. The carré du
champ operator (cf. Definition 1.4.2) takes the form, on smooth functions f, g,
n
(f, g) = g ij ∂i f ∂j g. (1.11.2)
i,j =1
1.11 Diffusion Semigroups and Operators 43
d
(f, g) = Zj f Zj g,
j =1
from which positivity is immediate. It is also plain from this expression that L is
elliptic as soon as, at any point x, the vector space spanned by the vectors Zj (x),
1 ≤ j ≤ d, is Rn . This requires in particular d ≥ n.
The continuous setting of the previous examples allows for a crucial diffusion,
or change of variables, property emphasized in the following basic definition. At
this point, it is more a property of the differential operators (1.11.1) but it will
make sense in a general framework (emphasized in Chap. 3) for Markov genera-
tors L acting on some class A of (smooth) functions. For such an operator, recall
the carré du champ operator of Definition 1.4.2, as well as the shorthand notation
(f ) = (f, f ).
The operator L is also said to satisfy the diffusion property or change of variables
formula. In the examples of second order differential operators L considered above,
one may for example choose sufficiently smooth functions f in this definition. Pre-
cise assumptions and descriptions of relevant classes of functions will be addressed
later in Chap. 3.
There is a similar, equivalent, change of variables formula for functions of several
variables,
k
L (f1 , . . . , fk ) = ∂i (f1 , . . . , fk ) Lfi
i=1
(1.11.4)
k
+ ∂ij2 (f1 , . . . , fk ) (fi , fj )
i,j =1
second order differential operator, or equivalently that is, in each argument, a first
order differential operator. For example,
ψ(f ), g = ψ (f ) (f, g) (1.11.5)
n
∇h(f ) = g ij ∂j h ∂i f.
i,j =1
Alternatively, ∇h(f ) = ni=1 Z i ∂i f where Z i = nj=1 g ij ∂j h, 1 ≤ i ≤ n.
The diffusion property of a given Markov generator L is closely related to
the continuity properties of the sample paths of the associated Markov process
X = {Xtx ; t ≥ 0, x ∈ Rn }. As discussed in Sect. 1.1, given any Markov semigroup,
the associated Markov process may be constructed so that for every “nice” func-
tion f (in the domain of L), the real-valued process (f (Xt ))t≥0 is right-continuous
with left-limits. What is hidden behind the diffusion property is that these processes
(f (Xt ))t≥0 are continuous in this case. The proof of this claim requires some work
in stochastic calculus (with respect to non-continuous processes). At least, the dif-
fusion property implies a kind of continuity in measure for functions of the domain
as described in Theorem 3.1.16, p. 136.
Observe that the diffusion operators L for which the second order terms are all
zero (that is, for which the carré du champ operator is identically zero) are far
from being uninteresting. They correspond to vector fields, that is, first order dif-
ferential operators. For such an operator, the associated semigroup P = (Pt )t≥0
is of the form Pt f (x) = f (Xtx ) where Xtx is the path following the vector field
starting at x. It is the analogue of the Markov processes described by solutions of
stochastic differential equations, but deterministic. If the vector field is of the form
Zf (x) = ni=1 Z i (x)∂i f (x), then Xtx is a point in Rn whose coordinates are the
1.11 Diffusion Semigroups and Operators 45
solution of the differential system X0 = x, dXti = Z i (Xt )dt. The law of the process
at time t is a Dirac mass at Xtx and randomness has disappeared. Randomness (or
diffusion) arises precisely from the second order terms in the differential operator.
Given a diffusion operator L as in (1.11.1), let us now describe its invariant measure
when it admits a (smooth, strictly positive) density w with respect to the Lebesgue
measure dx (on E = Rn or an open subset of Rn ). This density w is given by the
equation L∗ w = 0 where L∗ is the adjoint of L with respect to the Lebesgue measure
which may be explicitly described, on smooth functions f , by
n
ij
n
∗
L f= ∂ij2 g f − ∂i bi f . (1.11.6)
i,j =1 i=1
For a different reference measure, it is necessary to take the adjoint with respect
to this new measure (for example, when the diffusion matrix is not the identity,
there is a better choice than the Lebesgue measure as will be seen below). The
invariant measure will usually be given in the further developments, and its existence
or finiteness will be clear from the context.
In general, the adjoint operator L∗ of (1.11.6) is a second order differential oper-
ator, but including a non-zero constant term (usually called the potential in quantum
mechanics and in the study of Schrödinger equations). In the simplest instances
L = + Z where is the usual Laplacian on Rn and Z = (Z i )1≤i≤n is a vec-
tor field, with the Lebesgue measure as reference measure. Hence, by (1.11.6), on
smooth functions f , the adjoint operator L∗ with respect to the Lebesgue measure
is
L∗ f = f − Zf − div(Z)f (1.11.7)
n
where div(Z) = i=1 ∂i Z i . Indeed, given smooth
compactly
supported functions
f, g (on E = Rn or an open subset of Rn ), E f g dx = E g f dx whereas, by
integration by parts, for every 1 ≤ i ≤ n,
f Z i ∂i g dx = − ∂i Z i f g dx = − g Z i ∂i f dx − g ∂i Z i f dx
E E E E
In the context of second order differential operators L of the form (1.11.1), invariant
measures μ are not easy to identify in general. For example, on an open subset
of Rn , the density w of μ with respect to the Lebesgue measure is a solution of
L∗ w = 0. But when the measure μ is reversible, things are a lot easier. For example,
on an open set in Rn , or on a manifold given any local system of coordinates, an
operator given in the form
1 ij
n
Lf = ∂i wg ∂j f , (1.11.8)
w
i,j =1
where w is smooth and strictly positive, is clearly (by integration by parts) symmet-
ric in L2 (μ) where dμ = wdx is the measure with density w with respect to the
Lebesgue measure. This is indeed the general case when the reversible measure has
a non-vanishing density w, and therefore operators of the form
n n
n
Lf = g ij ∂ij2 f + ∂i g + g ∂i (log w) ∂j f
ij ij
i,j =1 j =1 i=1
are exactly those operators with second order terms given by the matrix (g ij ) and
reversible measure dμ = wdx. Therefore, the knowledge of (g ij ) and (bi ) in the
representation (1.11.1) allows for a direct identification of the invariant measure
dμ = wdx when it is reversible. In this case, for f, g smooth and compactly sup-
ported,
n
f Lg dμ = − g ij ∂i f ∂i g dμ = − (f, g)dμ. (1.11.9)
Rn Rn i,j =1 Rn
the co-metric g which naturally enters into the investigation of Markov diffusion
operators. We thus privilege the co-metric g in the notation. As explained there, the
decomposition L = g + Z has the advantage of being preserved under a change of
coordinates.
Under this decomposition
L = g + Z,
The operator f → (log w, f ) is often identified with ∇ log w · ∇f , and in this case
Z is called a gradient vector field. The invariant reversible measure is then typically
written as
dμ = e−W dμg
with w = e−W and μg the Riemannian measure. The coordinate representa-
tion (1.11.8) takes the form
L = g − (W, ·) = g − ∇W · ∇ (1.11.10)
(see Sect. C.5, p. 511, for further details). Both the coordinate representation of
(1.11.2) and the integration by parts formula (1.11.9) are similar in this context. The
latter decomposition of L (the canonical decomposition) is most useful when com-
puting iterated carré du champ operators in view of curvature-dimension conditions
(see Sect. 1.16 below) since it is mainly with the language of Riemannian geometry
that computations are made understandable.
When dealing with an operator L = g − ∇W · ∇ symmetric with respect to
dμ = e−W dμg on a Riemannian manifold (M, g) with Riemannian measure μg ,
we sometimes speak of M as a weighted Riemannian manifold. The basic example
of a weighted measure dμ = e−W dx on Rn is already of significant interest.
In summary, it is easier to describe the invariant measure in the reversible case,
since it is then given locally, while in the general case, it is obtained from a differen-
tial equation. For symmetric diffusions, the carré du champ operator determines
the second order terms of the generator L while the invariant measure μ determines
the first order terms. (For comparison, note that such a characterization is not avail-
able in the discrete case, since for example on finite state spaces, the operator
already describes the generator L.)
In the last part of this section, we briefly discuss variations and a few operations on
the preceding representations.
48 1 Markov Semigroups
n
n
∇ ∗V = − ∂i Vi + Vi ∂i W
i=1 i=1
so that
L = −∇ ∗ ∇. (1.11.11)
This representation immediately extends to the general Riemannian case, where
∇f is replaced by the 1-form df given in a local system of coordinates by
1.12 Ellipticity and Hypo-ellipticity 49
(∂i f )1≤i≤n . The scalar product of 1-forms w and w is then given in this system
of coordinates by w · w = ni,j =1 g ij wi wj , so that again L = −d ∗ d, which may be
seen as an extension of (1.11.11).
When the operator is elliptic, df may be replaced by its gradient ∇f which is
now a vector field, where in a local system of coordinates,
n
∇i f = g ij ∂j f, 1 ≤ i ≤ n.
j =1
The scalar product on vectors V and V is defined as V · V
= ni,j =1 gij V i V
j where
∗
(gij ) is the inverse matrix of (g ). The adjoint ∇ is then defined by its action on a
ij
vector field Z as
∇ ∗ Zf dμ = Z · ∇f dμ
E E
for any smooth compactly supported function f , and we still have L = −∇ ∗ ∇,
which also extends (1.11.11) to the Riemannian case. We refer to Appendix C for
further details.
We conclude this section with a brief comment on integration by parts in the
context of diffusion operators. Namely, observe that when the measure μ is only
invariant, and only for diffusion operators, the integration by parts formula (1.6.3)
still holds when g = ψ(f ). Indeed, setting ψ = , invariance yields
0= L(f )dμ = ψ(f ) Lf dμ + ψ (f ) (f )dμ,
E E E
In particular, this observation explains why, in dimension one (for the so-called
Sturm-Liouville operators, Sect. 2.6, p. 97), the analysis will be easier and invariant
measures will automatically be reversible.
It has already been mentioned in Sect. 1.10 that ellipticity or hypo-ellipticity play
an important role in the theory. In particular, under these conditions, solving a dif-
ferential (heat) equation of the form ∂t u = Lu with respect to a differential operator
L with initial condition u(0, x) = f (x) has regularizing properties: even when f is
only measurable, the solution u(t, x) is smooth for t > 0.
Hypo-ellipticity is a notion intermediate between semi-ellipticity and ellipticity.
Recall first the notions of semi-ellipticity and ellipticity for a diffusion operator L
50 1 Markov Semigroups
n
g(V , V ) = g ij (x) Vi Vj ≥ 0 for all V = (Vi )1≤i≤n ∈ Rn . (1.12.1)
i,j =1
We sometimes speak of uniform ellipticity whenever there exists a c > 0 such that
boundary conditions have to be imposed to ensure some kind of uniqueness (see, for
example, in dimension one, Sects. 2.4, p. 92, 2.5, p. 96, and 2.6, p. 97). The simplest
conditions (which however produce in general only sub-Markov semigroups) are
Dirichlet boundary conditions which amount to assuming that u(t, x) is 0 at the
boundary of the domain for any t > 0, or in probabilistic terms to consider the
associated Markov process killed at the boundary. Again, such restrictions imply
that it is impossible to solve the heat equation (1.12.4) locally.
To be more precise, if L is defined, say, on an open set O ⊂ Rn , and if we consider
a solution of the equation in O1 ⊂ O with the initial condition u0 say compactly
supported in O1 , the solution is not unique: any kind of boundary condition on any
1.12 Ellipticity and Hypo-ellipticity 51
d
L= Zj2 + Z0
j =1
where the Zj ’s, 0 ≤ j ≤ d, are smooth vector fields (which may always be achieved
at least locally). Consider then, at any point x ∈ O, the vector space Vp generated
by the vector fields Zj and their commutators up to some fixed order p ≥ 1, that is
V1 = span{Zj , 0 ≤ j ≤ d},
V2 = span Zj , [Zj , Zk ], 0 ≤ j, k ≤ d ,
...
Vp = span Vp−1 ∪ [Zj , V ], V ∈ Vp−1 , 0 ≤ j ≤ d .
Then, one main conclusion of the Hörmander theory is that if there exists an integer
p such that, for any x ∈ O, Vp = Rn , the operator L is hypo-elliptic on O.
52 1 Markov Semigroups
p = Rn , the operator
Then, if there exists an integer p such that for any x ∈ O, V
L − ∂t is hypo-elliptic on (0, ∞) × O. In other words, under this condition, any
solution of ∂t u = Lu on [0, s] × O is smooth on (0, s) × O, whatever the behavior
of L outside O.
1.13 Domains
This section briefly discusses a few observations and hypotheses about domains of
Markov semigroups and their infinitesimal generators as they arose in the preceding
sections. It is only aimed at giving a flavor of the necessary framework and will
be completed in Chap. 3, Sect. 3.3, p. 151, in which precise hypotheses on the
various classes of functions necessary to develop the Markov semigroup calculus
are carefully discussed.
Throughout this chapter, we are dealing, on some state space (E, F), with a
symmetric Markov semigroup P = (Pt )t≥0 with infinitesimal generator L with
L2 (μ)-domain D(L) where μ is the invariant and reversible measure. As discussed
earlier in Sect. 1.4, to define conversely the semigroup P from its generator L, it is
necessary to know L on a core of its domain D(L). Usually however, the generator
is only given on some set A0 of functions. Some simple criteria to ensure that a
dense vector subspace of functions is a core (in particular in the symmetric case of
most interest) are actually available. In Sect. 3.2, p. 137, we show for example that
on a complete Riemannian manifold (in particular, on open sets in Euclidean spaces
with a suitably chosen metric), C ∞ functions with compact support always form
a core for operators of the form L = g − ∇W · ∇ for some smooth potential W
on the manifold. Further examples will be discussed later. In general, when dealing
with second order operators L with smooth coefficients on an open set in Rn or on a
manifold as given by (1.11.1), every solution of the (heat) equation ∂t f = Lf will
be smooth (as is the case when solving stochastic differential equations, as already
1.14 Summary of Hypotheses (Markov Semigroup) 53
mentioned in Sect. 1.12). The solution is actually as smooth as the initial data. If the
operator is smooth and elliptic (or only hypo-elliptic), then even the regularity on
the initial data is not required, as discussed in the previous section. We could then
work on the class of C ∞ functions, but unfortunately it is not included in L2 (μ),
unless the manifold is compact. These observations lead us to consider an extended
algebra A ⊃ A0 on which the operator L is defined as an extension together with
the associated carré du champ operator , regardless of integrability properties. In
the regular instances, this class A will be the class of smooth C ∞ functions.
Given a vector space A0 of real-valued functions f on E, a variety of hypotheses
may be considered in order to conveniently work with P and L on A0 . A general
set of conditions is that A0 is contained in all Lp (μ)-spaces, 1 ≤ p < ∞, is dense
in the domain D(L) of L (that is, is a core of the domain), and is stable under
products and compositions with C ∞ functions vanishing at 0. In particular, it is an
algebra. In addition, it may be imposed that whenever f, g ∈ A0 , then f Pt g ∈ A0
for every t ≥ 0, or even that A0 is stable under the action of Pt . Observe that under
the latter assumption, A0 is automatically a core of the domain. The first hypothesis
is formulated to include the example of the set A0 of smooth compactly supported
functions on a complete manifold, while the second covers compact manifolds (but
also non-compact instances such as, for example, the space of Schwartz functions
for the Laplacian on Rn ). The algebra A on which the generator and its carré du
champ operator may naturally be considered, corresponding to the class of smooth
functions, may be constructed as an extension of A0 .
In some situations, it is not necessary to assume stability by composition with
smooth functions (as for the Ornstein-Uhlenbeck semigroup if one wishes to work
only with polynomials, cf. Sect. 2.7.1, p. 103). Actually, there are numerous inter-
mediate hypotheses in order to suitably cover one example or another, and it would
be rather difficult to cover all the cases in this way. We often choose to work with the
maximal hypothesis (of stability by the semigroup), leaving to the reader to adapt
proofs and arguments to more specific or complicated cases. Section 3.3, p. 151,
supplies the necessary arguments to fully justify this reduction, and describes in
more detail the different technical conditions required on a set of functions on which
L is given in order to ensure that the described extension can be achieved.
(E, F, μ) is assumed to be a good measure space in the sense that there is a count-
able family of sets which generates F (up to sets of μ-measure 0), and that both the
measure decomposition theorem and the bi-measure theorem apply (cf. Sect. 1.2).
To make things simpler, and since it is unlikely that the reader would ever need
more, it may actually always be assumed that F is the completion with respect to
the measure μ of the Borel σ -field for a topology on E which makes E a com-
plete separable metric space (a so-called Polish space). However, the topology will
not be used. Every Lp (μ)-space (1 ≤ p < ∞) is therefore separable. Functions,
always real-valued, are classes of functions for the μ-almost everywhere equal-
ity, and equalities and inequalities such as f ≤ g are always understood to hold
μ-almost everywhere. (See p. 7.)
H2. On a good measure space (E, F, μ), a Markov semigroup P = (Pt )t≥0 is a
family of operators with the following properties (see pp. 9 and 11):
(i) For every t ≥ 0, Pt is a linear operator sending bounded measurable functions
on (E, F) into bounded measurable functions.
(ii) P0 = Id, the identity operator (initial condition).
(iii) Pt (1) = 1, where 1 is the constant function equal to 1 (mass conservation).
(iv) If f ≥ 0, then Pt f ≥ 0 (positivity preserving).
(v) For every t, s ≥ 0, Pt+s = Pt ◦ Ps (semigroup property).
(vi) For every f ∈ L2 (μ), Pt f converges to f in L2 (μ) as t → 0 (continuity
property).
(vii) For every p, 1 ≤ p < ∞, the operators Pt , t ≥ 0, extend to every Lp (μ) as
bounded (contraction) operators.
for the semigroup P = (Pt )t≥0 in the sense that for
H3. The measure
μ is invariant
any f ∈ L1 (μ), E Pt f dμ = E f dμ (Definition 1.2.1).
(vii) is then automatic and the operators Pt , t ≥ 0, are contractions on every
Lp (μ), 1 ≤ p ≤ ∞
H4. The operators Pt , t ≥ 0, are represented by Markov kernels
Pt f (x) = f (y)pt (x, dy), x ∈ E,
E
where pt (x, dy), t ≥ 0, is a family of probability kernels on E (that is, for fixed
x ∈ E, a probability measure on (E, F) such that for any A ∈ F , x → pt (x, A) is
measurable) (see p. 12).
H5. The Markov process (or family of processes) X = {Xtx ; t ≥ 0, x ∈ E} on a
probability space (
, , P) associated with the Markov semigroup P = (Pt )t≥0
satisfies, on bounded or positive measurable functions f : E → R, the conditional
formula
Pt f (x) = E f Xtx = E f (Xt ) | X0 = x , t ≥ 0, x ∈ E.
H6. The Markov semigroup P = (Pt )t≥0 is symmetric with respect to the measure
μ, or μ is reversible for P = (Pt )t≥0 , in the sense that for any pair (f, g) of func-
tions in L2 (μ),
f Pt g dμ = gPt f dμ
E E
(Definition 1.6.1).
H7. The domain D(L) of the infinitesimal generator L of the semigroup P = (Pt )t≥0
is the (linear) space of functions f ∈ L2 (μ) such that the limit
1
lim (Pt f − f )
t→0t
exists in L2 (μ). This limit, denoted by Lf , defines the infinitesimal (Markov) gen-
erator (or operator) L. The domain D(L) is always a dense subspace of L2 (μ)
(cf. Definition 1.4.1). On this domain, the semigroup (Pt )t≥0 solves the heat equa-
tion (with respect to L),
∂t Pt = LPt = Pt L.
For a symmetric semigroup, the generator L is self-adjoint on its domain.
H8. The domain D(E) of the Dirichlet form E is the space of functions f in L2 (μ)
such that the limit
1
lim f (f − Pt f )dμ
t→0 t E
exists. This limit, denoted by E(f, f ) = E(f ), defines the Dirichlet form E of the
semigroup P = (Pt )t≥0 with domain D(E). When f ∈ D(L),
E(f ) = f (−Lf )dμ
E
Various hypotheses may hold on the vector space A of functions on which this
carré du champ operator is defined (see Sect. 1.13).
56 1 Markov Semigroups
H10. The generator L satisfies the diffusion property which states that for any fam-
ily (f1 , . . . , fk ) of functions in A on which the carré du champ operator is de-
fined, and any smooth function : Rk → R such that (f1 , . . . , fk ) belongs to
D(L),
k
k
L (f1 , . . . , fk ) = ∂i (f ) Lfi + ∂ij2 (f ) (fi , fj ).
i=1 i,j =1
Equivalently, the carré du champ operator is, in each argument, a first order dif-
ferential operator. The diffusion property translates the fact that the trajectories of
the associated Markov process X = {Xtx ; t ≥ 0, x ∈ E} are in some sense continu-
ous (cf. Sect. 1.11).
Given a Markov semigroup (Pt )t≥0 , or its associated Markov process (Xt )t≥0 , on
a state space E, a bi-measurable bijection B between E and another state space E
yields a Markov process (B(Xt ))t≥0 on E , with Markov semigroup
Pt f = Pt (f ◦ B) ◦ B −1 , t ≥ 0,
not really different from the original one. The properties of semigroups which we
would like to investigate should actually be invariant under such transformations.
There are indeed intrinsic quantities in a given Markov generator which are inde-
pendent of the choice of such representations. Properties of semigroups are often
1.15 Working with Markov Semigroups 57
n
n
Lf = g ij (x)∂ij2 f + bi (x)∂i f
i,j =1 i=1
n
n
Lf =
g ij
(y)∂ij2 f +
bi (y)∂i f
i,j =1 i=1
where
n
Jki (y)J (y) g k B −1 (y)
j
g ij (y) =
k,=1
The formula for b is however a bit more complicated since it involves both b and
the derivatives of the g ij ’s. There are expressions which are better behaved than oth-
ers (such as the matrix g = (g ij (x)) or the vector fields, and more generally tensors,
see below) under a change of variables.
It is important here to notice that, from this
point of view, the first order part ni=1 bi ∂i of L is not a vector field. However, in
the decomposition (1.11.10) L = g + Z, the term Z is a vector field. What does
not behave properly in this expression is the second order term ni,j =1 g ij (x)∂ij2 ,
which may appear as ni,j =1 ĝ ij (y)∂ij2 + ni=1 b̂i ∂i after a change of coordinates.
For example, if = ∂x2 + ∂y2 is the usual Laplacian on R2 for which g = Id and
b = 0, in polar coordinates it takes the form
1 2 1
∂r2 + ∂ + ∂r .
r2 θ r
In particular b = 0 in the first form
n buti b = 0 in the second, so that neither the vector
b, nor the first order operator i=1 b ∂i , are intrinsic.
As already mentioned, many ideas in the study of diffusion generators are
taken from Riemannian geometry, in which precisely invariants under the action
58 1 Markov Semigroups
of changes of coordinates are investigated. This is in particular the case for ellip-
tic operators, for which the diffusion matrix g = g(x) = (g ij (x))1≤i,j ≤n is, at every
point x, symmetric positive-definite (see (1.12.2)). A Riemannian metric is thus nat-
urally associated with such an elliptic operator, and there is a way to work with and
to identify invariants under changes of coordinates in this geometry. The natural
framework for differential operators and invariants under changes of coordinates is
the setting of differentiable manifolds. The reader will find in Appendix C the basic
geometric notions necessary for the further developments.
Moreover, quantities invariant under a change of coordinates, called intrinsic,
will be of use in the analysis of semigroups. Among these quantities, some are
easier to manipulate, as mentioned above, namely tensors (cf. Sect. C.3, p. 504, for
a complete description of tensor quantities). In a change of coordinates, a tensor is
multiplied by the Jacobian matrix as much as there are top indices in the tensor and
by its inverse as much as there are bottom indices. In the definition of a diffusion
operator, g = (g ij )1≤i,j ≤n is a tensor (twice covariant) but b = (bi )1≤i≤n is not.
The time change from t to ct (c > 0) on a Markov semigroup P = (Pt )t≥0 clearly
corresponds to a change of infinitesimal generator from L to c L. We investigate
here changes of L to c(x)L for some function c(x) that also correspond to time
changes, although of more complicated forms, but which have a natural probabilistic
interpretation.
Observe to start with that only positive functions c(x) have to be considered
since the carré du champ operator associated to c L is c , which has to be positive.
For the discussion here, we further restrict to strictly positive bounded functions
0 < a ≤ c(x) ≤ b < ∞ and to diffusion operators L (so that the associated Markov
processes X are sample continuous). While perhaps not the most interesting situa-
tion, once the basic principle is well understood in this case, one can easily adapt it
to the general case.
Now, given the Markov process (Xt )t≥0 (on some probability space (
, , P))
with generator L, starting at some (fixed) point x ∈ E, set for every t ≥ 0,
t 1
Ct = ds.
0 c(Xs )
Since the process (Ct )t≥0 is adapted to the natural filtration (Ft )t≥0 of (Xt )t≥0 , At
t = XAt , t ≥ 0, is
is, for every t ≥ 0, a stopping time of this filtration. The process X
a Markov process (due to the strong Markov property applied to the stopping times
At ). For any function f in the domain of L, by Itô’s formula,
f
df (Xt ) = dMt + Lf (Xt )dt,
and, by composition,
df (XAt ) = At dMAt + Lf (XAt )dt .
f
t ) = d M
df (X tf + c(X
t ) Lf (X
t )dt.
1.15.3 Products
This operation will often be referred to in this book as the tensorization proce-
dure. Given two independent families of Markov processes X = {Xtx ; t ≥ 0, x ∈ E}
y
and Y = {Yt ; t ≥ 0, y ∈ F } on possibly different state spaces E and F , the pair
y
(Xtx , Yt ), t ≥ 0, (x, y) ∈ E × F , is again a Markov process on the product space
E × F . If P = (Pt )t≥0 and Q = (Qt )t≥0 denote their respective semigroups, the
product semigroup R = P ⊗ Q is given by the tensor products Rt = Pt ⊗ Qt ,
t ≥ 0. More precisely, the product semigroup (Rt )t≥0 acts on products of functions
h(x, y) = f (x)g(y), f : E → R, g : F → R, as
The semigroup (Rt )t≥0 may also be defined via the kernels of the representa-
tion (1.2.4) as
rt (x, y), (dz, du) = pt (x, dz) ⊗ qt (y, du)
for every t ≥ 0 and (x, y) ∈ E × F (with the obvious notation). In case the ker-
nels of (Pt )t≥0 and (Qt )t≥0 admit respective densities pt (x, z) and qt (y, u), t > 0,
60 1 Markov Semigroups
Quite often, given a semigroup P = (Pt )t≥0 on a state space E with Markov gen-
erator L and associated Markov process X = {Xtx ; t ≥ 0, x ∈ E}, there is a map
h : E → F such that for every t ≥ 0 and every bounded measurable function
g : F → R, Pt (g ◦ h) is a function of h, call it (Qt g) ◦ h. In this case, the fam-
ily of operators Q = (Qt )t≥0 is again a Markov semigroup, with state space F , and
the push-forward process (h(Xt ))t≥0 is a Markov process with semigroup Q. In or-
der for this property to hold, it is enough that L(g ◦ h) = ( Lg) ◦ h for sufficiently
many suitable functions g : F → R where L is an operator on F . The semigroup
Q = (Qt )t≥0 then has Markov generator L. This new semigroup Q and generator
L may then be considered as the quotient semigroup and generator of P = (Pt )t≥0
and L respectively via the projection map h. In this setting, if the semigroup P has
invariant measure μ, then the image semigroup Q has invariant measure ν where ν
is the image measure of μ induced by the map h. This procedure may sometimes
be used to identify image measures, in particular when the operator L is easy to
identify.
Such a situation occurs in particular for diffusion operators when h is real-valued.
In this case, it suffices that Lh be a function of h, as well as (h). Indeed, if
Lh = b(h) and (h) = a(h), then by the diffusion property (1.11.3) and for a suffi-
ciently regular function g : R → R,
so that
Lg = a(x)g + b(x)g .
The same method applies when h is vector-valued via a multi-dimensional change
of variables formula.
Another situation which often occurs is where there are vector fields
Zj , 1 ≤ j ≤ k, commuting with L, and such that a function f of the domain D(L) is
of the form g ◦ h if and only if Zj f = 0, 1 ≤ j ≤ k. The space F is then viewed the
quotient of E by the action of these vector fields and the function h is the projection
on this quotient space. When there is just one such vector field Z, F is then simply a
parametrization of the integral curves of this vector field. If Zj = Z commutes with
L, then Z also commutes with Pt for every t ≥ 0, and if Zf = 0, then ZPt f = 0.
Indeed, taking the derivative in s ∈ [0, t] of Ps Z(Pt−s f ) yields
Ps (L Z − Z L)Pt−s f = 0.
Therefore, identifying the value at 0 with the value at t, Pt Zf = ZPt f . Hence the
semigroup (Pt )t≥0 preserves the functions vanishing on the vector field Z. More-
over, if Z is a vector field as above, let (Tt )t≥0 be the flow solution of the differen-
tial equation ∂t Tt f = ZTt f (called the exponential of the vector field Z). The flow
(Tt )t≥0 is simply the semigroup—in fact here a group, since there is no need for the
restriction t ≥ 0—with generator Z. Now, if Z with exponential (Tt )t≥0 commutes
with the generator L with semigroup (Pt )t≥0 , then Pt and Tt commute for every
t ≥ 0 (simply repeat the previous procedure by taking the derivative of Tu Pt Ts−u f
for u ∈ [0, s]).
As an illustration, a (smooth) function f in Rn is radial if and only if for all
infinitesimal rotations ij = x i ∂j − x j ∂i , 1 ≤ i, j ≤ n, ij f = 0. It is clear that
these vector fields ij commute with the (standard) Laplace operator on Rn .
Thus the heat semigroup (Pt )t≥0 preserves radial functions, and a new semigroup
may then be constructed by projection on the quotient space R+ , called the Bessel
semigroup (see Sect. 2.4.2, p. 94). Similarly, f only depends on (x1 , . . . , xn−1 ) if
and only if ∂n f = 0, and the image of the Brownian motion in Rn is again a Markov
process in Rn−1 (in this case simply Brownian motion in Rn−1 ). This principle
will be further illustrated in the next chapter in the case of the heat or Brownian
semigroup in Euclidean space, but also on the sphere or on the hyperbolic space in
which the Laplace operator admits a number of symmetries (see Sects. 2.2 and 2.3,
pp. 81 and 88).
Another way to form a semigroup on some image space F can be illustrated
using the carré du champ operator and the reversible measure μ of the semigroup
(Pt )t≥0 (that is, on the Markov Triple structure (E, μ, ), which will be emphasized
later). Namely, let ν be the image measure of μ by a measurable map h : E → F ,
and define a new carré du champ operator on functions g on F by
1 (g)(y) = E g(h) | h = y , y ∈ F,
62 1 Markov Semigroups
the conditional expectation being taken under the law μ. (It is possible to give a
meaning to this expression even when μ is not a probability measure via a decom-
position of μ with respect to ν.) It is easily checked that 1 is indeed a carré du
champ operator, and if is of diffusion-type, then so is 1 . In contrast to the pre-
vious procedure, this construction does not require any special properties of the
function h with respect to L. However, the resulting generator is in general difficult
to compute explicitly. This approach will be used repeatedly throughout this work
when dealing with a Lipschitz map h : E → R to transfer functional inequalities
or concentration tail estimates of the space (E, μ) onto R (equipped with the im-
age measure), and will be the basis for most tail estimates for Lipschitz functions
discussed in the following chapters.
Adding a vector field is a simple analytic operation which however, from the prob-
abilistic point of view, is somewhat delicate. It takes the form of the so-called Gir-
sanov transformation. The basic principle is outlined here only for diffusion pro-
cesses, and even for simplicity for processes on E = Rn with generator given in the
Hörmander form (1.10.5)
1 2
d
L= Zj + Z0
2
j =1
1 j
d
Z= a (x)Zj
2
j =1
where the aj ’s, 1 ≤ j ≤ d, are smooth and bounded functions (beware that Z0 is
excluded in the previous sum). The new operator is L = L + Z, and the task is to
compare the semigroup (P t )t≥0 with generator
L with the semigroup (Pt )t≥0 with
generator L. From a probabilistic viewpoint, the idea is to consider the law of the
process (Xt )t≥0 as a measure on the space of continuous functions on R+ with
values in Rn . Then, the law of the process (X) t≥0 with generator
L has a density
with respect to the law of (Xt )t≥0 given, at each time t, by
t d
1 t j 2
d
j
Nt = exp a j (Xs )dBs − a (Xs )ds ,
0 2 0
j =1 j =1
1.15 Working with Markov Semigroups 63
where (B j )1≤j ≤d are the independent Brownian motions in Rn which appear in the
stochastic differential equation (1.10.6). For a (sketch of a) proof, observe that the
process (Nt )t≥0 thus defined is a martingale by Itô’s formula and that
f
d f (Xt )Nt = f (Xt ) dNt + Nt dMt + Lf (Xt )dt
1
d
j
+ i
∂j f (Xt ) a (Xt )Zi dt .
2
i,j =1
Therefore,
t
Ex Nt f (Xt ) −
Lf (Xt )dt = f (x)
0
where the notation Ex denotes the expectation conditional on X0 = x. This shows
that multiplying the law of Xt by the density Nt solves the martingale prob-
lem (1.4.6) associated to
L. In fact, defining for every suitable function f , every
t ≥ 0 and x ∈ Rn ,
t f (x) = Ex f (Xt )Nt ,
P
we have ∂t P t f = Pt t )t≥0 is indeed the semigroup with generator
Lf so that (P L.
There are of course many different ways to reach this conclusion. Observe that in
order for this conclusion to hold that the vector field Z be a linear combination of
the vector fields Zj , 1 ≤ j ≤ d. This condition is not restrictive when the operator
L is elliptic, but becomes a serious restriction when it is not.
It is often the case that one has to look at differential operators of the form
Lf = Lf + Vf,
V ≤ 0 (and this will remain the case for every non-constant potential V ≤ 0). These
sub-Markov semigroups share a lot of common features with the semigroups killed
at the boundary such as those described later in Sect. 2.4, p. 92, and in Sect. 2.6,
p. 97. For a general (but bounded) potential V , the easiest way to represent (P t )t≥0
starting from (Pt )t≥0 is to use the same trick as for the Girsanov transformation, that
t )t≥0 from the law of the Markov process (Xt )t≥0 with generator L.
is to describe (P
This is done this time with the help of the Feynman-Kac formula. Working again for
simplicity on E = Rn , if
t
At = exp V (Xs )ds , t ≥ 0,
0
This is much easier to see than for the Girsanov transformation, since in this case
f
d f (Xt )At = At dMt + Lf (Xt )dt + f (Xt )At V (Xt ) dt.
and hence ∂t P t
t f = P Lf .
If the starting operator L is symmetric with respect to some measure μ, so is
the new operator L. But this time, even for a probability measure μ, the constant
function 1 is no longer an eigenvector. If V ≤ 0, then the spectrum still lies in
(−∞, 0). Very often, it is in fact in (−∞, λ0 ) for some λ0 < 0. When this lower
bound of the spectrum of − L is in fact an eigenvalue, it corresponds to (at least) one
eigenvector U0 which is positive. This eigenvector is called the fundamental state (or
ground state) of the system (corresponding to the Perron-Frobenius eigenvector in
the finite Markov chain setting of Sect. 1.9). When it is strictly positive everywhere,
then the transformation of P t , t ≥ 0, into
1
Rt f = e−λ0 t Pt (U0 f ), t ≥ 0,
U0
satisfies Rt (1) = 1, and therefore defines a new Markov semigroup (often called
the ground state transform of the previous one). Hence, the study of such Markov
semigroups may be reduced to the study of Markov semigroups provided that the
ground state U0 may be identified, which is not an easy task in general (see also
Sect. 1.15.8 below for more details).
1.15 Working with Markov Semigroups 65
Removing a gradient field (W, ·) (= ∇W · ∇) is not just adding the vector field
with opposite sign. The transformation described here is based on the same principle
as the h-transform and turns out to be very useful. Thus assume that L1 is a diffusion
generator written in the form
L1 f = Lf − (W, f )
LW (W )
W1 = − + .
2 4
M −1 L1 M = L − W1 .
The two operators L1 and L − W1 are conjugate to each other. In particular, for the
respective associated semigroups (Pt )t≥0 and (Qt )t≥0 ,
Qt f = e−W/2 Pt eW/2 f , Pt f = eW/2 Qt e−W/2 f , t ≥ 0.
Moreover, if Pt has kernel densities pt (x, y), t > 0, (x, y) ∈ E × E, with respect to
some measure m, that is
Pt f (x) = f (y) pt (x, y)dm(y),
E
1.15.8 h-Transform
1
Pth f = Pt (f h), t ≥ 0,
h
on a suitable class of functions f . If Pt is represented by a kernel pt (x, dy) as
in (1.2.4), then Pth is represented by the kernel
h(y)
pt (x, dy).
h(x)
1
Lh f = L(hf ).
h
As an illustration, the h-transform can be used to describe the Fokker-Planck op-
erator (see Sect. 1.5) in the symmetric setting. For example, if L = g − ∇W · ∇
on a weighted Riemannian manifold with Laplace-Beltrami operator g and Rie-
mannian measure μg , then for h = eW , Lh = L∗ where L∗ is the adjoint of L in
L2 (μg ).
But this is not the main use of the h-transform. The aim is to continue working
with the generator of a Markov semigroup, and Lh is actually such a generator if
h is harmonic in the sense that Lh = 0. In this case, and provided L is a diffusion
operator according to Sect. 1.11,
Lh f = Lf + 2 (log h, f )
associated process is known as the h-process associated to the original Markov pro-
cess with semigroup P.
It is beyond the scope of this short section to develop the full theory of
h-processes (which is indeed quite complicated and for which we refer the reader to
the appropriate literature). However, we may at least give a flavor of it by compar-
ison with Markov chains (cf. Sect. 1.9). Indeed, consider a Markov chain (Xk )k∈N
(on a finite set E, say) with transition matrix P = (P (x, y))(x,y)∈E×E , and assume
for simplicity that for any (x, y) ∈ E × E, P (x, y) > 0. The quantity P (x, y) rep-
resents the probability that the chain will jump from x to y. Now restrict the matrix
P (x, y) to (x, y) ∈ A × A where A ⊂ E, and denote by P A this restricted square
matrix. The (unique strictly positive) Perron-Frobenius eigenvector U of P A satis-
fies
P A (x, y) U (y) = λ U (x)
y∈A
1
QA (x, y) = P A (x, y) U (y), (x, y) ∈ A × A.
λU (x)
Given x ∈ A, for k < , consider the law of (X1 , . . . , Xk ) conditioned to the event
{Xi ∈ A, 1 ≤ i ≤ }. This is not the law of a Markov chain, but it becomes so if
→ ∞. In the limit, this law is the law of the Markov chain with transition ma-
trix QA . This appears as a consequence of some elementary (but rather tedious)
direct computations, which, in this case, can be made explicit from the matrix
P . The h-transform is then simply the transfer of this principle to the continuous
time case, in the larger setting of Markov semigroups. Notice that in this case the
h-transform is the same operation as the ground phase state transformation de-
scribed in Sect. 1.15.6.
1.15.9 Subordination
Subordination is another way to produce new semigroups from a given one, through
time averaging. Let (Pt )t≥0 be a Markov semigroup on E with generator L. For any
probability measure ν on R+ , define a new Markov operator by setting
∞
Pν = Pt dν(t).
0
law ν, independent of the process (Xt )t≥0 . Then Pν is the law of the random variable
XT in the sense that for all bounded measurable functions f on E, and all x ∈ E,
Pν f (x) = E f (XT ) | X0 = x .
Therefore, while (Pt )t≥0 describes the solution of the parabolic heat equation as-
sociated to the generator L, the semigroup (Qs )s≥0 is the solution of the extended
elliptic equation ∂s2 + L = 0 on R+ × E (elliptic at least when L itself is elliptic).
1.15 Working with Markov Semigroups 69
Again, the preceding operation has a clear probabilistic interpretation. The mea-
1/2
sure νs arising in (1.15.1) is precisely the law of the hitting time
T = inf{t > 0 ; Bt = 0} for a one-dimensional Brownian motion (Bt )t≥0 starting
from B0 = s√> 0. Indeed, it is enough to observe that, for α > 0, the function
F (t, x) = e− α x−αt , t ≥ 0, x ∈ R, satisfies (∂t + ∂x2 )F = 0 and is bounded on R+ .
Therefore F (t, Bt ) is a martingale on [0, T ] such that E(F (T , BT )) = E(F (0, s))
(since B0 = s) from which it follows that
√
Es e−αT = e− α s
1/2
which is the known Laplace transform of the probability measure νs .
The preceding representation of the solutions of the elliptic equation
(∂s2 + L)H = 0 from the semigroup (Qs )s≥0 and the law of some random variable
on R+ may be easily extended to more general equations of the form
2
∂s + α(s)∂s + L H = 0, H (0, x) = f,
t = 0} where (B
with the help of the hitting time inf{t > 0, B t )t≥0 is the diffusion on
R+ with generator ∂t2 + α(t)∂t and initial value B 0 = s. At least for H bounded, one
may consider to this end the martingale H (B t , Xt ), t ≥ 0, with (Xt )t≥0 the Markov
process with generator L assumed to be independent of (B t )t≥0 . Then, by the mar-
tingale property, we use the fact that E(s,0) (H (B 0 , X0 )) = E(s,0) (H (BT , XT )). Of
course, this construction yields a new semigroup only when α is constant. For ex-
ample, later in Sect. 2.2, p. 81, a slight modification of the 12 -stable subordinator
α,1/2 2
will be considered, defined for α > 0 by dνs (t) = eαs−α t dνs (t) and satisfying
∞ √
e−λt dνs (t) = e−s( λ+α −α) , λ ∈ R+ .
α,1/2 2
(1.15.2)
0
To conclude this section, we note that the subordination procedure may be de-
veloped for lots of different representations of the operator L. These usually make
sense only in some specific situations or on some restricted classes of functions. For
example, the Riesz potentials (−L)−α may be represented for α > 0 as
∞
γα−1 t α−1 Pt dt
0
∞
where γα = 0 t α−1 e−t dt. This operator may be bounded from some Lp -space
into another Lq -space as is the case, for example, for the heat semigroup, but is only
defined on the space of mean-zero functions in the finite measure case in general.
Although these representations should be handled with care in general, they provide
useful norm bounds and positivity preserving properties of the associated operators.
70 1 Markov Semigroups
This integral does not make sense in general although it may be defined under mini-
mal reasonable conditions. Furthermore, it has a clear meaning in specific instances.
For example, as will be discussed later in Chap. 4, as soon as the invariant measure
μ of (Pt )t≥0 is finite (and normalized into a probability measure) and satisfies a
Poincaré inequality, it holds that for some constant λ > 0,
Pt f 2 ≤ e−λt f 2 , t ≥ 0,
1 t
t 0 f (Xs )ds converges to 0 (μ-almost surely) as t → ∞.
1.16.1 2 Operator
call from Definition 1.4.2 the carré du champ operator defined on a suitable alge-
bra A of functions in the L2 (μ)-domain D(L) of L by
1
(f, g) = L(f g) − f Lg − g Lf , (f, g) ∈ A × A.
2
The idea (taken from a Riemannian viewpoint, as explained below) is to formally
repeat this definition, replacing the product operation by , to define a new operator
2 (iterated carré du champ operator) as
1
2 (f, g) = L (f, g) − (f, Lg) − (Lf, g) , (1.16.1)
2
for any pair (f, g) of functions such that the various terms on the right-hand side
are well-defined. As for the carré du champ operator , we often write more simply
2 (f ) = 2 (f, f ).
By symmetry of L with respect to the measure μ, and as long as f , Lf , g, Lg
and (f, g) are in the L2 (μ)-domain of L, and if moreover (f, g) is in the L1 (μ)-
domain, the integration by parts formula for the 2 operator reads as
2 (f, g)dμ = (Lf )(Lg)dμ. (1.16.2)
E E
It might also be useful to record at this stage that the diffusion property for L or
(Definition 1.11.1) leads to a change of variables formula for the 2 operator. For
example, using (1.11.3) and (1.11.5), if ψ : R → R is smooth enough, an elementary
calculation yields that
2 ψ(f ) = ψ (f )2 2 (f ) + ψ (f )ψ (f ) f, (f ) + ψ (f )2 (f )2 . (1.16.3)
Note that this formula is presented in (C.6.7), p. 516, as a consequence of the stan-
dard differential calculus rules in differentiable manifolds whereas the diffusion
property from Definition 1.11.1 emphasizes here a more intrinsic and efficient cal-
culus, called -calculus (or 2 -calculus). The power of this calculus will be demon-
strated at length throughout this work.
Some care has to be taken in the definition of and 2 concerning the choice of
the class A of functions. This issue will be discussed in Chap. 3. For the examples
considered in this short introduction, say on E = Rn or on a manifold, A may be
taken for simplicity to be the class of smooth functions with compact support. If
is the standard Laplacian on Rn , and if f : Rn → R is smooth, then
where Ric(L) is a symmetric tensor defined from the Ricci tensor Ricg of the Rie-
mannian (co-) metric g by Ric(L) = Ricg +∇∇W . On Rn with the flat Euclidean
metric, and for the usual Laplacian , this would simply be Ric(L) = ∇∇W and
thus
2 (f ) = |∇∇f |2 + ∇∇W (∇f, ∇f ) (1.16.5)
on smooth functions f : Rn → R.
As is clear from these examples, unlike the carré du champ operator , the new oper-
ator 2 is not always positive. For Laplacians g on Riemannian manifolds (M, g),
the 2 operator is an expression of the Bochner-Lichnerowicz formula (1.16.4)
(cf. Theorem C.3.3, p. 509) and is positive if (and only if) the Ricci curvature Ricg
of the manifold is positive. More precisely, for an elliptic operator L, there is al-
ways a function ρ(x) such that for every smooth function f and at every point x
in the space, 2 (f ) ≥ ρ(x) (f ). For a Laplacian on a Riemannian manifold, the
best possible function ρ(x) in such an inequality is precisely the infimum of the
Ricci tensor (that is, at any point, the smallest eigenvalue of some symmetric matrix
evaluated from the coefficients of L). For non-elliptic operators, there is no such
function in general.
Many differential inequalities may be seen as consequences of an inequality of
the form
2 (f ) ≥ ρ (f ) (1.16.6)
for some ρ ∈ R and all f ’s in A (or in the respective domains of and 2 ). Such an
inequality will be called a curvature condition CD(ρ, ∞). To explain the meaning
of the second parameter ∞, we turn to a more general definition.
n is greater than or equal to the (topological) dimension of the manifold (that is, the
example from which the inequality actually takes its name). For example, the usual
Laplacian on Rn satisfies the condition CD(0, n) (since |∇∇f |2 ≥ n1 (f )2 ).
But this is not necessarily true for other operators. For example, the Ornstein-
Uhlenbeck operator investigated in Sect. 2.7.1, p. 103, satisfies CD(1, ∞) on any
(finite-dimensional) state space, but does not satisfy CD(ρ, n) for any finite n.
As presented in Sect. 1.11.3, a general elliptic differential operator L on a man-
ifold with dimension n is uniquely decomposed as L = g + Z where g is the
Laplacian associated to a Riemannian (co-) metric g and Z is a vector field. Then L
satisfies a curvature-dimension condition CD(ρ, m) if and only if m ≥ n and, set-
ting ∇S Z to be the symmetric covariant derivative of Z in the metric g (that is the
symmetrized tensor ∇Z),
1
Ricg −∇S Z ≥ ρ g + Z ⊗ Z. (1.16.7)
m−n
Note in particular that m = n only when Z = 0 so that, in this sense, Laplacians are
operators with minimal dimension among all elliptic operators. Note that here the
curvature condition CD(ρ, ∞) boils down to Ricg −∇S Z ≥ ρ g. In particular, when
Z = −∇W · ∇, as will normally be the case throughout this book, (1.16.7) reads as
a2
a ≥ ρ + . (1.16.9)
n−1
Such one-dimensional operators will be illustrated in model examples in Chap. 2.
Owing to their simplicity, they will prove to be useful when testing our intuition
74 1 Markov Semigroups
regarding similar convexity properties for more general operators on more general
spaces.
M. Fukushima (see also [91, 190, 294, 431]). Rota’s Lemma (Lemma 1.6.2) may be
found in [196].
Ergodic properties refer to a variety of behaviors as time goes to infinity and only
a very specific feature is examined in Sect. 1.8 and throughout this book.
Elements on Markov chains (Sect. 1.9) on finite and countable spaces may be
found in [309, 333, 357].
Complete accounts on stochastic differential equations and diffusion processes
and semigroups, as outlined in Sects. 1.10 and 1.11, are [50, 152, 252, 255, 263,
350, 358]. Martingale problems and the interplay between analysis and probability
theory is exposed in [256, 393].
Hypo-ellipticity is one major topic of the Hörmander theory [247, 248] briefly
discussed in Sect. 1.12.
The relevant aspects on domains of infinitesimal operators are briefly presented in
Sect. 1.13 following the standard material and references on the subject (see above
and also Appendix A and Chap. 3).
Section 1.15 gathers a variety of tools in the investigation of Markov semigroups,
generators and processes. Some of them are part of the folklore and not always
explicitly stated in standard references. Girsanov (Sect. 1.15.5) and Feynman-Kac
(Sect. 1.15.6) formulas are presented in standard references on stochastic calculus
such as [252, 350, 358, 393]. Potential theory and h-transforms as mentioned in
Sect. 1.15.8 are deeply investigated in [164] (see also [153]) where the reader will
find a comprehensive account of the interplay between potential theory and proba-
bility theory. More on subordinators (Sect. 1.15.9) may be found in [15, 64, 65, 379]
and in the references therein. The basics on Riesz potentials may be found for ex-
ample in [164, 342, 388].
The 2 operator and the associated notion of a curvature-dimension condition
were introduced in the early contributions [24, 36] on the basis of the Bochner-
Lichnerowicz formula in Riemannian geometry towards the study of logarithmic
Sobolev inequalities (see Chap. 5) and Riesz transforms (cf. [26]). Curvature or
curvature-dimension conditions in terms of the 2 operator are sometimes referred
to as the “2 criterion”.
Chapter 2
Model Examples
This chapter is devoted to some basic model examples, which will serve as a guide
throughout this book. These model examples will also provide an opportunity to
illustrate some of the ideas, definitions and properties of Markov semigroups, op-
erators and processes introduced in the first chapter. Moreover, they will help to
set up the framework for the investigation of more general Markov semigroups and
generators as will be achieved in the next chapter.
The two simplest examples of diffusion semigroups and generators (at least
among those for which the semigroup is explicitly known) are the (Euclidean) heat
semigroup and the Ornstein-Uhlenbeck semigroup, with associated Brownian mo-
tion and Ornstein-Uhlenbeck process. There are of course other fundamental exam-
ples, as models or references for comparison, for which there is however in general
no explicit formulas for the semigroups so that they are only described in terms
of their generators. We present here some of these models, with a special focus on
the underlying Laplacian or diffusion generator. With respect to Chap. 1, most of the
Markov semigroups presented in this chapter will indeed be introduced by their gen-
erators (defined on classes of smooth functions). The examples considered here will
actually present an opportunity to discuss the existence and uniqueness of (symmet-
ric) semigroups with given generators (emphasized in the preceding chapter as the
essential self-adjointness issue). Complete justifications are developed in the next
chapter together with the description of the relevant classes of functions.
The chapter starts with the three geometric models of the heat semigroup on the
Euclidean space, the sphere and the hyperbolic space. For each model, we care-
fully describe the geometric framework and present the natural Laplacian giving
rise to the associated heat, or Brownian, semigroup. Sections 2.4 and 2.5 discuss
the heat semigroup with Neumann, Dirichlet or periodic conditions on R or on an
interval of R. More general Sturm-Liouville operators on an interval of the real line
are examined next, and are illustrated in Sect 2.7 by the examples of the Ornstein-
Uhlenbeck or Hermite, Laguerre and Jacobi operators. These important examples
are the three models of one-dimensional diffusion operators which may be diag-
onalized with respect to a basis of orthogonal polynomials, thus leading to deep
analytic connections. Following the first chapter, the exposition often combines the
various analytic, probabilistic and geometric viewpoints.
From the operator-theoretic viewpoint developed in the first chapter, the heat or
Brownian semigroup in the Euclidean space Rn is the Markov diffusion semigroup
with infinitesimal generator the usual Laplacian = Rn on Rn defined on smooth
functions f : Rn → R by
n
f = Rn f = ∂i2 f
i=1
and with invariant and reversible measure the Lebesgue measure dx. The carré du
champ operator is simply given by
n
(f, g) = ∇f · ∇g = ∂i f ∂i g
i=1
1
e−|x−y| /4t ,
2
pt (x, y) = n/2
t > 0, (x, y) ∈ Rn × Rn . (2.1.1)
(4πt)
These kernels classically solve the (parabolic) heat equation ∂t pt = pt (where
acts either on x or y). The semigroup (Pt )t≥0 may via described in probabilistic
2.1 Euclidean Heat Semigroup 79
As such, the semigroup (Pt )t≥0 is a Markov semigroup in the sense of (i)–(v)
of Definition 1.2.2, p. 12. To check the continuity assumption (vi), that is, for any
f ∈ L2 (dx), Pt f → f in L2 (dx) as t → 0, start with a smooth and compactly
supported function f and extend the result by density.
The semigroup P = (Pt )t≥0 is thus called the heat or Brownian semigroup on Rn .
It is one of the few examples for which an explicit description of the transition prob-
abilities is available. The Chapman-Kolmogorov equation (1.3.2), p. 17, appears as
a consequence of the fact that the sum of two independent Gaussian vector in Rn
with respective covariance matrices 2t Id and 2s Id is a Gaussian vector with covari-
ance 2(t + s) Id. Similarly, the dual Fokker-Planck equation (1.5.2), p. 24, is easily
checked with the observation that, for R = |x − y|2 , R = 2n and (R) = 4R (on
either x or y).
Note that the constant function 1 is not in the L2 (dx)-domain of since not
integrable with respect to the Lebesgue measure dx. As observed in Sect. 1.4, p. 18,
the equation Pt (1) = 1 is not a direct consequence of the fact that (1) = 0. We
come back to this issue later when dealing with completion and boundary values.
Among the further useful observations on the Euclidean heat semigroup (Pt )t≥0 ,
observe
that the operators Pt are not Hilbert-Schmidt. Indeed, for every t > 0 and
x ∈ Rn , Rn pt (x, y)2 dy = (8πt)
1
n/2 and hence
pt (x, y)2 dxdy = ∞
Rn Rn
(see Sect. A.6, p. 483). This is again a consequence of the fact that the reversible
measure (the Lebesgue measure) is infinite. Actually, there are no square integrable
eigenvectors of the Laplacian . Indeed, the eigenvectors are typically of the form
y → eix·y , which are never integrable.
On the geometric side, one can immediately verify that the 2 operator, (1.16.1),
p. 71, of the standard Laplacian on Rn is given on smooth functions f : Rn → R
by
n
2 2
2 (f ) = 2 (f, f ) = |∇∇f |2 = ∂ij f .
i,j =1
n
Since by the Cauchy-Schwarz inequality i,j =1 (∂ij f )
2 ≥ n1 ( ni=1 ∂i2 f )2 , for ev-
ery smooth f ,
1
2 (f ) ≥ (f )2
n
so that the Laplace operator on Rn satisfies the curvature-dimension condition
CD(0, n) of Definition 1.16.1, p. 72 (with optimal values).
80 2 Model Examples
Using this example, we next illustrate some of the semigroup tools and properties
described in Chap. 1. We start with the commutation relations with suitable vector
fields as described in Sect. 1.15.4, p. 60. Brownian motion B = (Bt )t≥0 is clearly
translation invariant. The process starting at x is the process starting at 0 trans-
lated from x. This property actually illustrates the commutation of the Laplacian
with translations. A translation in the direction u ∈ Rn is expressed on functions as
f (x) → Tt f (x) = f (x + tu), t ≥ 0, x ∈ Rn . As t ≥ 0 evolves, the family (Tt f )t≥0
is nothing else but the solution of the differential equation ∂t Tt f = ZTt f where Z
is the constant vector field Zf = ni=1 ui ∂i f . The vector field Z clearly commutes
with , and thus in the heat semigroup P = (Pt )t≥0 we have Pt (Ts f ) = Ts Pt f for
all choices of t and s (≥ 0).
A similar observation may be applied to rotations. The Laplacian indeed also
commutes with the operators ij = xi ∂j − xj ∂i , 1 ≤ i, j ≤ n. In particular, this
commutation justifies the invariance by rotation of the law of Brownian motion
B = (Bt )t≥0 when x = 0. Namely, if rijt denotes the rotation of angle t in the 2-plane
(ei , ej ), then for any f and x ∈ Rn , f (rijt x) = Rt f (x) where Rt = etij is the
(semi-) group generated by the vector field ij . Since ij vanishes at the origin,
and hence 0 is left-invariant under Rt , the commutation between and ij ensures
that for any bounded measurable function f , Ps Rt f (0) = Rt Ps f (0) = Ps f (0), jus-
tifying the rotational invariance of Brownian motion. Of course, there is no need to
use such a complicated argument to observe the obvious fact that the (Gaussian) law
of Bt is rotation invariant when the origin is 0. This presentation however empha-
sizes how to use similar arguments in more involved instances, when for example
the law of the associated Markov process is not explicitly known.
The law of the Brownian motion B = (Bt )t≥0 has of course several other remark-
able properties. In particular, for every t ≥ 0, the law of Bt starting from 0 is also the
law of t 1/2 B1 . This again is a commutation property, this time with the vector field
Df (x) = ni=1 xi ∂i f (x), the exponential of which being the dilation semigroup
f (x) → Dt f (x) = f (et x) for which
[, D] = D − D = 2.
(where Cn > 0 is the suitable normalizing constant). For any bounded measurable
function f on Rn , the Cauchy kernels yield the harmonic extension H (t, x) of f on
2.2 Spherical Heat Semigroup 81
This section is devoted to the analogue of the heat semigroup on the standard sphere
Sn (in Rn+1 ). This is one most useful model spaces of a compact manifold without
boundary, moreover of strictly positive (constant) curvature. It is closely related to
the Jacobi semigroups described below in Sect. 2.7.4.
To get a clear picture, it is important to first provide suitable geometric descrip-
tions of the state space in order to address the various objects under investigation,
such as the generator and carré du champ operator. The spherical Laplacian, de-
noted by Sn below, may be defined as the Laplace-Beltrami operator on the Rie-
mannian manifold Sn as introduced in Sect. C.3, p. 504. It is invariant and symmetric
with respect to the unique probability measure on Sn , denoted σn , which is invari-
ant under rotations in Rn . As an embedded (in Rn+1 ) manifold, representations of
the sphere Sn in explicit charts may actually be provided. We present two main
representations, the orthogonal projection representation and the stereographic pro-
jection representation. For each representation, we provide an explicit description
of the spherical Laplacian, the invariant measure and the associated carré du champ
operator. Below, we use the standard unit sphere Sn in Rn+1 , but simple scaling
arguments cover spheres of arbitrary radius. Recall also that, according to the expo-
sition in Appendix C, emphasis is placed on the (Riemannian) co-metric rather than
the usual metric.
(where dx is the Lebesgue measure on the unit ball B of Rn and cn > 0 is the
normalization constant which ensures that μ is a probability measure). This mea-
sure μ is therefore, up to some constant, the image of σn under the orthogonal
projection from Sn onto B. The Laplacian Sn itself in this chart takes the form
(compare (1.11.8), p. 46)
1 ij ij
n n n
S n = ∂i wg ∂j = δ − x i x j ∂ij2 − n x i ∂i . (2.2.1)
w
i,j =1 i,j =1 i=1
2.2 Spherical Heat Semigroup 83
Via the chain rule formula, if f is the restriction to the sphere of a smooth function
f (x 1 , . . . , x n+1 ) defined in Rn+1 , then Sn f is the restriction to the sphere of the
quantity
n+1
ij
n+1
δ − x i x j ∂ij2 f − n x i ∂i f. (2.2.2)
i,j =1 i=1
In other words, (2.2.1) is in fact valid in Rn+1 , and not only in the local system of
coordinates. Hence, for explicitcomputations, it is not necessary to replace one of
the coordinates, say x n+1 by ± 1 − ni=1 (x i )2 .
From the representation (2.2.2) of Sn as an operator on Rn+1 , it is not im-
mediate that the associated Markov process lives on the unit sphere. However, the
characterization of Sn may be changed slightly into an operator L in Rn+1 satisfy-
ing
L x i = −nx i , x i , x j = δ ij |x|2 − x i x j , 1 ≤ i, j ≤ n + 1.
This operator obviously coincides with Sn on the unit sphere. Moreover, it may
be observed that for this new operator L(|x|2 ) = 0 and (|x|2 ) = 0. Then, it is quite
immediate that for the diffusion process (Xtx )t≥0 with generator L, |Xtx |2 is constant,
and therefore that the process stays forever on the sphere it started from.
Both the carré du champ operator and the invariant measure μ of the spherical
Laplacian are invariant under the action of the rotations of Rn+1 . This is easily
seen for rotations with vertical axis. Consider the first order differential operators in
Rn+1 , ij = x i ∂j − x j ∂i , 1 ≤ i, j ≤ n + 1, already introduced in Sect. 2.1. Then
[Sn , ij ] = 0. Indeed, a smooth function f is rotationally invariant if and only if
ij f = 0 for every i, j . As already mentioned in the preceding section, the vector
field ij generates a (semi)-group of rotations in the plane (ei , ej ). Since every
rotation is a composition of planar rotations in orthogonal planes, the claim follows.
(1 + |x|2 )2 n−2 i n
Sn = − 1 + |x|2 x ∂i (2.2.3)
4 2
i=1
where is the Euclidean Laplacian in Rn . We refer the reader to Sect. 2.3 for
more details about the stereographic projection, viewed there as the restriction to
the sphere of an inversion in Rn+1 .
The preceding orthogonal (2.2.1) and stereographic (2.2.3) projection represen-
tations of the Laplacian on the sphere Sn , regarded as a sub-manifold of Rn+1 , do
keep the same metric and are thus equivalent. One of the interesting properties of
the stereographic projection is that, in this chart, the Euclidean and spherical met-
rics are proportional, in other words conformally equivalent. Generally speaking, a
conformal map from a Riemannian manifold M into itself is a map under which
the metric g of M is transformed to c(x)g for some strictly positive function c(x).
Two metrics g1 and g2 on a given manifold M are conformally equivalent when
g2 (x) = c(x)g1 (x) for some strictly positive function c(x). Conformal maps and
conformally equivalent metrics will play a crucial role in the study of Sobolev in-
equalities on Euclidean space and on the sphere (cf. Sect. 6.9, p. 313). Furthermore,
from a more analytical point of view, uniform ellipticity (that is, the existence of a
c > 0 such that (g ij ) ≥ c(δ ij ) in the sense of symmetric matrices, cf. (1.12.3), p. 50)
is not satisfied in the orthogonal projection representation at the boundary of the
unit ball (the equator of the sphere) whereas it is satisfied there in the stereographic
representation. Conversely, the reverse inequality (g ij ) ≤ c(δ ij ) holds in the projec-
tion representation in the neighborhood of the south-pole but does not hold in the
stereographic projection. Therefore, care must be taken over such properties which
are not invariant by changes of coordinates, and thus are not intrinsic.
There are still many other ways to consider the Laplace operator on the sphere
Sn and we briefly describe below some further examples (these, however, are not
really used later).
First, for Sn embedded into Rn+1 in the usual way, extend any smooth function
f on Sn to a function fdefined on a neighborhood of Sn in Rn+1 , which is indepen-
2.2 Spherical Heat Semigroup 85
n 1
Rn+1 = ∂r2 + ∂r + 2 Sn . (2.2.4)
r r
There is still another representation, more intrinsic in view of Lie group actions.
Consider the sphere Sn as the quotient space SO(n + 1)/SO(n), where SO(n + 1)
is the special orthogonal group in Rn+1 (SO(n) is then regarded as the subgroup of
SO(n + 1) which leaves the point (1, 0, . . . , 0) invariant). Recall the infinitesimal
rotations in Rn+1 , ij = x i ∂j − x j ∂i , 1 ≤ i, j ≤ n + 1. These vector fields preserve
functions
which are independent of the radius (since they commute with the oper-
ator n+1i=1 x i ∂ ). Now, for a function f on Sn given as the restriction to Sn of a
i
smooth function defined in a neighborhood of Sn in Rn+1 , the operator Sn may be
represented as
Sn f = 2ij f. (2.2.5)
1≤i<j ≤n+1
In this representation, observe that, as for the usual Laplace operator on Rn , Sn is
given as the sum of squares of vector fields which commute with it, although they
do not commute with each other. Moreover, many more vectors than the dimension
of the space have to be used. This is an example of a Casimir operator on a homo-
geneous space. Such operators play a fundamental role in the analysis of compact
Lie groups.
According to the general theory presented in Chap. 3, the spherical Laplacian, con-
sidered for example on the class of smooth (C ∞ ) functions on Sn , defines the gen-
erator of the so-called spherical heat or Brownian semigroup (Pt )t≥0 on the sphere
Sn . The Markov process associated with this operator is called the spherical Brow-
nian motion. The spherical heat semigroup admits kernel densities pt (x, y), t > 0,
(x, y) ∈ Sn × Sn , with respect to the invariant measure σn (cf. Definition 1.2.4,
p. 14). Since the semigroup commutes with rotations on the sphere, and therefore
with any rotation R, the heat kernels satisfy pt (x, y) = pt (Rx, Ry) (t > 0). But for
any two pairs (x, y) and (x , y ) of points on the sphere, there exists a rotation R
such that (x , y ) = (Rx, Ry) if and only if x · y = x · y . Using the intrinsic distance
on the sphere defined from the Laplacian (see Sect. C.4, p. 509, or (3.3.9), p. 166),
which in this case is d(x, y) = arccos(x · y), (x, y) ∈ Sn × Sn , pt (x, y) may be ex-
86 2 Model Examples
While there are no simple expressions for heat kernels on spheres, the latter ker-
nel qt (x, y) is on the other hand quite simple and is given by the celebrated Poisson
formula. Indeed, it is classical that the harmonic extension H to the unit ball in Rn+1
of a function f : Sn → R may be represented as
1 − |z|2
H (z) = f (y)dσn (y). (2.2.6)
Sn |z − y|n+1
This formula is quite easy to check. Observe first that the map
1 − |z|2
z →
|z − y|n+1
is harmonic in the open unit ball B. Furthermore, for any z ∈ B, the measure
1−|z|2
ν(z, dy) = |z−y| n+1 dσn (y) is a probability measure on S . This is a consequence
n
of the harmonic property since the function m(z) = Sn ν(z, dy) is harmonic in B,
constant on any sphere of radius r < 1 (due to the rotation invariance of σn ) and
satisfies m(0) = 1, so that the constant value on spheres of radius r is 1. It remains
2.2 Spherical Heat Semigroup 87
as z → z0 ∈ S , ν(z, dy) converges to the Dirac mass at z0 . Then,
to observe that n
the function Sn f (y)ν(z, dy) is harmonic on the ball, and for f continuous on Sn ,
converges to f (z0 ) when z converges to z0 ∈ Sn . This is exactly what is expected
from the Poisson kernel, proving (2.2.6).
According to the Poisson formula, in polar coordinates (r = e−t x, x ∈ Sn ), the
density kernel qt (x, y) with respect to σn of the measure ν(z, dy) may be written
1 − e−2t
qt (x, y) = .
|e−t x − y|n+1
Writing the harmonic extension of f as Qt f , the subordination representation ex-
presses that, as a semigroup,
(n − 1)2 n − 1
Qt = exp −t −Sn + − , t ≥ 0.
4 2
In the last part of this section, we describe the geometric and curvature features of
the sphere and its Laplacian. As the sphere Sn of dimension n is of constant cur-
vature n − 1, the Laplace operator Sn satisfies the curvature-dimension condition
CD(n − 1, n)
1
2 (f ) ≥ (n − 1) (f ) + (Sn f )2
n
(for all f , say, in the algebra A of smooth functions) from Definition 1.16.1, p. 72,
with optimal values (cf. Sect. C.6, p. 513). Such a curvature-dimension condition is
similar to those of the one-dimensional Jacobi operators investigated in Sect. 2.7.4
88 2 Model Examples
n0
− x i ∂i f = log 1 − |x|2 , f .
i=1
But now, the function U = 1 − |x|2 is nothing else than the restriction to the half-
upper sphere of the first coordinate. It is therefore an eigenfunction associated to the
first eigenvalue, so that the operator Sn0 can be decomposed as
Sn0 + (n − n0 )∇ log U
1
Ric −∇S Z − Z ⊗ Z = (n − 1) Id .
n − n0
Comparing with (1.16.7), p. 73, observe that there is equality between tensors. Thus,
here we have found a fundamental example of operators satisfying the curvature-
dimension condition CD(n − 1, n) in an optimal way but for which n > n0 is not the
topological dimension of the state space. The latter operator will be further analyzed
in Sect. 6.9, p. 313, in connection with Sobolev-type inequalities.
The third model example is the heat semigroup on hyperbolic space. In the core of
this book, this model will not be used as often as the preceding models in Euclidean
and spherical spaces. So its description here will be somewhat more sketchy. We
start again with a geometric description of the underlying state space. Various rep-
resentations of the hyperbolic metric are available and here we emphasize two of
them.
2.3 Hyperbolic Heat Semigroup 89
on E = Rn−1 × (0, ∞). As in the preceding section, we use here upper indices
in accordance with the Riemannian geometry convention, putting emphasis on the
co-metric. It is not necessary here to change charts since this one covers at once
the entire manifold. The resulting manifold, called the hyperbolic space Hn (in the
upper half-space representation), is not compact. In this representation, the carré du
champ operator may be written as
2
(f ) = x n |∇f |2
and the invariant reversible measure is dμ(x) = (x n )−n dx (beware of the notation
here, x n is the n-th coordinate and (x n )−n denotes its −n power). Here, as usual,
|∇f |2 is the carré du champ operator of a smooth function f on Rn for the usual
Laplacian Rn and dx is the Lebesgue measure. The associated hyperbolic Lapla-
cian, denoted Hn , is given in this representation by
2
Hn = x n Rn − (n − 2)x n ∂n . (2.3.1)
It should be pointed out that the hyperbolic metric degenerates at the bound-
ary {x n = 0} of the upper half-space E = Rn−1 × (0, ∞). However, the descrip-
tion (2.3.1) is enough to define a unique symmetric semigroup with infinitesimal
generator Hn . Following the developments in Chap. 3, for an elliptic operator on
a manifold, knowledge of its action on the class of smooth compactly supported
functions is enough to describe a unique symmetric semigroup with this operator as
infinitesimal generator (this is the issue of essential self-adjointness) as soon as the
manifold is complete (see Proposition 3.2.1, p. 142, and Corollary 3.2.2, p. 143).
Now the distance on E induced by Hn (see Sect. C.4, p. 509) is the standard hy-
perbolic Riemannian metric on E, and this metric is complete (the boundary is at an
infinite distance with respect to it). Therefore, the operator Hn is essentially self-
adjoint on E and is the infinitesimal generator of a unique symmetric semigroup
called the hyperbolic heat or Brownian semigroup. Moreover the operator satisfies
the curvature-dimension condition CD(−(n − 1), n) (see below), hence this unique
semigroup is indeed Markov.
From a probabilistic viewpoint, the heat semigroup on Hn is the Markov semi-
group (up to the probabilistic normalization of the Laplacian) associated with the
process solving the stochastic differential equation
(in Stratonovich form). It is easily seen that this process does not reach the bound-
ary {x n = 0} in finite time and therefore that the semigroup is indeed Markov. The
Markov process (Xt )t≥0 thus constructed is called hyperbolic Brownian motion.
90 2 Model Examples
The hyperbolic heat semigroup admits density kernels pt (x, y), t > 0,
(x, y) ∈ Hn × Hn , with respect to the invariant measure. They are expressed in terms
of the intrinsic distance associated with hyperbolic Laplacian Hn (see Sect. C.4,
p. 509 or (3.3.9), p. 166). In the upper half-space representation E = Rn−1 × (0, ∞),
the distance between (x1 , y1 ) and (x2 , y2 ) is given by
|x1 − x2 |2 + y12 + y22
cosh−1 .
2y1 y2
and
e−nt
kn+2 (t, d) = ∂d kn (t, d). (2.3.2)
2π sinh(d)
The expressions take a different form according as n is even or odd (and are sim-
pler for odd n’s) while becoming increasingly more complicated as n increases.
A similar recurrence formula for the heat kernels on spheres will be given below
in (2.7.15) after the appropriate analysis of the corresponding Jacobi operators and
their expansions in Jacobi orthogonal polynomials.
As for the sphere, the hyperbolic metric is conformally equivalent to the Eu-
clidean metric. But now the invariant measure is infinite. The Riemann curvature
tensor of the metric is constant, as is the Ricci curvature, which is equal to −(n − 1).
(We have actually described in these three sections the only three metrics and spaces
with this property, the Euclidean space, the sphere and the hyperbolic space.) It may
be checked directly from the definition of Hn that it satisfies a curvature-dimension
condition CD(−(n − 1), n) in the sense of Definition 1.16.1, p. 72.
The Laplace operator Hn on hyperbolic space is invariant under rotations
around the axis of the last coordinate vector en , as well as under translations par-
allel to the hyperplane {x n = 0}. According to Sect. 1.15.4, p. 60, the associated
Markov semigroup leaves invariant the set of functions depending only on the last
coordinate, and gives rise along this coordinate to a one-dimensional semigroup
on (0, ∞) with generator x 2 ∂x2 − (n − 2)x∂x . After a change of variable setting
x = ey , the latter operator takes the simpler form ∂y2 − (n − 1)∂y . In particular
(y) = 1, the function y (which may be seen as the distance from infinity) hav-
ing gradient 1. By (1.16.9), p. 73, this one-dimensional operator still satisfies the
CD(−(n − 1), n) condition, and moreover there is equality in the differential in-
equality (1.16.9), which characterizes this condition in dimension one. Therefore,
2.3 Hyperbolic Heat Semigroup 91
the geometric properties of the Laplace operator of hyperbolic space may be recov-
ered from one-dimensional projections. Moreover, from the latter description, the
function h(x) = (x n )n−1 satisfies Hn h = 0 and therefore defines a harmonic posi-
tive function. The measure hdμ is invariant (although not reversible) for Hn , and
in particular there exists in this way at least one invariant measure different from the
reversible one (many such invariant measures exist, this is just one of them).
Like the sphere Sn , the hyperbolic space Hn admits another representation on the
open unit ball B of Rn . In order to describe this, we return to the stereographic
projection, but this time from Sn−1 to the hyperplane {x n = 0} identified with Rn−1 .
This transformation is an inversion. In the Euclidean space Rn , the inversion with
center x0 ∈ Rn and radius r > 0 is the map which associates to each x = x0 its
inversion x ∈ Rn defined by the condition that x − x0 and x − x0 are proportional
and |x − x0 ||x − x0 | = r 2 . Analytically, it is given by
x − x0
ϕx0 ,r : x → x0 + r 2 .
|x − x0 |2
The sphere with center x0 and radius r is clearly stable under this transformation
(which is equal to the identity on it). This sphere is called the sphere of the inver-
sion ϕx0 ,r . It is only defined for x = x0 and it is an involution (ϕx20 ,r = Id). Actually,
the inversion ϕx0 ,r sends every sphere not containing x0 to sphere, and every sphere
containing x0 to a hyperplane not containing x0 . Similarly, it transforms every hy-
perplane containing x0 into a hyperplane containing x0 , and every hyperplane not
containing x0 into a sphere. Moreover, ϕx0 ,r preserves all the spheres orthogonal to
the sphere of inversion (two spheres are orthogonal if at any intersection point x,
the radii joining x to the centers of the spheres are orthogonal).
Now, the stereographic projection with pole N (north-pole) is actually√the re-
striction to the unit sphere Sn−1 of an inversion with center N and radius 2 (this
inversion clearly preserves the intersection of the unit sphere with the horizontal
hyperplane). But it also sends the upper half-space {xn > 0} onto the unit ball of
Rn . Via this transformation, the hyperbolic Laplacian Hn of (2.3.1) becomes an
operator on the open unit ball B with carré du champ operator
1 2
(f ) = 1 − |x|2 |∇f |2
4
and reversible measure cn (1 − |x|2 )−n dx. Actually, the inversions are conformal
maps of the Euclidean space, and it is not so surprising that the image under an in-
version of a metric conformally equivalent to the Euclidean metric is again confor-
mally equivalent to the Euclidean metric. In this system of coordinates, the operator
92 2 Model Examples
1 2 n − 2 i n
Hn = 1 − |x|2 Rn + 1 − |x|2 x ∂i . (2.3.3)
4 2
i=1
opportunity to introduce the Neumann and Dirichlet boundary conditions, which are
discussed more generally in Sect. 2.6, to address the issue of self-adjointness and to
provide intuitive probabilistic descriptions in terms of reflected and killed Brownian
motion.
Consider the operator Lf = f on (0, ∞) acting on the set Cc∞ (0, ∞) of smooth
and compactly supported functions f on (0, ∞). As this operator is symmetric with
respect to the Lebesgue measure dx, we may look for a symmetric semigroup for
which Cc∞ (0, ∞) is included in the domain of its generator and for which this gen-
erator coincides with L on this class of functions. Note that while we considered
L on (0, ∞), we will see that the associated Markov process may in fact live in
R+ = [0, ∞) and that we will have to consider the natural state space on which the
semigroup lives to be [0, ∞) instead of (0, ∞). Whether we regard the space state to
be (0, ∞) or [0, ∞) is more a matter of taste, and in any case the boundary behavior
will have to be examined. This observation is relevant for most examples studied
here and in the next sections.
A semigroup with generator L in this setting is not unique. Indeed, a bounded
measurable function f defined on (0, ∞) may be extended in at least two dif-
ferent ways to the whole real line R. It may actually be extended to a symmet-
ric function fˆ (that is fˆ(−x) = fˆ(x)) or to an anti-symmetric function fˇ (that is
fˇ(−x) = −fˇ(x)), its value at 0 will not matter. Then, if (Pt )t≥0 is the heat semi-
group on R, Pt fˆ is symmetric, while Pt fˇ is anti-symmetric. Setting PtN f = Pt fˆ
and PtD f = Pt fˇ, taking the restriction to R+ = [0, ∞) yields two different semi-
groups (PtN )t≥0 and (PtD )t≥0 on R+ . It is easily seen that both semigroups are
symmetric with respect to the Lebesgue measure, and that they are positivity pre-
serving. (PtN )t≥0 and (PtD )t≥0 admit simple kernel densities via the standard heat
kernel (2.1.1) (on the line) given, for t > 0 and (x, y) ∈ (0, ∞) × (0, ∞), by
1
ptN (x, y) = pt (x, y) + pt (x, −y)
2
and
1
ptD (x, y) = pt (x, y) − pt (x, −y) .
2
The semigroup (PtN )t≥0 is Markov while (PtD )t≥0 is only sub-Markov (PtD (1) ≤ 1).
For any function f on (0, ∞), both semigroups u(t, x) = PtN,D f solve the heat
equation ∂t u = Lu on R+ × (0, ∞) and thus have L as infinitesimal generator on
Cc∞ (0, ∞). In conclusion, Cc∞ (0, ∞) is not a core of the domain of L, or equiva-
lently L is not essentially self-adjoint on this set (cf. Sect. A.5, p. 481, Sect. 1.6,
p. 24, and Sect. 1.12, p. 49).
94 2 Model Examples
For any bounded measurable function f on (0, ∞), and for any t > 0, PtN f is
a smooth function, symmetric, so that its derivative vanishes at x = 0. The semi-
group (PtN )t≥0 is called the semigroup with infinitesimal generator L and Neumann
boundary conditions (derivatives vanish at the boundary). On the other hand, PtD f
is also smooth, anti-symmetric and therefore vanishes at x = 0. The semigroup
(PtD )t≥0 is then called the semigroup with infinitesimal generator L and Dirichlet
boundary conditions (functions vanish at the boundary).
The Neumann and Dirichlet semigroups have clear probabilistic descriptions in
terms of the associated Brownian motion. Letting {B tx ; t ≥ 0, x ∈ R} be Brownian
motion with speed 2 on the line, it is easily shown that for every suitable function
f : (0, ∞) → R, and every t ≥ 0, x ∈ R+ ,
x
PtN f (x) = E f Bt .
preserves the class of radial functions. As described in Sect. 1.15.4, p. 60, if r de-
notes the function |x|, then (r 2 ) = 2n while (r 2 ) = 4r 2 . Hence, for any smooth
function f compactly supported in (0, ∞)
n−1
f (r) = f (r) + f (r).
r
(Compare with (2.2.4).) The image operator on (0, ∞) is thus Lf = f + n−1
x f .
The family of operators
κ
LBκ f = f + f
x
on (0, ∞) is known as the family of Bessel operators with parameter κ. Here we
restrict our attention to the case κ ≥ 0. The associated reversible measure is x κ dx
on (0, ∞). By means of the harmonic function hκ (x) = x 1−κ (= log x when κ = 1),
it is easily seen that the Markov process with generator LBκ starting from x > 0
never reaches the boundary 0 as soon as κ ≥ 1. A more precise explanation of why
this is the case will be given in Sect. 2.6 below.
Now, one may naively think that if a process does not reach the boundary, then
its generator is essentially self-adjoint, since different symmetric extensions cor-
respond to various boundary behaviors of the process. The above example of LB2
shows that this is not the case. Indeed, performing the h-transform procedure with
h(x) = x for the semigroups (PtN )t≥0 and (PtD )t≥0 of respectively reflected and
killed Brownian motions on (0, ∞) yields two semigroups with generator LB2 on
(0, ∞), while the process driven by the associated stochastic differential equation
does not reach the boundary. (Observe that the h-transform of the killed Brow-
nian motion produces a Markov semigroup, while the transform of the reflected
Brownian motion produces a semigroup (Pt )t≥0 which only satisfies Pt (1) ≥ 1.)
Moreover, the h-transform of LBκ yields the Bessel operator LB2−κ . Hence, follow-
ing Proposition 2.4.1 below, LBκ is essentially self-adjoint as soon as κ > 2. Since
h-transforms preserve such properties, LBκ is also self-adjoint in the range κ < 0.
In this setting, the associated process has a drift which pushes it to the boundary so
strongly that, once it has reached 0, there is no way to return to the open set (0, ∞).
This kind of duality between LBκ and LB2−κ will also be observed for Laguerre
semigroups (Sect. 2.7.3) with similar conclusions.
The value which appears as a limiting case for essential self-adjointness for κ ≥ 0
is thus κ = 2. The following, somewhat more general, Proposition 2.4.1 provides a
useful criterion to ensure that a given generator L on the half-line is essentially self-
adjoint, that is the space Cc∞ (0, ∞) of C ∞ compactly supported functions in (0, ∞)
is dense in the domain D(L) of L. As announced, it shows in particular that, as soon
as κ > 2, the Bessel generator LBκ defined on Cc∞ (0, ∞) is essentially self-adjoint.
Proof We briefly outline the arguments. The fact that L is symmetric with respect
to dμ = eA dx is immediate (see Sect. 2.6). Remove then the gradient in L accord-
ing to the technique described in Sect. 1.15.7, p. 65. The problem is reduced to
2
proving that if K = a2 + a4 then the operator L1 f = f − Kf is essentially self-
adjoint on (0, ∞) with respect to the Lebesgue measure. To this end, according
to Proposition A.5.3, p. 482, it is enough to show that for some λ ∈ R, the equa-
tion f = (λ + K)f (understood in the distributional sense) has no solution in
L2 (dx) = L2 ((0, ∞), dx) except 0. By the hypothesis, λ may be chosen so that
λ + K > K0 (x) for some ε > 0 where K0 (x) = xε2 . Any solution f on (0, ∞) of
f = (K + λ)f is as smooth as K. Assuming that f is not identically 0, up to a
sign change, let f (x0 ) > 0 for some x0 > 0. Now, if f (x0 ) > 0, it is easy to see
from f ≥ K0 (x)f that f is increasing on (x0 , ∞), and is therefore convex on this
interval. Being convex it grows at least linearly at infinity and therefore is not in
L2 (dx). On the other hand, if f (x0 ) < 0, from standard arguments, f is bounded
from below by the solution f0 of f0 = K0 f0 which has the same value and same
derivative at x0 . To verify that f is not in L2 (dx), it is therefore enough to show that
f02 is not integrable near 0. But the solutions of f0 = K0 f0 are linear combinations
of x α1 and x α2 where α1 and α2 are solutions of α(α − 1) = ε. Since f0 (x0 ) < 0,
√
f0 behaves like βx −α1 near the origin, with β > 0 and −2α1 = 1 + 1 + 4ε. The
conclusion then easily follows and the proposition is established.
Then, this semigroup has the Lebesgue measure on S as invariant measure. All
1
p
these considerations show that the above semigroup (Pt )t≥0 on S1 is just the heat
2.6 Sturm-Liouville Semigroups on an Interval 97
p
pt (x, y) = pt (x, y + k), t > 0, (x, y) ∈ [0, 1] × [0, 1].
k∈Z
Unfortunately, this representation as a sum of a series does not admit a closed form.
The preceding analysis may be pushed a bit further on the basis of the proper-
ties of the standard heat semigroup (Pt )t≥0 on the real line to further illustrate the
Neumann and Dirichlet boundary conditions emphasized in the previous section.
Indeed, any bounded measurable function f on [0, 1] may be extended by symme-
try to [−1, +1], and then by periodicity with period 2. The resulting function fˆ
is invariant under the symmetries about 0 and 1. For every t ≥ 0, Pt fˆ is a func-
tion on R which shares the same symmetries. Since a smooth function on R which
is symmetric under those symmetries has zero derivatives at 0 and 1, the result-
ing semigroup on [0, 1] corresponds to the Neumann boundary conditions. In the
same way, any function f defined on [0, 1] may be extended by anti-symmetry at
0 and then by periodicity. The resulting function fˇ is anti-symmetric about x = 1,
and so is Pt fˇ. The associated semigroup on [0, 1] then corresponds to the Dirichlet
boundary conditions.
To fix the ideas, we begin with the case of a bounded interval of the real line R, say
[−1, +1]. Choose a generator L with smooth coefficients of the form
Lf = af + bf (2.6.1)
98 2 Model Examples
where a and b are smooth on [−1, +1]. A natural class of functions f on which L
acts is the family of smooth functions on [−1, +1] (equivalently the restrictions to
[−1, +1] of smooth functions defined on a neighborhood of [−1, +1]). Although it
is of interest and often necessary to consider coefficients which are unbounded at the
boundary of the interval (as for example in Sect. 2.4 or Sect. 2.7.4), we assume here
for simplicity of exposition that a and b are bounded on the whole closed interval
[−1, +1]. An immediate computation shows that the carré du champ operator
associated with the generator L of (2.6.1) is given, on smooth functions f and g, by
(f, g) = af g .
d2 d2 α d
2
= α2 2 − · .
dy dx α dx
In the new variable y, the operator L thus takes the form ∂y2 + c(y) ∂y and the carré
du champ operator is the usual (f ) = f 2 . Observe also that if α vanishes at one
of the boundaries of [−1, +1], the new interval on which varies may be unbounded
(although we assumed here for simplicity that this is not the case). For further pur-
poses, note that this change of variables is actually only possible in dimension one
(since there is only one possible metric up to a multiplicative function). Another
feature of dimension one is that the invariant measure is always reversible as well
as explicit (since every vector field is a gradient).
After this change of variable, we may again rescale the state space as to interval
[−1, +1] and switch back to the x variable, so that the operator now takes the form
Lf = f + c(x)f (2.6.2)
where c is a smooth function on [−1, +1]. In this form, the invariant measure is
given, up to a multiplicative constant, by its density
x
w(x) = exp c(y)dy ,
x0
the initial point x0 being a normalization constant for the underlying Markov pro-
cess. Denote by μ the probability measure with (normalized) density w with respect
to the Lebesgue measure (if the coefficients are all bounded, then the invariant mea-
sure is indeed finite).
2.6 Sturm-Liouville Semigroups on an Interval 99
(where we recall that dμ = wdx). But for bounded (smooth) functions f and g, the
integration by parts formula indicates that
+1 +1 +1
f Lg dμ = − (f, g) dμ + f g w −1
−1 −1
and
+1 +1
0= (f Lg − g Lf )dμ = f g − f g w −1 .
−1
The reasoning which identified this semigroup with the semigroup with Dirichlet
boundary conditions on the half-line (Sect. 2.4.1) may be repeated here with no
change.
The Neumann boundary conditions correspond to the case when the semigroup
(PtN )t≥0 is stable on functions with derivative equal to 0 on the boundary (in higher
dimensions, with normal derivative equal to 0 on the boundary), with associated re-
flected Markov processes. In this case, it is not enough to solve an ordinary stochas-
tic differential equation in order to describe the semigroup and it is necessary to add
to the equation a drift term called local time, which is a measure on R+ supported
by the set of times where the process is at the boundary, and which is singular with
respect to the Lebesgue measure. We will not go into these considerations here.
It is not easy to connect Neumann or Dirichlet conditions and processes reflected
or killed at the boundary. It is not so difficult to see that for the process killed at the
boundary, starting from a continuous function f vanishing at the boundary of O,
and provided the coefficients of the stochastic differential equation are smooth and
bounded, then PtD f also vanishes at the boundary of O. However, for the semigroup
associated with the reflected process, it is in general much more difficult to make
sure that whenever f has a derivative vanishing at the boundary of O, then the same
holds for PtN f . We do not investigate this question here, which would lead to a
further study of local times.
In the case of Sturm-Liouville operators on an interval of the real line, we sys-
tematically consider below and throughout this work the Neumann boundary condi-
tions, and thus the Neumann semigroup which we denote by (Pt )t≥0 for simplicity.
These conditions will indeed make it possible for the constant function 1 to belong
to the domain and to satisfy Pt (1) = 1.
The semigroup (Pt )t≥0 associated with a Sturm-Liouville operator L on [−1, +1]
as in (2.6.2) with smooth and bounded coefficients is always Hilbert-Schmidt in
L2 (μ) for the invariant probability measure μ (which is also reversible in this one-
dimensional case). This will be studied later, since the operator actually satisfies a
Sobolev-type inequality, and in this case the density kernels pt (x, y) will actually be
bounded for t > 0. Its spectrum is therefore discrete. To understand the structure of
the eigenvectors, one has to solve the second order differential equation Lf = −λf .
But, for any λ, and for every choice of f (−1) and f (−1), there is a unique so-
lution in the open interval (−1, +1). For Neumann conditions, one has to impose
f (−1) = 0. Then, there exists an infinite sequence of values of λ for which the
solutions of this equation satisfy f (1) = 0 (the value of f (−1) does not matter and
can be fixed). This infinite sequence consists of the eigenvalues of −L with Neu-
mann boundary conditions, and the associated eigenfunctions, suitably normalized,
form an orthonormal basis of L2 (μ). In particular, the constant function (equal to
1) is an eigenfunction with eigenvalue 0.
Denote by (λk )k∈N the sequence of eigenvalues of −L as just described, and
by (fk )k∈N the sequence of the corresponding eigenvectors (normalized in L2 (μ)).
Then the kernel densities of (Pt )t≥0 with respect to the invariant measure μ are
2.6 Sturm-Liouville Semigroups on an Interval 101
However, very few cases yield explicit expressions for λk and fk , and hence of this
heat kernel. The later three sections devoted to the Hermite, Laguerre and Jacobi
operators are such instances.
For many interesting examples, it often happens that the drift coefficient is singular
at the boundary, with a boundary repulsion force on the process. There is a nice way
to determine whether the associated Markov process X = {Xtx ; t ≥ 0, x ∈ [−1, +1]}
reaches the boundary or not in terms of harmonic functions, that is, solutions of
Lh = 0. If L is of the form (2.6.2), so that Lh = h + c(x)h , then the harmonic
functions h are solutions of hh = −c(x), namely
x y
h(x) = exp − c(r)dr dy.
a b
Since the density w of the invariant measure μ satisfies ww = c, a harmonic function
h thus satisfies h = w1 . In terms of X, assuming that (Xt )t≥0 starts from an interior
point (say 0), then (h(Xt ))t≥0 is a local martingale. If Tu and Tv are the hitting
times of u, v ∈ (−1, +1) respectively, and if Tu,v = Tu ∧ Tv is the hitting time of the
boundary of [u, v] ⊂ (−1, 1), we get, since h is harmonic,
E h(XTu,v ) = h(0) = h(u) P(Tu < Tv ) + h(v) P(Tv < Tu ). (2.6.5)
as x → −1. The process never hits the boundary, if α ≥ 1, and it reaches it when
α < 1.
As already pointed out earlier for Brownian motion on the half-line, not reaching
the boundary is not the same as being essentially self-adjoint. Indeed, when the drift
is singular, the definition of the generator L on the open interval may be enough
to fully describe the Markov semigroup and process. In analytical terms, the C ∞
compactly supported functions on the open interval (−1, +1) form a core of the
domain, and thus the operator is essentially self-adjoint. (In particular, whenever L
is self-adjoint, there is no need to care about the boundary conditions when solving
the heat equation.) However, this is usually not that easy to prove. The following
proposition provides a sufficient criterion. It is similar to Proposition 2.4.1 above.
102 2 Model Examples
c2 (x)
c (x) + ≥ C1 min (1 + x)−2 , (1 − x)−2 − C2 .
2
α− α+
Observe that when c(x) 1+x at x = −1 and c(x) − 1−x at x = +1, then
the above condition requires that min(α− , α+ ) > 2, whereas the condition for the
process not to reach the boundary requires min(α− , α+ ) ≥ 1. Such arguments are of
course very specific to dimension one, since explicit computations can be performed
in this case. In higher dimensions, the idea is often to mimic the one-dimensional
case by using comparison arguments.
real line R, a probability measure μ with exponential moments (that is, such
On the
that R eα|x| dμ(x) < ∞ for some α > 0) has a more or less canonical Hilbertian ba-
sis for the space L2 (μ) consisting of orthogonal polynomials, since polynomials are
then dense in L2 (μ). Such a basis is obtained by orthogonalization of the sequence
of polynomials x k , k ∈ N, in the scalar product of L2 (μ). Up to normalization and
change of sign, these orthogonal polynomials are unique, and in the following we al-
ways choose them so that they are normalized in L2 (μ) with strictly positive leading
coefficient.
There are only a few cases of orthogonal polynomials which are also eigenvectors
of diffusion operators. In dimension one, there are only, up to affine transformations,
the Hermite, Laguerre and Jacobi polynomials. This is why we give special atten-
tion to these polynomials in the following sub-sections. Moreover, these examples
belong to the few cases for which there is a complete description of the sequences
of eigenvalues and eigenvectors for a Sturm-Liouville operator. There is a huge lit-
erature on orthogonal polynomials, especially on these three main families. The
following only outlines what will be useful and relevant for the rest of the book.
As will become clear in the following, the Hermite, Laguerre and Jacobi opera-
tors analyzed here may be considered on various classes of functions, C ∞ functions,
C ∞ rapidly decreasing functions, or even polynomials. Again, Chap. 3 will present
the suitable framework and conditions covering such examples.
2.7 Diffusion Semigroups Associated with Orthogonal Polynomials 103
(2π)n/2
that is μ is the standard Gaussian measure on Rn (with mean zero and covariance
the identity matrix). We may also characterize the Ornstein-Uhlenbeck operator L
as the unique operator such that
where (Bt )t≥0 is a standard Brownian motion (in Rn ) starting at the origin. This
process is indeed the solution of the stochastic differential equation
√
dXt = 2 dBt − Xt dt, X0 = x,
104 2 Model Examples
It is not difficult to show that L cannot satisfy any CD(ρ, m) condition where
ρ ∈ R and m < ∞, expressing that in a sense the operator is intrinsically infinite-
dimensional (independently of the dimension of the state space). The curvature con-
dition CD(1, ∞) may also be described by the commutation property on the semi-
group
∇Pt f = e−t Pt (∇f ) (2.7.5)
for every t ≥ 0 and f sufficiently regular. This property may be verified directly
from the representation (2.7.3). Alternatively, starting with the commutation relation
[L, ∇] = L∇ − ∇L = ∇,
2.7 Diffusion Semigroups Associated with Orthogonal Polynomials 105
which is linked to the curvature condition CD(1, ∞) in Theorem 3.3.18, p. 163 (see
also Sect. 4.7, p. 206, and Sect. 5.5, p. 257, below).
As announced, of particular interest are the eigenvectors of the Ornstein-
Uhlenbeck generator which are described in terms of the Hermite polynomials
orthogonal with respect to the Gaussian measure. Here we begin with dimension
one, so
that μ denotes below the standard Gaussian measure on R (satisfying in par-
ticular R eα|x| dμ(x) < ∞ for some, or even every, α > 0). For every integer k, the
Ornstein-Uhlenbeck operator L sends the space Pk of polynomials of degree less
than or equal to k into itself. If the finite-dimensional vector space Pk is equipped
with the scalar product of L2 (μ), L defines a symmetric operator in a Euclidean
space. It may be diagonalized with respect to an orthonormal basis. By recurrence
over k, one then constructs a sequence of orthonormal polynomials in L2 (μ) which
are eigenvectors of L. Comparing the step from k to k + 1, the polynomial added
at k + 1 is none other than the polynomial of degree k + 1 orthogonal to Pk . The
sequence of orthonormal polynomials (Hk )k∈N obtained in this way is therefore
the orthonormal polynomials for the Gaussian measure μ known as the Hermite
polynomials. These polynomials are unique provided their signs are prescribed, and
by convention will be taken so that the coefficient of the term of highest degree is
strictly positive.
For each k ∈ N, Hk is of degree k and is an eigenvector of −L, that is
−L Hk = λk Hk
for some positive number λk . Inspecting the term of highest degree shows that
λk = k (H0 is a constant). Alternatively, the sequence (Hk )k∈N of Hermite poly-
nomials may be defined via the generating series
2 /2 sk
esx−s = √ Hk (x), x ∈ R, s ∈ R,
k∈N
k!
and it is then an exercise to check using this representation that −LHk = kHk for
every k. There are many other useful representations of the Hermite polynomials
Hk , k ∈ N, for instance
1 1
Hk (x) = √ E (x + iG) = √k
(x + iy)k dμ(y), x ∈ R
k! k! R
2
(observe that R es(x+iy) dμ(y) = esx−s /2 , s ∈ R). For example H0 (x) = 1,
H1 (x) = x, H2 (x) = √1 (x 2 − 1), and so forth. One may also deduce in this way the
2
transition densities (2.7.4) from the general expansion (2.6.4).
106 2 Model Examples
so that the functions in the L2 (μ)-domain of L are the functions f with de-
composition
(2.7.6) such that the series describing Lf is in L2 (μ), that is, such
that k∈N k ak2 < ∞. This condition therefore requires rapidly decreasing coeffi-
2
cients ak . Translating this condition into the regularity properties of a given func-
tion f ∈ L2 (μ) requires more care. The latter is actually equivalent to saying that
f ∈ C 1 and that its second derivative in the distributional sense is a function such
that f − xf is in L2 (μ).
For a function f decomposed in L2 (μ) as (2.7.6), the Dirichlet form of Defini-
tion 1.7.1, p. 30, is given by
E(f ) = k ak2
k∈N
(use that E(f ) = f (−Lf )dμ). Hence functions with E(f ) < ∞ are a priori less
regular than functions in the domain (if one agrees that the regularity of a function
is described in terms of the decay of its L2 (μ)-coefficients).
Finally, for any function f ∈ L2 (μ), Pt f is, for every t > 0, in the domain D(L)
and
C
LPt f 22 ≤ 2 f 22 .
t
To prove this assertion, note that the function re−r is bounded on r ∈ R+ , say by
C, so that for every k ∈ N and t > 0, k 2 e−2kt ≤ tC2 . Then compute the norm of
LPt f as above in its spectral decomposition. This property is actually not particu-
lar to the Ornstein-Uhlenbeck operator, and the same argument may be applied to
any symmetric semigroup provided the spectral representation may be used. Hence,
a symmetric semigroup is always smoothing in the L2 -sense (which is however
weak). Below, stronger regularizing properties will be investigated, depending on
the ellipticity of the generator. This comment emphasizes more the difference with
2.7 Diffusion Semigroups Associated with Orthogonal Polynomials 107
Of course, such an identity also follows from a standard integration by parts on the
Gaussian density. While this standard argument may be iterated in various ways, the
Hermite polynomials as eigenvalues of L provide an efficient systematic tool. For
example, using similarly that L(|x|2 − n) = −2(|x|2 − n) yields that
|x| f dμ = n
2
f dμ + f dμ (2.7.8)
Rn Rn Rn
and so on.
The Ornstein-Uhlenbeck operator is related to another famous and well-studied
operator, the harmonic oscillator in Rn , given on smooth functions f by
Hf = f − |x|2 f. (2.7.9)
This is an operator with potential, symmetric with respect to the Lebesgue measure,
and corresponding to the simplest model of quantum mechanics. Denote by (Kt )t≥0
its associated semigroup. Observing that H (U0 ) = −n U0 where U0 = e−|x| /2 , the
2
ground state transformation described in Sect. 1.15.6, p. 63, may be performed. Thus
consider the semigroup
1
Rt f = ent Kt (U0 f ), t ≥ 0,
U0
with generator given on smooth functions f by
1
Lf = nf + H (U0 f ).
U0
108 2 Model Examples
Now
H (U0 f ) = −n U0 f + U0 f + 2 ∇U0 · ∇f
so that
Lf = f + 2∇ log U0 · ∇f
which is precisely the Ornstein-Uhlenbeck operator − x · ∇. The transformation
f → U0 f carries the analysis of H into the analysis of the Ornstein-Uhlenbeck
operator in terms of Hermite polynomials.
Various functional inequalities in this book are concerned with the dependence of
constants with respect to dimension. Independence with respect to dimension then
usually allows for infinite-dimensional extensions. Actually, in most cases, infinite-
dimensional functional inequalities appear as limits of finite-dimensional inequal-
ities with constants controlled in terms of dimension. Since we will not discuss
many infinite-dimensional examples, it is of interest to describe at least one genuine
infinite-dimensional example, the infinite-dimensional Ornstein-Uhlenbeck opera-
tor. Nevertheless, the functional inequalities for Gaussian measures emphasized in
the future chapters will be restricted to their (dimension-free) finite-dimensional
statements. Infinite-dimensional versions may be reached either by a limiting argu-
ment on the finite-dimensional results, or by suitable adaptations of the formalism
and of the main principles of proofs as they can be drawn from the material dis-
cussed here.
As presented in the finite-dimensional case in the previous Section, the Ornstein-
Uhlenbeck operator in Rn may be described as the diffusion operator L such that
Lxi = −xi and (xi , xj ) = δij where xi , 1 ≤ i ≤ n, are linear forms corresponding
to an orthonormal basis. Moreover, these coordinates xi define independent standard
normal variables under the invariant measure. In a somewhat more abstract way,
L may be described by its action on linear forms (x) = · x, x ∈ Rn , as
L() = −, , = · ,
thus independent of the chosen orthonormal basis. This discussion provides an op-
portunity to mention that if μ is a centered Gaussian measure on Rn with non-
singular covariance matrix Q, then it is the invariant measure of the Ornstein-
Uhlenbeck generator
n
n
M ij ∂ij2 − MQ−1 x i ∂i
i,j =1 i=1
2.7 Diffusion Semigroups Associated with Orthogonal Polynomials 109
ducing kernel Hilbert space H associated with the Wiener measure μ. Indeed, if
p
h(t) = i=1 ai (ti ∧ t), t ∈ [0, 1], h defines an absolutely continuous function in
C0 ([0, 1]) such that
2 2
1 p 1
|h|2H = ḣ2 (t)dt = E ai Bti = E ḣ(t)dBt .
0 i=1 0
The Hilbert space H is identified with the set of absolutely continuous elements
1
h of C0 ([0, 1]) such that 0 ḣ2 (t)dt < ∞, traditionally called the Cameron-Martin
Hilbert space (of the Wiener measure μ).
For any orthonormal sequence (ei )i≥1 in L2 ([0, 1], dt), the Wiener integrals (ran-
dom variables on the canonical space)
1
Xi (w) = ei (s)dBs (w), i ≥ 1,
0
are independent standard normal random variables (see Sect. B.1, p. 487). Note that
when the functions ei have one derivative in L2 ([0, 1], dt), Itô’s formula indicates
that
1
Xi (w) = ei (1)w(1) − ei (s)w(s)ds.
0
The sequence (Xi )i≥1 defines a set of orthonormal coordinates on the canoni-
t
such that, if vi (t) = 0 ei (s)ds, t ∈ [0, 1], i ≥ 1 (elements of H), the
cal space
series i≥1 vi Xi is almost everywhere convergent with distribution μ (although
i≥1 Xi = ∞ μ-almost everywhere).
2
On L2 (μ) with state space E = C0 ([0, 1]) and probability measure the Wiener
measure μ, consider the algebra A0 of functions f : C0 ([0, 1]) → R which are poly-
nomials in a finite number of such Xi ’s, that is f (w) = Pk (X1 (w), . . . , Xn (w)). The
Ornstein-Uhlenbeck operator L may then be defined as acting on such f ’s by
Lf = LPk X1 (w), . . . , Xn (w)
This representation may be verified directly for functions f ∈ A0 since it then boils
down to the corresponding representation in Rn . Observe that A0 is stable under
2.7 Diffusion Semigroups Associated with Orthogonal Polynomials 111
the action of (Pt )t≥0 , and therefore is a core for L. The representation is therefore
independent of the chosen base (ei )i≥1 of L2 ([0, 1], dt).
The CD(1, ∞) curvature property of the finite-dimensional Ornstein-Uhlenbeck
operator immediately extends to the infinite-dimensional setting. Furthermore, since
the operator is an infinite sum of one-dimensional Ornstein-Uhlenbeck operators,
its eigenvectors and eigenvalues are quite easy to describe. The eigenvalues of −L
are the positive integers. However, the corresponding eigenspaces are now infinite-
dimensional. For example, the eigenspace associated with the eigenvalue λ1 = 1 is
the space of linear functionals. The infinite-dimensional Ornstein-Uhlenbeck oper-
ator is thus an example of an operator with a purely point spectrum but which is not
discrete in the terminology of Sect. A.4, p. 478, in Appendix A.
Lα f = xf + (α − x)f , (2.7.10)
where α > 0. Of course, any linear change x → ax (a > 0) would modify the co-
efficients a little, the chosen normalization being justified below. Its invariant and
reversible measure is the gamma distribution on R+
self-adjoint on (0, ∞) as soon as α > 32 . Comparing with the case of the hyperbolic
Laplacian on Rn−1 × (0, ∞) (Sect. 2.3) shows that this issue really depends on the
completeness of the space for the natural distance induced by the operator L as
described in (3.3.9), p. 166. In the Laguerre case, the distance from the boundary
{x = 0} to any point in (0, ∞) is finite, while in the case of the hyperbolic Laplacian
it is infinite.
As in the Ornstein-Uhlenbeck case, the Laguerre operator Lα may be diagonal-
ized with respect to a basis of orthogonal polynomials, namely the Laguerre or-
thogonal polynomials. Indeed, as before, Lα preserves the finite-dimensional space
of polynomial functions with degree less than k, and is symmetric on it when en-
dowed with the Euclidean structure inherited from L2 (μα ). Therefore there exists
an orthonormal basis (Lk )k∈N of L2 (μα ) consisting of polynomials Lk of degree k
which are eigenvectors of Lα . After normalization, this sequence is unique and de-
fines the Laguerre orthonormal polynomials associated with μα . As usual L0 = 1,
and the eigenvalues are computed just by looking at the highest degree term so that
−Lα Lk = k Lk
where σn−1 is the uniform measure on the sphere Sn−1 ⊂ Rn . This function may be
written as Rk ( |x|2 ) where Rk is a further polynomial of degree k. Then Rk satisfies
2
which may be compared to the corresponding formula (2.7.5) for the Ornstein-
Uhlenbeck semigroup. Similarly, the h-transform
x α−1 Lα x 1−α f = (L2−α + α − 1)f
translates into
x 1−α Ptα x α−1 f = et (α−1) Pt2−α f
which exhibits a duality between Lα and L2−α (0 < α < 2) very similar to the dual-
ity observed for Bessel semigroups or generators (see Sect. 2.4.2).
Products of Lα on Rn+ may be considered similarly.
The last family of Sturm-Liouville operators presented here is the one associated
with the so-called Jacobi polynomials. As mentioned earlier, it plays an important
role in the analysis of the Laplacian on spheres. It is, after the Ornstein-Uhlenbeck
and Laguerre operators, the third family of one-dimensional diffusion operators
which may be diagonalized with respect to a family of orthogonal polynomials.
It is a remarkable fact that these three operators are the only operators which have
this property.
114 2 Model Examples
The Jacobi operator on E = [−1, +1] acts on smooth functions f on [−1, +1]
(or restrictions to [−1, +1] of smooth functions on a neighborhood of [−1, +1]) by
Lα,β f = 1 − x 2 f − (α + β)x + α − β f , (2.7.11)
where α, β > 0. Its invariant and reversible measure is the beta distribution on
[−1, +1]
dμα,β (x) = Cα,β (1 − x)α−1 (1 + x)β−1 dx
where Cα,β > 0 is a suitable normalization constant. The carré du champ operator is
given on smooth functions f by (f ) = (1 − x 2 )f 2 . When α = β, we sometimes
speak of symmetric Jacobi operators (although all Jacobi operators are symmetric
with respect to the invariant measure).
While the Jacobi operators (2.7.11) may be considered for any values of α, β > 0,
the boundary issues are more easily described after a suitable change of variable.
Setting x = cos θ , we obtain in the new variable θ an operator on [0, π] which reads
(α + β − 1) cos θ + α − β
∂θ2 + ∂θ . (2.7.12)
sin θ
According to Proposition 2.6.1, this operator is self-adjoint on the interval (0, π)
(and hence the Jacobi operator Lα,β is self-adjoint on the interval (−1, +1)) as soon
as min(α, β) > 32 while the associated Markov process does not reach the boundary
as soon as min(α, β) ≥ 1.
As in the case of the Ornstein-Uhlenbeck operator, Lα,β sends the family of poly-
nomials of degree less than or equal to k into itself, and thus admits a basis of
eigenvectors consisting of polynomials. These polynomials are the orthogonal poly-
nomials associated with the measure μα,β , called the Jacobi polynomials (Jk )k∈N
(with implicit parameters α, β). Jacobi polynomials may be described by generating
series (see (2.7.14) below in the symmetric case).
The Ornstein-Uhlenbeck generator can be seen as a limit of Jacobi operators.
Indeed, for α = β = n2 , and after changing x into √xn , n1 Lα,β converges as n → ∞
to the (one-dimensional) Ornstein-Uhlenbeck operator LOU . (In fact, the conver-
gence is much more comprehensive, the measures, the eigenvectors and the eigen-
values converge, and the Markov semigroups themselves converge, together with
the finite-dimensional laws of the associated Markov processes.) In the same way,
if we translate [−1, +1] to [0, 2], and then change x into xn and α into n (with β
fixed), then n1 Ln,β converges to the Laguerre operator L2β with the same strength
of convergence as described in the remark above.
The eigenvalues of the Jacobi operators are easy to describe by considering the
action of Lα,β on the highest degree terms of the orthogonal polynomials. The eigen-
values of −Lα,β are given by the sequence
λk = k(k + α + β − 1), k ∈ N.
There is an analogy with the eigenvalues of the spherical Laplacian Sn , which
should come as no surprise. When n is an integer (n ≥ 1), and α = β = n2 , the Jacobi
2.7 Diffusion Semigroups Associated with Orthogonal Polynomials 115
operator Lα,β may indeed be seen as an image of Sn via its action on functions
depending only on one coordinate in Rn+1 , that is the functions which are invariant
under rotations leaving one point fixed (for example (1, 0, . . . , 0)). These functions
are called zonal. This may easily be seen via (2.2.2) acting on functions depending
only on x 1 . Hence, if Jk is a Jacobi polynomial of degree k, that is, an eigenvector
for Lα,α for α = n2 , then the restriction to Sn of the function Jk (x 1 ) defined on Rn+1
is an eigenvector for Sn .
This interpretation of the Jacobi operator Lα,α as the projection of Sn on
zonal functions leads to the fact that it satisfies the curvature-dimension condition
CD(2α − 1, 2α) for α = n2 (which may of course be checked directly from the rep-
resentation (2.7.12) and the general formula (1.16.9), p. 73). Observe furthermore
that (1.16.9) (which characterizes the curvature-dimension condition) is in fact an
equality, as it was for the radial part of the hyperbolic Laplace operator. Therefore
the symmetric Jacobi operator is a very good one-dimensional candidate for testing
properties related to curvature-dimension.
For non-symmetric Jacobi operators (α = β), there is also a representation
from spheres at least when α and β are half-integers α = n2 , β = p2 . The
Jacobi operator is then interpreted as a quotient of the sphere of dimension
n + p − 1 (up to a factor of 4). The eigenvalues of degree k correspond to
eigenfunctions on the sphere of dimension 2k as may be seen from the relation
4k(k + α + β − 1) = 2k(2k + n + p − 2). Of course, this observation in itself is
not enough to claim that the generator comes from the sphere models, but it is a
strong indication. To really understand the construction, let the spherical Lapla-
cian on Sn+p−1 ⊂ Rn+p act on functions of y = x12 + · · · + xp2 producing in this
way an operator defined on functions of y ∈ [0, 1]. After the change of variable
y = 2x − 1 towards the interval [−1, +1], one obtains (up to a factor of 4) the Ja-
cobi operator Lα,β with parameters α = n2 , β = p2 . It is then seen to satisfy the
curvature-dimension condition CD( n+p−2 4 , n + p − 1), this property being easily
verified even when n and p are no longer integers. However, in this dissymmetric
situation, there are many other curvature-dimension conditions which are not imme-
diately comparable with one another. For example, for n = p, the first interpretation
leads to an optimal condition CD(n − 1, n) for Ln/2,n/2 whereas the second one
leads to CD( n−12 , 2n − 1).
From this geometric interpretation, Ornstein-Uhlenbeck or Laguerre operators
may also be viewed as limits of spherical √ operators when the dimension n goes
to infinity on spheres scaled to have radius n. As a significant illustration, and a
source of inspiration for many results on the (infinite-dimensional) curvature con-
dition CD(ρ,
√ ∞), (normalized) uniform measures on spheres of dimension n and
radius n converge as n → ∞, when projected on k coordinates, to the standard
Gaussian measure on Rk . As mentioned above, a similar convergence holds at
the level of the operators and semigroups. This observation confirms the infinite-
dimensional character of the Ornstein-Uhlenbeck operator (as well as the Laguerre
operator) from the curvature-dimensional condition viewpoint.
The semigroup with generator Lα,β is called the Jacobi semigroup denoted by
α,β
(Pt )t≥0 . Like the Ornstein-Uhlenbeck and Laguerre semigroups, the Jacobi semi-
116 2 Model Examples
group may actually be defined explicitly from its action on the basis of Jacobi or-
thogonal polynomials (and coincides similarly with the semigroup of the unique es-
sentially self-adjoint extension in the range min(α, β) > 32 ). The Jacobi semigroup
α,β α,β
(Pt )t≥0 admits kernel densities pt (x, y), t > 0, (x, y) ∈ [−1, +1] × [−1, +1],
with respect to the invariant measure.
The family of Jacobi semigroups also satisfies some remarkable relations. Ob-
serve for example that
∂x Lα,β = Lα+1,β+1 − (α + β) Id ∂x .
α,β
This identity translates on the associated semigroups (Pt )t≥0 as
= e−(α+β)t Pt
α,β α+1,β+1
∂x Pt ∂x
α,β α+1,β+1
and therefore also leads to a simple relation between ∂x Jk and Jk−1 . In
α,β α+1,β+1
turn, these relations connect the kernels pt (x, y) and pt (x, y). Indeed, for
smooth functions f compactly supported in (−1, +1), writing
α,β
∂x pt (x, y)f (y)dμα,β (y)
[−1,+1]
= e−(α+β)t
α+1,β+1
pt (x, y)∂y f (y)dμα+1,β+1 (y)
[−1,+1]
where
Cα,β > 0 is the normalization
constant of the beta distribution μα,β . Now,
since [−1,+1] L(x)dμα,β = [−1,+1] L(x 2 )dμα,β = 0,
Cα,β 1 (α − β)2
= 1 − x 2 dμα,β = 1− ,
Cα+1,β+1 [−1,+1] α+β +1 α+β
α+1,β+1 α,β
providing an explicit formula linking pt to pt . This formula takes in par-
ticular a simpler form for y = ±1 and α = β, namely
e2αt
ptα+1,α+1 (x, ±1) = ± ∂x ptα,α (x, ±1). (2.7.13)
2(α + 1)(2α + 1)
This formula will be translated below in (2.7.15) into an identity for heat kernels on
the sphere.
The representation of Jacobi operators from the spherical Laplacian when the pa-
rameters are half-integers leads to nice formulas for the family of Jacobi orthogonal
2.7 Diffusion Semigroups Associated with Orthogonal Polynomials 117
polynomials via the Poisson formula (2.2.6). Let us for example illustrate the picture
when α = β = n2 , n ≥ 1. Any function f (x1 ) defined on [−1, +1] may be lifted to
a zonal function fˆ on the sphere by fˆ(x) = f (x · e1 ) where e1 = (1, 0, . . . , 0) ∈ Sn
(the variable x1 being with respect to as the first coordinate of a point x ∈ Sn ).
Then expand f ∈ L2 (μα,α ) with respect to the basis of Jacobi polynomials (Jk )k∈N ,
say f (x1 ) = k∈N ak Jk (x1 ). Since Jˆk is the restriction to Sn of a homogeneous
harmonic polynomial of degree k, Fk (X) = r k Jˆk (x) is harmonic in Rn+1 , where
X = rx, x ∈ S (polar coordinates). Then Qr f (x1 ) = k∈N ak r k Jk (x1 ), 0 ≤ r ≤ 1,
n
is such that Q r f is the harmonic extension to the unit ball of fˆ (read in polar coor-
dinates). This construction is represented by the kernel
Qr f (x1 ) = qr (x1 , y1 )dμα,α (y1 )
[−1,+1]
where qr (x1 , y1 ) = k∈N r
kJ
k (x1 )Jk (y1 ). On the other hand, by the Poisson for-
mula (2.2.6),
1 − r2
Qr f (x) = fˆ(y)dσn (y).
Sn |rx − y|n+1
Choosing x = e1 in the previous formula leads by identification to
1 − r2
r k Jk (1)Jk (y1 ) = (2.7.14)
(1 + r 2 − 2ry1 )(n+1)/2
k∈N
which is known as the generating function for the symmetric Jacobi polynomials.
The general formula for the Poisson kernel is however
a bit more complicated.
It requires the representation of a point on Sn as (x1 , 1 − x12 x̂), where x̂ ∈ Sn−1 ,
which then leads to
r k Jk (x1 )Jk (y1 )
k∈N
1 − r2
= n+1
dμα−1,α−1 (s).
[−1,+1] [1 + r 2 − 2rx1 y1 + 2rs (1 − x12 )(1 − y12 ) ] 2
Extending this formula to the general case where n is no longer an integer is then
easy and amounts to observing that both sides are solutions of the differential equa-
tion (∂r2 + nr ∂r + r12 Lα,α )F = 0, the remaining argument being similar.
The case of non-symmetric Jacobi operators may essentially be treated similarly
(at the expense of more complicated expressions), starting from the half-integer case
for the geometric picture and then extending to arbitrary parameters.
Conversely, the interpretation of the Jacobi operator as the spherical Laplacian
acting on functions which depend only on the first coordinate (that is, functions
depending only on the distance from (1, 0, . . . , 0) on the sphere) leads to useful
118 2 Model Examples
information about the sphere Sn itself. First, the image measure of the uniform mea-
sure σn on Sn via the map x = (x1 , . . . , x
n+1 ) → x1 is the measure μn/2,n/2 , and its
p
image measure via (x1 , . . . , xn+1 ) → 2 i=1 xi2 − 1 is μ(n+1−p)/2,p/2 . Moreover,
the spherical heat kernels pt (x, y), which, as we saw in Sect. 2.2, depend only on
the distance d(x, y) between x, y ∈ Sn , may be expressed in terms of the kernel
densities associated with the Jacobi operators. More precisely, if
pt (x, y) = sn t, d(x, y) = ŝn (t, x · y), t > 0, (x, y) ∈ Sn × Sn ,
n/2,n/2 n/2,n/2
then ŝn (t, u) = pt (1, u) where pt (x, y) is the kernel associated with the
Jacobi operator Ln/2,n/2 . Now, the recurrence formula (2.7.13) leads to a recurrence
formula for ŝn , namely
ent
ŝn+2 (t, u) = ∂u ŝn (t, u), (2.7.15)
(n + 1)(n + 2)
very similar to the corresponding formula for the hyperbolic space given in (2.3.2).
The aim of this chapter is to provide a general framework for the investigation of
symmetric Markov diffusion semigroups and operators. It is based on the early ob-
servations of Chap. 1 and on the investigation of the model examples in Chap. 2.
This chapter develops all the fundamental properties which will justify the com-
putations in the following chapters in the most convenient framework. It is not in-
tended to be read linearly, and could be skipped on first reading. The interested
reader should come back to this material when specific technical details have to be
understood (such as, for example, the use of curvature conditions towards gradient
bounds, which lies at the heart of much of the subsequent analysis). A summary of
the framework is presented at the end of the chapter.
The various questions addressed here are actually somewhat delicate. While the
chapter aims at describing a general abstract setup, it clearly cannot encompass all
the instances of interest. Moreover, each specific example often requires an inde-
pendent analysis with suitable assumptions and hypotheses. Therefore, we do not
try to be maximally general at each level, instead emphasizing the tools and meth-
ods which may then be suitably adapted to the cases of interest. These issues in
particular concern the classes of functions on which the various operators and semi-
groups are analyzed. Again, we present the main features of the classes of functions
required to develop the investigation rather than carefully describing each example.
In particular, the exposition freely emphasizes the formal and inspiring arguments
of the -calculus and heat flow monotonicity on which most of the following func-
tional inequalities will be based. Nevertheless, on the basis of the analysis of dif-
fusion operators on complete manifolds in Sect. 3.2, Sect. 3.3 develops a complete
and self-contained framework fully justifying the various results and conclusions
In Chap. 1, we investigated the analysis of Markov semigroups starting from the
knowledge of the semigroup itself (sometimes in a concrete instance), and intro-
duced the infinitesimal generator together with its domain as a tool to describe it.
In this chapter, we adopt the reverse point of view, analyzing the semigroup based
on the properties of an operator (a second order differential operator), or rather its
carré du champ operator , leading to the basic Markov Triple structure (E, μ, )
which will form the natural environment for future investigations.
The operator and carré du champ operator are initially defined on some set A0 of
functions (typically the set of smooth compactly supported functions). The analysis
of differential diffusion operators on smooth manifolds leads us to consider a larger
class A (playing the role of smooth functions without support conditions), and to
precisely analyze the relationships between the two classes A0 and A, in particu-
lar with regard to integration by parts formulas. Throughout, when working with
classes of (real-valued) functions (rather than points), it should be kept in mind that
equalities and inequalities between functions should always be understood to hold
almost everywhere (with respect to the underlying invariant measure).
The first section describes the main setting consisting of a triple (E, μ, ) with
state space E, σ -finite measure μ, carré du champ operator acting on an algebra
A0 of functions, and the diffusion property in force throughout this work. This al-
lows us to describe what we call a Standard Markov Triple, which is a convenient
setting allowing us to develop the formalism of Markov semigroups towards func-
tional inequalities and convergence to equilibrium. Section 3.2 presents, for the con-
crete example of second order differential operators on smooth complete connected
manifolds, some of the basic issues, properties and hypotheses needed to work with
diffusion operators and semigroups on natural classes of functions. This analysis
anticipates the further introduction of the extended function algebra A, which en-
ables the translation into the general framework of some central features such as
connexity, completeness, weak hypo-ellipticity etc. With the tool of essential self-
adjointness at the center of the construction, Sect. 3.3 then emphasizes the relevant
hypotheses and properties of Full Markov Triples, covering in particular the exam-
ples illustrated in Chaps. 1 and 2. Section 3.4 summarizes the various hypotheses
and definitions, introducing the Full Markov Triple as the convenient framework in
which most results of the book are presented. The reader may easily refer to it while
progressing through the subsequent chapters.
The following definition describes the initial setting of investigation. Recall the no-
tion of a good measurable space (p. 7).
3.1 Markov Triples 121
for some matrix a(x, y), (x, y) ∈ E × E. Then, if μ(x) > 0 for any x ∈ E, (3.1.1)
is translated into the identity, valid for every pair (x, y) ∈ E × E,
at 0. Note that, on the other hand, it is also necessary to assert that (f, 1) = 0, so
that will have to be extended later to a wider class containing constant functions.
Remark 3.1.2 Working with classes of functions rather than points, it should be
emphasized that the preceding setting already ensures that the algebra
A0 is rich
enough in a measurable sense. For example, if f ∈ L1 (μ) and if E gf dμ ≥ 0 for
every positive g ∈ A0 , then f ≥ 0 (μ-almost everywhere). Indeed, given A ∈ F
with μ(A) < ∞, there exists a sequence (fk )k∈N in A0 converging to 1A in L1 (μ),
and therefore, up to the choice of a subsequence, μ-almost everywhere. By the sta-
bility by composition with functions vanishing at 0, gk = 2fk2 (1 + fk2 )−1 , k ∈ N,
forms a sequence of functions in A0 , uniformly bounded from
above and converg-
The central diffusion hypothesis will be in force in most of this book. While pre-
sented for the generator in Sect. 1.11, p. 42, it is defined here equivalently on the
carré du champ operator.
k
(f1 , . . . , fk ), g = ∂i (f1 , . . . , fk ) (fi , g). (3.1.2)
i=1
Recall that equalities between functions (in A0 here) are understood μ-almost
everywhere. In other words, (f, g) is a derivation on each of its arguments. Re-
stricted to polynomial functions , the preceding boils down to the chain rule
for any choice of f, g, h in A0 . Under the diffusion hypothesis, it is clear that the
fundamental identity (3.1.1) is satisfied.
On the basis of some of the material of Chap. 1, we now describe how the preceding
framework actually includes the more usual description in terms of semigroups and
3.1 Markov Triples 123
generators. This discussion will lead us to enrich the underlying algebra A0 with
several properties which, in addition to the diffusion property, will then form the
basic (diffusion) Markov Triple structure. The various properties of A0 presented
below will be summarized in Sect. 3.4.
Given a triple (E, μ, ) as in Definition 3.1.1 together with the diffusion property
from Definition 3.1.2, the associated symmetric operator L may then be defined, on
A0 , by the integration by parts formula,
g Lf dμ = − (f, g)dμ (3.1.4)
E E
for every f, g ∈ A0 (cf. (1.6.3), p. 27). In order to uniquely define the generator L
on A0 from (3.1.4), it is necessary to assume that for any f ∈ A0 , there exists a
finite constant C(f ) such that for any g ∈ A0 ,
(f, g)dμ ≤ C(f ) g2 (3.1.5)
E
L(f g) = f Lg + g Lf + 2 (f, g)
for all f, g ∈ A0 .
It is also necessary to impose that
Lf dμ = 0
E
for any f ∈ A0 , what was identified as the invariance (or stationarity) of the mea-
sure μ in Chap. 1. Of course, if constant functions belong to A0 , this is automatically
true due to integration by parts (since then (f, 1) = 0 thanks to the chain rule). It
is however not wise in general to assume that constant functions belong to A0 . This
property will be automatic once the richer class A has been introduced in Sect. 3.3
below. By symmetry of , it is clear that the invariant measure μ is reversible for L
in the sense that
g Lf dμ = f Lg dμ (3.1.7)
E E
for every pair (f, g) ∈ A0 × A0 .
124 3 Symmetric Markov Diffusion Operators
k
L (f1 , . . . , fk ) = ∂i (f1 , . . . , fk ) Lfi
i=1
(3.1.8)
k
+ ∂ij2 (f1 , . . . , fn ) (fi , fj )
i,j =1
(for every f in A0 ). Now, by the integration by parts formula (3.1.4), this amounts
to
(f ), g dμ = f, (f )g dμ − g (f ) (f )dμ
E E E
for every g ∈ A0 , which appears as an immediate consequence of the diffusion
property for . Conversely, starting from a generator L, the change of variables
formula (3.1.9) for (r) = r 2 yields in return the carré du champ operator in the
form
2 (f ) = L f 2 − 2f Lf.
Everything is therefore coherent.
Remark 3.1.4 It is perhaps worth deciphering the relationship between the change
of variables formulas (3.1.2) and (3.1.8) for and L respectively, and the mere chain
rule (3.1.3), as given on p. 43.
On a commutative algebra such as A0 , a first order differential operator (with
no 0 order term), also often called a derivation or a vector field, is a linear op-
erator Z : A0 → A0 satisfying the chain rule Z(f g) = f Zg + gZf . An operator
X : A0 → A0 is a second order differential operator if its associated carré du champ
operator X (f, g) = 12 [X(f g) − f Xg − gXf ] is a first order operator in each ar-
gument. (One could define in this way differential operators of any order.) A first
order operator is then simply a second order operator with vanishing carré du champ
operator. Now, it is quite easy to see that,
for any first order operator Z, the change
of variables formula Z((f1 , . . . , fk )) = ki=1 ∂i (f1 , . . . , fk )Zfi is valid for any
polynomial function with (0) = 0 (extended without this restriction provided
A0 contains a unit 1 for which Z(1) = 0). In the same way, the change of vari-
able formula (3.1.8) is also valid for any second order differential operator X with
polynomial functions . Therefore, the change of variables formula is simply the
extension of the chain rule from polynomials to smooth functions. With these defi-
3.1 Markov Triples 125
nitions, L is a second order differential operator, and a first order operator in each
of its arguments. Moreover, the 2 operator (Definition 3.3.12 below) is a second
order differential in each of its arguments, and the Hessian operator H (f )(g, h)
(Definition 3.3.13 below) is of second order in f and of first order in g and h.
Given a triple (E, μ, ) as in Definition 3.1.1 with algebra A0 satisfying the dif-
fusion property and the preceding requirements, the next step is to understand on
which natural domain the (diffusion) operator L, given on A0 , may be extended.
The first task is to construct the Dirichlet form E and its domain D(E) by completion
from A0 , and then to build the domain D(L) of the generator L from this Dirichlet
domain. This procedure yields a self-adjoint operator L with L2 (μ)-domain D(L),
from which the semigroup P = (Pt )t≥0 with infinitesimal generator L and invari-
ant and reversible measure μ is then constructed following the lines presented in
Appendix A.
To start with, recall the Dirichlet form E introduced in Definition 1.7.1, p. 30, as
E(f, g) = (f, g)dμ = − f Lg dμ, (f, g) ∈ A0 × A0 . (3.1.10)
E E
Also write E(f ) for E(f, f ). Note that by the Cauchy-Schwarz inequality (both for
the quadratic form and in L2 (μ)),
E(f, g) ≤ E(f )1/2 E(g)1/2 .
The Dirichlet form E is a priori defined only on A0 × A0 but may actually be ex-
tended to a wider class, the so-called Dirichlet domain D(E). In fact, if A0 is en-
dowed with the Dirichlet norm
1/2
f E = f 22 + E(f ) , (3.1.11)
we may take the completion of A0 with respect to this norm to turn it into a Hilbert
space embedded in L2 (μ). For this to be possible, the form must be closable, that
is, if a sequence (fk )k∈N in A0 is such that fk 2 → 0 and if E(fk − f ) → 0
when k, → ∞, then E(fk ) → 0. But this is actually automatic with the help of the
operator L. Indeed, whenever k > ,
E(fk ) = − fk Lf dμ + E(fk , fk − f )
E
so that the sequence ((fk )1/2 )k∈N is a Cauchy sequence in L2 (μ). This allows us
to define in a proper way
(f ), and then (f, g) for f, g ∈ D(E), as an element of
L1 (μ) such that E(f ) = E (f )dμ.
At this stage, it is certainly worth making the following observation.
Proposition 3.1.5 If f ∈ D(E) and h ∈ A0 , then hf ∈ D(E) (in other words, D(E)
is an A0 -module). More precisely, whenever a sequence (fk )k∈N in A0 converges in
D(E) to f , then the sequence (hfk )k∈N converges in D(E) to hf . Moreover, for any
f, g in D(E) and any h ∈ A0 , (hf, g) = h (f, g) + f (h, g).
This inequality applied to an approximating sequence yields the result. The second
claim of the proposition is achieved in the same way.
a semigroup of operators on a Banach space (cf. Sect. 1.4, p. 18, and Sect. A.1,
p. 473). On D(L), the semigroup (Pt )t≥0 with generator L solves the heat equation
∂t Pt = LPt = Pt L
(and accordingly is often called the heat semigroup or heat flow with respect to L).
It now remains to check the positivity preserving and Markov properties of
(Pt )t≥0 , which will be addressed in the next sub-sections.
That the associated semigroup (Pt )t≥0 is positivity preserving (i.e. Pt f ≥ 0 when-
ever f ≥ 0) is not at all clear from the above construction. The proof given
via (1.4.5), p. 21, is purely formal and cannot be turned into a real proof without fur-
ther hypotheses on the triple (E, μ, ), thus we require another approach. Indeed,
in the general theory of Dirichlet forms, in order for (Pt )t≥0 (constructed from E)
to be positivity preserving it suffices that the Dirichlet domain D(E) is stable under
the maps f → ψ(f ), where ψ : R → R is a smooth function such that ψ(0) = 0
and |ψ | ≤ 1, and moreover that for such ψ ’s,
E ψ(f ) ≤ E(f ). (3.1.12)
which is clear from spectral decomposition (cf. Appendix A), and therefore, in
order to prove that Pt f ≤ 1, it is enough to show that g = λRλ (f ) ≤ 1. Now,
λg = Lg + λf , and from the preceding, for every smooth ψ with ψ(0) = 0 and
|ψ | ≤ 1,
λ g g − ψ(g) dμ = Lg g − ψ(g) dμ + λ f g − ψ(g) dμ
E E E
≤λ f g − ψ(g) dμ.
E
Thus
(f − g) g − ψ(g) dμ ≥ 0.
E
128 3 Symmetric Markov Diffusion Operators
This property, established for any such smooth function ψ, easily extends to the
function ψ(r) = r ∧ 1, r ∈ R, from which
(f − g)(g − 1) dμ ≥ 0.
{g>1}
Since f ≤ 1, it follows that μ(g > 1) = 0. By homogeneity, for any a > 0, when-
ever f ≤ a, we have Pt f ≤ a, and therefore if f ≤ 0, then Pt f ≤ 0 which is the
announced positivity property. The argument ensures in the same way that the as-
sociated semigroup is sub-Markov (that is Pt (1) ≤ 1, where 1 is understood in the
sense of Definition 3.1.1).
Now, restricting to the case of diffusion operators, (3.1.12) easily holds. In-
deed, it is not hard to show that if ψ is smooth with ψ(0) = 0 and ψ bounded,
and if f ∈ D(E), then ψ(f ) ∈ D(E) and (ψ(f )) = ψ 2 (f )(f ). To see why
ψ(f ) ∈ D(E), for a sequence (fk )k∈N ∈ A0 converging to f ∈ D(E) in the Dirichlet
norm and μ-almost everywhere, and for k, ∈ N, we have
E ψ(fk ) − ψ(f ) ≤ ψ (fk )2 (fk − f ) + ψ (fk )2 − ψ (f )2 (fk )
E
+ 2ψ (fk )ψ (fk ) − ψ (f )(fk )1/2 (f )1/2 dμ.
Since the sequence ((fk ))k∈N is uniformly integrable, it follows that (ψ(fk ))k∈N
is a Cauchy sequence in D(E) and therefore convergent. Hence (3.1.12) is satisfied,
as announced.
Recall that the semigroup just constructed is moreover sub-Markov (that is, it
is positivity preserving and satisfies Pt (1) ≤ 1) rather than Markov (Pt (1) = 1).
Criteria for the latter to hold (Definition 3.1.12) will be presented later.
The preceding construction is the so-called Friedrichs self-adjoint extension of
the operator L initially given on A0 as described in Sect. A.3, p. 477, for which
A0 ⊂ D(L) ⊂ D(E).
Basically, the domain D(E) of the Dirichlet form is the L2 (μ)-domain of (−L)1/2 .
As already emphasized in Sect. 1.4, p. 18, and Sect. 1.6, p. 24, and above, the con-
struction of the Dirichlet form E with domain D(E) and of the self-adjoint operator
L with L2 (μ)-domain D(L) yields a Markov or sub-Markov semigroup P = (Pt )t≥0
on (E, F) with infinitesimal generator L and invariant reversible measure μ. The
semigroup viewpoint has several advantages and actually allows us to identify fur-
ther properties of interest.
3.1 Markov Triples 129
Consider, for example, for a given function f in L2 (μ) and t > 0, the expression
1 1
f (f − Pt f )dμ = f 2 dμ − (Pt/2 f )2 dμ . (3.1.13)
t E t E E
By (1.7.2), p. 30, this expression is decreasing in t. The domain D(E) of the Dirich-
let form may then be described equivalently as the set of functions f ∈ L2 (μ) for
which the limit of this expression is finite when t → 0. If f is in the L2 (μ)-domain
D(L) of the generator L with semigroup P = (Pt )t≥0 , it is
also in the Dirichlet do-
main D(E) by the integration by parts formula E(f ) = − E f Lf dμ.
The expression (3.1.13) may moreover be rewritten using the kernels pt (x, dy)
of the representation (1.2.4), p. 12, as
1 2
f (x) − f (y) pt (x, dy)dμ(x). (3.1.14)
2t E E
With respect to (3.1.12), the latter therefore also covers contractions ψ which are not
necessarily smooth. Moreover, using Fatou’s lemma, whenever a sequence (fk )k∈N
in D(E) converges in L2 (μ) to f ∈ D(E), then, since the expression (3.1.14) is
decreasing in t,
E(f ) ≤ lim inf E(fk ). (3.1.16)
k→∞
This result captures the fact that (f ) = 0 as long as f remains constant. This will
be further developed in Chap. 8 as a weak co-area formula.
In this language, the contraction property (3.1.15) along the Dirichlet form may
be translated into a contraction property for itself (even without the diffusion
assumption). Indeed, for f ∈ A0 and tk → 0, μ-almost everywhere in x ∈ E,
1 2
(f )(x) = lim Ptk f (x) − Ptk (f )(x) .
2
k→∞ 2tk
130 3 Symmetric Markov Diffusion Operators
But, if Pt f (x) = E f (y)pt (x, dy), this last expression may be written as
1 2
f (y1 ) − f (y2 ) ptk (x, dy1 ) ptk (x, dy2 ),
4tk E E
and is therefore clearly decreasing under the action of a contraction ψ . Hence, for
any ψ : R → R such that |ψ(r) − ψ(s)| ≤ |r − s|, r, s ∈ R, and every f ∈ A0 ,
ψ(f ) ≤ (f ), (3.1.18)
Therefore E(Pt f ) ≤ 1
2et Ef
2 dμ for t > 0.
132 3 Symmetric Markov Diffusion Operators
More or less, all items in this proposition are immediate from the definition of the
spectral decomposition. (iii) is a simple consequence of the bound r 2 e−r ≤ 4 e−2 on
R+ , (iv) follows from the bound re−r ≤ e−1 , while (v), which follows from the fact
that t → λe−2λt is decreasing, may be seen more directly using that
∂t E(Pt f ) = −∂t Pt f LPt f dμ = −2 (LPt f )2 dμ ≤ 0.
E E
It may finally be pointed out that Proposition 3.1.6 takes a much simpler form
in the case of a discrete spectrum. Indeed, in this casethere exists a sequence
(Ek )k∈N of orthogonal closed subspaces of L2 (μ) with k∈N Ek = L2 (μ) which
are eigenspaces of L associated with the eigenvalues
−λk , λk ≥ 0, k ∈ N. When a
function f ∈ L2 (μ) is decomposed into f = k∈N fk , then
−Lf = λk f k , Pt f = e−λk t fk , E(f ) = λk fk 22
k∈N k∈N k∈N
and so on.
Remark 3.1.7 Using the same arguments as in Proposition 3.1.6, it may be shown
that if f ∈ L2 (μ), then Pt f ∈ D(Lk ), t > 0, for any k ∈ N. Therefore, at the L2 (μ)
level, even in the absence of any hypo-ellipticity condition, the semigroup is smooth-
ing as an effect of symmetry only. Moreover, if f ∈D(E), then Pt f converges to f
in D(E) as t → 0. This is also true if f ∈ D(L) as k D(Lk ) is dense in D(Lp ) for
any p ∈ N, again by means of the smoothing properties of the semigroup.
for every (f, g) ∈ A0 × A0 . The domain D(E) of the Dirichlet form E is the com-
pletion of A0 with respect to the norm f E = [f 22 + E(f )]1/2 , and E is extended
to D(E) by continuity together with the carré du champ operator . The symmetric
semigroup of contractions P = (Pt )t≥0 with infinitesimal generator L defined on its
L2 (μ)-domain D(L) is positivity preserving but not necessarily Markov (in general
it is only sub-Markov, i.e. Pt (1) ≤ 1).
The following addresses the question of essential self-adjointness which will turn
out to be central in the abstract construction developed in Sect. 3.3 below, in partic-
ular concerning mass conservation. Besides the operator L constructed so far, it is
necessary to introduce an adjoint operator L∗ , where the duality refers both to the
measure μ and the algebra A0 .
In this situation, L∗ f is the unique element of L2 (μ) such that for any g ∈ A0 ,
f Lg dμ = g L∗ f dμ.
E E
In order to ensure the ESA property, extra conditions are needed. The next sec-
tion will demonstrate how this property holds for second order elliptic differential
operators on smooth complete connected manifolds where A0 is the class of smooth
compactly supported functions. Section 3.3 will address this issue in the present ab-
stract framework.
Examples in Chap. 2 show that there may be many different self-adjoint exten-
sions L of a given operator L defined on A0 , and conditions for the ESA property
to hold may depend on the behavior of the measure at the boundary of the domain
(Sect. 2.4, p. 92, and Sect. 2.6, p. 97). In general, for various self-adjoint exten-
sions L, their domains D( L) are not comparable. To each self-adjoint extension L
corresponds a Dirichlet form with domain D(E) and an associated semigroup (not
necessarily Markov), using the procedure described by (3.1.13) below. In contrast
to the different domains D( L) however, the associated Dirichlet domains D(E) may
be compared. The domain D(E) of the Friedrichs extension that we constructed is
the minimal one, and corresponds in practice to operators with the Dirichlet bound-
ary conditions. There is also a maximal extension corresponding to the Neumann
boundary conditions. In the context developed here, namely if the operator is essen-
tially self-adjoint, the minimal and maximal extensions coincide and there is only
one Dirichlet domain (referred to as the unique extension).
fied. For instance, Sect. 2.6, p. 97, gives examples of operators which are essentially
self-adjoint but nevertheless not conservative.
These observations lead to the following two definitions in the setting of a
Diffusion Markov Triple (E, μ, ) with associated operator L and semigroup
P = (Pt )t≥0 .
To verify the equivalence of the two possible definitions, note that if f ∈ D(L) is
such that Lf = 0, then E(f ) = 0 and therefore (f ) = 0. Conversely, if f ∈ D(E)
is such that (f ) = 0, then for any h ∈ D(E), E(f, h) = 0, and therefore f belongs
to D(L) with Lf = 0.
Observe that since Pt is defined on any bounded measurable function, the def-
inition makes sense (and is in fact of great importance) even when μ(E) = ∞.
Section 3.2.4 below presents a useful criterion ensuring mass conservation for sec-
ond order differential operators on a complete Riemannian manifold. Section 3.3
further develops the abstract framework to deduce this property from essential self-
adjointness and curvature conditions. On the other hand, when solving a stochastic
differential equation, it is sometimes easy to directly check mass conservation. In
particular, we presented in the second chapter (Sect. 2.6, p. 97) criterions in dimen-
sion one which are almost necessary and sufficient conditions for this to happen.
From the preceding, the following proposition is immediate.
Remark 3.1.14 It may be observed that under the ergodicity and the ESA prop-
erties, the set {Lh ; h ∈ A0 } is dense in the space L20 (μ) of functions in L2 (μ)
which are orthogonal to constants (that is L
2 (μ) itself when μ(E) = ∞). Indeed,
if f ∈ L2 (μ) is such that, for any h ∈ A0 , E f Lhdμ = 0, then f ∈ D(L∗ ) with
L∗ f = 0. From the ESA property, f ∈ D(L) and Lf = 0. Ergodicity allows us to
conclude that f is constant. It is then 0 when μ(E) = ∞. When μ(E) = 1, since
for any h ∈ A0 , Lh ∈ L20 (μ), the same conclusion follows working in L20 (μ) instead
of L2 (μ).
136 3 Symmetric Markov Diffusion Operators
We close this section with two further properties which are relevant in this context.
Even though there is no topology on the state space E, functions in D(E) share
some minimal regularity as a consequence of the diffusion property. In particular
they satisfy a measurable form of the intermediate value theorem.
Proof Assume that μ(f ∈ [c, d]) = 0 and choose ψ : R → R smooth and constant
outside the interval [c, d] with ψ(a) = ψ(b), such that ψ (f ) = 0. Since by the
diffusion property
ψ(f ) dμ = ψ (f )2 (f )dμ = 0,
E E
The slicing procedure described in the next result will prove to be particularly
useful.
The proof of this result is an easy consequence of (3.1.16) and the change
of variable formula. Indeed, using some smooth approximation of the function
r → (r − ak )+ ∧ (ak+1 − ak ), and (3.1.16), it is easily seen that
E(fk ) ≤ (f )dμ.
{f ∈(ak ,ak+1 )}
An important feature used here is that (f ) = 0 on the set {f = a}, which is a
consequence of (3.1.17).
what follows, the manifold could of course be Rn , or some open set in Rn . Indeed,
as soon as the coefficients g ij (x) in (3.2.1) below are not constant, the relevant
quantities do not differ much in Rn or in a general manifold. This section makes
significant use of the material presented in Appendices A and C.
In the manifold context, here and throughout, the algebra A0 consists of the
smooth C ∞ compactly supported functions. The degree of smoothness may actu-
ally depend on the smoothness of the manifold, however it will be at least C 2 . For
simplicity, we only consider C ∞ manifolds and function algebras, for which the
generic terminology “smooth” will be used. More general instances may be adapted
accordingly.
We start with some general observations. As already pointed out in the last sec-
tion, when dealing with diffusion operators L and semigroups (Pt )t≥0 on manifolds,
classes of functions which are larger than the class A0 , such as the class of smooth
functions without special growth conditions, may be used. It is then necessary to be
cautious when using integration by parts. Indeed, as discussed in Sect. 1.6, p. 24,
for a symmetric diffusion operator L on a smooth manifold, the integration by parts
formula holds as soon as f and g are smooth and compactly supported, and it is
actually enough that one of the functions be compactly supported. If neither of the
smooth functions f and g is compactly supported, the validity of the formula re-
quires some knowledge on the behavior of the functions f and g at infinity depend-
ing on the underlying (reversible) measure μ. The statement that a function f is
in the domain D(L) of L hides two different aspects, a regularity assumption, and
a growth condition. Moreover, in the presence of boundaries, there are additional
requirements at the boundary (for example, normal derivatives at the boundary have
to vanish in the case of Neumann boundary conditions).
Similar comments hold for the associated semigroup P = (Pt )t≥0 . Indeed, when
the manifold is not compact, Pt f , t > 0, would never be compactly supported even
if f is. But the smoothness of Pt f for smooth compactly supported functions f is
in general a consequence of two types of criteria: when the semigroup is given as the
law of the solution of a stochastic differential equation with global properties such as
non-explosion (see Sect. B.4, p. 495), or from local considerations as a consequence
of ellipticity or hypo-ellipticity (see Sect. 1.12, p. 49).
It is therefore important to determine a setting in which the integration by parts
formula (3.1.4) and other properties hold beyond the preceding class A0 . For ex-
ample, in the core of the analysis, one is often led to consider quantities such as
(Pt f ). If we restrict ourselves to smooth functions f in the L2 (μ)-domain of L,
(Pt f ) is certainly not in general in L2 (μ). Furthermore, unless L may be defined
on any smooth function, the meaning of L(Pt f ) is not entirely clear. What would
then guarantee for a function such as g = (Pt f ) that ∂t Pt g = Pt (Lg)? Indeed,
coming back to the basic examples of symmetric diffusions with density kernels as
presented in the preceding chapters, so that for t > 0 and x ∈ E,
Pt f (x) = f (y) pt (x, y)dμ(y)
E
3.2 Second Order Differential Operators on a Manifold 139
and Lx pt (x, y) = Ly pt (x, y), then, assuming that derivation and integration com-
mute,
∂t Pt f (x) = f (y) Lx pt (x, y)dμ(y)
E
= f (y) Ly pt (x, y)dμ(y) = Lf (y)pt (x, y)dμ(y).
E E
But the last step here is in fact integration by parts and requires some a priori knowl-
edge on the behavior with respect to μ of both pt (x, y) and f (y) at infinity.
As mentioned above, this discussion may also be illustrated in probabilistic terms
on the diffusion semigroups given by solutions of stochastic differential equations
as in Sect. 1.10, p. 38. From the representation (1.1.1), p. 8, of the semigroup in
terms of the expectation E(f (Xtx )), it is clear that as soon as the associated Markov
process X = {Xtx ; t ≥ 0, x ∈ E} is continuous and f and Lf are bounded and con-
tinuous, then ∂t Pt f = Pt Lf . Fortunately, with a minimum of assumptions on the
state space emphasized below, this property still holds for smooth functions f in
L2 (μ) as soon as Lf is also in L2 (μ).
Motivated by this discussion, the two main questions and properties which should
be addressed consistently are therefore the uniqueness of the symmetric extension
of L given the algebra A0 and the mass conservation (Markov) property Pt (1) = 1.
The uniqueness or essentially self-adjointness question (Definition 3.1.10), that is
whether A0 is a core for the operator L (cf. Sect. A.5, p. 481), is an important issue
since, when satisfied, any formula valid in A0 and continuous under the topology
of the domain D(L) would still hold on this domain. The Markov property, on the
other hand, allows for the correct probabilistic setting and interpretations.
The aim of this section is to establish these two properties in the context of sec-
ond order diffusion operators on a smooth complete connected manifold M, for the
class A0 of smooth (C ∞ ) compactly supported functions. Therefore, throughout this
section, we deal with a second order differential operator L acting in a local system
of coordinates on functions f : M → R in A0 as
n
n
Lf = g ij ∂ij2 f + bi ∂i f (3.2.1)
i,j =1 i=1
and clearly L and satisfy the diffusion property of Definition 3.1.3. The associated
semigroup P = (Pt )t≥0 with generator L as constructed from the Dirichlet form
in Sect. 3.1 may already be assumed to be sub-Markov (Pt (1) ≤ 1).
The exposition here in the manifold case points towards the general setting de-
veloped in the next section. We try to identify the various properties used on the
manifold and the differential operator in order to express them directly in terms of
the operator and its carré du champ operator. The hypotheses put forward in Sect. 3.3
fully rely on this main example. The three basic properties required to achieve this
program and uniqueness of the symmetric extension are connexity, completeness
and (weak-) hypo-ellipticity, which are discussed in the following subsections.
3.2.1 Connexity
Recalling that (f ) stands for |∇f |2 (for Laplacians on manifolds, for example), a
minimum requirement is that if f : M → R is constant then (f ) = 0. This is an
easy consequence of the diffusion property (3.1.8). Conversely, it is also a natural
requirement that if a function f satisfies (f ) = 0, then it is constant. As illustrated
next, this property actually hides the connexity of the underlying state space together
with some intrinsic form of hypo-ellipticity.
In the context of finite Markov chains (see Sect. 1.9, p. 33), connexity is equiv-
alent to the fact that there is only one recurrence class, or in other words that the
Markov chain has a strictly positive probability of moving from any point to any
other point in a finite number of steps. In some abstract situation, connexity to-
gether with the diffusion property implies some minimal continuity for functions in
the Dirichlet space. It shows in particular that the diffusion property may never hold
on a discrete space.
If a diffusion operator L is elliptic in the sense of (1.12.2), p. 50, a function
f : M → R satisfying (f ) = 0 is locally constant. As announced, that locally
constant functions are constant entails connexity of the state space. For example,
if M is just two copies of R, that is M = R × {−1, +1} and if, for = ±1,
Lf (x, ) = ∂x2 f (x, ), then it is clear that any function f satisfying (f ) = 0 is
constant on each copy of R (each connected component of M) but may not be glob-
ally constant.
The picture may be even more complicated. Consider, on R2 , the operator
Lf = ∂x2 f for which (f ) = (∂x f )2 so that the connexity property does not hold.
Note that L is not hypo-elliptic in the sense described in Sect. 1.12, p. 49. How-
ever, in this context, the situation is rather simple. The variable y is just a parame-
ter, and the associated Markov process (Xt , Yt )t≥0 starting from (x, y) amounts to
(Xt , y)t≥0 where (Xt )t≥0 is a one-dimensional Brownian motion (with speed 2)
3.2 Second Order Differential Operators on a Manifold 141
starting from x. The state space may thus be restricted to R × {y} and L may
be considered as an elliptic operator on R satisfying the connexity condition.
Let now Lf = ∂x2 f + x∂y f for which, still, (f ) = (∂x f )2 so that the opera-
tor is still
non-connected. But now, the associated Markov process is of the form
t
(Xt , y + 0 Xs ds)t≥0 and no longer preserves the second component. It is therefore
not possible to restrict here the analysis to R × {y}, and the connexity property does
not hold although the operator is now hypo-elliptic.
It may be thought that connexity is related to ellipticity. The following example
indicates that it is not the case. On R3 , consider the two vector fields
y x
Z1 = ∂x − ∂z , Z2 = ∂y + ∂z , (3.2.3)
2 2
and the operator L = Z12 + Z22 . This operator is not elliptic, but connected. Indeed,
(f ) = (Z1 f )2 + (Z2 f )2 , and a smooth function f which satisfies (f ) = 0 is
such that Z1 f = Z2 f = 0. Now the commutator [Z1 , Z2 ] = Z3 is Z3 f = ∂z f so
that f also satisfies Z3 f = 0 and is therefore constant. This model is classically
known as the Heisenberg (hypo-elliptic) operator.
Connexity will play a fundamental role in many functional inequalities below (re-
lated to tightness), and will be used to establish convergence to equilibrium (ergod-
icity). An abstract presentation of the connexity property will be further developed
in Sect. 3.3.
integration by parts being justified by the fact that ζ is compactly supported. To-
gether with the chain rule formula,
0≤ f ζ dμ = −
2 2
ζ (f )dμ − 2
2
f ζ (f, ζ )dμ.
M M M
As a consequence,
ζ (f )dμ ≤ 2
2
|f ζ | (f )1/2 (ζ )1/2 dμ.
M M
Proposition 3.2.1 above ensures that the hypo-elliptic diffusion operator L from
(3.2.1) on a smooth complete connected manifold M is essentially self-adjoint
and is therefore the infinitesimal generator in L2 (μ) of a symmetric semigroup
P = (Pt )t≥0 as discussed in Appendix A. While the Dirichlet form construction
of Sect. 3.1 already ensures that (Pt )t≥0 may be assumed sub-Markov, one central
question is to determine whether or not Pt (1) = 1 (for every t ≥ 0). As already
pointed out, depending on the context, this property is variously referred to as mass
conservation, Markov, stochastic completeness or non-explosion. Indeed, even in
the absence of boundary, it may happen that the associated Markov process (Xtx )t≥0
goes to infinity in finite time. Denoting by T this explosion time, if the manifold M
is complete, there is just one symmetric semigroup (Pt )t≥0 with generator L, and in
this case, Pt f (x) = E(f (Xtx )1{t<T } ) so that the Markov property Pt (1) = 1 does
not hold when T < ∞. As developed in this section, the validity of the mass conser-
vation or Markov property in a smooth manifold setting is closely related to gradient
144 3 Symmetric Markov Diffusion Operators
bounds. The mass conservation property will require more information on the set A0
than the mere completeness of the manifold and makes use of the extended algebra
A of smooth functions on the manifold M.
A useful criterion towards mass conservation is the use of the curvature condi-
tion CD(ρ, ∞) of Definition 1.16.1, p. 72. This investigation will raise in addition
further issues concerning the algebras A0 and A. Recall to start with the 2 operator
from (1.16.1), p. 71, defined for f ∈ A0 by
1
2 (f ) = 2 (f, f ) = L (f ) − 2 (f, Lf ) .
2
The Riemannian content of the and 2 operators is emphasized in Sect. C.6,
p. 513, as (f ) = |∇f |2 and
where Ricg is the Ricci tensor and ∇S Z the symmetric part of ∇Z. As for L and
, the 2 operator may be extended in the manifold case to the class A of smooth
functions. Recall then the curvature condition CD(ρ, ∞), ρ ∈ R,
2 (f ) ≥ ρ (f )
(for all f ∈ A thus) which suitably extends the notion of Ricci curvature lower
bound on Riemannian manifolds.
One fundamental result with respect to the curvature condition CD(ρ, ∞) is the
following gradient bound or commutation between the actions of the semigroup
and the carré du champ operator. Simple examples are the Brownian and Ornstein-
Uhlenbeck semigroups in Rn with ρ = 0 or ρ = 1 respectively (cf. e.g. (2.7.5),
p. 104). The operator L (from (3.2.1)) is assumed to be elliptic in the sense
of (1.12.2), p. 50, since in general no curvature bounds may hold for non-elliptic
operators.
Theorem 3.2.3 (Gradient bound) Let L be an elliptic diffusion operator with semi-
group P = (Pt )t≥0 , symmetric with respect to μ, on a smooth complete connected
manifold M. If L satisfies the curvature condition CD(ρ, ∞) for some ρ ∈ R, then,
for every f ∈ A0 and every t ≥ 0,
(Pt f ) ≤ e−2ρt Pt (f ) .
It is not difficult to see that, conversely, the gradient bound of Theorem 3.2.3
for every t ≥ 0 implies in return the curvature condition CD(ρ, ∞) (see Corol-
lary 3.3.19 below or Theorem 4.7.2, p. 209, in the context of local Poincaré inequal-
ities). Below, we present a formal and easy proof of Theorem 3.2.3, and then we
address the question of how this formal argument can be justified.
The formal proof is very elementary, and relies on the basic interpolation argu-
ment along the semigroup extensively developed throughout this work. Consider,
for f ∈ A0 and fixed t > 0, the function defined by
(s) = Ps (Pt−s f ) , s ∈ [0, t]
3.2 Second Order Differential Operators on a Manifold 145
The first term comes from the derivative of Ps and the heat equation, the second
one from the derivative of Pt−s f applied to the quadratic form . Rewriting this
identity with g = Pt−s f ,
(s) = Ps L(g) − 2 (g, Lg) = 2Ps 2 (g) .
It is worthwhile to observe that this identity is the analogue at the level of the 2
operator of the Duhamel formula (3.1.20) for . Now the CD(ρ, ∞) condition
simply reads ≥ 2ρ , from which the conclusion immediately follows.
Now, we have to wonder why and when such a proof could be justified, that is,
for which classes of functions f the preceding argument may be developed. The
operator Ps acts a priori on bounded functions, or on functions which are in some
Lp (μ)-space, but there is no guarantee that differentiation and commutation of L
and Ps are valid for those functions. More precisely, the first identities to be un-
derstood are LPs ((g)) = Ps (L(g)) as well as differentiation of (Pt−s f ) under
Ps (which consists, as already mentioned earlier, of an integration by parts together
with the derivation of an integral, and hence requires some justification). Of course,
working with an operator L with smooth coefficients on a compact smooth mani-
fold, the preceding is fully justified for sufficiently smooth functions. Indeed, in this
case, if f is smooth, so is Ps f , and so are g = Pt−s f and (g). They are more-
over bounded, and in the domain of L, so that the conclusion follows. However,
we would like to assert that this result is true for a generic operator L with smooth
coefficients on a (a priori non-compact) manifold. If f is smooth and compactly
supported, g = Pt−s f and (g) are still smooth, but no longer compactly supported
(it is not too hard to see that, under the CD(ρ, ∞) hypothesis, the kernel measures
pt (x, dy) have strictly positive densities everywhere for any x). Therefore care has
to be taken here when considering Ps ((g)).
Before moving on to a precise analysis of the arguments which justify the ingre-
dients used in the proof of Theorem 3.2.3, let us mention a reinforced form of it
which we shall also use extensively.
Theorem 3.2.4 (Strong gradient bound) In the setting of Theorem 3.2.3, if L satis-
fies the curvature condition CD(ρ, ∞) for some ρ ∈ R, then, for every f ∈ A0 and
every t ≥ 0,
(Pt f ) ≤ e−ρt Pt (f ) .
Using the change of variables formula (diffusion property) for L, this may be rewrit-
ten as
−1/2 ((Pt−s f ))
Ps (Pt−s f ) 2 (Pt−s f ) − .
4 (Pt−s f )
From (C.6.4), p. 515, the CD(ρ, ∞) curvature hypothesis actually yields the rein-
forced inequality
4 (g) 2 (g) − ρ (g) ≥ (g) (3.2.4)
(for any g in A). Applying this inequality to g = Pt−s f then yields the differential
inequality ≥ ρ , from which the conclusion follows again.
The following now tries to fully justify the proof of the gradient bounds (The-
orems 3.2.3 and 3.2.4) in the smooth context of this section. According to the dis-
cussion in Sect. 1.11.3, p. 46, the elliptic operator L from (3.2.1) takes the form
L = g + Z where g is the Laplace-Beltrami operator on M associated with the
(co-) metric g = (g ij ) and Z is a smooth gradient field Zf = −∇W · ∇f . The
invariant measure of L is given by dμ = e−W dμg where dμg is the Riemannian
measure on (M, g). Recall the class A0 of smooth (C ∞ ) compactly supported func-
tions on M, and the class A of smooth (C ∞ ) functions. The carré du champ operator
is simply given by (f ) = |∇f |2 (on A0 or A). Recall that, according to Sect. C.6,
p. 513, the curvature condition CD(ρ, ∞) expresses, when Z = 0, a lower bound
on the Ricci curvature of the manifold.
which is valid for all functions f ∈ A0 . The first step is to extend this property to
smooth functions of the L2 (μ)-domain D(L) of L. In fact, the following consider-
increases the complexity of the integration by parts formulas. Some of the difficul-
ties are resolved by observing that
which follows from the explicit description of the 2 operator in this Riemannian
context (cf. (C.5.3), p. 512).
To start with, note that for f ∈ A0 , from (3.2.5),
∇∇f 22 = |∇∇f |2 dμ
M
≤ −ρ (f )dμ + (Lf )2 dμ
M M
It therefore converges to some symmetric tensor K. The limit does not depend on
the approximating sequence, and is thus denoted Kf . If f ∈ A0 , then Kf = ∇∇f .
This remains true for any smooth function. To see why this is, it is enough to observe
that given h ∈ A0 ,
∇∇(hf ) = h Kf + ∇h ⊗ ∇f + ∇f ⊗ ∇h + f ∇∇h,
from which the identification Kf = ∇∇f follows. Then the bound (3.2.6) extends
to functions f in the L2 (μ)-domain of L by density.
Having made these preliminary observations, we can now begin the proof of
Theorem 3.2.4. Choose a function f ∈ A0 and fix ε > 0. For t > 0, consider the
function in s ∈ [0, t] given by
1/2
(s) = e−2ρs (Pt−s f ) + ε .
L + ∂s ≥ 0. (3.2.7)
148 3 Symmetric Markov Diffusion Operators
for every s ∈ [0, t]. From (3.2.6), |∇∇Pt−s f | is in L2 (μ) with L2 (μ)-norm bounded
from above by C1 f D(L) for some C1 > 0 depending only on ρ.
Choose now two auxiliary positive functions ξ, ζ : M → R, smooth and com-
pactly supported. Let, for s ≥ 0,
G(s) = ξ Ps ζ (s) dμ = ζ (s)Ps ξ dμ.
M M
This being true for any positive ξ ∈ A0 , it finally yields Pt ( (t)) ≥ (0). It re-
mains to let ε go to 0 to get the desired inequality. The strong gradient bound of
Theorem 3.2.4 is established.
3.2 Second Order Differential Operators on a Manifold 149
There is therefore a huge difference between the formal proof of Theorem 3.2.4
given earlier and the actual, more honest proof, that justifies in practice the formal
computations. For example, the same result would also be true on a compact mani-
fold with convex boundary (with the Neumann conditions at the boundary), but yet
another proof would be required.
In Sect. 3.3, in the abstract framework of a Markov Triple, the preceding gradi-
ent bounds will be extended from the complete case to the essentially self adjoint
case. However, this extension requires a much more refined analysis of the relations
between the classes A0 and A.
Theorem 3.2.4 may be strengthened by letting ρ vary, i.e. by assuming the exis-
tence of a (smooth) function ρ : M → R such that 2 (f ) ≥ ρ (f ) at any point x
and for every f ∈ A. In a Riemannian setting with L = g − ∇W · ∇, the best func-
tion ρ = ρ(x) for which the preceding is satisfied is the minimal eigenvalue at the
point x of the symmetric tensor Ric +∇∇W . The next proposition is the announced
extension, together with its probabilistic interpretation.
Proposition 3.2.5 In the setting of Theorem 3.2.3 under the curvature condition
CD(ρ, ∞) for some ρ = ρ(x) bounded from below by c ∈ R, for any f ∈ A0 and
t ≥ 0,
t |∇f |
(Pt f ) = |∇Pt f | ≤ P
where (P t )t≥0 is the semigroup with generator L − ρ Id. In particular, if
{Xt ; t ≥ 0, x ∈ M} denotes the Markov process with generator L, for any t ≥ 0
x
and x ∈ M,
t
|∇Pt f |(x) ≤ Ex |∇f |(Xt ) e− 0 ρ(Xs )ds . (3.2.9)
The proof of this proposition closely follows the previous proof of Theorem 3.2.4
working now with
(s) = Ps (Pt−s f )1/2 , s ∈ [0, t].
√
√ ≥ c for every x ∈ M, we already have an a priori bound (Pt f ) ≤
Since ρ(x)
−ct
e Pt ( (f )) which justifies the various derivatives of . Then, it remains to
apply the Feynman-Kac formula of Sect. 1.15.6, p. 63.
On the basis of Theorems 3.2.3 and 3.2.4 of the last sub-section, we now consider
the mass conservation property Pt (1) = 1 under a CD(ρ, ∞) curvature condition
for some ρ ∈ R. Formally, conservation of mass reflects the fact that L(1) = 0.
However, when the measure μ is infinite, the constant function 1 cannot be in the
domain D(L).
150 3 Symmetric Markov Diffusion Operators
Proof To make sense of the statement, recall that Pt (1) is understood here as the
(increasing) limit of Pt ζk as k → ∞ where (ζk )k≥1 is any sequence of positive
functions in A0 increasing to 1.
Choose two positive functions ξ and ζ in A0 , and
consider, for t ≥ 0, the quantity M (Pt ζ − ζ )ξ dμ. By integration by parts,
t
(Pt ζ − ζ ) ξ dμ = ξ LP ζ dμ ds
s
M 0 M
t
≤ (ξ, Ps ζ )dμ ds (3.2.10)
0 M
t
≤
(ξ ) 1
(ζ ) ∞ e−ρs ds
0
where the last step uses the curvature condition via Theorem 3.2.4. Now replace ζ
by the terms ζk , k ≥ 1, of √
the sequence which is used to characterize completeness
of the manifold M. Since (ζk ) ≤ k1 , in the limit as k → ∞,
Pt (1) − 1 ξ dμ = 0,
M
smoothing functions ζk , as k → ∞, and whenever
Replacing as before ζ by the
μ(E) = ∞, it follows that M ξ dμ = 0 for any ξ ∈ A0 , yielding a contradiction.
Theorem 3.2.7 is proved.
This section attempts to describe a general setting for the investigation of symmetric
Markov diffusion operators and semigroups, which will be convenient for the anal-
ysis of various questions related to functional inequalities, convergence to equilib-
rium, heat kernel bounds and so on. Although we do not want to restrict ourselves to
the case of smooth manifolds, in view of several important applications (in particular
in infinite dimension), we are now in a position to extract from the previous analysis
in manifolds the main features of the spaces of smooth compactly supported (A0 )
and smooth (A) functions used there. The abstract formalism put forward here is
a self-consistent framework in which the intuition behind the -calculus and semi-
group monotonicity may easily be developed, without having to take too much care
over the various families of functions involved in the analysis. In particular, this set-
ting is not intended for an investigation in itself. Applications will develop the ideas
emphasized through this formalism rather than the technical properties which have
to be investigated somewhat case by case.
Recall also that we do not want to impose a topology on the basic measure space
(E, F, μ) on which the Markov operators act, or on the space in which the Markov
processes live. Indeed, every notion should be invariant under measurable bijections
of the space (although such generality may in practice be quite useless). In partic-
ular, equalities and inequalities between functions have to be understood μ-almost
everywhere. The various classes of (real-valued) functions involved in the investi-
gation actually aim at replacing a pointwise analysis by a measurable one. In partic-
ular, the classes should be rich enough to develop a relevant calculus, leading below
to the necessary hypotheses in this regard (not always transparent). In applications
of interest, such as in a smooth manifold context, the various relevant inequalities
usually do hold pointwise.
The following thus describes a suitable framework enabling us to smoothly de-
velop the -calculus at the root of this investigation. The next Sect. 3.4 contains a
summary of the various definitions and hypotheses. The starting point for the analy-
sis is the underlying algebra A0 of a symmetric Diffusion Markov Triple (E, μ, )
(Definition 3.1.8). As already pointed out, this class A0 is in practise the class of
smooth (C ∞ ) functions with compact support. On A0 is given the carré du champ
operator . From the carré du champ operator is constructed the generator L to-
gether with the Dirichlet form E defined on its domain D(E). Many properties of
functions defined on A0 automatically extend to the Dirichlet space D(E). The semi-
group P = (Pt )t≥0 with infinitesimal generator L in L2 (μ) is symmetric with respect
to μ and sub-Markov (positivity preserving and such that Pt (1) ≤ 1 for every t ≥ 0).
152 3 Symmetric Markov Diffusion Operators
A Diffusion Markov Triple together with the additional ergodicity and mass con-
servation properties yields the notion of a Standard Markov Triple defined in Defi-
nition 3.1.15.
But in the initial setting of a Diffusion Markov Triple (E, μ, ), many properties
(such as essential self-adjointness (ESA), ergodicity, mass conservation or gradient
bounds as developed in the smooth case) may or may not hold. Criteria for these
properties to hold actually require the introduction of a new larger class A of func-
tions, viewed as an extension of A0 , on which the operator L is defined as an exten-
sion together with the associated operator, regardless of integrability properties.
In the smooth manifold case, this class A is the class of smooth (C ∞ ) functions.
The following develops the abstract construction of such a class A. The ESA prop-
erty will appear as the cornerstone of the construction. Indeed, while positivity of
the carré du champ operator is the main ingredient of the extension results, at
the second order of the 2 operator, curvature conditions together with essential
self-adjointness allow for a parallel treatment.
holds true.
(viii) For every f ∈ A0 and every t ≥ 0, Pt f ∈ A.
Remark 3.3.2 It might be worthwhile to mention again that the ideal and stability
properties of the extended algebra A provide an alternative understanding of the
density of A0 . For example, the non-negativity of on A0 may be extended to A
in the following way. Letting f ∈ A and g ∈ A0 , g ≥ 0, by the chain rule, for every
integer k ≥ 1,
0 ≤ f k g = k 2 f 2k−2 g 2 (f ) + 2kf 2k−1 g (f, g) + f 2k (g).
the stability of A0 under (Pt )t≥0 . This is of course the most desirable situation (and
is valid, for example, on compact manifolds) and many properties (such as the ESA
property) are then automatic. It requires the measure μ to be finite. Relying on this
hypothesis actually allows us to favor intuition without taking too much care over
technical details.
There are also cases where A0 may be assumed to be stable under (Pt )t≥0 with-
out containing the constant functions. For instance, in Rn and for the standard
Laplace operator , or the Ornstein-Uhlenbeck operator, the Schwartz space of
rapidly decreasing functions together with all their derivatives is such an exam-
ple. With some extra work (depending on our knowledge of the model), this may
also be achieved in a fairly large setting. But it is unlikely that this will hold on a
non-compact manifold only under the curvature condition, which is at the heart of
the analysis developed in this monograph. For the Ornstein-Uhlenbeck, Laguerre
and Jacobi operators of Chap. 2, it is in general better to work with the algebra A0
of polynomials in order to take advantage of the particular structure of the associ-
ated semigroups. The functions in A0 are then no longer stable under composition
with smooth functions. The reader may easily adapt the various hypotheses to those
particular cases.
Remark 3.3.3 When the measure μ is finite and the semigroup (Pt )t≥0 is con-
servative, constant functions belong to D(L). Then A0 may be replaced by the
larger class Aconst
0 = {f + c ; f ∈ A0 , c ∈ R}, extending with (f, 1) = 0 and
L(1) = 0. In the same spirit, the change of variables formulas (3.1.2) and (3.1.8)
easily extend to functions f in Aconst
0 without the restriction that (0) = 0. How-
ever, care has to be taken not to apply the ideal property to Aconst 0 (since if it ap-
plies
we would be back
to A = A 0 ). Indeed, even the integration by parts prop-
erty E g Lf dμ = − E (f, g) dμ does not in general make sense for f ∈ A
when g is constant. It is also worth observing that the properties (f, 1) = 0 and
L(1) = 0 are direct consequences of the change of variables formula applied with
(r) = 1,
r ∈ R. In particular, integration by parts (3.3.1) entails the invariance
of μ as E Lf dμ = 0 for any f ∈ A0 . When the underlying invariant measure
is finite, it will also be convenient to deal in the later chapters with the family
Aconst+
0 = {f + c ; f ∈ A0 , f ≥ 0, c > 0}.
Remark 3.3.4 The ideal property (i) of Definition 3.3.1 is convenient for the analy-
sis on non-compact Riemannian manifolds, but would not be suitable in some other
settings. For example, in many models of statistical mechanics, where an infinite
number of variables is considered, A0 would be the space of (smooth compactly
supported) functions depending on a finite number of coordinates. Then, A would
be the space of smooth functions depending on an infinite number of variables
and the ideal property does not hold. Fortunately, in many infinite-dimensional in-
stances, this difficulty is easily overcome. For example, for the infinite-dimensional
Ornstein-Uhlenbeck semigroup described in Sect. 2.7.2, p. 108, the algebra A would
be the space of smooth functions depending on a finite number of coordinates, and
the stability under (Pt )t≥0 is valid due to the fact that the different components
3.3 Heart of Darkness 155
3.3.2 Domains
At this point, there are two definitions of the operators L and , one on D(L) (or
D(E)) and one on the extended algebra A. The first task is to verify that they actually
coincide. Recall the adjoint L∗ of L (with respect to μ and A0 ) as presented in
Definition 3.1.9.
Proposition 3.3.5
(i) If f ∈ A ∩ D(L∗ ), then L∗ f = Lf .
(ii) If f ∈ A ∩ D(E), both definitions of , in D(E) and in A, coincide.
(iii) If f ∈ A ∩ D(E) and Lf ∈ L2 (μ), then f ∈ D(L).
But from the ideal property kf ∈ A0 , so that (kf, h) = A (kf, h), and similarly
f (h, k) = f A (h, k), hence equality holds. The same argument then allows us
to pass from (f, h) with f ∈ A ∩ D(E) and h ∈ A0 to (f, g) with f and g in
A ∩ D(E).
For the last point (iii), for every h ∈ A0 ,
E(h, f ) = h Lf dμ ≤ Lf 2 h2 ,
E
156 3 Symmetric Markov Diffusion Operators
Proof (i) is immediate since under the hypotheses f ∈ D(L∗ ) (by the very definition
of D(L∗ )) and by the ESA property D(L) = D(L∗ ).
The linear map f thus extends to a continuous linear form on the Hilbert space
D(E),
and may be represented as E(g, h) for some g ∈ D(E). Then for any h ∈ A0 ,
E (f − g)Lhdμ = 0, and therefore f − g ∈ D(L∗ ) (with L∗ (f − g) = 0). Since
∗
D(L ) = D(L) by the ESA assumption, f may be written as sum of an element of
D(L) and an element of D(E), hence in D(E). The proof is complete.
Having presented the extended algebra A and some of its properties, we next de-
scribe the connexity, weak hypo-ellipticity and completeness properties based on
the picture provided by the smooth manifold case of the previous section. Thus in
the following, the triple (E, μ, ) denotes a Diffusion Markov Triple and A the
extended algebra of Definition 3.3.1.
It should be mentioned here that with respect to the ergodicity property of Defi-
nition 3.1.11, connexity is a local property for functions in A whereas for ergodicity
the property holds for functions in D(E).
The second definition describes weak hypo-ellipticity in this context.
3.3 Heart of Darkness 157
Very often (for example in the case of manifolds with hypo-elliptic operators,
as in Sect. 1.12, p. 49), it is also true that for any bounded measurable function
f and any t > 0, Pt f ∈ A. However, this property will not be used below, and
passing from weak hypo-ellipticity to this seemingly stronger form would require
extra assumptions on the algebra A.
The third definition is that of completeness in this context, following the char-
acterization of Proposition C.4.1, p. 511, in Appendix C, as extensively used in the
previous section on Riemannian manifolds.
To establish the gradient bounds described in Sect. 3.2 in the smooth Riemannian
manifold setting, we investigate curvature conditions in this abstract context. To
this end, we first define the 2 operator, already introduced in (1.16.1), p. 71, on the
algebra A, and define in the same way Hessians.
for (f, g) ∈ A × A. As for the carré du champ operator , we often write more
simply 2 (f ) = 2 (f, f ).
1
H (f )(g, h) = g, (f, h) + h, (f, g) − (f, (g, h) . (3.3.3)
2
Such Hessians appear in the chain rule formula for 2 ((C.6.6), p. 516) in the
form
2 (hf, g) = h 2 (f, g) + f 2 (h, g) + 2H (g)(h, f ). (3.3.4)
Many formulas involving the operator twice may also be seen as Hessians, such
as for example
h, (f ) = 2H (f )(f, h), f, (f, h) = H (h)(f, f ) − H (f )(f, h).
Furthermore, the integration by parts formula, valid as soon as one of the functions
f, g, h belongs to A0 , reads
H (f )(g, h)dμ
E
1
= f L (g, h) + (h, Lg) + (g, Lh) + 2 Lg Lh dμ (3.3.5)
2 E
1
= g (f, Lh) − (h, Lf ) − L (f, h) dμ.
2 E
3.3 Heart of Darkness 159
These chain rules extend to change of variables formulas for smooth functions as a
consequence of the formula for the operator (see Remark 3.1.4).
The curvature-dimension condition is defined using the 2 -operator, as in
Sect. 1.16, p. 70.
Remark 3.3.15 In the same way as the positivity of extends from A0 to A via the
ideal property, the change of variables formula and condition (ii) of Definition 3.3.1
(see Remark 3.3.2) may be used to extend the CD(ρ, n) inequality. The argument
is slightly more involved due to the second order terms in the change of variables
formula for 2 . Let us illustrate the principle on the simpler curvature inequality
CD(ρ, ∞). For a given function f ∈ A, it is enough to show that for any positive
function h ∈ A0 , hK(f ) ≥ 0 where K(f, f ) = 2 (f ) − ρ (f ). From the posi-
tivity of K on A0 and the Cauchy-Schwarz inequality (with respect to the positive
quadratic form K), K(hf, hf ) − 2f K(hf, h) + f 2 K(h, h) ≥ 0. Let then ψ : R → R
be smooth such that ψ(0) = 0 and apply the latter to ψ(h). By the change of vari-
ables formula, after some simplification,
ψ 2 (h)K(f ) + 4 ψ(h)ψ (h)H (f )(h, f ) + 2 ψ 2 (h) (f )(h) + (h, f )2 ≥ 0.
Now fix ε > 0 and choose ψ (smooth) such that ψ(0) = 0, and ψ(r) = 1 and
ψ (r) = 0 when r ≥ ε. It follows that K(f ) ≥ 0 on {h ≥ ε}, and it remains to let
ε → 0 to obtain the result.
3.3.5 Extensions
In order to develop the proof of the strong gradient bound (Theorem 3.2.4) in this
abstract setting, it is necessary to extend the 2 and the Hessian operators from
160 3 Symmetric Markov Diffusion Operators
A0 to D(L), in the same way as extends from A0 to D(E). In a sense, the ap-
proach here parallels at the second order what has been developed in Sect. 3.1 at
the first order. At this second order level, positivity of has to be replaced by the
CD(ρ, ∞) curvature condition together with essential self-adjointness (ESA). This
goal is achieved in the following proposition.
Proposition 3.3.16 Assume the ESA assumption and the CD(ρ, ∞) condition for
some ρ ∈ R.
(i) The 2 operator extends to a continuous bilinear operator on D(L) satisfying
2 (f )dμ = (Lf )2 dμ (3.3.8)
E E
(vii) For every h ∈ A0 and f ∈ D(L), hf ∈ D(L) and, for any g ∈ D(L),
bilinear operator was extended from A0 to D(E) using
Proof In Sect. 3.1, the
that, on A0 , E(f ) = E (f )dμ, (f ) ≥ 0 and A0 is dense in D(E) with respect to
the D(E)-topology. Under the CD(ρ, ∞) and ESA conditions, the same procedure
may be performed on the bilinear positive map 2 (f ) −
ρ (f ) using
the density
of A0 in D(L). The basic ingredient here is the formula E 2 (f )dμ = E (Lf )2 dμ
which holds for any f ∈ A0 . The extension of the CD(ρ, ∞) condition to D(L) is
3.3 Heart of Darkness 161
then straightforward. From (3.3.7), the extension of H (f )(g, h) follows the same
lines, and the change of variables formula (3.3.4) goes to the limit.
The next step, as was the case for the operator (Propositions 3.3.5 and 3.3.6),
is to identify the two definitions of 2 and the Hessian (on D(L) and on A), and
moreover to extend the integration by parts formula (3.3.8) to A. The following
proposition fulfills this task.
Proposition 3.3.17 Assume the ESA assumption and the CD(ρ, ∞) condition for
some ρ ∈ R. If f ∈ A ∩ D(L), the two definitions of 2 (f ) (on D(L) and on A)
coincide. The same is true for the Hessian H (f )(g, h) whenever f ∈ A ∩ D(L),
g ∈ A ∩ D(E) and h ∈ A0 .
Moreover, if f ∈ A ∩ D(L) then 2 (f ) ∈ L1 (μ) and (f ) ∈ L1 (μ), in which
case
2 (f )dμ = (Lf )2 dμ.
E E
Conversely, and under the additional ergodicity property, for f ∈ L2 (μ) ∩ A, if both
2 (f ) and (f ) belong to L1 (μ), then f ∈ D(L).
By the integration
by parts formula (3.3.5), this expression is reduced to an inte-
gral of the form E f E(g, h, k)dμ where
E(g, h, k) (even if tedious
to write down
explicitly) belongs to A0 . Now similarly E kHA (f )(g, h)dμ = E f E(g, h, k)dμ
with the same expression E(g, h, k). Then, changing f into a sequence (f )∈N of
functions in A0 converging to f ∈ D(L), the identity of the two integrals follows
in the limit. Fixing now f ∈ D(L) and h ∈ A0 , and replacing g ∈ A0 by g ∈ D(E),
the same procedure (using the other chain rule for H and the other integration by
parts formula (3.3.5)) yields the announced coincidence of H and HA . The identity
of the 2 operators follow along the same lines.
We already know from Sect. 3.1 that if f ∈ D(L),
then Lf ∈
L2 (μ) (by def-
inition) and (f ) ∈ L (μ). Moreover, in D(L), E 2 (f )dμ = E (Lf )2 dμ and
1
162 3 Symmetric Markov Diffusion Operators
Since K is a positive bilinear form, it holds that K(f, h)2 ≤ K(f, f )K(h, h),
which implies for any h ∈ D(L),
|f (h)| ≤ C(f )hD(L) . Then, there exists some
g ∈ D(L) such that f (h) = E [Lg Lh + gh]dμ. But it is easily seen from the spec-
tral decomposition that in fact g = Lk, where k = (L2 + Id)−1 (L + ρ Id)f . Again
from the spectral decomposition, the operators
2 −1 −1 2
L + Id (L + ρ Id) and L2 + Id L +ρL
Thanks to the ergodicity and ESA properties, the linear form f can be extended to
L2 (μ), which shows that Lf + ρf ∈ L2 (μ), and therefore Lf ∈ L2 (μ). The proof
is complete.
Note from the preceding proof that whenever ρ ≥ 0, for a function f ∈ A ∩ L2 (μ)
such that 2 (f ) ∈ L1 (μ), the condition (f ) ∈ L1 is not needed to ensure that
Lf ∈ L2 (μ).
On the basis of the previous curvature conditions, we now address the announced
gradient bounds (or commutation between Pt and ) which were developed in the
smooth manifold case in Sect. 3.2.3 under the hypotheses of connexity and com-
pleteness. The point here is that the result will be extended in the abstract frame-
work to the case when the underlying operator is essentially self-adjoint. As we
have seen in Proposition 3.3.11, essential self-adjointness holds under connexity,
weak hypo-ellipticity and completeness, so that the approach here goes further than
the manifold case.
3.3 Heart of Darkness 163
Although the gradient bound (Pt f ) ≤ e−2ρt Pt ((f )) seems easier to obtain
than the strong bound, the main difficulty
√ is that we have no a priori control on the
L2 (μ)-norm of it, in contrast to that of (Pt f ). Unfortunately, this leads to some
extra complications. One main result is emphasized in the following theorem and its
corollary.
Although the corollary seems a weaker result than the theorem (by Jensen’s in-
equality for Pt , which is sub-Markov), we will see later (cf. Theorem 4.7.2, p. 209)
that they are in fact both equivalent to the curvature CD(ρ, ∞) condition. Actually,
taking the derivative at t = 0 of (Pt f ) ≤ e−2ρt Pt ((f )) yields
Lemma 3.3.20 Under the ESA and CD(ρ, ∞) assumptions, for every f ∈ A0 , and
any t ≥ 0 and ε > 0, (Pt f ) + ε 2 − ε ∈ D(E).
Proof Set Gε = (Pt f ) + ε 2 − ε. Since Pt f ∈ A, by composition with a smooth
function, Gε ∈ A, and from the ESA property, it is enough to prove that Gε ∈ L2 (μ)
and (Gε ) ∈ L1√
√ (μ) (Proposition 3.3.6
(ii)). The first claim follows since we have
r + ε 2 − ε ≤ r for any r ≥ 0 and E (Pt f )dμ = E(Pt f ) ≤ E(f ) < ∞. The
second claim is a consequence of the bounds
((Pt f ))
(Gε ) = ≤ 2 (Pt f ) − ρ (Pt f ),
4((Pt f ) + ε 2 )
Lemma 3.3.21 Under the ESA and CD(0, ∞) assumptions, for any
f ∈ A ∩ L2 (μ) such that Lf ∈ D(E) and (f ) + ε 2 − ε ∈ D(E) ∩ A for some
ε > 0, and for any h positive and bounded in D(E),
(f, Lf )
E h, (f ) + ε 2 − ε + h √ dμ ≤ 0.
E (f ) + ε
Proof Using the same notation Gε = (f ) + ε 2 − ε, for h ∈ A0
E(h, Gε ) = − h LGε dμ
E
42 (f )((f ) + ε 2 ) − ((f )) (f, Lf )
=− h + dμ.
E 4((f ) + ε 2 )3/2 (f ) + ε 2
Applying once again the extended 2 inequality (3.2.4) the announced inequality
follows in this case. It may then be extended to any h ∈ D(E) by density.
Proof of Theorem 3.3.18 We only prove the theorem when ρ = 0, the extension to
the general case being straightforward. Fix ε > 0, f ∈ A0 and h ≥ 0 in A0 . For
t > 0 fixed, consider
(s) = hPs Gε dμ = Gε Ps h dμ, s ∈ [0, t],
E E
with the notation Gε = (Pt−s f ) + ε 2 − ε as above. The task will be to show
From Lemma 3.3.20, Gε ∈ D(E). By spectral analysis, for
that is increasing.
every G ∈ D(E), ∂s E GPs hdμ = −E(Ps h, G). On the other hand,
(Pt−s f, Pt−s Lf )
∂s Gε = −
Gε + ε
while
For any f ∈ A0 , (Pt f ) ∈ L1 (μ) showing that ∂s Gε 1 + ∂s2 Gε 1 ≤ C for some
constant C depending only on f ∈ A0 . Since Pt h is bounded, derivation of is
then justified and, by the usual change of variables formula,
(Pt−s f, Pt−s Lf )
(s) = −E(Ps h, Gε ) − Ps h dμ.
t−s f ) + ε)
((P 1/2
E
This is positive from Lemma 3.3.21. Hence (t) ≥ (0) for any positive h ∈ A0
and any ε > 0, leading to the conclusion. The theorem is established.
3.3 Heart of Darkness 165
(This type of local Poincaré inequality will be extensively examined in Sect. 4.7,
p. 206.) In particular, whenever (f )∈N (in A0 ) converges to f in Lp (μ) for some
p > 2, then (Pt f ) → (Pt f ) in Lq (μ) for any q < p2 and any t > 0.
Remark 3.3.22 The gradient bounds of Theorem 3.3.18 will play a major role in
the forthcoming chapters. In particular cases, similar results may be reached under
weaker hypotheses. For example, when solving stochastic differential equations, the
route proposed in Proposition 3.2.5 provides such bounds with stochastic calculus
tools. Similar results may be obtained for semigroups on bounded domains with
Neumann boundary conditions, provided there is enough information about the ge-
ometry of the boundary. Such an example would however never be covered by the
arguments presented here, due to the lack of the ESA property in this context. The
conclusion of Corollary 3.3.19 is still valid in non-diffusion instances (for example
for Markov chains on a finite space). In some examples, such as the Heisenberg
model (cf. (3.2.3)), the gradient bounds also hold up to a constant K > 1 as
(Pt f ) ≤ Ke−ρt Pt (f )
With the preceding gradient bounds, Theorem 3.2.6 of the previous section on
the mass conservation property extends to the present abstract setting. Recall that
the semigroup P = (Pt )t≥0 is conservative if Pt (1) = 1 for every t ≥ 0. As men-
tioned earlier, the mass conservation property is also known as the Markov, stochas-
tic completeness or non-explosion property. The following statement also covers
Theorem 3.2.7 in this abstract framework and is established similarly.
the esssup running over all bounded functions f in A such that (f ) ≤ 1. (Here,
the esssup is defined as the least measurable function on (E × E, F ⊗ F) which is
larger than f (x) − f (y) for μ ⊗ μ-almost every (x, y) ∈ E × E over the given class
of functions f .) The choice of an esssup (which thus rules out sets of μ-measure 0)
instead of a mere supremum is due to the fact that we are considering only classes
of functions. This distance, often called the intrinsic distance (although depending
on the underlying algebra A), coincides with the Riemannian distance in the case
of diffusions on a manifold with elliptic coefficients (see Sect. C.4, p. 509), and
also in some hypo-elliptic cases with the so-called Carnot-Carathéodory distance.
The diameter is defined as the L∞ (E × E, μ ⊗ μ)-norm of the distance function
d(x, y), (x, y) ∈ E × E. We make no claim that (3.3.9) effectively defines a distance
(beyond the triangle inequality which is obviously satisfied), and moreover that this
distance has anything to do with the completeness hypothesis. This would require
many more hypotheses on A and A0 (and would certainly be useless). In particular,
in some infinite-dimensional settings such as the Ornstein-Uhlenbeck example of
Sect. 2.7.2, p. 108, this distance is almost everywhere infinite. However, it will play
a key role in many functional inequalities or estimates on heat kernels in particular
by means of the associated Lipschitz functions. Distance to a measurable set A as
d(x, A) = esssup f (x), x ∈ E, where the esssup runs over all bounded functions f
in A such that (f ) ≤ 1 and f 1A = 0, may be considered similarly. In the above
mentioned example of the infinite-dimensional Ornstein-Uhlenbeck semigroup, this
distance is μ-almost everywhere finite as soon as μ(A) > 0. It will not be used
below but any result on Lipschitz function would also apply to these distances from
sets.
Definition 3.3.25 (Full Markov Triple) A Full Markov Triple (E, μ, ) is a Stan-
dard Markov Triple (thus with algebra A0 and extended algebra A) satisfying more-
over the connexity and ESA properties.
For a Compact Markov Triple, A0 is stable under (Pt )t≥0 and therefore automat-
ically dense in D(L) with respect to its topology.
168 3 Symmetric Markov Diffusion Operators
In this last section, we summarize the previous investigation and present the typical
framework for the analysis of symmetric Markov diffusion operators in which we
will be working in most parts of this monograph. Although various results might
hold in broader settings, without in particular the symmetry or diffusion assump-
tions, we will always work with a Standard or Full Markov Triple (E, μ, ) as
presented in Definitions 3.1.15 and 3.3.25 with underlying algebra A0 and extended
algebra A respectively (the properties of which are recalled below). This
framework
includes the Dirichlet form E with domain D(E) defined by E(f ) = E (f )dμ, the
associated diffusion operator L with L2 (μ)-domain D(L), infinitesimal generator of
the Markov semigroup P = (Pt )t≥0 (with invariant and reversible measure μ), and
associated Markov process, or family of processes, X = {Xtx ; t ≥ 0, x ∈ E}. The
semigroup P = (Pt )t≥0 may be represented according to (1.2.4), p. 12, by probabil-
ity kernels as
Pt f (x) = f (y) pt (x, dy), t ≥ 0, x ∈ E.
E
The kernels pt (x, dy) describe the distribution at time t of the Markov process Xtx
starting at x. Functions are understood as classes of functions, and equalities and
inequalities between them hold μ-almost everywhere.
Throughout, we adopt the Full Markov Triple structure as a convenient frame-
work which allows us to describe the main results and freely develop the central
ideas and principles of the -calculus and heat flow monotonicity.
Standard Markov Triples, or even just Diffusion Markov Triples, suffice for many
statements. Similarly, some results only require a number of specific properties of
the Full Markov Triple definition, but for convenience we stick to the latter. If there
is any need to distinguish between Full and Standard Triples, generally speaking
Full Markov Triples are necessary as soon as gradient bounds, the 2 operator and
curvature-dimension conditions enter into play, otherwise Standard Markov Triples
may be used.
As mentioned earlier, it is sometimes appropriate to work in the Compact Markov
Triple setting (Definition 3.3.26), although the conclusions will actually hold in the
full case.
Finally, for simplicity of exposition, we mostly use the reduced terminology
“Markov Triple”, assuming that Full Markov Triple is meant (or only Standard
Markov Triple if this is enough for the purpose of the given results according to
the preceding remarks).
We present below a short synthesis of the different objects of interest and hy-
potheses described in this first part, to which the reader may refer while progressing
through the following chapters.
3.4 Summary of Hypotheses (Markov Triple) 169
k
(f1 , . . . , fk ), g = ∂i (f1 , . . . , fk )(fi , g). (3.4.1)
i=1
D4. For every f in A0 , there exists a finite constant C(f ) such that for every
g ∈ A0 ,
(f, g)dμ ≤ C(f ) g2 .
E
The Dirichlet form E is defined for every (f, g) ∈ A0 × A0 by
E(f, g) = (f, g)dμ.
E
E(f, f ) is abbreviated as E(f ). The domain D(E) of the Dirichlet form E is the
completion of A0 with respect to the norm f E = [f 22 + E(f )]1/2 . The Dirich-
let form E is extended to D(E) by continuity together with the carré du champ
operator .
D5. L is a linear operator on A0 defined by and satisfying the integration by parts
formula
g Lf dμ = − (f, g)dμ
E E
for all f, g ∈ A0 . The change of variables formula (3.1.8) for L is a consequence
of the change of variables formula (3.4.1) for .
D6. For the operator L defined in D5, L(A0 ) ⊂ A0 .
D7. The domain D(L) of the operator L is defined as the set of f ∈ D(E) for which
there exists a finite constant C(f ) such that for any g ∈ D(E)
170 3 Symmetric Markov Diffusion Operators
E(f, g) ≤ C(f ) g2 .
On D(L), L is extended via the integration by parts formula for every g ∈ D(E).
L defined on D(L) is always self-adjoint.
Adjoint Operator (Definition 3.1.9) The domain D(L∗ ) is the set of functions
f ∈ L2 (μ) such that there exists a finite constant C(f ) for which, for every g ∈ A0 ,
f Lg dμ ≤ C(f ) g2 .
E
On this domain, the adjoint operator L∗ is defined by integration by parts: for any
g ∈ A0 ,
∗
L (f ) = f Lg dμ = g L∗ f dμ.
E E
It holds that D(L) ⊂ D(L∗ ) and L∗ is an extension of L.
A Full Markov Triple is a Standard Markov Triple for which there is an extended
algebra A ⊃ A0 of functions, with no requirements of integrability for elements of
A, satisfying the following requirements (Definition 3.3.1).
F1. Whenever f ∈ A and h ∈ A0 , hf ∈ A0 (ideal property).
holds true.
F8. For every f ∈ A0 , and every t ≥ 0, Pt f ∈ A.
F9. The Markov Triple is connected in the sense that if f ∈ A, (f ) = 0 implies
that f is constant (Definition 3.3.7).
F10. The ESA property holds (Definition 3.1.10).
A Compact Markov Triple is a Full Markov Triple such that A = A0 . The ESA
property is then automatic.
3.4.5 Miscellaneous
Algebra Aconst
0 (Remark 3.3.3) When μ is a probability measure, A0 may be
replaced by Aconst
0 = {f + c ; f ∈ A0 , c ∈ R}, extending by defining (f, 1) = 0
and L(1) = 0. The change of variables formulas (3.1.2) and (3.1.8) extend to func-
tions f ∈ Aconst
0 without the restriction that (0) = 0. The ideal property does not
apply to A0const (in which case A = A0 ). We also sometimes work with Aconst+
0 =
{f + c ; f ∈ A0 , f ≥ 0, c > 0}.
172 3 Symmetric Markov Diffusion Operators
(f ) ≤ 4 2 (f ) − ρ (f ) (f ).
This chapter collects and formalizes a selection of definitions and properties ex-
tracted from Chaps. 1 and 2. In particular, several references relevant to this chapter
may already be found there.
The abstract framework of a Markov Triple (E, μ, ) emphasized in Sects. 3.1
and 3.3 is inspired by the Dirichlet form theory as developed in [91, 189, 190, 294].
It essentially summarizes and extends the early description put forward in the lecture
notes [26] (see also [27, 28]). The intermediate value Theorem 3.1.16 is recorded
in [39]. The slicing argument of Proposition 3.1.17 is part of the folklore on Dirichlet
forms and capacities (cf. [91, 303] etc.).
Section 3.2 emphasizes some fundamental regularity results in the manifold set-
ting. The uniqueness Proposition 3.2.1 and its Corollary 3.2.2 are suitable extrac-
tions from [247, 248]. In this manifold setting, completion and self-adjointness have
been deeply investigated by R. Strichartz [390]. Gradient bounds and commu-
tation properties between gradient and semigroup (Theorems 3.2.3 and 3.2.4), of
fundamental importance throughout this work, appeared in the probabilistic con-
text via the Bismut representation formula of Proposition 3.2.5 [69, 171, 172]
(see [251, 391]). The 2 approach to the gradient bounds originates in [36] and
has been promoted in [22, 24] (see also [25]) in connection with the analysis of the
boundedness of Riesz transforms. The mass conservation Theorem 3.2.6 under a
lower bound on the Ricci curvature essentially goes back to S.-T. Yau [443] (see ear-
lier [191]). It has been widely studied and extended under Ricci or volume growths
by numerous authors, and a detailed history is presented for example in [215, 218].
The monograph [217] by A. Grigor’yan is a comprehensive investigation of heat
174 3 Symmetric Markov Diffusion Operators
kernel bounds on Riemannian manifolds and metric spaces in which the reader will
find complete references and historical developments.
The general setting put forward in Sect. 3.3 is largely inspired by the early con-
tribution [36] and already outlined in the lecture notes [26] (see also [27, 28]). The
complete and self-contained exposition developed here is new, and emphasizes in
particular essential self-adjointness as a critical tool in this context. The intrinsic
distance is also a classical feature of the theory of Dirichlet forms [91, 189, 190].
See also [243, 395, 396]. On gradient bounds for the hypo-elliptic Heisenberg model
(cf. Remark 3.3.22), see [30, 51, 282, 283, 425, 446].
Part II
Three Model Functional Inequalities
Chapter 4
Poincaré Inequalities
This chapter investigates the first important family of functional inequalities for
Markov semigroups, the Poincaré or spectral gap inequalities. These will provide
the first results towards convergence to equilibrium, and illustrate, at a mild and
accessible level, some of the basic ideas and techniques on Markov semigroups
and functional inequalities developed throughout this monograph, at the interplay
between analysis, probability theory and geometry.
Following the conclusions of Chap. 3 and the summary in Sect. 3.4, p. 168, it is
most convenient to present the results of this chapter, as in most parts of the book,
in the Full Markov Triple framework. Furthermore, according to the convention set
forth there, we use the terminology “Markov Triple” for Full Markov Triple. Results
with specific hypotheses will be clearly indicated.
More general settings may be considered for the analysis of spectral gap inequal-
ities. In order to state the Poincaré inequalities, it is actually enough to deal with the
minimal structure (E, μ, ) with finite invariant measure μ, even without the dif-
fusion property. Standard Markov Triples suffice for many of the results described
below, Full Markov Triples being necessary when dealing with inequalities involv-
ing gradient bounds and curvature conditions.
Recall that the Markov Triple setting includes a triple (E, μ, ) with state space
E, invariant reversible measure μ and carré du champ operator acting on an alge-
bra of bounded measurable functions A0 . The setting involves the Dirichlet form
E(f ) = (f )dμ
E
with domain D(E) and the associated diffusion operator L with domain D(L), gen-
erator of the Markov semigroup P = (Pt )t≥0 . Most inequalities are usually stated
and established for functions in D(E). In general, they are established in fact only
for the functions of the algebra A0 , which by construction is dense in the Dirichlet
domain D(E). The extended algebra A ⊃ A0 allows for the -calculus and gradient
bounds. In concrete examples, A0 plays the role of the set of smooth compactly sup-
ported functions while A represents smooth functions with no restriction on integra-
bility and support. The associated Markov semigroup P = (Pt )t≥0 with infinitesimal
The Poincaré, or spectral gap, inequality is the simplest inequality which quantifies
ergodicity and controls convergence to equilibrium of the semigroup P = (Pt )t≥0
towards the invariant measure μ (in other words, the convergence of the kernels
pt (x, dy), x ∈ E, as t → ∞, towards dμ(y)). Since the kernels pt (x, dy) describe
the distribution at time t of the associated Markov process Xtx starting at x, this con-
vergence is translated equivalently as a convergence of Xtx towards the equilibrium.
On the basis of the example of the Ornstein-Uhlenbeck semigroup (Sect. 4.1),
Sect. 4.2 introduces the formal definition of Poincaré or spectral gap inequalities
in the context of a Markov Triple (E, μ, ) as above. Exponential decay in L2 (μ)
along the semigroup is part of the first main equivalent descriptions of Poincaré
inequalities. Tensorization properties are presented next. The example of the ex-
ponential measure on the line is discussed in Sect. 4.4, together with exponential
integrability properties of Lipschitz functions under a Poincaré inequality. The next
section studies Poincaré inequalities for measures on the real line or on an inter-
val of the real line. Section 4.5.1 first presents a characterization of measures on
the line satisfying a Poincaré inequality. Then, in Sect. 4.5.2, Poincaré inequalities
on an interval with respect to Neumann, Dirichlet and periodic boundary condi-
tions are discussed. Section 4.6 presents the Lyapunov function method of obtain-
ing Poincaré inequalities. Local Poincaré inequalities for heat kernel measures un-
der curvature conditions are investigated next in Sect. 4.7 via the basic semigroup
interpolation scheme. Global Poincaré inequalities for the invariant measure under
curvature-dimension conditions are developed in Sect. 4.8. Further inequalities of
Brascamp-Lieb-type are presented in Sect. 4.9. Finally, Sect. 4.10 is a somewhat
in depth investigation of the description of the bottom of the spectrum and essen-
tial spectrum of Markov generators together with criteria of general interest which
establish the discreteness of spectra via the notion of a Persson operator.
In this case,
Pt f = e−kt ak Hk , t ≥ 0,
k∈N
for every t ≥ 0. Since both terms of this inequality are equal at t = 0, we can perform
a first order Taylor expansion at t = 0 of both sides, for any function f in the domain
D(L). More precisely, since Pt f = f + t Lf + o(t), we get that
(Pt f )2 dμ = f 2 dμ + 2t f Lf dμ + o(t)
R R R
= f 2 dμ − 2t E(f ) + o(t)
R
180 4 Poincaré Inequalities
and, clearly,
−2t
e f dμ =
2
f dμ − 2t
2
f 2 dμ + o(t).
R R R
Hence R f 2 dμ ≤ E(f ). Since E(f + c) = E(f ) for any real c, it follows, after
centering, that for any function f ∈ D(L), and thus by extension for every f in the
Dirichlet domain D(E) of the Ornstein-Uhlenbeck operator,
2
f 2 dμ − f dμ ≤ E(f ). (4.1.1)
R R
This inequality (4.1.1) (or rather family of inequalities) may be read in two ways.
First, for smooth functions f , by construction of the carré du champ operator ,
2
f 2 dμ − f dμ ≤ E(f ) = (f )dμ = f 2 dμ.
R R R R
On the other hand, (4.1.1) expresses equivalently that the first non-trivial eigen-
value of (the opposite generator) −L is greater than or equal to
1. Indeed, if f
is non-constant such that −Lf = λf for some λ (≥ 0), then R f dμ = 0 and
2
R (f )dμ = R f (−Lf )dμ = λ R f dμ so that λ ≥ 1. Inequality (4.1.1) will
be called the Poincaré, or spectral gap, inequality for the Gaussian measure μ (with
respect to the carré du champ operator or the Dirichlet form E).
Note that (4.1.1) may be established directly on the spectral decomposition in
Hermite polynomials since
2
E(f ) = k ak2 ≥ ak2 = f dμ −
2
f dμ ,
k≥1 k≥1 R R
where, again, the last equality is a consequence of the fact that a0 = R f dμ. Note
also that (4.1.1) is optimal in the sense that functions of the form f = aH0 + bH1 =
a + bx for a, b ∈ R achieve equality in (4.1.1), thus describing the class of extremal
functions for this inequality.
Observe that similar arguments on multiple Hermite polynomials (or ten-
sorization tools as will be developed in Sect. 4.3 below) yield a Poincaré or
spectral gap inequality for the standard Gaussian probability measure dμ(x) =
(2π)−n/2 e−|x| /2
dx on Rn with carré du champ operator (f ) = |∇f |2 and Dirich-
2
Proposition 4.1.1 (Poincaré inequality for the Gaussian measure) Let μ be the
standard Gaussian measure on the Borel sets of Rn . For every function f : Rn → R
in the Dirichlet domain D(E),
2
f dμ −
2
f dμ ≤ E(f ) = |∇f |2 dμ.
Rn Rn Rn
4.2 Poincaré Inequalities 181
For a probability measure ν on a measurable space (E, F), we define the variance
of a function f in L2 (ν) as
2
Varν (f ) = f 2 dν − f dν . (4.2.1)
E E
Varμ (f ) ≤ C E(f ).
The best constant C > 0 for which such an inequality holds is sometimes referred
to as the Poincaré constant (of the Markov Triple).
It is enough to state such a Poincaré inequality for a family of functions f which
is dense in the domain D(E) of the Dirichlet form E. This principle will be used
almost automatically when establishing Poincaré (and more general functional) in-
equalities, and typically functions from the underlying algebra A0 will be used to
this task.
The terminology between Poincaré and spectral gap inequality is somewhat fluc-
tuant. We often say that the probability measure μ satisfies a Poincaré inequality
P (C) with constant C (with respect to the carré du champ operator or the Dirich-
let form E). For example, by Proposition 4.1.1, the standard Gaussian measure μ on
Rn satisfies a Poincaré inequality P (1) with respect to the standard Dirichlet form
182 4 Poincaré Inequalities
Remark 4.2.2 A function with zero variance is constant. Therefore, one impor-
tant consequence of a spectral gap inequality for the invariant measure μ is that
if E(f ) = 0, then f is (μ-almost everywhere) constant. In particular, such a con-
clusion reinforces the ergodicity property of Definition 3.1.11, p. 135. It is also a
stronger form of connexity (Definition 3.3.7, p. 156). Functional inequalities with
this property will be called tight. A general principle, to be developed throughout
this book, is that tight inequalities have something to say about convergence to equi-
librium (cf. Sect. 1.8, p. 32).
Actually, as will be developed later for other functional inequalities, one can
consider non-tight Poincaré inequalities such as
2
f 2 dμ ≤ a f dμ + C E(f ), f ∈ D(E).
E E
This would not be of much help here, at least for probability measures, since this
f to the constant function f = 1 shows that a ≥ 1. Then,
inequality applied for
applied to g = f − E f dμ, we are back to the case a = 1. On the other hand, the
Nash-type inequalities investigated later in Sect. 7.4.1, p. 364,
2
f 2 dμ ≤ a |f |dμ + C E(f ), f ∈ D(E),
E E
Lemma 4.2.3 Under the Poincaré inequality P (C), for a function f ∈ D(E) such
that f = 0 outside a set A ∈ F with μ(A) < 1, it holds that
C
f 2 dμ ≤ E(f ).
A 1 − μ(A)
4.2 Poincaré Inequalities 183
for functions f ∈ D(E) with support in A as in the previous lemma may hold in
various contexts. For example, the so-called Faber-Krahn inequalities, in the con-
text of Nash inequalities, belong to this family (see Remark 6.2.4, p. 284, and Re-
mark 8.2.2, p. 399).
(iii) For every function f ∈ L2 (μ), there exists a constant c(f ) > 0 (possibly de-
pending on f ) such that, for every t ≥ 0,
for every f ∈ L2 (μ) and t > 0. Recall that according to Proposition 3.1.6 and Re-
mark 3.1.7, p. 131 and p. 132, Pt , t > 0, maps L2 (μ) into D(Lk ) for every in-
teger
k. Assume
then for simplicity that f has mean zero. Since by invariance,
E P t f dμ = E f dμ = 0,
(t) = Varμ (Pt f ) = (Pt f )2 dμ, t ≥ 0.
E
which yields (4.2.2). The Poincaré inequality P (C) therefore translates into the dif-
ferential inequality (t) ≤ − C2 (t), t > 0, which amounts to saying that the func-
tion e2t/C (t) is decreasing in t ≥ 0. The conclusion follows by comparing the
values of this function at t and at t = 0.
To prove the equivalence of the third statement (iii), it is convenient to record the
following lemma of independent interest.
Lemma 4.2.6 For any non-zero function f ∈ L2 (μ), the map t → log(Pt f 2 ) is
convex on R+ .
The proof is immediate. Indeed, if (t) = E (Pt f ) dμ, t > 0, as before, (t) =
2
2 E Pt f LPt f dμ and
(t) = 4 (LPt f )2 dμ.
E
The elementary
proof follows the same pattern. For a function f ∈ D(E), the deriva-
tive of E (f − Pt f )2 dμ in t for t > 0 is equal, by symmetry and integration by
parts, to
−2 (f − Pt f ) LPt f dμ = 2 E(Pt/2 f ) − 2 E(Pt f ) ≤ 2 E(f ),
E
4.3 Tensorization of Poincaré Inequalities 185
Proposition 4.2.7 (Bounded perturbation) Assume that the Markov Triple (E, μ, )
satisfies a Poincaré inequality P (C). Let μ1 be a probability measure with density
h with respect to μ such that b1 ≤ h ≤ b for some constant b > 0. Then μ1 satisfies
P (b3 C) (with respect to ).
With the somewhat more refined Lemma 5.1.7, p. 240, below, b3 may be replaced
by b2 in the preceding. An even more precise statement is that if h = ek , then b2 may
be replaced by eosc(k) where osc(k) = sup k − inf k is the oscillation of the function
k. (As usual, sup and inf stand here respectively for esssup and essinf with respect
to the measure μ.)
These first elementary observations already yield Poincaré inequalities for mea-
sures μ on R or Rn with a bounded density with respect to the
standard Gaussian
measure, and for Dirichlet forms, on the line say, of the form R a(x)f 2 (x) dμ(x)
where the function a(x) is bounded from below by a strictly positive constant.
Proof As a first indication of the result, consider the elementary case when the cor-
responding (opposite) generators −L1 and −L2 have discrete spectra, (λ1k )k∈N and
(λ2 )∈N . The (opposite) generator −(L1 ⊕ L2 ) on the product space E1 × E2 has
spectrum (λ1k + λ2 )k,∈N . Its smallest strictly positive eigenvalue is thus the mini-
mum of the smallest strictly positive eigenvalues of (λ1k )k∈N and (λ2 )∈N . The con-
clusion follows in this case with C1 = 11 and C2 = 12 .
λ1 λ1
The general case is not much harder. The main point is to understand that the
Dirichlet form E on the product (E1 × E2 , μ1 ⊗ μ2 , 1 ⊕ 2 ) is given, on suitable
functions f : E1 × E2 → R, by
E(f ) = E1 (f )dμ2 + E2 (f )dμ1 .
E2 E1
Here E1 is the Dirichlet form on E1 applied to the first variable (with the second
one being fixed), and similarly E2 is the Dirichlet form on E2 applied to the second
variable (with the first one being fixed). First we need the following lemma.
Proof Rather than giving a formal proof, let us sketch the main idea. Dirichlet forms
are bilinear and positive, sending functions from the Dirichlet spaces into R+ . Such
quadratic forms are convex, and if φ : R → R is convex, then by Jensen’s inequality
(under suitable integrability properties),
φ F dμ2 ≤ φ(F )dμ2 .
E2 E2
This inequality applied to F (x2 ) = f (·, x2 ) then yields the announced claim.
By the Poincaré inequality for μ1 , the first term on the right-hand side
of this identity
is bounded from above by C1 E1 (g), and by Lemma 4.3.2, E1 (g) ≤ E2 E1 (f ) dμ2 .
By the Poincaré inequality for μ2 for every fixed
x1 ∈ E1 , the second term on the
right-hand side is bounded from above by C2 E1 E2 (f ) dμ1 . Therefore
Varμ1 ⊗μ2 (f ) ≤ C1 E1 (f ) dμ2 + C2 E2 (f ) dμ1 ≤ max(C1 , C2 ) E(f ).
E2 E1
Remark 4.3.3 A somewhat different route may actually be used to establish Propo-
sition 4.3.1. On the product space E = E1 × E2 equipped with the product proba-
bility measure μ = μ1 ⊗ μ2 , consider, for every t ≥ 0 and x ∈ E, the probability
measure νt (dx, dy) = pt (x, dy)μ(dx) where pt (x, dy) denotes the kernel of the
semigroup (Pt )t≥0 (on E). As for the variance, it is easily checked that
2
f (x) − f (y) νt (dx, dy) = 2 f dμ − f Pt f dμ .
2
E E E E
(for every t ≥ 0). Since it is clear that νt = νt1 ⊗ νt2 corresponds to the tensor product
of the respective semigroups on E1 and E2 , the preceding inequality is immediately
stable under products.
The tensorization Proposition 4.3.1 may for example be used immediately to re-
cover the Poincaré inequality of Proposition 4.1.1 for the standard Gaussian measure
on Rn from its one-dimensional counterpart (4.1.1). Similarly, the Poincaré inequal-
ity for the exponential measure investigated in the next section may be tensorized to
multiple coordinates.
a Markov theory setting). We illustrate this strategy here with the example of the
exponential measure on the real line (and positive half-line), generalized later in
Sect. 4.6. On the basis of this example, we shall derive some integrability properties
of Lipschitz functions and measure concentration under a Poincaré inequality.
(the right-hand side describing the carré du champ operator and Dirichlet form as-
sociated to the Laguerre operator). The spectrum here is discrete (equal to N) and
the Poincaré inequality (4.4.1) expresses the gap between 0 and the first non-zero
eigenvalue 1.
However, one may also investigate the Poincaré inequality for the exponential
measure with respect to the standard carré du champ operator (f ) = f 2 and its
associated Dirichlet form E which is the content of the next proposition.
Proposition 4.4.1 Let dμ(x) = e−x dx on the positive half-line R+ . For every func-
tion f : R+ → R in the Dirichlet domain D(E) associated to the carré du champ
operator (f ) = f 2 ,
Varμ (f ) ≤ 4 f 2 dμ. (4.4.2)
R+
It should be pointed out that the Dirichlet domain in this case generates a sub-
Markov semigroup so that Proposition 4.4.1 is not really in the realm of Poincaré
inequalities. Moreover the essential self-adjointness property (Definition 3.1.10,
p. 134) is certainly not satisfied here (cf. Sect. 2.4, p. 92).
The conclusion follows by applying the Cauchy-Schwarz inequality to the last inte-
gral.
We next somewhat broaden the spirit of the preceding proof. For this discus-
sion, let us consider the symmetric exponential measure on the whole real line,
dμ(x) = 12 e−|x| dx. The density of μ with respect to the Lebesgue measure is not
quite smooth and, if necessary, the potential x → |x| may be replaced by a smooth
function with the same behavior at infinity. However, we work with the given den-
sity since the argument which is developed is simpler in this case. The associated
Markov generator is Lf = f − sign(x)f , where sign(x) denotes the sign of x,
with carré du champ operator (f ) = f 2 . Its spectrum is not discrete. This is actu-
ally not due to the non-smooth coefficient, but rather to the behavior of |x| at infinity
so that the decay of μ is not strong enough. For example, it is easily seen that as
soon as λ ≥ 14 , there are no solutions to f − f = −λf which are square integrable
on (0, ∞).
Nevertheless, the operator L admits a Poincaré inequality (and thus a gap in the
spectrum) as can be seen from the following argument directly inspired by the proof
of Proposition 4.4.1. What follows is a bit formal, but may be easily justified. Con-
sider the function h(x) = |x|, x ∈ R. Then Lh = 2δ0 − 1 where δ0 is the Dirac mass
at 0, so that integration against it with a (smooth) function f such that f (0) = 0
gives 0. Let f be such a function, and write
f 2 dμ = − f 2 Lh dμ = f 2 , h dμ = 2 f (f, h)dμ.
R R R R
Now (h) = 1 so that |(f, h)| ≤ (f )1/2 . Hence, after an application of the
Cauchy-Schwarz inequality as in the proof of Proposition 4.4.1,
1/2 1/2
f 2 dμ ≤ 2 f 2 dμ (f )dμ ,
R R R
and therefore
Varμ (f ) ≤ 4 (f )dμ = 4 f 2 dμ (4.4.3)
R R
which is the announced Poincaré inequality for the symmetric exponential measure
μ on R. It is worth pointing out that the significant component of the argument is
190 4 Poincaré Inequalities
that −Lh is bounded from below by a strictly positive constant away from 0 (outside
a compact set containing 0 would have been enough) and that (h) is bounded from
above. This principle will be amplified in Sect. 4.6 below, which deals with the
so-called Lyapunov functions.
One miracle of the previous argument is that the constant 4 in Proposition 4.4.1
and (4.4.3) is optimal. To see why, we examine another consequence of Poincaré
inequalities, namely the integrability properties of Lipschitz functions. Recall the
notion of a Lipschitz function f from Definition 3.3.24, p. 166, with Lipschitz co-
1/2
efficient f Lip = (f )∞ . A function f is said to be 1-Lipschitz if f Lip ≤ 1.
Under a Poincaré inequality for a measure μ, Lipschitz functions are exponen-
tially integrable. This is the content of the following statement.
∞
−2
Cs 2
e dμ ≤ e
sf s E f dμ
1 − +1 . (4.4.4)
E 4
=0
It is enough to establish this inequality for a bounded function f (indeed one can
work with fk = (f ∧ k) ∨ (−k), k ∈ N, for which (fk ) ≤ 1 since (ψ(f )) ≤ (f )
for any contraction ψ : R → R, and let then k → ∞ by Fatou’s lemma). Apply the
Poincaré inequality P (C) to g = esf/2 , s ∈ R. Since
s2 s2
E(g) = e (f )dμ ≤
sf
esf dμ,
4 E 4 E
4.4 The Example of the Exponential Measure, and Exponential Integrability 191
2
Cs 2 s
1− Z(s) ≤ Z .
4 2
Cs 2
If 1 − 4 > 0, then
2
Cs 2 −1 s
Z(s) ≤ 1 − Z .
4 2
Iterating the procedure by replacing s by s
yields the claim since lim→∞ Z( 2s )2 =
2
es E f dμ .
To reach the full conclusion of the statement, it remains to show that every
1-Lipschitz function f is indeed integrable (provided it is finite μ-almost every-
where). To this end, apply the Poincaré inequality to fk = |f | ∧ k for every k ∈ N.
In particular, since (fk ) ≤ (|f |) ≤ (f ) ≤ 1,
2
fk − fk dμ dμ ≤ c1
E E
Let c2 > 0 be large enough so that μ(|f | ≥ c2 ) < 12 (since f is μ-almost every-
where finite). Hence, μ(|fk | ≥ c2 ) < 12 for every k. Now, if two measurable sets
S1 and S2 are such that μ(S1 ) ≥ 12 and μ(S2 ) > 12 , then, μ(S1 ∩ S2 ) > 0, and in
particular
S1 ∩ S2 = ∅.
√Therefore, for every k ∈ N, there exists an x ∈ E such that
|fk (x) − E fk dμ| < 2c1 and |fk (x)| < c2 , and hence
fk dμ < 2 c1 + c2 .
E
√
2+s C
s(f − E f dμ)
e dμ ≤ √ . (4.4.5)
E 2−s C
192 4 Poincaré Inequalities
By the Poincaré inequality P (C) for μ and the change of variables formula,
Varμ (g ◦ F ) ≤ C (g ◦ F )dμ
E
g ◦ F (F )dμ
2
=C
E
2
g dμF ,
2
≤C g ◦ F dμ = C
E R
1
F (x) = √ f (x1 ) + · · · + f (xn ) , x = (x1 , . . . , xn ) ∈ Rn ,
n
√
P F (Z) ≥ E F (Z) + r ≤ 3 e−r/ C
Remark 4.4.5 One may wonder how far the exponential integrability of Proposi-
tion 4.4.2, or the tail estimate (4.4.6), are from a Poincaré inequality. The Mucken-
houpt characterization presented in the next section clearly indicates that they are
not sufficient in general to entail a Poincaré inequality. However, under a curva-
ture condition CD(0, ∞), the concentration bound (4.4.6) surprisingly turns out to
imply in return a Poincaré inequality P (C ), which moreover has proportional con-
stants. This will be analyzed in Sect. 8.7, p. 425, on the basis of the local heat kernel
inequalities of Sect. 4.7 below (in particular their reverse forms).
This section describes a criterion for a measure μ on the real line R to satisfy a
Poincaré inequality with respect to the usual carré du champ operator (f ) = f 2
(recall that it is always possible to reduce to this case by a suitable change of vari-
ables, cf. Sect. 2.6, p. 97). This criterion, going back to the work of B. Muckenhoupt,
yields a useful necessary and sufficient condition for the Poincaré inequality to hold
(although it is not always completely straightforward to check). We only state it for
measures absolutely continuous with respect to the Lebesgue measure (see however
below). The Rn case will be treated in Chap. 8 in terms of more general measure-
capacity inequalities (which, however, are less useful in practice).
and
m 1
B− = sup μ (−∞, x] dt.
x<m x p(t)
Then, a necessary and sufficient condition in order that μ satisfies a Poincaré in-
equality P (C) is that B = max(B+ , B− ) < ∞. Moreover, B ≤ 2C ≤ 8B where C is
the Poincaré constant of μ.
It should be mentioned that whenever B < ∞, the median is unique, since the
measure of any interval included in the support is strictly positive. A careful look
at the criterion indicates that in order for it to be satisfied, p1 has to be locally
integrable on the support of μ with respect to the Lebesgue measure. Hence the
measure μ should not have “holes” in its support. For example, it is immediately
checked that whenever p = 0 on some interval [x0 , x1 ] with m ≤ x0 < x1 such that
μ([x1 , +∞)) > 0, or when at some point x0 , p |x − x0 |, then the condition is not
satisfied. It is a good exercise to check that the measure on R with density cα e−|x|
α
with respect to the Lebesgue measure satisfies the Muckenhoupt criterion if and
only if α ≥ 1.
It is not necessary for a probability measure μ to have a density with respect
to the Lebesgue measure in order to satisfy a Poincaré inequality (use for example
the contraction Proposition 4.4.4 with a Lipschitz map which is constant on some
interval of positive measure). In fact, Theorem 4.5.1 may be extended and stated
without any change by taking p to be the density of the absolutely continuous part
of μ. For simplicity, and to make the exposition of the proof a little easier, we only
deal with measures with densities, and we assume that the suprema in B+ and B−
are taken over points x in the smallest closed interval which carries μ. The general
case is left to the reader.
4.5 Poincaré Inequalities on the Real Line 195
Proof As in the proof of the Poincaré inequality for the exponential measure, it is
enough to bound from above
∞ m
2 2
f (x) − f (m) dμ and f (x) − f (m) dμ
m −∞
2 dμ.
by C E(f ) = C Rf By the Cauchy-Schwarz inequality, for every x > m,
2
2 x
f (x) − f (m) = f (t)dt
m
x 1
≤ f 2 (t) p(t)dt dt = E(f ) h+ (x).
R m p(t)
Therefore,
∞ ∞
2
f (x) − f (m) dμ ≤ E(f ) h+ (x) p(x)dx.
m m
Together with a similar treatment for the second integral, we immediately obtain a
Poincaré inequality with constant
∞ m
C = max h+ (x) p(x)dx, h− (x) p(x)dx .
m −∞
Proof of Theorem 4.5.1 First, the condition B < ∞ is sufficient for a Poincaré in-
equality P (C) to hold. To establish this claim, we proceed as in the proof of Propo-
sition 4.5.2 to bound [f (x) − f (m)] 2
for every x > m. However, we need a some-
what sharper bound. Set g(x) = h+ (x) and k(x) = g(x)p(x), x > m. Then, by
the Cauchy-Schwarz inequality,
2
2 x
x
2
x 1
f (x) − f (m) = f (t)dt ≤ f (t) k(t)dt dt.
m m m k(t)
196 4 Poincaré Inequalities
x 1
m k(u) du = 2g(x). The definition of B+ indicates that, for every x > m,
B+
g(x) ≤
μ([x, +∞))
p(x)
√ = −2 r (x).
μ([x, +∞))
But
k(t) r(t) = p(t) μ [t, +∞) h+ (t) ≤ B+ p(t),
so that finally
∞ ∞
2
f (x) − f (m) p(x)dx ≤ 4B+ f 2 (t) p(t)dt.
m m
m
The term −∞ p(x)[f (x) − f (m)]2 dx is treated similarly with B− so that the mea-
sure μ indeed satisfies a Poincaré inequality with constant C = 4B.
Turning to the converse, the idea is to apply the Poincaré inequality P (C) to
suitable test functions f . We first replace the Poincaré inequality with the easiest
consequence of it given by Lemma 4.2.3 with A = [m, ∞) where m is the median
of μ (hence μ(A) ≤ 12 and 1−μ(A)1
≤ 2). Therefore, under P (C), for any function
f ∈ D(E) supported on [m, ∞),
f dμ ≤ 2C f 2 dμ.
2
R R
4.5 Poincaré Inequalities on the Real Line 197
x
Then take a function g ∈ L2 (μ), and apply the latter to f (x) = m g(t)dt, x > m. It
follows that
∞ x 2 ∞
g(t)dt dμ(x) ≤ 2C g 2 (x)dμ(x).
m m m
Fix ε > 0 and r > m, and apply this inequality to the function
1
g(x) = 1[m,r] (x), x ∈ R.
p(x) + ε
The left-hand side of the previous inequality is therefore bounded from below by
∞ x 2 r 2
1
g(t) dt p(x)dx = dt μ [r, ∞) ,
r m m p(t) + ε
strictly positive and bounded density. To understand the meaning of such inequali-
ties (which will be further explored in Sect. 4.10), consider for simplicity a bounded
interval I ⊂ R, with the usual carré du champ operator (f ) = f 2 and a reference
probability measure μ with a smooth density w bounded from below by some con-
stant a > 0. As described in Sect. 2.6, p. 97, this situation is related to the analysis
of the Sturm-Liouville operator Lf = f + ww f acting on smooth functions f on
an interval I ⊂ R. The following statement holds.
Proposition 4.5.3 For the Sturm-Liouville operator Lf = f + ww f acting on
smooth functions on a bounded interval I ⊂ R where w is smooth and bounded from
below by a constant a > 0, with invariant probability measure dμ = wdx, there is
a constant C > 0 such that, for any smooth function f compactly supported in the
interval I ,
f 2 dμ ≤ C f 2 dμ. (4.5.1)
I I
The best constant C in this inequality is where −λ0 is the lowest eigenvalue of
1
λ0 ,
the Sturm-Liouville operator L with Dirichlet boundary conditions (functions which
vanish at the boundary).
Proof From the analysis performed in Sect. 2.6, p. 97, the operator L defined on the
class A0 of smooth and compactly supported functions in the interval I extends to a
self-adjoint operator L (the Friedrichs extension) which is such that A0 is dense in
the Dirichlet domain D(E) of this extension, corresponding to the Dirichlet bound-
ary conditions. Moreover, the spectrum of (the opposite generator) −L is discrete,
consisting of an infinite increasing sequence of positive eigenvalues (λk )k∈N associ-
ated to eigenvectors (fk )k∈N which form an orthonormal basis of L2 (μ). It requires
a bit of further analysis (not developed here) to see that in fact λ0 is strictly pos-
itive, simple, and corresponds to a positive eigenvector f0 , strictly positive in the
interior of I (λ0 and f0 are the analogues of the Perron-Frobenius eigenvalue and
eigenvector in the context of finite Markov chains, see Sect. 1.9, p. 33).
applies. Namely, if f ∈ L (μ)
Now, the standard spectral decomposition picture 2
Therefore,
E(f ) = − f Lf dμ = ak2 λk ≥ λ0 ak2 = λ0 f 2 dμ.
I k∈N k∈N I
Inequality (4.5.1) of Proposition 4.5.3 follows with C = λ10 , where λ10 appears to
be the best possible (since f0 saturates the inequality). On the other hand, if (4.5.1)
holds for all f ∈ A0 , then it extends to every function in the Dirichlet domain D(E)
by density. The proof of the proposition is therefore complete.
4.5 Poincaré Inequalities on the Real Line 199
The best constant C in this inequality is λ11 , where −λ1 is the first non-zero eigen-
value of the Sturm-Liouville operator L with Neumann boundary conditions (func-
tions whose derivatives vanish at the boundary).
Proof This relies on the same analysis as that given in Proposition 4.5.3. The only
point to observe is that one may approximate a smooth function f defined in a
neighborhood of I by a sequence of smooth functions with derivatives which vanish
in a neighborhood of the boundaries, in such a way that the corresponding quantities
in the inequality (4.5.2) converge to the quantities for f . At a technical level, the idea
is to first approximate the derivatives. The details are left to the reader.
It is worth mentioning that there are cases when (4.5.2) is valid only for func-
tions which are compactly supported in I . As is illustrated by the next statement,
this situation corresponds to a spectral gap of the operator with periodic bound-
ary conditions. The next statement actually investigates more precisely the different
values of the constants in the preceding two propositions with the example of the
Lebesgue measure on the interval I = [0, 1].
The inequality is optimal and the sharp constant is attained by the function
f (x) = sin(πx).
200 4 Poincaré Inequalities
The inequality is optimal and the sharp constant is attained by the function
f (x) = cos(πx).
(iii) For any smooth function f : [0, 1] → R such that f (0) = f (1),
1
Vardx (f ) ≤ f 2 dx. (4.5.5)
4π 2 [0,1]
The inequality is optimal and the sharp constant is attained by functions of the
form f (x) = a cos(2πx) + b sin(2πx) + c where a, b, c are real constants.
Item (i) corresponds to the bottom of the spectrum with Dirichlet boundary con-
ditions (Proposition 4.5.3) while (ii) corresponds to the spectral gap for Neumann
boundary conditions (Proposition 4.5.4). Property (iii) corresponds to an estimate
on the spectral gap for periodic functions and is usually referred to as Wirtinger’s
inequality in the literature (see the Notes and References).
Proof The assertion (iii) is proved directly using a Fourier decomposition of a peri-
odic function similar to the Hermite expansion in Sect. 4.1. Namely if
f (x) = a0 + ak cos(2πkx) + bk sin(2πkx)
k≥1
(take if necessary a finite sum to start with), then assuming that a0 = [0,1] f dx = 0,
1 2
f dx =
2
ak + bk2 and f 2 dx = 2π 2 k 2 ak2 + bk2 .
[0,1] 2 [0,1]
k≥1 k≥1
The conclusion immediately follows. Note that any smooth function f on [0, 1]
such that f (0) = f (1) may be extended to a continuous 1-periodic function. While
it may happen that this extension is not C 1 , it may be approximated by a se-
quence of smooth periodic functions such that each term in the inequality con-
verges to the corresponding one for f . Therefore, the optimal (Poincaré) constant
in (4.5.5) is also the optimal constant in the Poincaré inequality on the interval [0, 1]
over the class of smooth 1-periodic functions. The second set (ii) of inequalities,
without any boundary condition, appears as a consequence of (i) by symmetriza-
tion and periodization (for f : [0, 1] → R arbitrary, define g : [−1, +1] → R by
g(x) = f (x) for x ∈ [0, +1], g(x) = f (−x) for x ∈ [−1, 0], and apply (i) to g on
the interval [−1, +1] after re-scaling). Finally (i) is a consequence of (iii) by anti-
symmetrization and periodization (for f : [0, 1] → R such that f (0) = f (1) = 0,
define g : [−1, +1] → R by g(x) = f (x) for x ∈ [0, +1], g(x) = −f (−x) for
x ∈ [−1, 0]).
4.6 The Lyapunov Function Method 201
The main principle is the following. In general, on Rn for example, it is not that
difficult to reach Poincaré inequalities on compact subsets, balls for example, by
comparison with known measures or gradients. Accordingly, we say that a Markov
Triple (E, μ, ), where μ is not necessarily a probability measure, with underlying
algebra A0 , satisfies a local Poincaré inequality on a measurable subset K with
0 < μ(K) < ∞ of the state space E if for some constant CK > 0 and every function
f in A0 ,
(f − mK ) dμ ≤ CK
2
(f )dμ (4.6.1)
K K
where mK = 1
μ(K) K f dμ. In other words, this is the Poincaré inequality with re-
spect to the carré du champ operator for functions restricted to K and with respect
to the measure μ restricted to K and normalized to be a probability measure. Actu-
ally, only a much weaker form of (4.6.1) will be used below, namely that for some
(measurable) set L ⊃ K and any f ∈ A0 ,
(f − mK )2 dμ ≤ CK,L (f )dμ (4.6.2)
K L
for some m = m(f ) to be chosen later. Using of the Lyapunov function J , write
1 LJ
(f − m)2 dμ ≤ − (f − m)2 dμ + b (f − m)2 dμ. (4.6.3)
E λ E J K
is then bounded from above thanks to (4.6.2) by b CK,L L (f )dμ. In order to deal
with the first term, we show that for every g ∈ A0 ,
LJ 2
− g dμ ≤ (g)dμ. (4.6.4)
E J E
The proof is then completed provided (4.6.4) and (4.6.5) may be applied to
g = f − m, for which (g) = (f ). However, g is not in general in A0 even if
f is (recall that A0 stands in general for the set of smooth compactly supported
functions). To overcome this difficulty, use the completeness property and replace
g by g ζk , k ∈ N, where ζk are the functions used to define completeness and then
pass to the limit as k → ∞. In this way, Theorem 4.6.2 is established.
One may wonder whether the condition J ≥ 1 has indeed been used in the pre-
ceding proof. First, since the hypotheses in Theorem 4.6.2 are unchanged when J
is replaced by cJ , the preceding condition essentially amounts to the fact that J
is bounded from below by a strictly positive constant. This condition then ensures
that if we divide by J the resulting function g2J will be in A. Moreover, the com-
pleteness assumption is used only to justify the integration by parts formula (4.6.5).
This formula may
hold without it, for example as soon as L(log J ) and (log J ) are
in L1 (μ) and E L(log J )dμ = 0. Following Proposition 3.3.6, p. 156, it would be
enough for this purpose that L(log J ) ∈ L2 (μ).
The preceding Lyapunov criterion (Theorem 4.6.2) admits a number of vari-
ations, under somewhat different hypotheses and conclusions. Note for example
that it may be applied to the standard Gaussian measure on Rn with the choice of
J = 1 + |x|2 and K some Euclidean ball. The resulting Poincaré constant is how-
ever far from optimal (in particular it depends on the dimension). The criterion also
applies to the exponential measure with for example J = ec|x| for some sufficiently
small c > 0. Comparing with the previous (and optimal) proof of the Poincaré in-
equality for the exponential measure given in Proposition 4.4.1, setting J = ech , the
hypothesis on J may be deduced from a uniform bound on (h), an upper bound on
−Lh outside a compact set and a proper choice of the parameter c. The Lyapunov
hypothesis on J is then more general, although the proof given earlier, following
the same lines, was more precise.
A nice and powerful illustration of the Lyapunov function method is the following
statement for log-concave measures. A probability measure dμ = e−W dx on the
Borel sets of Rn such that W is a smooth convex function is called log-concave.
function in
the preceding sense amounts to showing that for a convex function W
such that Rn e−W dx < ∞,
x · ∇W
lim inf > 0,
|x|→∞ |x|
Note for further purposes that the dependence of the Poincaré constant on the
dimension n provided by the preceding proof is quite poor.
Theorem 4.6.3 may be applied to the uniform measure on a convex body in Rn .
Indeed let K be convex and compact with non-empty interior in Rn and consider the
convex function W (or rather a smooth approximation of it) defined by W (x) = 1 if
x ∈ K and W (x) = ∞ otherwise. The uniform normalized Lebesgue measure μK
on K thus satisfies a Poincaré inequality
VarμK (f ) ≤ C |∇f |2 dμK (4.6.6)
K
for some C > 0 and all smooth functions f : K → R. The classical Payne-
Weinberger Theorem, relying on a refined geometric analysis of the spectral de-
composition of the Laplace operator on K, describes the optimal estimate
D2
C≤ (4.6.7)
π2
in terms (only) of the diameter D of K. This inequality is optimal since it becomes
an equality when K is a ball, and includes in particular (4.5.4) in dimension one.
Even if (4.6.7) is optimal, the constant depends on the diameter and this estimate
cannot be used to prove a Poincaré inequality for general log-concave measures
(Theorem 4.6.3). Observe furthermore that (4.6.7) is false in general for non-convex
domains. A compact connected domain K in Rn still satisfies a Poincaré inequal-
ity (4.6.6) but the constant may depend on more than only the diameter (think for
example of a dumbbell).
μ(K ∪ L)
CK∪L ≤ max(CK , CL ).
μ(K ∩ L)
Here, the Poincaré inequalities (4.6.1) are assumed to hold on some suitable class
A1 of functions on E, not necessarily A0 . The proof relies on the following lemma,
which is of independent interest.
1 1 y2 x2
(x + z)2 + (y + z)2 = + .
1−b 1−a 1−b 1−a
y2 x2
After some tedious details, it appears that the supremum of 1−b + 1−a on the set
x2 (x+y)2 y2
a + 1−a−b + ≤ 1 is bounded from above by a + b = 1 − μ(K ∩ L) which yields
b
the claim. This bound is actually not optimal, but the optimal bound (given a and b)
has a less pleasant form. The argument is similar if either a = 0 or b = 0.
for all functions f (in some class A1 ). To this end, assuming K∪L f dμ = 0, simply
write
f 2 dμ ≤ f 2 dμ + f 2 dμ
K∪L K L
and use the definition of CK and CL together with the lemma.
In the framework of this chapter, this section deals with Poincaré inequalities not for
the invariant measure μ but for the heat kernel measures pt (x, dy), t ≥ 0, x ∈ E, of
the Markov semigroup P = (Pt )t≥0 (from the representation
Pt f (x) = f (y) pt (x, dy)
E
of (1.2.4), p. 12) with respect to the carré du champ operator . These inequalities
along the semigroup will be called local with reference to the (initial) point x and
the time variable t. (In particular, there is no need here to speak of an invariant or
reversible measure.) These are important illustrations of semigroup monotonicity
based on the Duhamel interpolation principle.
The inequalities investigated here are close in spirit to the local Poincaré inequal-
ities (4.6.1) where the measure restricted to a set K is replaced by the heat kernel
measure pt (x, dy) around a point x (highly concentrated around this point, at least
for small t’s), but play a different and fundamental role. Indeed, as will be shown
here, these inequalities are in general easier to reach, and provide a lot of important
information on the behavior of the semigroup. They are moreover a powerful tool
in the study of partial differential equations.
The local heat kernel inequalities will be presented and established for functions
in the algebra A0 of the underlying Markov Triple (E, μ, ). Depending on the
context, they may then be extended to any function in D(E) or D(L), or even to any
bounded measurable function. These extensions shall not be discussed below and
are left to the reader.
To better introduce these ideas, we first start with the example of the Ornstein-
Uhlenbeck semigroup P = (Pt )t≥0 presented in Sect. 2.7.1, p. 103, whose generator,
on Rn , is given by Lf = f − x · ∇f with carré du champ operator (f ) = |∇f |2
(on an algebra A0 of smooth functions). In this simple and concrete example, both
the semigroup (Pt )t≥0 and the kernels pt (x, dy) are explicit. In particular, the rep-
resentation (2.7.3), p. 104, yields the commutation property ∇Pt f = e−t Pt (∇f )
4.7 Local Poincaré Inequalities 207
(where we used the fact that, by convexity, (Pt h)2 ≤ Pt (h2 )).
Hence the gradient of Pt decays exponentially in t. This property has interest-
ing consequences via the Duhamel-type formula (3.1.21), p. 131. Let us recall the
principle. For fixed t > 0, and for a function f in A0 , consider
(s) = Ps (Pt−s f )2 , s ∈ [0, t]
(as usual at almost any point x ∈ E, omitted everywhere). Taking the derivative of
yields
(s) = Ps L (Pt−s f )2 − 2Pt−s f LPt−s f = 2Ps (Pt−s f )
On the basis of this interpolation formula, recall the commutation relation (4.7.1)
(Pt−s f ) ≤ e−2(t−s) Pt−s (f )
As a consequence,
t
Pt f 2 − (Pt f )2 ≤ 2Pt (f ) e−2(t−s) ds = 1 − e−2t Pt (f ) . (4.7.2)
0
This inequality is a Poincaré inequality P (C), with constant C = 1 − e−2t , not for
the invariant (Gaussian) measure
μ, but for the heat kernel measure pt (x, dy) of the
representation Pt f (x) = Rn f (y) pt (x, dy), for any x ∈ E. However, as t → ∞,
we recover the Poincaré inequality of Proposition 4.1.1 since pt (x, dy) converges
to μ(dy). Recall that the initial point x is implicit in (4.7.2) (as is the case for all of
the local inequalities considered here).
The preceding argument via (4.7.1) may also be used to establish a converse
inequality. Indeed, by (4.7.1) again, for every s ≥ 0 and every g ∈ A0 ,
Ps (g) ≥ e2s (Ps g).
208 4 Poincaré Inequalities
As for (4.7.2), and its reverse form (4.7.3), we deduce that, for every t ≥ 0,
2t (Pt f ) ≤ Pt f 2 − (Pt f )2 ≤ 2t Pt (f ) . (4.7.4)
Remark 4.7.1 Since the kernels pt (x, dy) are Gaussian measures, the preceding
Poincaré inequalities (4.7.4) and (4.7.2) are just variations (with different means and
variances) of the Poincaré inequality for the standard Gaussian measure (Proposi-
tion 4.1.1) applied to functions y → f (x + ty). The approach is however completely
different, emphasizing interpolation along the semigroup, the commutation relations
and the connection with the 2 operator and curvature conditions.
e2ρt − 1
Pt f 2 − (Pt f )2 ≥ (Pt f ). (4.7.6)
ρ
Proof As mentioned before, the implication from (i) to (ii) is the content of the gra-
dient bound of Corollary 3.3.19, p. 163. The proof from the gradient bound (ii) to
the local Poincaré inequalities (iii) and (iv) has been discussed above for the Brow-
nian and Ornstein-Uhlenbeck semigroups and is similar here for any value of ρ. In
the same way, the local Poincaré inequality (iii) and its reverse form (iv) yield in
the limit as t → 0 the curvature condition CD(ρ, ∞). Indeed, using a second order
Taylor expansion
t2 2
Pt h = h + t Lh + L h + o t2
2
on (iii) for example, the left-hand side is given by
t2
t L f 2 + L2 f 2 − t 2 (Lf )2 − 2tf Lf − t 2 f L2 f + o t 2
2
while the right-hand side reads
2t (f ) − 2ρt 2 (f ) + 2t 2 L (f ) + o t 2 .
1 2 2
L f − (Lf )2 − f L2 f ≤ −2ρ (f ) + 2 L (f ).
2
Developing further the chain rule, we obtain
Note that in general little is known about the heat kernel measures pt (x, dy),
t ≥ 0, x ∈ E. However, under a curvature condition, Theorem 4.7.2 ensures that
they satisfy Poincaré-type inequalities, and thus for example, by Proposition 4.4.2,
have exponential tails. On the other hand, as mentioned in the Ornstein-Uhlenbeck
4.8 Poincaré Inequalities Under a Curvature-Dimension Condition 211
example, the reverse Poincaré inequalities (4.7.6) of Theorem 4.7.2 provide a useful
quantitative regularization property in the form of
ρ
(Pt f ) ≤ (4.7.7)
e2ρt−1
for any bounded measurable function f such that |f | ≤ 1 and any t > 0 (cf. Proposi-
tion 8.6.1, p. 422, for a somewhat sharper bound). In particular, (Pt f ) = O(t −1/2 )
as t → 0.
Remark 4.7.3 It is worth mentioning that the implications from (ii) to (iii) and
(iv) of Theorem 4.7.2 work similarly if (ii) only holds up to some constant K ≥ 1,
that is, for every t ≥ 0 and all functions f in A0 ,
(Pt f ) ≤ K e−2ρt Pt (f ) .
The proof is entirely similar, and (iii), for example, then simply reads
K(1 − e−2ρt )
Pt f 2 − (Pt f )2 ≤ Pt (f ) .
ρ
Remark 4.7.4 The preceding proof shows that each of the assertions of Theo-
rem 4.7.2 are equivalent to the statement that there exist a t0 > 0 and a function
c(t) = 2t − 2ρ t 2 + o(t 2 ) such that, for any t ∈ (0, t0 ) and any f in A0 ,
Pt f 2 − (Pt f )2 ≤ c(t)Pt (f )
(respectively
Pt f 2 − (Pt f )2 ≥ c(t) (Pt f ).
The preceding section describes Poincaré inequalities under the curvature condition
CD(ρ, ∞) for the heat kernel measures pt (x, dy) of a Markov semigroup (Pt )t≥0 .
In the well behaved cases, typically when ρ > 0 (as for the Ornstein-Uhlenbeck
semigroup discussed above), these inequalities extend as t → ∞ to inequalities for
the invariant measure μ as a basic instance of ergodicity as emphasized earlier in
Sect. 1.8, p. 32. In particular, as discussed there, only the minimal connexity con-
dition (if (f ) = 0 then f is μ-almost everywhere constant) is required for this to
212 4 Poincaré Inequalities
be the case. Moreover, it has been shown in Theorem 3.3.23, p. 165, that when-
ever ρ > 0, the invariant measure μ is always finite (and thus
a probability measure
by normalization) and ergodicity then ensures that P∞ f = E f dμ for every f in
L2 (μ). Under this assumption, the local Poincaré inequalities (4.7.5) for pt (x, dy)
of Theorem 4.7.2 extend as t → ∞ to a Poincaré inequality P ( ρ1 ) for the invariant
(probability) measure μ. The following statement summarizes this consequence. Al-
though it is formally contained in the stronger Theorem 4.8.4 below (corresponding
to n = ∞), it is worth stating independently.
Proposition 4.8.1 (Poincaré inequality under CD(ρ, ∞)) Under the curvature
condition CD(ρ, ∞) with ρ > 0, the Markov Triple (E, μ, ) satisfies a Poincaré
inequality P (C) with constant C = ρ1 . That is, for every function f ∈ D(E),
1
Varμ (f ) ≤ E(f ).
ρ
In other words, the Markov Triple consisting of Rn with the measure dμ = e−W dx
for which ∇∇W ≥ ρ Id and the standard carré du champ operator therefore satisfies
the curvature condition CD(ρ, ∞). We present this conclusion as an independent
statement. It will also appear as a consequence of the stronger Brascamp-Lieb in-
equality below (Theorem 4.9.1).
A similar statement, with the same proof, holds on a weighted Riemannian man-
ifold (M, g) for the operator L = g − ∇W · ∇ with the reversible (probability)
4.8 Poincaré Inequalities Under a Curvature-Dimension Condition 213
measure μ having density e−W with respect to the Riemannian measure under the
curvature condition Ric(L) = Ricg +∇∇W ≥ ρ g with ρ > 0 (cf. Sect. 1.16, p. 70,
and Sect. C.6, p. 513).
The main purpose of this section is to investigate Poincaré inequalities for
a Markov Triple (E, μ, ) under the stronger curvature-dimension condition
CD(ρ, n) of Definition 3.3.14, p. 159, for some ρ > 0 and some finite dimension
n ≥ 1. Recall that this condition expresses that
1
2 (f ) ≥ ρ (f ) + (Lf )2
n
for all f ∈ A0 (or A). In particular, it will be observed how the (finite) dimension
will improve upon the preceding Poincaré constant in Proposition 4.8.1.
To this end, we start from the CD(ρ, n) condition for ρ > 0 and n > 1 integrated
with respect to μ to get that, for every f in A0 ,
1
2 (f )dμ ≥ ρ (f )dμ + (Lf )2 dμ.
E E n E
Now, by integration by parts and the construction of the operators and 2
(cf. Sect. 3.3, p. 151),
(f )dμ = f (−Lf )dμ and 2 (f )dμ = (Lf )2 dμ,
E E E E
ρn
n−1 of the spectral gap is optimal since these two models both satisfy a curvature-
dimension condition CD(n − 1, n), yielding a Poincaré constant equal to n1 which
corresponds to the inverse of the first non-trivial eigenvalue. In particular, Lich-
nerowicz’s Theorem (4.8.1) compares the first non-trivial eigenvalue of the Lapla-
cian on a (compact) n-Riemannian manifold having a strictly positive lower bound
on the Ricci curvature by the one of the sphere with the same dimension and cur-
vature. Obata’s Theorem expresses conversely that when there is equality in (4.8.1),
the manifold is isometric to this sphere. Note that the result also holds on S1 (n = 1)
by Proposition 4.5.5 via a simple dilation from [0, 1] to [0, 2π]. But this may ap-
pear as a simple coincidence, since the condition CD(0, m), m ≥ 1, does not imply
a spectral gap inequality in general.
The preceding spectral argument implicitly assumes that the spectrum of L is dis-
crete. (We will actually establish later in Corollary 6.8.2, p. 306, that this is indeed
the case under these hypotheses.) We next provide a less formal approach relying
on semigroup tools and heat flow monotonicity. Along these lines, we start with a
useful dual description of the Poincaré inequality (also implicit from the previous
argument) which develops the heat flow argument via the second derivative of the
variance along the semigroup, or equivalently the first derivative of the Dirichlet
form, giving rise to the 2 operator.
Proof
(t) = −2 (Pt f )dμ and (t) = 4 (LPt f )2 dμ.
E E
Under (4.8.2), (t) ≥ − C2 (t), t ≥ 0, which integrates into the Poincaré inequal-
ity P (C) since
∞
C ∞ C
Varμ (f ) = − (t)dt ≤ (t)dt = − (0) = C (f )dμ.
0 2 0 2 E
by the Cauchy-Schwarz inequality, and the conclusion follows from the application
of P (C).
4.9 Brascamp-Lieb Inequalities 215
Theorem 4.8.4 (Poincaré inequality under CD(ρ, n)) Under the curvature-dimen-
sion condition CD(ρ, n), ρ > 0, n > 1, the Markov Triple (E, μ, ) satisfies a
Poincaré inequality P (C) with constant C = n−1
ρn .
for every smooth compactly supported function f on Rn , where (∇∇W )−1 is the
inverse of the Hessian of W .
216 4 Poincaré Inequalities
1
(∇∇W )−1 (∇f, ∇f ) ≤ |∇f |2
ρ
so that the Brascamp-Lieb inequality improves upon Corollary 4.8.2 in this context.
To address the proof of Theorem 4.9.1, the following general lemma of indepen-
dent interest will be useful.
Proof It is enough to prove (4.9.1) for u = A2 v with v ∈ D(A2 ). Then the in-
equality boils down to v, A2 v ≤ A−1
1 A2 v, A2 v. Now, it is easily checked that
D(A2 ) ⊂ D(A1 ), and setting B = A2 − A1 , the latter then reads A−1
1 Bv, A2 v ≥ 0.
In other words,
−1
A1 Bv, Bv + Bv, v ≥ 0
which is immediate since A1 and B are positive.
Proof of Theorem 4.9.1 The proof is only sketched. It uses in particular the ex-
tension of the operator L = − ∇W · ∇ and its associated semigroup (Pt )t≥0 to
vector-valued functions, and everything will ultimately rely on the commutation
property, easily checked on smooth functions f ,
L(f1 , . . . , fn ) = (Lf1 , . . . , Lfn ).
The operator
L is symmetric in H, and for u = (u1 , . . . , un ) and v = (v1 , . . . , vn ) in
H,
n
−Lu, v = ∇u · ∇v dμ = ∇ui · ∇vi dμ
Rn Rn i=1
for every u ∈ H.
Now let f ∈ A0 , where A0 is the set of smooth compactly supported functions
on Rn . Set
∞
gε = (ε − L)−1/2 f = c e−εs s −1/2 Ps f ds
0
for which
f 2 dμ = ε gε2 dμ + |∇gε |2 dμ.
Rn Rn Rn
(Here, the operator ∇ has to be properly extended from A0 to D(E) as an oper-
ator with values in H, but this is immediate.) Moreover, for f ∈ A0 , the identity
∇Lf = ( L − M)∇ from (4.9.2) extends to
1/2
∇gε = ∇ (ε − L)−1/2 f = ε + M −
L ∇f.
2 (f ) = |∇∇f |2 + Ric(L)(∇f, ∇f ),
with Ric(L) = Ricg +∇∇W . The next statement is then the analogue of the
Brascamp-Lieb inequality in this context.
Theorem 4.9.3 In this Riemannian setting, if the tensor Ric(L) is strictly positive
everywhere, denoting by Ric(L)−1 its inverse tensor, then for every smooth com-
pactly supported function f on M,
Varμ (f ) ≤ Ric(L)−1 (∇f, ∇f ) dμ.
M
The proof is rather similar to the Euclidean case of Theorem 4.9.1, replacing all
vectors u by 1-forms w and the operator L by the operator
L given in a local system
of coordinates (and with Einstein’s summation notation) by
L wj = ∇ i ∇i wj − ∇ i (W ) · ∇i wj .
This inequality may then be related to Proposition 3.2.5, p. 149, and the
semigroup
t )t≥0 with generator ∞
(P L = L−ρ Id. Indeed, by the development f = − 0 LPt f dt
of a smooth mean-zero function f and integration by parts,
∞
Varμ (f ) = ∇f · ∇Pt f dμ dt.
0 M
from which (4.9.4) follows by an analysis similar to the one developed for Theo-
rem 4.9.1.
The gradient commutation bound (4.9.5) actually has an even more precise form
of a probabilistic nature, which may prove useful in a wide variety of contexts,
although it is not so easy to express. Namely, starting from the commutation for-
mula (4.9.2), the semigroup (Qt )t≥0 with generator
L − M acting on vector-valued
functions (or on 1-forms in the manifold case) can be rewritten as the (exact) for-
mula
∇Pt f = Qt ∇f, t ≥0
(on suitable functions f : M → R). Therefore, (4.9.5) amounts to |Qt u| ≤ P t (|u|)
for vector-valued functions u. A direct approach to this proceeds via the Feynman-
Kac-type probabilistic representations as described in Sect. 1.15.6, p. 63. Let us
briefly illustrate the principle in the flat Euclidean case M = Rn with
L = − ∇W · ∇, corresponding to Theorem 4.9.1, borrowing the notation from
its statement and proof.
Indeed, the semigroup (Qt )t≥0 admits a nice probabilistic interpretation in terms
of the Markov process {Xtx ; t ≥ 0, x ∈ Rn } with semigroup (Pt )t≥0 and generator L.
Consider its vector-valued extension (with generator L) given on a vector-valued
function u = (f1 , . . . , fn ) by
Pt u(x) = Ex u(Xt ) = Ex f1 (Xt ) , . . . , Ex fn (Xt ) , t ≥ 0, x ∈ Rn .
The vector-valued version of the Feynman-Kac formula of Sect. 1.15.6, p. 63, then
takes the form
Qt u(x) = Ex At u(Xt ) (4.9.6)
where now At , t ≥ 0, is a (random) matrix, being a solution of the ordinary differ-
ential equation
t ∂t At = −At M(Xt ), with A0 = Id. (Note that this is not in general
exp(− 0 M(Xs ) ds) since the matrices M(Xs ) do not commute, and the latter ex-
pression is not even symmetric even if the matrices M(Xs ) are.) By a standard
220 4 Poincaré Inequalities
lemma from ordinary differential equation theory, if for any s ≥ 0, M(Xs )u, u ≥
ρ(Xs )|u|2 , then the matrix At satisfies
t
|At u| ≤ e− 0 ρ(Xs )ds
|u|.
In the last section of this chapter, we investigate some further spectral properties re-
lated to symmetric Markov semigroups. Essentially, Poincaré inequalities concern
Markov semigroups which are symmetric with respect to a given probability mea-
sure μ. In this context, the constant function 1 is always an eigenvector associated
with the 0 eigenvalue. Hence, Poincaré inequalities explore the gap between 0 and
the rest of the spectrum, and more precisely describe when the spectrum of −L is in-
cluded in {0} ∪ [λ1 , ∞) for some λ1 > 0. But it may happen that the spectrum itself
is included in [λ0 , ∞) for some λ0 > 0. This is in general the case for sub-Markov
semigroups, such as those related to Dirichlet boundary conditions on compact sets,
but it may also happen when the measure μ is infinite. These questions are addressed
in Sect. 4.10.1.
Furthermore, it is often of great interest to know when the spectrum is discrete,
that is when the essential spectrum is empty. Recall (cf. Appendix A) that this prop-
erty conceals several different facts. First that the spectrum is purely punctual, and
that to every point in the spectrum corresponds at least one non-zero eigenfunc-
tion (which is in L2 (μ) by definition). Second, that every point λ in the spec-
trum is isolated, that is, there exists an ε > 0 such that no point of the spectrum
lies in (λ − ε, λ + ε). Moreover, for any point in the spectrum, the correspond-
ing eigenspace is finite-dimensional. Section 4.10.2 investigates in this regard the
bottom of the essential spectrum, providing simple criteria related to Nash-type in-
equalities (further developed in Chap. 7).
In many situations, the essential spectrum depends only on “what happens at
infinity”. To give a precise meaning to this statement, we describe in Sect. 4.10.3 the
notion of a Persson operator, which leads once again to useful criteria for emptiness
of the essential spectrum.
We refer to Appendix A, and more precisely to Sect. A.4, p. 478, for the vari-
ous notions and definitions of the spectrum of a positive self-adjoint operator. The
4.10 Further Spectral Inequalities 221
first two sub-sections below are concerned with the minimal Markov triple struc-
ture (E, μ, ) as introduced in the first part of Sect. 3.1, p. 120, and focus on the
self-adjoint operator L associated with the triple.
As already mentioned, Poincaré inequalities make sense only when the refer-
ence measure μ is finite (in fact a probability measure), and when the semigroup
P = (Pt )t≥0 is Markov. It could nevertheless be the case, either for infinite mea-
sures (or for sub-Markov semigroups), that the following inequality
f 2 dμ ≤ C E(f ) (4.10.1)
E
holds for some C > 0 and some suitable class functions f on E. One such example
has already appeared in Lemma 4.2.3 above as a consequence of a Poincaré inequal-
ity on a larger set. When such an inequality holds for a class of functions f which
is dense in the Dirichlet domain D(E) of the given Dirichlet form E (and therefore
holds for every function f in it), it amounts to saying that the spectrum of the un-
derlying (opposite) generator −L lies in the interval [ C1 , ∞). Here, L is understood
as the minimal extension of the operator defined from the Dirichlet form E, that is,
its Friedrichs extension as described in Sect. 3.1, p. 120.
In the case of a bounded domain (in Rn or a manifold), this extension corre-
sponds to the Dirichlet boundary conditions. The connection between the spectrum
and (4.10.1) has been illustrated for Sturm-Liouville operators on an interval of the
real line in Sect. 4.5.2, giving rise to a useful connection between eigenvalues with
Dirichlet or Neumann boundary conditions. This section provides further material
on the spectrum of a diffusion operator.
From a spectral analysis point of view, the best constant C in (4.10.1) is λ10 where
λ0 is the bottom of the spectrum of −L (which is different from the spectral gap
related to Poincaré inequalities). This observation applies only when 1 is no longer
an eigenvector for −L which may indeed happen in a wide variety of situations:
either the measure μ is infinite, or the boundary conditions imposed on the domain
D(L) prevent 1 from belonging to it, in particular in the case of Dirichlet boundary
conditions.
T = inf {t ≥ 0 ; Xt ∈
/ O},
222 4 Poincaré Inequalities
then for all λ < λ0 , Ex (eλT ) < ∞ while for all λ ≥ λ0 , Ex (eλT ) = ∞. To understand
this result, the basic observation is that the bottom λ0 of the spectrum of − corre-
sponds to some strictly positive eigenvector f0 which vanishes at the boundary (this
strictly positive eigenvector corresponds to the ground state for Schrödinger oper-
ators and to the Perron-Frobenius eigenvector in the case of finite Markov chains).
Then eλ0 t f0 (Xt ), t ≥ 0, defines a martingale, and since f0 vanishes at the bound-
ary, the finiteness of Ex (eλ0 T ) would yield a contradiction. Hence Ex (eλT ) = ∞
for every λ ≥ λ0 . On the other hand, for any λ < λ0 , there exists a strictly posi-
tive eigenvector solution of −Lf = λf (not vanishing identically at the boundary),
which may be thought of as the bottom eigenvector on some enlarged set. The same
reasoning with the martingale eλt f (Xt ), t ≥ 0, leads to the fact that Ex (eλT ) must
be finite. This type of argument may be extended to more general settings, replacing
the Brownian motion by the Markov process with generator L.
Proposition 4.10.2 For the hyperbolic space Hn of dimension n, and for any
smooth compactly supported function f on Hn ,
4
f 2 dμ ≤ E(f ).
Hn (n − 1)2
Proof The argument is similar to the proof of the Poincaré inequality for the ex-
ponential measure (Sect. 4.4.1). In the chosen upper half-space representation, the
hyperbolic Laplacian may be explicitly represented as
Lf = xn2 f − (n − 2)xn ∂n f.
Now, the coordinate function xn satisfies L(xn ) = −(n − 2)xn and (xn ) = xn2 .
Therefore, if f is a smooth and compactly supported function on Hn ,
1 −1 2 1
f dμ =
2
(−Lxn ) xn f dμ = xn , xn−1 f 2 dμ.
H n n−2 H n n−2 H n
Now,
xn , xn−1 f 2 = −f 2 xn−2 (xn ) + 2f xn−1 (xn , f )
= −f 2 + 2f xn−1 (xn , f )
≤ −f 2 + 2|f | (f )1/2
4.10 Further Spectral Inequalities 223
Proposition 4.10.3 Assume that σess (−L) ⊂ [σ0 , ∞) for some σ0 > 0. Then, for
every r > σ10 , there exists a finite-dimensional vector subspace F ⊂ L2 (μ) such
that, for any f ∈ D(E),
f dμ ≤ r E(f ) + (F f )2 dμ
2
(4.10.2)
E E
1
Proof The first part is straightforward. Indeed, given r > σ0 , let F be the di-
rect sum of all the eigenspaces corresponding to eigenvalues λ < 1r . By defini-
tion of σess (−L), there is only a finite number of such eigenvalues and the sub-
space F is finite-dimensional. Let f ∈ D(E), and set for simplicity g = F (f ),
h = f − F (f ). Clearly E(g, h)
= 0 and E(f ) = E(g) + E(h) ≥ E(h). On the other
hand, by construction, E(h) ≥ 1r E h2 dμ. It follows that
f dμ =
2
g dμ + h dμ ≤
2 2
g dμ + rE(h) ≤
2
g 2 dμ + r E(f )
E E E E E
224 4 Poincaré Inequalities
which is (4.10.2). Conversely, under (4.10.2) for some r > 0 and some subspace F ,
assume that there exists a λ ∈ σess (−L) such that λ < 1r . Recall from Weyl’s criterion
(Theorem A.4.4, p. 481) that for any ε > 0, there exists an orthonormal sequence
(fk )k∈N such that λfk + Lfk 2 ≤ ε for
every k. Since fk has norm 1 in L (μ),
2
Proposition 4.10.4 If the spectrum of −L is discrete (i.e. σess (−L) = ∅), there exist
a positive function w ∈ L2 (μ) and a map β : (0, ∞) → (0, ∞) such that, for any
r ∈ (0, ∞) and any f ∈ D(E),
2
f 2 dμ ≤ r E(f ) + β(r) |f |w dμ . (4.10.3)
E E
Converses of this proposition will be studied below when the semigroup with
Markov generator L has a density kernel (Theorem 4.10.5) or for Persson operators
(Theorem 4.10.8).
Proof For any r > 0, denote by Fr the vector space spanned by all the eigenvectors
with eigenvalues less than 1r and by n(r) its dimension. Let (fk )k∈N be a sequence
of eigenvectors such that, for any r, (f0 , . . . , fn(r) ) defines an orthonormal basis of
r . Let (αk )k∈N be a decreasing
F sequence of strictly positive real numbers such that
α
k∈N k < ∞. Set w = k∈N k |fk |. The function
α
w belongs to L (μ) and, for
2
1
|ck | ≤ |f |w dμ.
αn(r) E
4.10 Further Spectral Inequalities 225
n(r) −2
Since Fr (f )22 = 2
k=0 ck , it follows that Er (f )22 ≤ n(r)αn(r) . It then suf-
−2
fices to choose β(r) = n(r)αn(r) , r > 0, and to follow the proof of Proposi-
tion 4.10.3 above. Inequality (4.10.4) is obtained with (r) = infs>0 [rs + β(s)],
which is clearly increasing and concave and satisfies limr→∞ (r)
r = 0. Conversely,
if (4.10.4) holds for some , it may be assumed to
be C 1 , therefore satisfying
limr→∞ (r) = 0 (namely, replace by (r) = 2 r (s)ds which is still in-
r 0
1 (r)). The version (4.10.3)
creasing and concave and satisfies 2 (r) ≤ (r) ≤
then follows from the family of inequalities (concavity)
Note that it is not required here that E E p 2 (x, y)dμ(x)dμ(y) < ∞, which would
imply that P is Hilbert-Schmidt and therefore compact (cf. Sect. A.6, p. 483).
Theorem 4.10.5 Assume that for some r > 0 there exists a positive function
w ∈ L2 (μ) such that for any f ∈ D(E),
2
f 2 dμ ≤ r E(f ) + |f |w dμ . (4.10.5)
E E
Assume moreover that there exists a t > 0 for which Pt has an L2 -density kernel
with respect to μ. Then the essential spectrum σess (−L) of −L lies in [ 1r , ∞).
In particular, if there exists a positive function w ∈ L2 (μ) such that for every
r > 0, there exists a β(r) > 0 such that for all f ∈ D(E),
2
f dμ ≤ r E(f ) + β(r)
2
|f |w dμ ,
E E
226 4 Poincaré Inequalities
and if Pt has an L2 -density kernel with respect to μ for some t > 0, then the spec-
trum of −L is discrete.
Proof The second part of the statement follows from the first part by applying it
to the weight β(r)1/2 w (for any r > 0). We thus concentrate on the first assertion.
Let Gw = {g ∈ L2 (μ) ; |g| ≤ w}. First observe that for an operator P with a pos-
itive density kernel in L2 (μ), the closure P (Gw ) of P (Gw ) is compact. Indeed,
given any sequence (gk )k∈N in Gw , thus bounded in L2 (μ), there exists a subse-
quence (gk )∈N which converges weakly, say to some function g ∈ L2 (μ). Then,
P having a density kernel, P gk converges μ-almost everywhere to P g, and more-
over |P gk | ≤ P w ∈ L2 (μ). From the dominated convergence Theorem, P gk then
converges to P g in L2 (μ) as → ∞.
Now (4.10.5) applied to Pt f yields, with K = Pt (Gw ),
2
Pt f 22 ≤ r E(Pt f ) + sup f g dμ . (4.10.6)
g∈K E
From this point, we may proceed as in the proof of Proposition 4.10.3 and choose
by contradiction λ < 1r such that λ ∈ σess (−L). For any ε > 0, there exists an or-
thonormal sequence (fk )k∈N ⊂ D(L) such that λfk + Lfk 2 ≤ ε for
every k ∈ N.
Looking at the derivative in s of e−2λ(t−s) Ps fk 22 , and recalling that E fk2 dμ = 1,
it follows that for every k ∈ N,
√
Pt fk 2 − e−2λt ≤ 2 t ε.
2
√
Combined with |E(Pt fk ) − λPt fk 22 | ≤ ε, it follows that
√
E(Pt fk ) − λe−2λt ≤ (1 + 2λt) ε.
Now, since (fk )k∈N converges weakly to 0 as k → ∞, for any compact set
K ⊂ L2 (μ),
lim sup fk g dμ = 0.
k→∞ g∈K E
In Theorem 4.10.5, the hypothesis that Pt has a density kernel may be replaced by
the fact that there exists a sequence (P )∈N of operators with density kernels (not
necessarily positive) such that, in operator norm, lim→∞ Pt − P = 0. Indeed,
this assumption still ensures that Pt (Gw ) is relatively compact.
4.10 Further Spectral Inequalities 227
In practice, Theorem 4.10.5 is not that useful since to establish that Pt has an L2 (μ)
density kernel, it is often necessary to prove first that it is Hilbert-Schmidt, and
therefore compact, which already implies that −L has a discrete spectrum. In the
following we develop some further techniques which will enable us to reach the
same conclusion. Actually, to complete the picture, the reader should be aware of
the huge difference that may exist, in the analysis of Markov semigroups, between
finite and infinite dimension. In particular, in reasonable situations, the essential
spectrum only depends on what happens at infinity. To allow such an analysis, we
introduce the notion of a Persson operator.
In concrete instances, (Ak )k∈N will typically be any increasing sequence of com-
pact subsets of E (in finite dimension), or even just balls. Observe that since the
sequence (Ak )k∈N is increasing, the sequence (λk )k∈N is increasing too. As a conse-
quence, for a Persson operator with respect to a sequence (Ak )k∈N , to assert that the
spectrum is discrete it is enough to prove that the corresponding increasing sequence
(λk )k∈N converges to ∞.
Given a sequence (Ak )k∈N as in Definition 4.10.6, introduce for
each k ∈ N the
space L2 (Ak , μ) of all measurable functions f : E → R such that Ak f 2 dμ < ∞
with associated norm f 2,Ak . Define furthermore
" #
H1 (Ak ) = f ∈ D(E) ∩ L2 (Ak , μ) ; (f )dμ < ∞ .
Ak
Proposition 4.10.7 For a sequence (Ak )k∈N as above, assume that for any k ∈ N,
there exists a function ψk ∈ D(E) with values in [0, 1] such that ψk = 0 on Ak ,
ψk = 1 on Ack+1 , and such that (ψk ) is bounded. Assume furthermore that the
embedding from H1 (Ak ) into L2 (Ak , μ) is compact for each k. Then the operator
L is Persson with respect to the sequence (Ak )k∈N .
2 dμ ≤ 2ε,
E h dμ ≥ 1 − 2ε.
Therefore so that 2 Now,
Eg
E(f ) = E(g) + E(h) ≥ E(h) ≥ λ h2 dμ ≥ λ(1 − 2ε).
E
This being valid for any f ∈ D(E) with support in Ack , we get λ(1 − 2ε) ≤ λk , and
the announced inequality λess ≤ σ follows.
Turning to the converse inequality λess ≥ σ , let λ belong to the essential spec-
trum σess . Applying again Weyl’s criterion (Theorem A.4.4, p. 481), for any ε > 0,
there exists an orthonormal sequence (f )∈N in D(L) such that λf + Lf 2 ≤ ε
for every . As in previous similar proofs, |E(f ) − λ| ≤ ε, and therefore the se-
quence (E(f ))∈N is bounded together with the sequence (f 2 )∈N . Hence, for
any k ∈ N, the sequence (f )∈N is also bounded in H1 (Ak ), so that by the compact-
ness assumption, there exists a subsequence which converges in L2 (Ak , μ). As the
sequence is orthonormal in L2 (μ), it converges weakly to 0 and therefore the limit
of this subsequence is 0. This may be done for any subsequence of (f )∈N , which
shows that (f )∈N converges to 0 in L2 (Ak , μ) for any k.
Now fix k ∈ N and consider, according to the hypotheses, ψk with values in [0, 1]
vanishing on Ak , equal to 1 on Ack+1 , and with bounded gradient. For every , let
g = ψk f . Then
E(g ) = ψk (f )dμ + 2 ψk f (f , ψk )dμ + f2 (ψk )dμ.
2
E E E
E(g ) ≤ λ + ε + ε .
On the other hand, E (1−ψk )2 f2 dμ ≤ Ak+1 f2 dμ and thus lim→∞ E g2 dμ = 1.
Therefore, as → ∞, λk ≤ λ + ε and the announced inequality follows. The proof
of Proposition 4.10.7 is complete.
The use of the Persson hypothesis may be combined with the weighted Nash
inequalities of Theorem 4.10.5 to reach the following main characterization.
4.10 Further Spectral Inequalities 229
Proof Denote by (Ak )k∈N a sequence for which L is Persson. The task is to bound
from below the corresponding values λk , k ∈ N, defined in Definition
4.10.6. To this
end, let r > 0 be fixed, and choose k = k(r) such that β(r) Ac w 2 dμ ≤ 12 . Then, by
k
the hypothesis applied to a function f with support in Ack , and the Cauchy-Schwarz
inequality,
2
1
f 2 dμ ≤ r E(f ) + β(r) |f |w dμ ≤ r E(f ) + f 2 dμ,
E Ack 2 E
whence λk(r) ≥ 2r1 . Since r > 0 is arbitrary, the increasing sequence (λk )k∈N con-
verges to ∞. The converse is achieved via Proposition 4.10.4. The proof is therefore
complete.
Proof We first deal with the case of Rn , for which we already know that L is essen-
tially self-adjoint thanks to Corollary 3.2.2, p. 143. Recall that here dμ = e−W dx.
Choose as the sequence (Ak )k∈N the sequence of balls centered at the origin with
230 4 Poincaré Inequalities
for some ck → ∞. Changing f into geW/2 , and after integration by parts (justified
since g is compactly supported), this amounts to proving that for any g smooth and
compactly supported in Ack ,
|∇g|2 + Rg 2 dx ≥ ck g 2 dx
Rn Rn
Due to their interplay with spectral properties, Poincaré inequalities form a vast
subject. They have been investigated and studied in a wide variety of settings, in
particular in the analysis of partial differential equations, and under various names
in the literature. This chapter only focuses on the aspects of Poincaré inequalities
emphasized in this book, namely functional inequalities, convergence to equilibrium
and heat kernel bounds. In particular, we do not address Poincaré inequalities on
balls (local Poincaré inequalities), doubling properties and the related analysis on
metric spaces. For these topics, we refer, for example, to [9, 13, 125, 217, 230, 236,
376, 397, 422] and to the references therein.
Poincaré inequalities also sometimes refer to the comparison of the L2 -norm of
a function with its gradient (in the form (4.10.1)) for functions vanishing at the
boundary of some domain. This form is related to the Dirichlet eigenvalue problem
in partial differential equations.
The Poincaré inequalities investigated in this chapter, and throughout this book,
mainly deal with Neumann boundary conditions comparing the L2 -norm of a func-
tion with mean zero with the one of its gradient. In this form, they may probably
4.11 Notes and References 231
be traced back to the contributions [348, 349] where H. Poincaré used duplication
((4.2.4)) and a clever change of variables to establish the inequality on bounded
convex domains in Rn . They are sometimes called Poincaré-Wirtinger inequalities
in reference to W. Wirtinger who seemingly analyzed functions on the circle (corre-
sponding to (4.5.5)) using a spectral decomposition along the trigonometric system.
W. Wirtinger, who is mentioned in the book [70], also considered the extension on
the sphere using spherical harmonics. The term “Poincaré inequality” first appears
in the book by R. Courant and D. Hilbert [140]. See [8, 300] for a short historical
introduction to Poincaré inequalities in the context of partial differential equations.
We refer in addition to [317] for a comprehensive survey of inequalities that in-
volve a relationship between a function and its derivatives or integrals, including in
particular Poincaré inequalities.
The name of Poincaré inequality has been used for numerous inequalities of the
same form or which are obtained similarly. In particular, the Poincaré inequality for
the Gaussian measure (Proposition 4.1.1) may be traced back to the early 30s in the
physics literature along with the Hermite eigenfunction expansion. It appears later
in papers of J. Nash [323], H. Chernoff [130], L. Chen [127] and in many places
today.
The main classical exponential decay Theorem 4.2.5 in Sect. 4.2.2 is a standard
result in Fourier-type analysis, and a fundamental part of the theory of Poincaré
inequalities. Lemma 4.2.6 may be found in [139].
The stability by bounded perturbations and tensorization (Sect. 4.3) properties of
Poincaré inequalities form part of the folklore of the subject.
The Poincaré inequality for the exponential measure of Proposition 4.4.1 is
pointed out in [403]. Its interpretation as a Lyapunov criterion has been empha-
sized in, for example, [29] (see below). Exponential integrability under a Poincaré
inequality goes back to the works [89] and [222]. It was revived and improved in
the corresponding investigation under logarithmic Sobolev inequalities (see the next
chapter) by S. Aida and D. Stroock [7], L. Gross and O. Rothaus [227] and others
(cf. [276]). The proof presented here is extracted from [79]. More on applications to
measure concentration may be found in [90, 276, 278].
The Muckenhoupt criterion presented in Sect. 4.5.1 also has a long history start-
ing with early papers by G. Hardy [231, 232], later developed in [321, 405, 410].
In particular, B. Muckenhoupt presented in [321] a characterization with respect to
two measures on R+ (see also Chap. 8 in this regard). More details on those re-
sults can be found in [269]. In the form of Theorem 4.5.1, the criterion is recalled
in [77] by comparison with the criterion for logarithmic Sobolev inequalities (see
the next chapter). See also [14] for further details. More precise comparisons be-
tween the Muckenhoupt functional and the optimal Poincaré constant are developed
in [310]. The Poincaré inequalities on a bounded interval of the real line presented
in Sect. 4.5.2 are more or less classical and may be found in various places in the
literature, including [203].
The Lyapunov method discussed in Sect. 4.6 has its origin in the Markov chain
context (cf. e.g. [309]) where it has been used as a tool towards convergence to
232 4 Poincaré Inequalities
After Poincaré inequalities, logarithmic Sobolev inequalities are amongst the most
studied functional inequalities for semigroups. Indeed, they contain much more in-
formation than Poincaré inequalities, and are at the same time sufficiently general
to be available in numerous cases of interest, in particular in infinite dimension (as
limits of Sobolev inequalities on finite-dimensional spaces). Moreover, they entail
remarkable smoothing properties of the semigroup in the form of hypercontractivity.
The structure of this chapter is quite similar to the preceding one on Poincaré
inequalities. In particular, the setting is that of a Full Markov Triple, abbreviated
as “Markov Triple”, (E, μ, ) with associated Dirichlet form E, infinitesimal gen-
erator L, Markov semigroup P = (Pt )t≥0 and underlying function algebras A0 and
A (cf. Sect. 3.4, p. 168). Logarithmic Sobolev inequalities under the invariant mea-
sure only concern finite (normalized) measures μ. It should be mentioned that log-
arithmic Sobolev inequalities involve entropy and Fisher information, which deal
with strictly positive functions. It will therefore be convenient to deal with the class
Aconst+
0 of Remark 3.3.3, p. 154, consisting of functions which are sums of a positive
function in A0 and a strictly positive constant.
Once again, several definitions and properties make sense for more general triples
(E, μ, ), and the Standard Markov Triple assumption suffices for a number of
results. Contrary to the preceding chapter on Poincaré inequalities, the diffusion
property can however not be discarded.
The first section introduces the basic definition of a logarithmic Sobolev inequal-
ity together with its first properties. Section 5.2 presents the exponential decay in en-
tropy and the fundamental equivalence between logarithmic Sobolev inequality and
hypercontractivity. The next sections discuss integrability properties of eigenvectors
and of Lipschitz functions under a logarithmic Sobolev inequality and present, as in
the case of Poincaré inequalities, a criterion for measures on the real line to satisfy
a logarithmic Sobolev inequality (for the usual gradient). Sections 5.5 and 5.7 deal
with curvature conditions, first for the local logarithmic Sobolev inequalities for
heat kernel measures, then for the invariant measure with additional dimensional in-
formation. Local hypercontractivity and some applications of the local logarithmic
Sobolev inequalities towards heat kernel bounds are further presented. Section 5.6
whenever
When D = 0, the logarithmic Sobolev inequality will be called tight, and will be
denoted by LS(C). When D > 0, the logarithmic Sobolev inequality LS(C, D) is
called defective.
The best constant C > 0 for which such an inequality LS(C) holds is sometimes
referred to as the logarithmic Sobolev constant (of the Markov Triple). Note that
it is part of the information
contained in the logarithmic Sobolev inequality that
provided f ∈ D(E), then E f 2 log(1 + f 2 )dμ < ∞, that is f belongs to the Orlicz
space L2 log L(μ) (observe further that L2 log L(μ) ⊃ Lp (μ) for every p > 2 for
the comparison with Sobolev inequalities in the next chapter). As usual, it is enough
to state (and prove) such inequalities for a family of functions f which is dense in
the Dirichlet domain D(E) (typically the algebra A0 ).
We often simply say that the probability measure μ satisfies a logarithmic
Sobolev inequality LS(C, D) or LS(C) (with respect to the underlying carré du
champ operator or Dirichlet form E). The normalization constant 2 is chosen
for further comparisons. By the homogeneity property of entropy, the logarithmic
Sobolev inequality of Definition 5.1.1 is homogeneous (invariant under a change of
f to cf ). The tight logarithmic Sobolev inequality LS(C) implies that if E(f ) = 0,
then f is constant. This observation is in agreement with the corresponding prop-
erty for Poincaré inequalities (cf. Remark 4.2.2, p. 182, and the link with connexity
in Sect. 3.2.1, p. 140).
The
√ logarithmic Sobolev inequality LS(C, D) is often presented equivalently
for f , f ≥ 0 (see Remark 5.1.2 below), and then takes the form, by the chain rule
formula,
C (f )
Entμ (f ) ≤ dμ + D f dμ. (5.1.4)
2 E f E
Since
Remark 5.1.2 The meaning of Iμ (f ) where the function f may vanish should be
sense when f = g for any g ∈ A0 , or even
clarified. It actually makes perfect 2
sense for any smooth strictly positive function f as the integral E (f )
f dμ, where
the latter integral may or may not be convergent. When f is only positive, observe
that Iμ (f ) = limε0 Iμ (f +ε). Therefore when dealing with such quantities, it is of-
ten enough to consider Iμ (f ) for f in Aconst+
0 . Throughout this section, inequalities
involving a Fisher-type expression will usually be established first for functions in
this class and then extended to more general functions taking limits. This procedure
will often be implicit.
The prototypical logarithmic Sobolev inequality is that for the standard Gaus-
sian measure on R or Rn for which LS(C) holds with C = 1, independently of
the dimension n of the underlying state space. (The value C = 1 justifies the nor-
malization chosen in Definition 5.1.1. This normalization will also be convenient
for the comparison with Poincaré inequalities below.) However, while the Poincaré
inequality for the Gaussian measure may be established by a simple spectral expan-
sion along the Hermite polynomials, the proof of the logarithmic Sobolev inequality
is more delicate, and is thus postponed until later in the chapter (Proposition 5.5.1).
The name logarithmic Sobolev inequality describes a weak form of the Sobolev
inequalities studied in Chap. 6. However, while Sobolev inequalities will typically
hold only in finite-dimensional spaces, logarithmic Sobolev inequalities may be
investigated in infinite-dimensional contexts (in probability theory, statistical me-
chanics, statistics, and in numerous settings dealing with an infinite number of co-
ordinates), where they provide useful bounds and controls. It will be shown later
how they really appear as limits of Sobolev inequalities as the dimension tends to
infinity (see Remark 6.2.6, p. 285, and Sect. 6.5, p. 291). Logarithmic Sobolev in-
equalities are also of interest in finite dimension as will be discussed in the chapter
and throughout the book.
The tight logarithmic Sobolev inequality LS(C) is stronger than the Poincaré in-
equality P (C) (Definition 4.2.1, p. 181). This is the content of the following propo-
sition, which furthermore shows that a logarithmic Sobolev inequality LS(C, D)
together with a Poincaré inequality actually yield a tight logarithmic Sobolev in-
equality.
Proof For the first assertion, apply LS(C) to f = 1 + εg where g ∈ D(E) with
E g dμ = 0. As ε → 0, it is not difficult to check by a Taylor expansion that
Entμ f 2 = 2ε 2 g 2 dμ + o ε 2
E
Lemma
Entμ (f + a)2 ≤ Entμ f 2 + 2 f 2 dμ.
E
Proof With f and a fixed, the aim is to verify that the function
ψ(r) = Entμ (rf + a)2 − Entμ (rf )2 − 2 (rf )2 dμ, r ∈ R,
E
On this basis, the proof of the second assertion of Proposition 5.1.3 is immediate.
Indeed, by the logarithmic Sobolev inequality LS(C, D) applied to fˆ, and (5.1.8),
2
Entμ f ≤ 2C E(f ) + D fˆ2 dμ + 2 Varμ (f ),
E
Remark 5.1.5 Examples will be provided below for which the logarithmic Sobolev
constant is strictly larger than the (finite) Poincaré constant (or is even infinite). In
this respect, observe that whenever the smallest (non-trivial)
eigenvalue λ1 of −L
(provided it exists) admits an eigenfunction h such that E h3 dμ = 0, then the Tay-
lor expansion to the next term around h used in the proof of Proposition 5.1.3 yields
such an instance for which the logarithmic Sobolev constant of μ is necessarily
strictly larger than the spectral gap constant (equal to λ11 ).
Proposition 5.1.6 (Bounded perturbation) Assume that the Markov Triple (E, μ, )
satisfies a logarithmic Sobolev inequality LS(C, D). Let μ1 be a probability mea-
sure with density h with respect to μ such that b1 ≤ h ≤ b for some constant b > 0.
Then μ1 satisfies LS(b2 C, b2 D) (with respect to ).
The important issue here, for both the proof of the lemma and its application,
is
that φ(s) − φ(r) − φ (r)(s − r) ≥ 0, r, s ∈ I , by convexity. With s = E f dμ, the
5.1 Logarithmic Sobolev Inequalities 241
with C1 = (2−p)C
p and D1 = (2−p)D
p . Conversely, if (5.1.9) holds for some p ∈ [1, 2)
and every f ∈ D(E), then, for constants c(p) > 0 and d(p) > 0 depending only on
p, LS(C, D) holds with C = c(p)C1 and D = c(p)D1 + d(p).
Proof We first show that LS(C, D) implies (5.1.9). The argument only relies
on the rephrasing of Hölder’s inequality as the convexity of the map
r → φ(r) = log(f 1/r ), r ∈ (0, 1] (for f = 0). Therefore, for 1 ≤ p < 2,
φ( p1 ) − φ( 12 )
1
≥ φ .
1
p − 1
2
2
1 4
≤ αk ≤ .
3 3
k∈Z
In particular αk ≤ 4
3 for every k. Now, for every k ∈ Z,
2k 1Nk+1 ≤ fk ≤ 2k 1Nk ,
αk+1
so that, letting βk = fk 22 , 4 ≤ βk ≤ αk and, for 1 ≤ p < 2,
2/p
where we used that αk ≤ ( 34 )1−(2/p) αk . Now apply (5.1.9) to fk for every k ∈ Z
to get that
αk
k βk ≤ c(p) βk log + C1 E(fk ) + D1 + d(p) βk
βk
for constants
c(p), d(p) > 0 only depending on p. Decomposing f again as
f = k∈Z 1Nk \Nk+1 f shows similarly (after some details) that
f log f dμ ≤ 3 log 4
2 2
k αk + 8 .
E k∈Z
By Jensen’s inequality for the concave function log and the fact that αk+1
4 ≤ βk ≤ αk
for every k,
αk k∈Z αk 4
βk log ≤ βk log ≤ log 4
βk k∈Z βk 3
k∈Z k∈Z
In the preceding proof, going from one form of the inequality to the other and
back does not preserve the constants. While the proof shows that the tight loga-
rithmic Sobolev inequality LS(C) implies D1 = 0, conversely (5.1.9) with D1 = 0
in the preceding proof does not yield back LS(C) for some C > 0. However, a
Taylor expansion f = 1 + εg in (5.1.9) with D1 = 0 yields a Poincaré inequality
C1
P ( 2−p ), so that, together with Proposition 5.1.3, we actually reach a tight logarith-
mic Sobolev inequality in this case.
Like Poincaré inequalities (cf. Sect. 4.2.2, p. 183), logarithmic Sobolev inequal-
ities have something to tell about convergence to equilibrium of the semigroup
P = (Pt )t≥0 towards the invariant (probability) measure μ. In addition, logarith-
mic Sobolev inequalities ensure smoothing properties of the semigroup in the form
of hypercontractivity.
The following statement is the analogue for entropy of the exponential decay in
L2 (μ) produced by a Poincaré inequality (Theorem 4.2.5, p. 183). Furthermore, the
proof of this result exhibits the fundamental relation between decay of the entropy
along the semigroup and the Fisher information (5.1.6).
244 5 Logarithmic Sobolev Inequalities
for every t ≥ 0.
It is not obvious a priori that the convergence in this statement is stronger than the
convergence in variance as produced by a Poincaré inequality. One point is that the
convergence holds for a larger class of functions. For example, if μ is the standard
Gaussian distribution on the real line (for which LS(C) holds with C = 1) and if
2
f (x) = ecx /2 , x ∈ R, c ∈ (0, 1), then Varμ (f ) = ∞ if c ≥ 12 whereas Entμ (f ) =
−3/2 + (1 − c)−1/2 log(1 − c)) for every c ∈ (0, 1).
2 (c(1 − c)
1
Therefore, under a control of H(ν0 | μ), Theorem 5.2.1 implies the stronger conver-
gence of νt towards μ in total variation.
For the sake of completeness, here is a brief proof of (5.2.2) (which shares
some features with the proof of Lemma 5.1.4 above). If f
≥ 0 denotes the
Radon-Nikodym derivative dμ dν
, then, thanks to the identity E |1 − dμ dν
|dμ =
2μ − νTV , (5.2.2) translates into
2
|1 − f |dμ ≤ 2 Entμ (f ).
E
5.2 Entropy Decay and Hypercontractivity 245
d
Entμ (Pt f ) = −Iμ (Pt f ). (5.2.3)
dt
The proof is immediate. Indeed, by the heat equation and integration by parts,
for any f ∈ Aconst+
0 to start with, and then extending to f ∈ D(E),
d
Entμ (Pt f ) = (1 + log Pt f ) LPt f dμ
dt E
=− (Pt f, 1 + log Pt f )dμ
E
(Pt f )
=− dμ = −Iμ (Pt f ).
E Pt f
This basic relation between the time derivative of entropy and the Fisher informa-
tion in particular tells us that entropy is decreasing along the flow of the semigroup,
a property often referred to as the Boltzmann (H -) Theorem.
On the basis of de Bruijn’s identity, the logarithmic Sobolev inequality LS(C) in
its Fisher information formulation (5.1.4)
C
Entμ (g) ≤ Iμ (g)
2
translates into the differential inequality (t) ≤ − C2 (t), t > 0, from which the in-
equality of Theorem 5.2.1 easily follows. Conversely, differentiating this inequality
at t = 0 yields the logarithmic Sobolev inequality by the same argument.
246 5 Logarithmic Sobolev Inequalities
5.2.2 Hypercontractivity
The evolution of the semigroup P = (Pt )t≥0 under a logarithmic Sobolev inequal-
ity may be described in another, new, formulation, at the origin of the notion of
logarithmic Sobolev inequality, known as the hypercontractivity property. This is a
smoothing property that exactly translates the logarithmic Sobolev inequality. The
next theorem presents this fundamental equivalence.
The second assertion of the theorem is thus called hypercontractivity of the semi-
group (Pt )t≥0 . Note that M(t) = 0 under a tight logarithmic Sobolev inequality
(D = 0) so that LS(C) holds if and only if for every t > 0 and f ∈ Lp (μ),
Pt f q ≤ f p
q−1
for (some or any) 1 < p < q < ∞ such that e2t/C ≥ p−1 . What Theorem 5.2.3
tells us is that, under a logarithmic Sobolev inequality, not only are the operators
Pt contractions in all Lp (μ)-spaces, but they improve integrability for large t (in
particular they are contractions from Lp (μ) to Lq (μ) when D = 0). This may be
shown to be optimal for some examples, such as the Ornstein-Uhlenbeck semigroup
(see below). Note that the theorem does not tell us anything in the case where p = 1.
Proof Once the correct starting point has been identified, the proof is rather simple.
To show that LS(C, D) implies hypercontractivity, we use the traditional strategy
of differentiating a suitable functional. Below, f is a function in Aconst+
0 (the result
being extended to functions in Lp (μ) by standard density arguments). A first obser-
vation, already used earlier in this chapter, is that the derivative of the norm · q
along its parameter q gives rise to the entropy. Namely,
q
∂q f q = ∂q f q dμ
E
1 q
= f log f dμ =
q
Entμ f + f dμ log f dμ .
q q
E q E E
5.2 Entropy Decay and Hypercontractivity 247
On the other hand, for every fixed q > 1, by integration by parts and the diffusion
property,
∂t (Pt f )q dμ = q (Pt f )q−1 LPt f dμ
E E
= −q(q − 1) (Pt f )q−2 (Pt f )dμ.
E
On the basis of these three observations, the proof may easily be completed. For
f in Aconst+
0 as before, consider
= (t, q) = (Pt f )q dμ, t ≥ 0, q > 1.
E
C D 1
∂q ≤ − ∂t + + log .
2(q − 1) q q
For the given expressions of q(t) and M(t), it follows immediately from this in-
equality that H (t) = q(t)
1
log( (t, q(t))) − M(t) is decreasing in t ≥ 0. Hypercon-
tractivity is then simply the inequality H (t) ≤ H (0).
This shows that the logarithmic Sobolev inequality LS(C, D) implies hypercon-
tractivity for any p > 1. Conversely, start from the hypercontractivity inequality
for some p > 1. Taking the derivative at t = 0 gives back (5.2.4) with q = p which
amounts to the logarithmic Sobolev inequality LS(C, D) after changing of f into
f 2/p . The proof is complete.
Remark 5.2.4 The above proof may easily be modified to yield a reverse hypercon-
tractivity property. Indeed, choosing a tight logarithmic Sobolev inequality LS(C)
for simplicity, LS(C) holds if and only if for every t > 0 and every strictly positive
measurable function f : E → R,
Pt f q ≥ f p
q−1
for (some or any) −∞ < q < p < 1 such that e2t/C = p−1 .
248 5 Logarithmic Sobolev Inequalities
Pt0 f q ≤ M f 2
for all f ∈ L2 (μ), then a logarithmic Sobolev inequality LS(C, D) holds (with
2q log M
C = 2q t0
q−2 and D = q−2 ).
This theorem may be shown to follow from a general relationship between en-
tropy and energy along the semigroup, which is an analogue of (4.2.3), p. 184.
Proposition 5.2.6 (Cattiaux’s inequality) For every function f in D(E) and every
t ≥ 0,
f 2 log f 2 dμ ≤ 2t E(f ) + f 2 log(Pt f )2 dμ.
E E
Proof We only establish the first assertion, the second one being proved in the same
way. As usual, it is convenient to work with f ∈ Aconst+
0 and to extend to D(E) by
density. As in many of
the earlier proofs, consider a functional along the semigroup
given here by (t) = E f 2 log Pt f dμ, t ≥ 0. Then, by the integration by parts and
diffusion properties,
LPt f 2f f2
(t) = f 2
dμ = − (f, Pt f ) + (Pt f ) dμ.
E Pt f E Pt f (Pt f )2
2f f2
− (f, Pt f ) + (Pt f ) ≥ − (f )
Pt f (Pt f )2
so that (t) ≥ − E (f )dμ = − E(f ) for every t. Integrating from 0 to t yields
the result.
5.2 Entropy Decay and Hypercontractivity 249
Therefore
2
1− f 2 log f 2 dμ ≤ 2t E(f ) + log Pt f 2q ,
q E
Note that Theorem 5.2.5 (with possibly different bounds) may be shown to follow
alternatively from Lemma 4.2.6, p. 184, and Proposition 5.1.8 (use by duality that
Pt f 2 ≤ Mf q ∗ where q ∗ is the conjugate exponent of q).
To conclude this section, we observe that, like Poincaré inequalities (Sect. 4.3,
p. 185), logarithmic Sobolev inequalities are stable under products. This dimension-
free feature is one fundamental aspect justifying the importance of logarithmic
Sobolev inequalities in infinite-dimensional contexts. It technically allows for the
extension of logarithmic Sobolev bounds on finite-dimensional space to infinite di-
mension (such as, for example, the extension from finite-dimensional to infinite
dimensional Gaussian measures).
Proof A quick proof may be provided via the equivalence with hypercontractiv-
ity (Theorem 5.2.3). Indeed, if (Pt1 )t≥0 and (Pt2 )t≥0 denote the respective Markov
semigroups, and if Pti is bounded from Lp (μ) into Lq (μ) with norm eMi for t
q−1
such that e2t/Ci ≥ p−1 , i = 1, 2, it is immediate that Pt1 ⊗ Pt2 is bounded from
250 5 Logarithmic Sobolev Inequalities
Lp (μ1 ⊗ μ2 ) into Lq (μ1 ⊗ μ2 ) with norm bounded from above by eM1 +M2 for
q−1
e2t/ max(C1 ,C2 ) ≥ p−1 .
The proposition may also be proved in the same way as Proposition 4.3.1, p. 185,
with a traditional scheme which is often used when dealing with correlations in
statistical mechanics models for example. Indeed, using the notation therein,
Entμ1 ⊗μ2 f 2 = Entμ1 g 2
+ f 2 (x1 , x2 ) log f 2 (x1 , x2 )dμ2 (x2 )
E1 E2
− f 2 (x1 , x2 )dμ2 (x2 ) log f 2 (x1 , x2 )dμ2 (x2 ) dμ1 (x1 )
E2 E2
that E1 (g) ≤ E2 E1 (f )dμ2 , the conclusion easily follows. However, the latter re-
quires more here in the form of the change of variables formula for carré du champ
operators and the Cauchy-Schwarz inequality yielding that
1 (g)(x1 ) ≤ 1 f (x1 , x2 ) dμ2 (x2 ).
E2
be shown below (Proposition 5.5.1) to satisfy LS(1) (with respect to the standard
carré du champ operator). Therefore, by Theorem 5.2.3, the associated Ornstein-
Uhlenbeck semigroup P = (Pt )t≥0 (as presented in Sect. 2.7.1, p. 103) is hypercon-
q−1
tractive in the sense that for every 1 < p < q < ∞ and t > 0 such that e2t ≥ p−1 ,
and every f ∈ L (μ),
p
Pt f q ≤ f p . (5.3.1)
Eigenfunctions of the Ornstein-Uhlenbeck operator are described by Hermite
polynomials. Recall from Sect. 2.7.1, p. 103, that the Hermite polynomials (Hk )k∈N
form an orthogonal basis of L2 (μ) on the real line. Similarly, the polynomials
form an orthogonal basis of L2 (μ) on Rn . Now let (H )∈Lk be a given family of
such polynomials with the property that k1 + · · · + kn = k for some fixed k ∈ N, and
set
Q= a H .
∈Lk
k∈N q k∈N
At the expense of further technical arguments not developed here, the exponen-
2 2
tial integrability result E eσ f /2 dμ < ∞ for every σ 2 < C1 still holds under a
defective logarithmic Sobolev inequality LS(C, D).
With respect to Poincaré inequalities (Proposition 4.4.2, p. 190), squares of Lip-
schitz functions are here exponentially integrable (and not just the Lipschitz func-
tions themselves). In particular, the exponential measure, which satisfies a Poincaré
inequality, cannot satisfy a logarithmic Sobolev inequality (for the standard carré
du champ operator). Note that the integrability level in Proposition 5.4.1 is optimal
since the standard Gaussian measure satisfies LS(1) (cf. Proposition 5.5.1 below),
and (5.4.1) is sharp in this example with f linear.
Proof The proof is perhaps even simpler than the corresponding one for Poincaré
inequalities. We start with (5.4.1), and similarly restrict ourselves to bounded Lips-
chitz functions in D(E) (or A) the general case being reached through truncation and
approximation (cf. Remark 4.4.3, p. 190).
Therefore, given a bounded 1-Lipschitz
function f ∈ D(E), set for s ∈ R, Z(s) = E esf dμ. The aim is to apply LS(C) to
esf/2 for every s. Towards this goal, observe that
Z (s) = f esf dμ
E
while
Entμ esf = s f esf dμ − Z(s) log Z(s) = sZ (s) − Z(s) log Z(s).
E
Integrating (5.4.1) along the centered Gaussian measure with variance σ 2 in the
s variable yields, by Fubini’s Theorem, that
2
2 f 2 /2 1 σ2
eσ dμ ≤ √ exp f dμ
E 1 − Cσ 2 2(1 − Cσ 2 ) E
1
for every σ 2 < C. Hence the first claim of the proposition holds and the proof is
complete.
Proposition 5.4.1 admits a simple and useful variation in the form of mo-
ment estimates. Namely, if R(p) = f 2p , p ≥ 2, for a function f ∈ Lp (μ) with
2
2
R(p)1−(p/2) Entμ f p .
p
If f is Lipschitz, then
for every p ≥ 2. By a series expansion, the growth in p amounts to the same level
of integrability as that in Proposition 5.4.1.
The next statement is an analogue of Proposition 4.4.4, p. 192, in the setting of
logarithmic Sobolev inequalities. The proof is entirely similar.
As in Sect. 4.4.3, p. 192, for Poincaré inequalities, Proposition 5.4.1 may be refor-
mulated in the form of a Gaussian measure concentration property for the measure
μ satisfying a logarithmic Sobolev inequality LS(C) for some C > 0. Indeed, for
every Lipschitz function f and every r ≥ 0,
−r 2 /2Cf 2Lip
μ f≥ f dμ + r ≤ e . (5.4.2)
E
Remark 5.4.4 In analogy with Remark 4.4.5, p. 193, in the context of Poincaré
inequalities, one may wonder how far the concentration bounds of the Herbst ar-
gument (Proposition 5.4.1 and (5.4.2)) are from the logarithmic Sobolev inequal-
ity LS(C) (or LS(C, D)). Again, the Muckenhoupt characterization below (The-
orem 5.4.5) indicates that they are not sufficient in general to entail a logarithmic
256 5 Logarithmic Sobolev Inequalities
Sobolev inequality. However, under a curvature condition CD(0, ∞), they do imply
a logarithmic Sobolev inequality with moreover proportional constants. This result
will be presented in Sect. 8.7, p. 425, together with the corresponding result for
Poincaré inequalities on the basis of the local heat kernel inequalities of Sect. 5.5
below (in particular the reverse forms).
We conclude this section with an analogue of the Muckenhoupt criterion for log-
arithmic Sobolev inequalities. As for Poincaré inequalities (Sect. 4.5.1, p. 194),
it is possible to characterize measures μ on the real line R satisfying a logarith-
mic Sobolev inequality LS(C) with respect to the usual carré du champ operator
(f ) = f 2 . The next statement is an analogue of Theorem 4.5.1, p. 194, with a
similar although more involved proof not presented here. As for the correspond-
ing equivalence for Poincaré inequalities, this result appears as the one-dimensional
version of more general relationships between capacities and measures discussed in
Chap. 8.
for the logarithmic Sobolev inequality if and only if α ≥ 2. Recall that the corre-
sponding criterion for the Poincaré inequality yields in this case α ≥ 1, thus distin-
guishing clearly the two families of inequalities. As a further example of interest, the
spectral decomposition of the Laguerre operator in Sect. 2.7.3, p. 111, shows that
the exponential measure μ on R+ has spectral gap 1 for the carré du champ operator
(f ) = xf 2 . Since the Laguerre operator is of curvature CD( 12 , ∞), according to
Proposition 5.7.1 below, μ satisfies a logarithmic Sobolev inequality with
√ constant
C = 2. This constant is optimal since the 1-Lipschitz map f (x) = 2 x (with re-
2 2
spect to ) is such that R+ eσ f /2 dμ < ∞ for every σ 2 < 12 and nothing better
(Proposition 5.4.1).
5.5 Local Logarithmic Sobolev Inequalities 257
Following the treatment of Poincaré inequalities in Sect. 4.7, p. 206, of the pre-
vious chapter, this section investigates logarithmic Sobolev inequalities under the
semigroup P = (Pt )t≥0 , that is with respect to the heat kernel measures of the rep-
resentation (1.2.4), p. 12,
Pt f (x) = f (y) pt (x, dy), t ≥ 0, x ∈ E.
E
The approach further develops the main principle of heat flow monotonicity using
the Duhamel interpolation formula, here in the context of entropy.
Before addressing the general framework, we first consider the Ornstein-
Uhlenbeck example leading to the logarithmic Sobolev inequalities for Gaussian
measure. The section closes with a version of hypercontactivity for heat kernel
measures and some heat kernel bounds. The local heat kernel inequalities will be
presented and established for functions in the algebra A0 of the Markov Triple
(E, μ, ), and we technically work with the class Aconst+
0 .
As for the local Poincaré inequalities in Sect. 4.7, p. 206, consider now, for t > 0
and f ∈ Aconst+
0 ,
(s) = Ps ψ(Pt−s f ) , s ∈ [0, t],
where ψ(r) = r log r, r ∈ R+ . Then, setting g = Pt−s f , by the heat equation and
the change of variables formula,
(g)
(s) = Ps L ψ(g) − ψ (g) Lg = Ps ψ (g) (g) = Ps .
g
Now, by (5.5.1),
2
(g) = |∇Pt−s f |2 ≤ e−2(t−s) Pt−s |∇f | .
258 5 Logarithmic Sobolev Inequalities
Since pt−s (x, dy) is a probability measure, by the Cauchy-Schwarz inequality, for
every h,
2
h
(Pt−s h)2 ≤ Pt−s f Pt−s .
f
With h = |∇f |, this amounts to
(Pt−s (|∇f |))2 |∇f |2 (f )
≤ Pt−s = Pt−s .
Pt−s f f f
As a consequence,
−2(t−s) (f ) −2(t−s) (f )
(s) ≤ e Ps Pt−s =e Pt .
f f
By definition of (s), s ∈ [0, t], it follows that, for every t ≥ 0,
1 − e−2t (f )
Pt (f log f ) − Pt f log Pt f ≤ Pt . (5.5.2)
2 f
It should be emphasized that with respect to the corresponding Poincaré inequal-
ity (4.7.2), p. 207, the logarithmic Sobolev inequality (5.5.2) makes use of the com-
mutation (5.5.1) rather than only (4.7.1), p. 207 (strong gradient bound versus gra-
dient bound in the terminology of Chap. 3). This aspect will be further developed in
Theorem 5.5.2 below.
The preceding inequality is thus a (tight) logarithmic Sobolev inequality
LS(1 − e−2t ) for the heat kernel measures pt (x, dy), with a constant independent
of the initial point x (implicit throughout the argument), which converges to 1 as
t → ∞. Since the kernels pt (x, dy) converge to the standard Gaussian measure μ,
as seen from (2.7.3), p. 104, we therefore conclude that the latter satisfies LS(1).
Proposition 5.5.1 (Logarithmic Sobolev inequality for the Gaussian measure) The
standard Gaussian measure μ on the Borel sets of Rn satisfies LS(1). In other
words, for every function f : Rn → R in the Dirichlet space of the standard carré
du champ operator (f ) = |∇f |2 ,
Entμ f 2 ≤ 2 E(f ) = 2 |∇f |2 dμ.
Rn
This proposition states the logarithmic Sobolev inequality for the centered Gaus-
sian measure μ on Rn with identity as covariance matrix (sometimes called the
Gaussian logarithmic Sobolev inequality). If μ is a centered Gaussian measure on
Rn with covariance matrix Q, a simple change of variables shows that for every
smooth function f on Rn ,
2
Entμ f ≤ 2 Q∇f · ∇f dμ.
Rn
5.5 Local Logarithmic Sobolev Inequalities 259
e2t − 1 (Pt f )
Pt (f log f ) − Pt f log Pt f ≥ · (5.5.3)
2 Pt f
for every f positive in A0 (where the right-hand side is understood, according to
Remark 5.1.2, as limε0 (P tf)
Pt f +ε ).
In a further similarity with the chapter on Poincaré inequalities, the same families
of inequalities may be established for the Euclidean (Brownian) heat semigroup
P = (Pt )t≥0 on Rn . Recall that in this case |∇Pt f | ≤ Pt (|∇f |). The corresponding
local inequalities for the heat kernel measures then take the form
(Pt f ) (f )
t ≤ Pt (f log f ) − Pt log Pt f ≤ t Pt .
Pt f f
Since at time t = 12 , the distribution of the heat semigroup Pt is the standard Gaus-
sian measure, the first inequality recovers Proposition 5.5.1.
The preceding analysis may be performed for any diffusion Markov semigroup
P = (Pt )t≥0 under the curvature condition CD(ρ, ∞) of Definition 3.3.14, p. 159.
With respect to Theorem 4.7.2, p. 209, dealing with local Poincaré inequalities, the
condition CD(ρ, ∞) is enriched by several new (stronger) equivalences, in partic-
ular the reinforced curvature condition (ii) and the strong gradient bound (iii).
e2ρt − 1 (Pt f )
Pt (f log f ) − Pt f log Pt f ≥ . (5.5.6)
2ρ Pt f
−2ρt
When ρ = 0, the quantities 1−e2ρ and e 2ρ−1 have to be replaced by t (as is
2ρt
the case for the heat semigroup on Rn of curvature CD(0, ∞)). Also recall from
Remark 5.1.2 that the Fisher-type expressions such as Pt ( (f )
f ) are understood as
limε0 Pt ( (f )
f +ε ). The inequalities in Theorem 5.5.2 extend to positive functions
f in A following the various extension procedures described in Sect. 3.3, p. 151.
Moreover, as for the corresponding Poincaré inequality (4.7.6), p. 210, the reverse
logarithmic Sobolev inequality (5.5.6) (for t > 0) extends to measurable positive
functions.
Proof We refer to Sect. C.6, p. 513, in Appendix C for the equivalence between
the CD(ρ, ∞) condition and (ii) already used in Theorem 3.3.18, p. 163, and at the
heart of the (strong) gradient bound (iii). As explained there, this apparent reinforce-
ment of the curvature condition only relies on the chain rule formulas for L and ,
that is, the diffusion property. The proof from the gradient bound (iii) to the local
logarithmic Sobolev inequality (iv) and its reverse form (v) is exactly the same as the
one presented above for the Ornstein-Uhlenbeck and Brownian semigroups (with
the only modification being the constant ρ). To complete the circle of equivalences,
proceed as usual to the asymptotics as t = 0 of the local logarithmic Sobolev in-
equality or the reverse inequality yielding the curvature condition CD(ρ, ∞). More
simply, use that these (local) logarithmic Sobolev inequalities imply the correspond-
ing (local) Poincaré inequalities (by Theorem 5.1.3) and the corresponding result for
Poincaré inequalities (Theorem 4.7.2, p. 209). The proof is therefore complete.
The proof is entirely similar, and (5.5.5) for example, then simply reads
K(1 − e−2ρt ) (f )
Pt (f log f ) − Pt f log Pt f ≤ Pt .
2ρ f
Similarly, the analogue of Remark 4.7.4, p. 211, indicates that all the assertions
of Theorem 5.5.2 are also equivalent to saying that there exist t0 > 0 and a function
c(t) = t − ρ t 2 + o(t 2 ) such that, for any t ∈ (0, t0 ) and any positive f in A0 ,
(f )
Pt (f log f ) − Pt f log Pt f ≤ c(t) Pt ,
f
respectively
(Pt f )
Pt (f log f ) − Pt f log Pt f ≥ c(t) .
Pt f
While the proof of Theorem 5.5.2 is based on the √ reinforced curvature condi-
tion in terms of the commutation (5.5.4) of Pt and , the standard definition of
CD(ρ, ∞) is actually enough to reach the local logarithmic Sobolev inequalities of
Theorem 5.5.2 (as for Poincaré inequalities). It is worth emphasizing the argument
here since the principle will be used again in subsequent developments. The follow-
ing proposition is the key technical step towards this goal, illustrating the role of the
2 operator in the heat flow monotonicity principle. It will allow us later to reach
stronger conclusions such as Sobolev-type inequalities as well as Harnack-type in-
equalities.
Then,
(s) = Ps Pt−s f (log Pt−s f )
and
(s) = 2Ps Pt−s f 2 (log Pt−s f ) .
Now, on the basis of these identities, under the curvature condition CD(ρ, ∞),
that is 2 (f ) ≥ ρ (f ) for every f ∈ A0 (or A), it follows that ≥ 2ρ . The
local inequalities of Theorem 5.5.2 are then easily derived in this alternative way.
The second derivative is more specific to the function ψ(r) = r log r. We have,
with = ψ , again by the heat equation,
(s) = Ps L (g) (g) − (g) Lg (g) − 2(g) (g, Lg) .
which by the change of variables formula (3.1.9), p. 124, for L may be written as
E(g) = 2(g) 2 (g) + 2 (g) g, (g) + (g) (g)2 .
For the choice of ψ(r) = r log r, and thus (r) = 1r , this expression may be directly
compared to the change of variables formula (3.3.2), p. 158, for the 2 operator
(consequence of the diffusion property of L), that is
2 ψ1 (g) = ψ1 (g) 2 (g) + ψ1 (g)ψ1 (g) g, (g) + ψ1 (g) (g)2 .
2 2
With ψ1 (r) = log r, E(g) = 2g 2 (log g) from which the conclusion follows.
Remark 5.5.4 The functions ψ(r) = r 2 (r ∈ R) and ψ(r) = r log r (r ∈ R+ ) are the
only ones for which the second derivative of (s) = Ps (ψ(Pt−s f )), s ∈ [0, t], takes
such a simple form. For ψ(r) = r 2 , even the subsequent derivatives have nice ex-
pressions yielding the iterated Gamma operators 3 , 4 . . . (constructed by the same
rule as 2 ). This is no longer the case for the entropy ψ(r) = r log r where the third
derivative already has a complicated form. Nevertheless, analogues of the preceding
local inequalities may be obtained by replacing r 2 or r log r by r α , 1 < α < 2 (or
even more general functions as in Proposition 7.6.1, p. 383, below). Such inequal-
ities (modified a little) are of interest for heat kernel decays as well as tail bounds
between exponential and Gaussian behaviour (cf. Chap. 7).
Under a logarithmic Sobolev inequality, the Markov semigroup P = (Pt )t≥0 is hy-
percontractive (Theorem 5.2.3). The same argument may be developed under a lo-
cal logarithmic Sobolev inequality yielding hypercontractive characterizations of
the curvature bound CD(ρ, ∞). One advantage of this description is that it ex-
tends to positive measurable functions, and is therefore stable under various kinds
of semigroup convergences. Furthermore, as shown by its proof, Theorem 5.5.5 pro-
vides a semigroup characterization of the curvature condition CD(ρ, ∞) without
any derivation argument.
q −1 e2ρt − 1
= 2ρs , (5.5.7)
p−1 e −1
1/q p 1/p
Ps (Pt−s f )q ≤ Pt f (5.5.8)
for all positive measurable functions f .
1−e−2ρt e2ρt −1
As usual, when ρ = 0, then 2ρ has to be replaced by t and e2ρs −1
by st .
Proof Theorem 5.5.2 indicates that (i) is equivalent to the local logarithmic Sobolev
inequality (5.5.5). Assume then that (5.5.5) holds. Let f be in Aconst+
0 and consider
1/q(s)
(s) = Ps (Pt−s f )q(s) , s ∈ [0, t],
q 2 (q − 1) q−2
q−1 (s) = Entps g q +
Ps g (g)
q
1 − e−2ρs q −1
≤ q2 +
Ps g q−2 (g)
2ρ q
by the local logarithmic Sobolev inequality (5.5.5). Now, if q > 1 satisfies
1−e−2ρs
2ρ + q−1
q = 0, that is,
q(s) − 1 e2ρt − 1
= ,
q(t) − 1 e2ρs − 1
then is increasing which amounts to (5.5.8) (with q(s) = q and q(t) = p).
Conversely, assuming (ii), a first order Taylor expansion (with t fixed) as p = 2,
−2ρt
q = 2(1+ε) and s = t (1−αε)+o(ε) with α = (1−eρt ) as ε → 0 (for which (5.5.7)
is satisfied in the limit), yields the local logarithmic Sobolev inequality (5.5.5) which
is equivalent to CD(ρ, ∞). Inequality (5.5.8) for functions in Aconst+ 0 extends by
density to positive measurable functions. The proof of Theorem 5.5.5 is complete.
t > 0, 0 < s ≤ t, −∞ < q < p < 1 satisfying (5.5.7) and f strictly positive, may be
added to the equivalences of Theorem 5.5.5.
It should be observed that Theorem 5.5.5 is not a direct consequence of the hy-
percontractivity Theorem 5.2.3 applied to the local logarithmic Sobolev inequalities,
264 5 Logarithmic Sobolev Inequalities
which would correspond to the semigroup with same carré du champ operator
and reversible measure pt (x, dy). On the other hand, Theorem 5.5.5 applied with
s = t − u yields as t → ∞ with u fixed the hypercontractive bound Pu f q ≤ f p
of Theorem 5.2.3.
This comment implies that (5.5.8) of Theorem 5.5.5 is optimal for the Ornstein-
Uhlenbeck semigroup in the sense that ρ = 1 is the optimal parameter such
that (5.5.8) holds. For the Brownian semigroup (ρ = 0), Theorem 6.7.7, p. 304,
will indicate how this inequality becomes optimal with Gaussian extremal functions
and parameters depending upon the dimension.
The application discussed here concerns some cheap but rough heat kernel bounds
which may be produced from the exponential integrability of heat kernel measures.
Assume that the Markov semigroup P = (Pt )t≥0 is such that Pt , t > 0, admits a
density kernel pt (x, y) with respect to the invariant measure μ as
Pt f (x) = f (y) pt (x, y)dμ(y), t > 0, x ∈ E,
E
Assume now a uniform bound on the kernel pt (x, z) ≤ K(t) (such bounds will
be discussed later in Chaps. 6 and 7). Furthermore, assume that we are in a setting in
which the local logarithmic Sobolev inequalities (5.5.5) of Theorem 5.5.2 hold for
the measures pt (·, z)dμ(z). Therefore, together with the exponential integrability
result for the Lipschitz distance function (Proposition 5.4.1), there exists a c(t) > 0
such that
2
pt (x, z) e4d(x,z) /c(t) dμ(z) < ∞
E
5.6 Infinite-Dimensional Harnack Inequalities 265
where K (t) > 0 only depends on t (and the choice of c(t) in accordance with
geometric features). This bound emphasizes, at a mild level, the standard Gaussian
bounds which may be expected on heat kernels. Such bounds will be addressed later
in Chap. 7. It should be added that the only features used in the previous argument
are the triangle inequality for the distance function d and the fact that z → d(x, z)
is 1-Lipschitz.
This section is concerned with Harnack-type inequalities under the curvature condi-
tion CD(ρ, ∞) and with various applications of the heat kernel logarithmic Sobolev
inequalities of the previous section.
The main result is a Harnack-type inequality, a consequence of the gradient
bounds under curvature conditions (cf. Theorem 3.3.18, p. 163, or (iii) of Theo-
rem 5.5.2). We state it for convenience in the setting of smooth complete connected
Riemannian manifolds (in order to freely speak of geodesics joining two points)
although the Markov Triple version may also be reached by different means (see
below). The algebra A0 then stands for the family of smooth compactly supported
functions.
Theorem 5.6.1 (Wang’s Harnack inequality) Let P = (Pt )t≥0 be a Markov semi-
group with infinitesimal generator L = g − ∇W · ∇ on a complete connected
Riemannian manifold (M, g). The following are equivalent.
(i) The curvature condition CD(ρ, ∞) holds for some ρ ∈ R.
(ii) For every positive measurable function f on M, every t > 0, every x, y ∈ M
and every α > 1,
α αρ d(x, y)2
(Pt f ) (x) ≤ Pt f (y) exp
α
(5.6.1)
2(α − 1)(e2ρt − 1)
Proof We prove the implication from (i) to (ii) for α = 2 and ρ = 0. Hence, the
inequality to establish is, for a positive smooth function f ,
2
(Pt f )2 (x) ≤ Pt f 2 (y) ed /2t
266 5 Logarithmic Sobolev Inequalities
where d = d(x, y). The idea of the proof relies on a variation of the usual interpo-
lation scheme by joining x and y by a constant speed geodesic (xs )s∈[0,t] and then
analyzing the function
(s) = Ps (Pt−s f )2 (xs ), s ∈ [0, t].
Integrating this differential inequality immediately yields the conclusion. The gen-
eral case, α > 1 and ρ = 0, is similar, the only difference is that we have to choose
a geodesic (xs )s∈[0,1] with a non-constant speed.
For the converse implication from (ii) to (i), assume again for simplicity that
ρ = 0. Let α = 1 + ε and let yε be the exponential map starting from x with initial
tangent vector v. A Taylor expansion (as ε goes to 0) of (5.6.1) then yields at the
point x ∈ M,
|v|2
Pt (f log f ) − Pt f log Pt f ≥ −v · ∇Pt (f ) − Pt f .
4t
Choosing v = −2t ∇P tf
Pt f yields the logarithmic Sobolev inequality in its reverse
form (5.5.6) which is equivalent to the curvature condition CD(ρ, ∞). The proof is
therefore complete.
ρ d(x, y)2
Pt (log f )(x) ≤ log Pt f (y) + (5.6.2)
2(e2ρt − 1)
for any strictly positive bounded measurable function f on M, any t > 0 and any
x, y ∈ M. This inequality may be proved alternatively by letting α → ∞ in (5.6.1)
(after the changing f into f 1/α ).
5.6 Infinite-Dimensional Harnack Inequalities 267
The latter log-Harnack inequality has a simple interesting consequence for lower
bounds on the kernel density pt (x, y) of Pt with respect to μ, which we assume
to be a probability density. Namely, applying it to f (z) = pt (x, z) + ε, z ∈ M, and
letting ε → 0, it implies that for any x, y ∈ M,
ρ d(x, y)2
pt (x, z) log pt (x, z)dμ(z) ≤ log p2t (x, y) + .
M 2(e2ρt − 1)
By Jensen’s inequality for the convex function r → r log r, r ∈ R+ , with respect
to μ, since pt (x, ·) is a probability density, the left-hand side of this inequality is
positive. Replacing t by 2t , it follows that, for all t > 0 and (x, y) ∈ M × M,
ρ d(x, y)2
pt (x, y) ≥ exp − ρt . (5.6.3)
2(e − 1)
We conclude this section with a useful observation concerning the local logarith-
mic Sobolev inequalities of Theorem 5.5.2 and their relationships with the previ-
ous Harnack inequalities. It appears that the reverse logarithmic Sobolev inequali-
ties (5.5.6) of Theorem 5.5.2 provide rather precise information on the kernels as
t → 0, which in turn get close to the Harnack inequality (5.6.1) of Theorem 5.6.1.
Consider a function f ∈ Aconst+
0 such that ε ≤ f ≤ 1 for some ε > 0. Then
Pt (f log f ) ≤ 0 and thus (5.5.6) ensures that
e2ρt − 1 (Pt f )
−Pt f log Pt f ≥ .
2ρ Pt f
It then follows by elementary means that
1 ρ
log ≤ . (5.6.4)
Pt f 2(e2ρt − 1)
This gradient bound is an analogue of (4.7.7), p. 211, in the context of local Poincaré
inequalities.
With the aim of applying (5.6.4) to Harnack inequalities, we assume for simplic-
ity that ρ = 0. Then, by definition of the intrinsic distance (3.3.9), p. 166, for every
t > 0 and x, y ∈ E,
1 1 d(x, y)
log ≤ log + .
Pt f (x) Pt f (y) 2t
After some work, it may then be shown that for each ε > 0, there exists a C(ε) > 0
such that
2
(Pt f )2 (x) ≤ C(ε) Pt f 2 (y) ed(x,y) /(2+ε)t ,
which is as close as possible to (5.6.1) (for α = 2). Employing more refined ar-
guments of an isoperimetric nature, the optimal bound will be achieved using this
principle in Sect. 8.6, p. 421.
268 5 Logarithmic Sobolev Inequalities
In other words (changing f into f 2 and extending to the domain D(E)), μ sat-
isfies LS( ρ1 ). This limiting procedure requires the ergodicity properties described
in Sect. 1.8, p. 32, which hold in particular under connexity (cf. e.g. Sect. 3.2.1,
p. 140).
Although this logarithmic Sobolev inequality is contained in Theorem 5.7.4 be-
low (corresponding to n = ∞), it is worth stating it independently.
Proposition 5.7.1 (Logarithmic Sobolev inequality under CD(ρ, ∞)) Under the
curvature condition CD(ρ, ∞), ρ > 0, the Markov Triple (E, μ, ) satisfies a log-
arithmic Sobolev inequality LS(C) with constant C = ρ1 . That is, for every function
f ∈ D(E),
2
Entμ f 2 ≤ E(f ).
ρ
and
(t) = 2 Pt f 2 (log Pt f )dμ
E
where we recall the Fisher information Iμ from (5.1.6). Now, if (t) ≥ − C2 (t)
for some C > 0, then e2t/C (t) is increasing in t ≥ 0, and therefore
for every positive function f in A which is bounded together with (f ) and Lf .
270 5 Logarithmic Sobolev Inequalities
of Proposition 4.8.3, p. 214, used for Poincaré inequalities. But whereas the latter
expresses a dual equivalent formulation of the Poincaré inequality, (5.7.5) turns out
to be strictly stronger in general than the logarithmic Sobolev inequality LS(C).
On the basis of the previous semigroup tools, we next investigate logarithmic
Sobolev inequalities under curvature-dimension conditions. As for Poincaré in-
equalities (cf. Theorem 4.8.4, p. 215), the constant C = ρ1 in Proposition 5.7.1 may
be improved under a curvature-dimension hypothesis CD(ρ, n) with n < ∞. Again
by Proposition 5.1.3, the next statement strengthens the Poincaré inequality from
Theorem 4.8.4, p. 215. The proof is a prototypical example of the power of the
change of variables formula for the 2 operator (and several integral identities in-
volving 2 in the proof below will turn out to be useful later in this monograph).
Theorem 5.7.4 (Logarithmic Sobolev inequality under CD(ρ, n)) Under the
curvature-dimension condition CD(ρ, n), ρ > 0, n > 1, the Markov Triple (E, μ, )
satisfies a logarithmic Sobolev inequality LS(C) with constant C = n−1
ρn .
Proof The objective is to reach the criterion of Proposition 5.7.3. The main argu-
ment relies on the chain rule formula (3.3.2), p. 158, for the 2 operator used here
with ψ(r) = ear for a real parameter a so to get, for g ∈ A0 ,
2 eag = a 2 e2ag 2 (g) + a g, (g) + a 2 (g)2 . (5.7.6)
Note that 2 (eag ) actually takes place in the extended algebra A. Now on the other
hand, by Proposition 3.3.17, p. 161, which applies to eag with g ∈ A0 , and the
change of variables for L,
ag 2
2 eag dμ = L e dμ
E E
2
= a2 e2ag Lg + a (g) dμ
E
= a2 e2ag (Lg)2 + 2a Lg (g) + a 2 (g)2 dμ.
E
5.7 Logarithmic Sobolev Inequalities Under a Curvature-Dimension Condition 271
Observe that since g ∈ A0 , all the terms in the latter integral are bounded and there-
fore integrable with respect to the finite measure μ. Integrating by parts,
e 2ag
Lg (g)dμ = − e 2ag
g, (g) dμ − 2a e2ag (g)2 dμ. (5.7.7)
E E E
After some elementary algebra, it follows from the preceding identities that
e 2ag
(Lg) dμ =
2
e2ag 2 (g) + 3a g, (g) + 4a 2 (g)2 dμ. (5.7.8)
E E
This identity turns out to be most useful, providing a way to classify many kinds of
quantities which are related to each other after integration by parts.
Now, by (5.7.6) again, the curvature-dimension condition CD(ρ, n) applied to
eag yields that
1 2
2 (g) + a g, (g) + a 2 (g)2 ≥ ρ (g) + Lg + a (g) .
n
where in the last step, (5.7.7) (with a = 12 ) was used again. In the latter, replace now
Choosing a = 2(n+2)3
(for which the term E eg (g, (g))dμ disappears) yields
that
ρn 4n − 1
eg 2 (g) dμ ≥ eg (g)dμ + eg (g)2 dμ (5.7.11)
E n − 1 E 4(n + 2)2
E
and in particular
ρn
eg 2 (g)dμ ≥ eg (g)dμ. (5.7.12)
E n−1 E
This is almost the required (5.7.5) although for f = eg where g ∈ A0 . Now, by
Proposition 3.3.6, p. 156, and Lemma 3.3.16, p. 160, under the curvature condi-
tion (5.7.12) extends to functions f ∈ A which are bounded, together with (f ) and
Lf (such functions belong to D(L)). The proof of Theorem 5.7.4 is complete.
Theorem 5.7.4 applies to the sphere Sn in Rn+1 (Sect. 2.2, p. 81), and for the
symmetric Jacobi operator with dimension n (Sect. 2.7.4, p. 113), the lower bound
ρn
n−1 of the logarithmic Sobolev constant being optimal since these two models
both satisfy a curvature-dimension condition CD(n − 1, n) and their Poincaré con-
stant (equal to n1 ) is already optimal. The case n = 1 corresponding to the one-
dimensional torus, for which ρ = 0, may easily be integrated into the picture. In-
deed, by Proposition 4.5.5, p. 199, the Poincaré inequality P (1) holds in this case,
and by the dual formulation of Proposition 4.8.3, p. 214, for every f in D(L),
2 (f )dμ ≥ (f )dμ.
E E
Therefore, by (5.7.13),
e 2 (g)dμ ≥
g
eg (g)dμ,
E E
5.8 Notes and References 273
ρn
that is (5.7.12) with the convention that n−1 = 1. The logarithmic Sobolev inequal-
ity with constant 1 (equal to the Poincaré constant) follows similarly. Of course,
simple integrations by parts on the torus may be used to provide a direct proof. The
constant is optimal since it is already optimal for the Poincaré inequality.
By analogy with the Poincaré inequalities of Proposition 4.5.5, p. 199, we may
emphasize the preceding conclusion for n = 1 on the unit interval (after scaling).
Krohn [384] (in the form of Theorem 5.2.5). Hyperboundedness in the sense of the
hypothesis of Theorem 5.2.5 implies a spectral gap, as conjectured in [384]. This
was recently established in [311] (see [432] for a prior weaker result), and consid-
erably improves upon Proposition 5.1.3 (although not quantitatively). The existence
and uniqueness of ground states for Schrödinger operators [223] was a further im-
portant step in the theory. The interplay between logarithmic Sobolev inequalities
and hypercontractivity has actually been quite rich and fruitful. The reference [145]
gives an account of some of the main aspects of this connection, while the re-
views [225] and [226] survey the developments of logarithmic Sobolev inequalities
over the last decades and collect references. Further general references on logarith-
mic Sobolev inequalities are [372] and [14] as an introduction, and [229, 238] with
an emphasis on statistical physics models (for bounded and unbounded spins).
The tightening Proposition 5.1.3 has been observed in [158] (on the basis of an
independent proof of (5.1.8)). Lemma 5.1.4 is due to O. Rothaus [370] as is Re-
mark 5.1.5 [367–369]. The stability by perturbation property, Proposition 5.1.6, is
due to R. Holley and D. Stroock [245]. The general proof presented here on the
basis of Lemma 5.1.7 is taken from [14] (see also [372]). Proposition 5.1.8 is part of
the slicing technique further developed in the context of Sobolev-type inequalities
in Chap. 6.
Theorem 5.2.1 goes back to the origin of entropy and to the Boltzmann
H -Theorem (see [86, 424]). The Pinsker-Csizsár-Kullback inequality (5.2.2) can
be found in numerous references including e.g. [14, 141, 426]. de Bruijn’s iden-
tity of Proposition 5.2.2 is recorded in [387] (see also [157]) for the Euclidean
heat semigroup and extensively developed in a more general framework in [36].
Exponential decays in uniform norms for models from statistical mechanics may
be developed with the same principle (see e.g. [229]). The main equivalence Theo-
rem 5.2.3 with hypercontractivity is due to L. Gross [224]. Reverse hypercontractiv-
ity (Remark 5.2.4) was first observed by C. Borell and S. Janson [88]. Theorem 5.2.5
goes back to [384]. Proposition 5.2.6 is due to P. Cattiaux [115], who gave a dif-
ferent proof. A main feature of logarithmic Sobolev inequalities, stability under
products, had already been emphasized and used in a critical way in [224]. The
dimension-free property is in particular a powerful argument in statistical mechan-
ics (cf. [229, 372]).
Integrability of Wiener chaos via hypercontractivity was emphasized by C. Borell
[88]. The discussion at the end of Sect. 5.3 on spectral features of hypercontractive
bounds is taken from [81].
The so-called Herbst argument from Proposition 5.4.1 in Sect. 5.4 goes back
to an unpublished letter (1975) of I. Herbst to L. Gross. The argument is presented
in [146], and emphasized later in the paper [6] which revived interest in the question
and gave rise to several subsequent contributions, in particular related to measure
concentration, synthesized in the notes [276] (to which we refer for more details).
See also [90] for applications to concentration inequalities. Proposition 5.4.2 is
taken from [7]. The Muckenhoupt criterion of Theorem 5.4.5 is due to S. Bobkov
and F. Götze [77] and is based on the corresponding result for Poincaré inequalities
(see the preceding chapter).
5.8 Notes and References 275
There are at least fifteen different proofs of the Gaussian logarithmic Sobolev
inequality (Proposition 5.5.1), some of them mentioned in this book, including the
proof of L. Gross [224] relying on the two-point space inequality, tensorization and
the central limit Theorem. The proof given here using heat flow monotonicity along
the Ornstein-Uhlenbeck semigroup goes back to the seminal paper [36] and is par-
tially inspired by the stochastic calculus proof of Nelson’s hypercontractivity by
J. Neveu [329]. A careful and detailed exposition of the argument in the language of
partial differential equations is given in [412] in the Gaussian case, and later in [16]
for strictly convex potentials in Rn . The paper [412] actually emphasized the impor-
tance of logarithmic Sobolev inequalities towards convergence to equilibrium in this
context. The logarithmic Sobolev inequality for Gaussian measures (as well as the
corresponding Poincaré inequality) has been extended to the infinite-dimensional
setting of the Wiener measure (cf. Sect. 2.7.2, p. 108), and further to Brownian
paths on a Riemannian manifold. With this task in mind, finite-dimensional approx-
imations may be used. Alternatively, the semigroup scheme may also be developed
directly in the infinite-dimensional context, in particular by means of the Clark-
Ocone probabilistic interpolation formula for functionals of the Brownian paths
(cf. [5, 105, 182, 250, 417] and [251] for a synthesis). The local heat kernel inequal-
ities of Sect. 5.5 have been analyzed in [27] (see also [28, 277, 394]) on the basis of
the ideas in [36]. The latter paper, which introduces the basic interpolation scheme
along the semigroup and the importance of the 2 operator, also describes the log-
arithmic Sobolev inequalities under the curvature-dimension condition of Sect. 5.7.
See [26] for a first synthesis. Remark 5.5.4 is developed in [274] for entropy. Local
hypercontractivity (Theorem 5.5.5) is taken from [31].
The Harnack inequalities for an infinite-dimensional diffusion operator from
Theorem 5.6.1 are due to F.-Y. Wang [428]. The log-Harnack inequality (5.6.2) can
be found in [76] as well as in [434] as a limit case of the previous Harnack inequal-
ities. The lower bound (5.6.3) is pointed out in [435]. Discussions and extensions
are developed in a series of papers by this author [431, 433, 435]. The useful ob-
servation (5.6.4) is due to M. Hino [242] and will be employed in a critical way in
Sect. 8.7, p. 425. Heat kernel bounds of the type (5.5.9) are surveyed in [217].
Logarithmic Sobolev inequalities under curvature-dimension condition
(Sect. 5.7) go back to [26, 36] (see also [371]). That the integral criterion from
Proposition 5.7.3 is not equivalent to the corresponding logarithmic Sobolev in-
equality has been pointed out by B. Helffer ([14, 238]). Proposition 5.7.5 on
logarithmic Sobolev inequalities on an interval may be found in [367, 368, 439]
and [173]. Hypercontractivity on the n-sphere via ultraspherical polynomials is
studied in [322].
Chapter 6
Sobolev Inequalities
Following our study of Poincaré and logarithmic Sobolev inequalities, this chap-
ter is devoted to the investigation of Sobolev inequalities. Sobolev inequalities play
a central role in analysis, providing in particular compact embeddings and tight
connections with heat kernels bounds. They are also deeply linked with the geo-
metric structure of the underlying state space through conformal invariance. Here
our study only covers a small fraction of the vast subject of Sobolev inequalities,
with a specific focus on the main theme of Markov diffusion operators and semi-
groups. As in the preceding chapters, we work here in the context of a Full Markov
Triple (“Markov Triple”) (E, μ, ) with Dirichlet form E, infinitesimal generator L,
Markov semigroup P = (Pt )t≥0 and underlying function algebras A0 and A as sum-
marized in Sect. 3.4, p. 168. Once again, several properties and results described in
this chapter remain valid in more general settings (in particular for Standard Markov
Triples), and the reader may adapt the statements and proofs if necessary.
The chapter starts with a brief exposition of the classical Sobolev inequalities on
the model spaces, namely the Euclidean, spherical and hyperbolic spaces. These
examples will both give hints and provide a better understanding of further de-
velopments. The next section investigates the various definitions of Sobolev-type
inequalities in the Markov Triple context, emphasizing in particular (logarithmic)
entropy-energy and Nash-type inequalities. These inequalities will be further inves-
tigated in Chap. 7. Section 6.3 presents the basic equivalence between Sobolev in-
equalities and (uniform) heat kernel bounds (ultracontractivity), and Sect. 6.4 high-
lights applications to compact embeddings. Sections 6.5 and 6.6 address issues on
tensorization properties of Sobolev-type inequalities and diameter, Lipschitz func-
tions and volume estimates. In particular, with respect to Poincaré and logarithmic
Sobolev inequalities, tensorization of Sobolev inequalities have to take into account
a dimensional parameter. Local inequalities under the semigroup (Pt )t≥0 are inves-
tigated next, providing in particular a heat flow approach to the celebrated Li-Yau
parabolic inequality. Section 6.8 establishes the sharp Sobolev inequality under the
curvature-dimension condition CD(ρ, n) (ρ > 0, n < ∞), covering the example
of the standard sphere. The subsequent section describes the conformal invariance
properties of Sobolev inequalities, and, as a consequence, the sharp Sobolev in-
equalities in the Euclidean and hyperbolic spaces on the basis of the inequality on
the sphere. Gagliardo-Nirenberg inequalities form a further family of equivalent
Sobolev-type inequalities, well-suited to non-linear porous medium and fast diffu-
sion equations. These topics, and their geometric counterparts, are presented and
detailed in Sects. 6.10 and 6.11, providing in particular a fast diffusion approach to
several Sobolev inequalities of interest.
The fundamental example of a Sobolev inequality takes place in the Euclidean space
Rn equipped with the Lebesgue measure dx and the standard carré du champ op-
erator (f ) = |∇f |2 . The basic Sobolev inequality expresses here that for every
smooth compactly supported function f on Rn , n > 2 (and thus every f in the
Dirichlet domain),
2
2
f p ≤ Cn |∇f | 2 = Cn |∇f |2 dx (6.1.1)
Rn
where p = n−2 2n
and Cn > 0 is an explicit constant whose optimal value is known
(and will be explicitly calculated below in Theorem 6.9.4). The norms are of course
understood here with respect to the Lebesgue measure. For simplicity |∇f |2 is
denoted ∇f 2 below.
2n
The exponent p takes the value n−2 (n > 2) for a very good reason. It is only for
this value that the Sobolev inequality (6.1.1) is invariant under dilation, that is by
the change of f (x) into fs (x) = f (sx), s > 0. Indeed, fs p = s −n/p f p while
∇fs 22 = s 2−n ∇f 22 , and it is only for the exponent p = n−2
2n
that the inequality
can hold, any other value leading to a contradiction as s → 0 or ∞. Dilations actu-
ally play a central role in the analysis of Sobolev inequalities in Euclidean space as
will be illustrated throughout this chapter.
On the unit sphere Sn in Rn+1 , with the normalized uniform measure μ and the
spherical carré du champ operator as defined in Sect. 2.2, p. 81, there is also
a Sobolev inequality which takes a somewhat different form, but with the same
exponent p = n−2 2n
. Namely, for every smooth function f on Sn (in the Dirichlet
domain)
4
f 2p ≤ f 22 + (f )dμ. (6.1.2)
n(n − 2) Sn
The norms denote here the norms in Lr (μ), r ≥ 1. In this form, the Sobolev inequal-
ity is closer to the Poincaré and logarithmic Sobolev inequalities discussed in the
preceding chapters, and may be established for large families of Markov Triples. In
the absence of dilations, the value of p = n−2
2n
is rather mysterious. By monotonic-
ity (Jensen’s inequality), different values of 2 ≤ p ≤ n−22n
may actually be consid-
2n
ered, however not exceeding the critical value n−2 . Note also that (6.1.2) is tight
6.2 Sobolev and Related Inequalities 279
in the sense of Remark 4.2.2, p. 182, implying that the only functions f such that
(f ) = 0 are constant.
Finally, there is also a Sobolev inequality on the hyperbolic space Hn
(cf. Sect. 2.3, p. 88) which takes the form
f p ≤ −A f 2 + Cn
2 2
(f )dμ (6.1.3)
Hn
The next definition introduces the notion of a Sobolev inequality in a general context.
On the basis of the model examples, the exponent p will often take the form
p = n−22n
for some n > 2 (not always an integer), and we then speak of a Sobolev
inequality Sn (A, C) of dimension n (> 2) and constants A ∈ R, C > 0. Under a
Sobolev inequality S(p ; A, C), if f ∈ D(E) then f ∈ Lp (μ). As usual, it is enough
to state (and prove) such inequalities for a family of functions f which is dense in
the Dirichlet domain D(E) (typically the algebra A0 ).
When the measure μ is finite and normalized into a probability measure, apply-
ing the Sobolev inequality from Definition 6.2.1 to f = 1 shows that A ≥ 1. (In
particular, a Sobolev inequality S(p ; A, C) with A = 0 (or A ≤ 0) can only hold if
280 6 Sobolev Inequalities
the measure μ has infinite mass.) The inequality is called tight if A = 1, and is then
denoted by S(p ; C) (respectively Sn (C)) for simplicity. The best constant C > 0
for which such an inequality S(p ; C) or Sn (C) holds is sometimes referred to as
the Sobolev constant (of the Markov Triple).
As for logarithmic Sobolev inequalities, tightness only holds if there is a Poincaré
inequality (Definition 4.2.1, p. 181).
It should be noted that the application of the first assertion to the Sobolev in-
equality (6.1.2) on the sphere Sn yields the Poincaré inequality P ( n1 ) with its sharp
4
constant (cf. Theorem 4.8.4, p. 215). In particular, both constants (1 and n(n−2) ) in
the Sobolev inequality (6.1.2) are optimal.
C
which amounts to the Poincaré inequality P ( p−2 ).
The second assertion relies on the analogue of (5.1.8), p. 240, stating that, for
p > 2 and every f ∈ Lp (μ),
2
2
f 2p ≤ f dμ + (p − 1) fˆp (6.2.1)
E
which may be shown by elementary calculus, proving that ψ (r) ≤ 2(p − 1)g2p .
6.2 Sobolev and Related Inequalities 281
Using (6.2.1), apply S(p ; A, C) to fˆ (in D(E)) and then the Poincaré inequality
P (C ) to get
2
2
f 2p ≤ f dμ + (p − 1) A fˆ2 + C E(f )
E
2
≤ f dμ + (p − 1) A C + C E(f ).
E
By Jensen’s inequality, this inequality is even better than the expected Sobolev in-
equality S(p ; (p − 1)(AC + C)). The proof of the proposition is complete.
Proposition 6.2.3 For a Markov Triple (E, μ, ), the following implications hold.
(i) Under Sn (A, C), A ≥ 0, C > 0, for every function f in D(E) with E f 2 dμ = 1,
n
Entμ f 2 ≤ log A + C E(f ) . (6.2.2)
2
(ii) Under (6.2.2), for every function f in D(E),
n/2
f n+2
2 ≤ A f 22 + C E(f ) f 21 . (6.2.3)
(iii) Conversely, under (6.2.3), a Sobolev inequality Sn (A1 , C1 ) holds with con-
stants A1 ≥ 0 and C1 > 0 only depending on n, A and C. Moreover A1 = 0
whenever A = 0.
f n+2
2 ≤ Cn ∇f n2 f 21 (6.2.4)
282 6 Sobolev Inequalities
where the sharp constant Cn is known (see the Notes and References). While Nash
inequalities with A < 0 may be considered similarly, we stick for simplicity to the
case A ≥ 0. By concavity, the logarithmic entropy-energy inequality (6.2.2) is equiv-
alent to the family of (defective) logarithmic Sobolev inequalities
Entμ f 2 ≤ (r) E(f ) + (r), r ∈ (0, ∞), (6.2.5)
2
for every f in D(E) with E f dμ = 1, where (r) = n2 log(A + Cr) and
(r) = (r) − r (r), r ∈ (0, ∞). The classical logarithmic entropy-energy in-
equality in Rn is discussed below in Proposition 6.2.5 as an equivalent form of the
logarithmic Sobolev inequality for Gaussian measures. Generalized entropy-energy
and Nash-type inequalities will be thoroughly studied in Chap. 7.
When μ is a probability, the logarithmic entropy-energy and Nash inequali-
ties (6.2.2) and (6.2.3) are called tight if A = 1. A tight logarithmic entropy-energy
inequality implies a logarithmic Sobolev inequality LS( Cn 4 ). Applied to f = 1 + εg
with ε → 0, the tight Nash inequality implies a Poincaré inequality P ( Cn 2 ).
Proposition 6.2.3 thus indicates that, up to constants (but we will however see
that optimal constants play a crucial role in a variety of problems of interest), the
preceding three inequalities are equivalent. Many further formulations may be con-
sidered here, as is clear from the proof, such as for example the Gagliardo-Nirenberg
inequalities presented in Sect. 6.10, the preceding being however amongst the most
used ones. Note that both the logarithmic entropy-energy and Nash inequalities
make sense for n ≥ 1, or even n > 0, while the Sobolev inequality requires n > 2.
We are left with the last implication, which is more involved and which re-
lies on the slicing technique already developed in Proposition 5.1.8, p. 241. It is
enough to establish the Sobolev inequality for a positive bounded function f in
D(E). The idea is to apply the Nash inequality (6.2.3) to the sequence of functions
fk = (f − 2k )+ ∧ 2k , k ∈ Z. Let Nk = {f > 2k }, k ∈ Z. Since
2k 1Nk+1 ≤ fk ≤ 2k 1Nk ,
2 μ(Nk+1 ) ≤
2k
fk dμ ≤ 2 μ(Nk ) and
2 2k
fk dμ ≤ 2k μ(Nk )
E E
n/(n+2) 2/(n+2)
αk+1 ≤ 2p βk αk2 .
k∈Z k∈Z k∈Z
By the upper bound 2
k∈Z αk ≤ ( k∈Z αk )2 , it follows that
(n−2)/n
αk ≤ 22(n+2)/(n−2) βk .
k∈Z k∈Z
Remark 6.2.4 As for Poincaré inequalities (Lemma 4.2.3, p. 182), the Cauchy-
Schwarz inequality may be used in the Nash inequality (6.2.3) to reach spectral
inequalities on functions in D(E) with support in some (measurable) subset B of E.
For example, whenever μ(B) ≤ (2A)−n/2 , for any function f ∈ D(E) with support
in B,
f 2 dμ ≤ 2Cμ(B)2/n E(f ).
B
In the classical Euclidean case, corresponding to A = 0 (and μ(E) infinite), this
inequality is known as a Faber-Krahn inequality. It describes a lower bound on the
spectrum of the restriction of the operator L to B with Dirichlet boundary condi-
tions. (See also Remark 8.2.2, p. 399.) Using similar slicing techniques, this in-
equality, valid for any B ⊂ E, implies in turn the corresponding Sobolev inequality
with the same dimension (but different constants).
Proof Start from the logarithmic Sobolev inequality for the standard Gaussian mea-
sure dμ(x) = (2π)−n/2 e−|x| /2 dx in Rn (Proposition 5.5.1, p. 258),
2
2
Entμ g ≤ 2 |∇g|2 dμ. (6.2.9)
Rn
Integrate by parts the middle term of the integral on the right-hand side of the pre-
ceding equality to get
1 1
x · f ∇f dx = ∇ |x|2 · ∇ f 2 dx = − f 2 |x|2 dx.
R n 4 R n 4 Rn
Now change f (x) into fs (x) = s n/2 f (sx), x ∈ Rn , s > 0. Since fs2 dx =
Rn
Rn f dx = 1, E(fs ) = s E(f ) and
2 2
Entdx fs2 = Entdx f 2 + n log s,
applying the preceding inequality to fs for every s > 0, and optimizing (s 2 = 4En(f ) )
yields the conclusion. The constant is optimal since this proof clearly indicates
that (6.2.8) for the Lebesgue measure is actually equivalent to the logarithmic
Sobolev inequality (6.2.9) for the Gaussian measure μ, which is sharp. Since the
exponential functions ea·x , a ∈ Rn , x ∈ Rn , are the extremal functions of the Gaus-
sian logarithmic Sobolev inequality, extremal functions of the Euclidean logarith-
2 2
mic Sobolev inequality are given by Gaussian kernels ea·x−|x| /2σ , a ∈ Rn , σ > 0,
x ∈ Rn . Proposition 6.2.5 is established.
Note that the requirement n > 2 is only necessary for the Sobolev inequality
Sn (A, C) in (i) and not for the ultracontractive bounds (ii) and (iii). Actually, the
proof below will transit through Nash inequalities (6.2.3) justifying the extension to
any n > 0.
Proof Start with the equivalence between (ii) and (iii). Note first that if
Pt 1,2 ≤ K(t), then, by duality, Pt 2,∞ ≤ K(t). Since Pt = Pt/2 ◦ Pt/2 , it is
therefore bounded from L1 (μ) into L∞ (μ) with norm K 2 ( 2t ). This shows that
(ii) ⇒ (iii). The converse implication is obtained by the classical Riesz-Thorin in-
terpolation Theorem which asserts that, for an operator P , whenever P p1 ,q1 ≤ K1
and P p2 ,q2 ≤ K2 , then for every θ ∈ [0, 1],
where
1 θ 1−θ 1 θ 1−θ
= + , = + .
pθ p1 p2 qθ q1 q2
Applying this result with p1 = p2 = 1, q1 = ∞, q2 = 1, K1 = C t −n/2 , K2 = 1
(since Pt is a contraction) and θ = 12 then yields the claim since pθ = 1, qθ = 2 and
(1−1)
Kθ = C 1/2 t −n/4 . The same argument shows that Pt p,q ≤ (Ct − 2 ) p q , t > 0,
n
where we set θ = 2
n+2 . It follows that the function eλt (1 − A −r (t)), t ≥ 0, is
decreasing, where
2A r θ 2
λ= and r = = .
C 1−θ n
Therefore, for every t > 0,
1/r n/2
A A
(t) ≤ ≤ . (6.3.2)
1−e −λt + A e−λt 1 − e−λt
(0)r
288 6 Sobolev Inequalities
If A > 0, the latter is bounded from above by C t −n/2 for some constant C > 0 for
every 0 < t ≤ 1. If A = 0, 1−eA−λt is replaced by its limit 2rt
C
= Cn
4t (for all t > 0).
Next turn to the converse implication, that is from the decay (ii) to the Nash
inequality (6.2.3) equivalent to the Sobolev inequality (i). Using the same nota-
tion, we only give the argument for A > 0 (the
case A = 0 being similar). It
may be assumed, again for f ≥ 0 such that E f dμ = 1, that for every t > 0,
(t) ≤ C (1 + t −n/2 ). By Lemma 4.2.6, p. 184, the function log is convex. There-
fore, for every θ ∈ [0, 1], and every t > 0,
−n/2 θ
t
(t) ≤ (0) 1−θ
C 1+ .
θ
Choose then θ = α t for some fixed α to be specified below, which can be achieved
as soon as t is small enough. At t = 0, the two sides of the preceding inequality are
equal. A Taylor expansion at t = 0 then yields
−2 E(f ) ≤ (0) −α log (0) + α log C 1 + α n/2 .
Choose α = E (0)
(f )
so that
E(f ) n/2
log (0) ≤ 2 + log C 1 +
(0)
and hence
n/2
(0)1+(n/2) ≤ K (0)n/2 + E(f )n/2 ≤ K1 (0) + E(f ) ,
Remark 6.3.2 The preceding proof does not attempt to make the constants sharp,
and of course the arguments may be tightened at some point. It is nevertheless pos-
sible to draw some (non-optimal) information concerning the relationships between
the various constants C in Theorem 6.3.1. For example, under the Sobolev inequal-
ity Sn (A, C) for some A ≥ 0, C > 0,
C
Pt 1,∞ ≤
t n/2
In the last part of this section, we make a few observations and describe some
consequences of Theorem 6.3.1. In particular, we assume in the following that A ≥ 0
in a given Sobolev inequality Sn (A, C). As a consequence of the Sobolev inequal-
ity (6.1.1), one recovers the fact that the standard heat kernels on Rn are uniformly
6.3 Ultracontractivity and Heat Kernel Bounds 289
bounded by C t −n/2 , t > 0, which is of course obvious from the explicit representa-
tion (2.1.1), p. 78. A more careful analysis will show in Corollary 7.1.6, p. 354, that
the optimal bound (4πt)−n/2 may actually be deduced from the Euclidean logarith-
mic Sobolev inequality (6.2.8) with its sharp constant. In Chap. 7, further methods
producing ultracontractive bounds from Sobolev inequalities will be developed.
The first corollary describes other forms of ultracontractive bounds.
C
Pt p,q ≤ n 1 1
t 2(p−q )
Proof The first assertion was mentioned in the proof of Theorem 6.3.1 (as a
consequence of the Riesz-Thorin Theorem and the fact that Pt is a contrac-
∞all L −λt
tion on p (μ)-spaces). For the resolvent R , use the integral representation
λ
Rλ = 0 Pt e dt from (A.1.2), p. 474. By the preceding, Pt p,∞ ≤ Ct −n/2p ,
0 < t ≤ 1. In addition, since the operators Pt , t ≥ 0, are contractions in all
Lp (μ)-spaces, the norm Pt p,∞ is decreasing in t, and is in particular bounded
for t ∈ (1, ∞). Consequently, for some constant C > 0 only depending on p and the
Sobolev constants,
∞ 1 ∞
−λt −n/2p
Rλ p,∞ ≤ Pt p,∞ e dt ≤ C t dt + C e−λt dt.
0 0 1
The claim follows for p > n2 . Together with a further interpolation, the same argu-
ment leads to the corresponding assertion for p < n2 . Corollary 6.3.3 is proved.
When the measure μ is a probability measure and when the Sobolev inequality
is tight (that is Sn (1, C) holds), more precise bounds as t → ∞ may be obtained.
Recall that in this setting tightness is equivalent to a Poincaré inequality (Proposi-
tion 6.2.2).
Proof We prove
the result with T = 2. For t > 0, consider the operator
Pt0 f = Pt f − E f dμ whose kernel is pt (x, y) − 1. From the hypotheses,
√ √
P10 1,∞ ≤ C so that P10 1,2 ≤ C and P10 2,∞ ≤ C . On the other hand,
the Poincaré inequality indicates by Theorem 4.2.5, p. 183, that
0
P f
≤ e−t/C2 f 2 .
t 2
Theorem 6.3.1 indicates that as soon as a Sobolev inequality holds, the semigroup
(Pt )t≥0 is ultracontractive, and may therefore be represented by a bounded density
kernel pt (x, y), t > 0, (x, y) ∈ E × E, as
Pt f (x) = f (y)pt (x, y)dμ(y) t > 0, x ∈ E,
E
with respect to the invariant measure μ. When the measure μ is finite, the fact that Pt
may be represented by a bounded kernel pt (x, y) implies that it is Hilbert-Schmidt,
since then
pt2 (x, y)dμ(x)dμ(y) < ∞.
E E
Therefore, the operator Pt , t > 0, is compact (cf. Sect. A.6, p. 483). We summarize
these conclusions in a statement.
Corollary 6.4.1 Under a Sobolev inequality Sn (A, C) for a finite measure μ, the
operators Pt , t > 0, are Hilbert-Schmidt and the spectrum of the Markov generator
L is discrete.
It is a specific consequence of this result that the embedding from the Dirichlet
space D(E) into L2 (μ) is compact. Recall that D(E) is endowed with the norm
f E = [f 22 + E(f )]1/2 .
Theorem 6.4.2 Under a Sobolev inequality Sn (A, C) for a finite measure μ, for
any sequence (fk )k∈N in D(E) bounded with respect to the norm · E , there is a
subsequence which converges in L2 (μ).
6.5 Tensorization of Sobolev Inequalities 291
Note that under the Sobolev inequality Sn (A, C), the convergence of the subse-
quence (fk )∈N in the preceding proof also takes place in Lq (μ), 2 ≤ q < n−2
2n
.
While Theorem 6.4.2 applies to Sobolev inequalities on spheres S or compact
n
manifolds, it does not apply directly on Rn since the invariant Lebesgue measure dx
is not finite. The same arguments do however apply on bounded sets in Rn , leading
to the classical Rellich-Kondrachov Theorem. Given a bounded open set O in Rn ,
set, for a smooth and compactly supported function f on O,
f 2H1 (O) = f 2 dx + |∇f |2 dx.
O O
Let H1 (O) be the Hilbert completion of the space of smooth compactly supported
functions in O with respect to this norm, considered as a subspace of L2 (O, dx).
Proof Via the stereographic projection from the sphere Sn onto Rn described in
Sect. 2.2.2, p. 83, compactly supported functions in O may be considered as func-
tions defined on Sn . Moreover, O being bounded, the norm f H1 (O) is equiva-
lent to the Dirichlet norm f E on the sphere since on a bounded set O, both the
measures and the carré du champ operators are comparable for the usual Laplacian
and the Laplacian on the sphere. It then remains to apply Theorem 6.4.2 on the
sphere to extract, from any bounded sequence in H1 (O), a subsequence convergent
in L2 (O, dx).
Proof According to (6.2.5), the respective inequalities (6.2.2) are equivalent to the
families of (defective) logarithmic Sobolev inequalities
Entμi f 2 ≤ i (ri ) Ei (f ) + (ri ), ri ∈ (0, ∞),
C
i (r) =
2(A + Cr
ni )
and
Cr
ni Cr
i (r) = i (r) − r i (r) =
ni
log A + − .
2 ni A + Cr
ni
ri
To add dimensions, choose ni = r
n for r > 0 so that
C
1 (r1 ) = 2 (r2 ) =
2(A + Cr
n )
6.6 Sobolev Inequalities and Lipschitz Functions 293
and
Cr
n Cr
1 (r1 ) + 2 (r2 ) = log A + − n
.
2 n A + Cr
n
The conclusion then follows from the tensorization of logarithmic Sobolev inequal-
ities in Proposition 5.2.7, p. 249, and retrieves the logarithmic entropy-energy in-
equality on the product space. Proposition 6.5.1 is established.
for every function f on the k-fold product space E k such that E k f 2 dμ⊗k = 1. This
is the procedure used in Remark 6.2.6 to reach the optimal Euclidean logarithmic
Sobolev inequality from the Sobolev inequality with sharp constants. If A = 1, as
k → ∞, the latter implies (formally) a logarithmic Sobolev inequality LS( Cn 4 ) on
the infinite-dimensional product space. This is another illustration of the importance
of logarithmic Sobolev inequalities in infinite dimension (as a form of limiting or
dimension-free Sobolev inequalities).
That the diameter D is finite under a Sobolev inequality with respect to a prob-
ability measure μ may, for example, be achieved via logarithmic entropy-energy
inequalities (equivalent to Sobolev inequalities by Proposition 6.2.3), much in the
spirit of the Herbst argument in Sect. 5.4.1, p. 252.
on F (s) = 1
s log Z(s), with F (0) = E f dμ. Integrating, it follows that for every
s ∈ R,
s 2
1 u
es(f − E f dμ)
dμ ≤ exp s 2
du .
E 0 u 4
∞ 2
Setting C = 0 u12 ( u4 )du, which is finite by definition of , it easily follows as
The bound produced by Proposition 6.6.1 is not sharp. For example, starting
from the Sobolev inequality (6.1.2) on the sphere Sn (with diameter π ), and then the
associated logarithmic entropy-energy inequality from Proposition 6.2.3, it may be
shown that D ≤ C for some (large) C > π . More refined methods not discussed in
this book (see the Notes and References) allow us to reach directly from the Sobolev
inequality (6.1.2) the optimal bound D ≤ π . This (more refined) argument actually
relies on the application of the Sobolev inequality to some suitable transformation
(related to extremal functions of the Sobolev inequality on the sphere) of a Lipschitz
function (and not just the exponential function as for entropy-energy inequalities).
Next, we concentrate on further aspects concerning Lipschitz functions under
Sobolev-type inequalities. One such aspect deals with volumes of balls.
holds and that V (r0 ) < ∞ for some r0 > 0. Then, provided that
log V (r)
lim sup < ∞, (6.6.1)
r→0 log r
Choosing for f the distance function to a given point, V (r) corresponds to the
volume of a ball in the metric space (E, d). The proposition thus indicates that,
provided the volume of the balls does not decay too rapidly to 0 as r → 0, the
volume of small balls is bounded from below by c r n where n is the dimension
of the Sobolev inequality. This result is already coherent on Rn . In particular, the
exponent n is minimal and determined by the volume of (small) balls. This property
explains why, for a diffusion operator with smooth coefficients on a Riemannian
manifold with dimension n, the Sobolev exponent is at least n.
1 1
1 r ≤ g ≤ 1{f ≤r} and (g) ≤ 1{f ≤r} ,
2 {f ≤ 2 } r2
the inequality Sn (A, C) yields, for some C1 > 0,
2/p
r 1 2C1
V ≤ C1 1 + 2 V (r) ≤ 2 V (r)
2 r r
k
r1 a (p/2) r1n
V ≤
2k C2 2kn
The proof of the previous proposition actually shows more. Namely, if the vol-
ume of small balls is not at least of the order r n , then it has to decay at least expo-
−b
nentially as e−cr as r → 0 for some b, c > 0.
As a first step, we are looking for inequalities which hold for Gaussian mea-
sures in finite dimension, precisely capturing the dimensional feature. To this
end, we begin with the Euclidean logarithmic Sobolev inequality of Proposi-
tion 6.2.5. Recall also the standard Gaussian measure on the Borel sets of Rn ,
dμ(x) = (2π)−n/2 e−|x| /2 dx.
2
for every smooth positive function g on Rn with Rn gdμ = 1 and such that g and
g (log g) are well-defined and integrable with respect to μ.
Comparing with the general setting, if A0 denotes the set of smooth compactly
supported functions on Rn , Proposition 6.7.1 is established for functions in Aconst+
0
(cf. Remark 3.3.3, p. 154), possibly extended then to larger classes of functions.
Note that (6.7.1) improves upon the standard logarithmic Sobolev inequal-
ity (6.2.9) for the Gaussian measure μ since log(1 + r) ≤ r, r ≥ 0, and
2 |∇g|2
g (log g) = g − g ∇(log g) = g − .
g
0
2 dμ = 1. (Actually, in this concrete setting, the smooth approximations can
R n ĝ
be made easily.) Then use the Euclidean logarithmic Sobolev inequality (6.2.8) in
which
we change f back to (2π)−n/4 ĝ e−|x| /4 . After integration by parts for the
2
in other words
2
g dμ − g ∇(log g) dμ < n g dμ.
Rn Rn Rn
But, from the integration by parts formula (2.7.8), p. 107, which has already been
used earlier,
g dμ = g |x|2 − n dμ,
Rn Rn
298 6 Sobolev Inequalities
1 2
2 (log f ) ≥ (log f )
n
which is precisely the curvature-dimension condition CD(0, n) for the Laplace op-
erator on Rn (applied to log f ). In other words, while the standard Gaussian loga-
rithmic Sobolev inequality (6.2.9) only captures by dilation the CD(0, ∞) curvature
condition, the improved inequality (6.7.1) allows us to reach the dimension n. There
6.7 Local Sobolev Inequalities 299
are few inequalities in Euclidean space with such a property. However, as soon as
one is found, it may usually be extended to a general framework.
The following theorem is one such illustration, providing a further example of
the semigroup interpolation scheme.
Pt (f log f ) − Pt f log Pt f
(6.7.4)
n 2t Pt (f L(log f ))
≤ t LPt f + Pt f log 1 − .
2 n Pt f
Pt (f log f ) − Pt f log Pt f
(6.7.5)
n 2t
≥ t LPt f − Pt f log 1 + L(log Pt f ) .
2 n
Proof Start with the most important step, namely the proof of (ii) under the
curvature-dimension CD(0, n) hypothesis (which will then imply (iii) and (iv)).
As usual, set (s) = Ps (φ(Pt−s f )), s ∈ [0, t], t > 0, where φ(r) = r log r, r ∈ R+ ,
and hence f ∈ Aconst+
0 . As already observed earlier in Proposition 5.5.3, p. 261,
(Pt−s f )
(s) = Ps = Ps Pt−s f (log Pt−s f )
Pt−s f
and
(s) = 2Ps Pt−s f 2 (log Pt−s f ) .
300 6 Sobolev Inequalities
so that the numerator of the same expression takes the form [LPt f − (s)]2 . This
results in the differential inequality
2 [LPt f − (s)]2
(s) ≥ , s ∈ [0, t]. (6.7.6)
nPt f
In this differential inequality, the terms LPt f and Pt f behave as constants, so that
we are actually dealing with an inequality of the form u ≥ (a + bu)2 , with u = .
More precisely, setting α = nP2t f , the differential inequality (6.7.6) leads to the in-
equality, valid for any 0 ≤ u ≤ v ≤ t,
(v) − LPt f − (u) − LPt f
(6.7.7)
≥ α(v − u) (0) − LPt f (t) − LPt f ,
which holds regardless of the signs of (v) − LPt f and (u) − LPt f . The in-
equality (6.7.3) of (ii) is then a direct application of (6.7.7) with u = 0 and v = t.
As already obtained in the proof of Theorem 5.5.2, p. 259, under the CD(0, ∞)
condition,
(Pt f ) (f )
≤ Pt ,
Pt f f
that is Pt f L(log Pt f ) ≥ Pt (f L(log f )). Combined with (6.7.3), this immediately
yields that
2t
1+ L(log Pt f ) > 0 (6.7.8)
n
(if r ≥ s(1 + αtr) and r ≥ s, then necessarily 1 + αtr > 0).
Furthermore, we may deduce from (6.7.7) that
1 1
≤ (s) − LPt f ≤ ,
αs − ( (0) − LPt f )−1 α(t − s) − ( (t) − LPt f )−1
from which the bounds (6.7.4) and (6.7.5) immediately follow by integration be-
tween 0 and t.
6.7 Local Sobolev Inequalities 301
Finally, any of (ii), (iii), (iv) yields CD(0, n) by a Taylor expansion at t = 0. The
proof is complete.
As for Proposition 5.5.3, p. 261, it is worth observing that (6.7.4) and (6.7.5) are
obtained directly from the curvature-dimension condition without the explicit use of
the gradient bound (3.3.18), p. 163, contrary to their weaker forms in Theorem 5.5.2,
p. 259. As such, this strategy of proof may be used in settings where gradient bounds
are lacking, for example in hypoelliptic settings where on the other hand extensions
of the curvature-dimension condition may be available.
Remark 6.7.4 If the condition CD(0, n) is replaced by the more general CD(ρ, n),
ρ ∈ R, the differential inequality on becomes
2[LPt f − (s)]2
(s) ≥ + ρ (s), s ∈ [0, t].
nPt f
This is not harder to solve, except that the corresponding result leads to expres-
sions involving trigonometric and logarithmic functions (depending on the sign
of ρ and the values of LPt f and Pt f ), and the final result is far less pleasant to
express. Fundamentally, such a differential equation (especially when ρ > 0) may
have no bounded solutions on [0, t] whenever the coefficients (here LPt f and Pt f )
are not controlled in some proper way with respect to the initial or final values
(0) and (t). When ρ = 0, the Li-Yau inequality expresses precisely this non-
explosion property, and the equivalent form under ρ = 0 may similarly be estab-
lished.
The proof of Theorem 6.7.3, in particular, (6.7.8), includes the celebrated Li-
Yau parabolic inequality. According to Remark 6.7.4 above, versions under the
curvature-dimension CD(ρ, n) condition are also available.
n
L(log Pt f ) > − .
2t
Alternatively,
(Pt f ) LPt f n
2
− < . (6.7.9)
(Pt f ) Pt f 2t
The Li-Yau inequality is one of the main tools in establishing Harnack-type in-
equalities for solutions of heat equations and Gaussian heat kernel bounds with
the correct dependence on dimension (on say Riemannian manifolds). As part of
the family of logarithmic Sobolev inequalities for heat kernel measures, it may be
qualitatively compared to Wang’s (infinite-dimensional) Harnack inequality (The-
orem 5.6.1, p. 265). For simplicity, we state the Harnack inequality in the setting
of a diffusion operator L = g − ∇W · ∇ on a complete connected Riemannian
manifold (M, g).
Corollary 6.7.6 (Harnack inequality under CD(0, n)) Under the curvature-
dimension condition CD(0, n), for every positive measurable function f on M,
every x, y ∈ M and every 0 < t < t + s,
t + s n/2 d(x,y)2 /4s
Pt f (x) ≤ Pt+s f (y) e , (6.7.10)
t
where d(x, y) is the Riemannian distance from x to y. Conversely, the Harnack
inequality (6.7.10), which holds for every x, y ∈ M and every 0 < t < t + s, implies
in return the Li-Yau inequality (6.7.9).
For the proof, as in Theorem 5.6.1, p. 265, let (xu )u∈[t,t+s] be a geodesic with
constant speed joining x to y and consider
2 /4s 2
(u) = un/2 eud(x,y) Pu f (xu ), u ∈ [t, t + s].
Then the Li-Yau inequality tells us precisely that the derivative of log is positive
on the interval [t, t + s] from which the claim follows. The converse statement is
established as the corresponding assertion in Theorem 5.6.1, p. 265.
Harnack inequalities as in Corollary 6.7.6 are traditionally used to produce sharp
off-diagonal Gaussian bounds on heat kernels of the form
comparison theorems in manifolds with positive Ricci curvature, the latter may then
be shown to imply uniform heat kernel bounds in the form of ultracontractivity,
and thus functional inequalities of Sobolev-type (cf. Sect. 6.3). Off-diagonal upper
bounds are achieved using the techniques presented in Sect. 7.2, p. 355, in the next
chapter. A typical result in this context is that
Cn,ε
√ 1/2 e−d(x,y) /(4+ε)t
2
pt (x, y) ≤ √ 1/2 (6.7.11)
V (x, t ) V (y, t )
for all t > 0, ε > 0, (x, y) ∈ M × M.
Lower bounds may be investigated similarly, and improve with a polynomial fac-
tor the lower bound (5.6.3), p. 267. For example, Corollary 6.7.6 (for f approaching
Dirac mass at x) yields that
t + s n/2 d(x,y)2 /4s
ps (x, x) ≤ pt+s (x, y) e
t
for all s > 0 and t > 0. Provided the local asymptotics lims→0 (4πs)n/2 ps (x, x) = 1
holds, it follows that
1
e−d(x,y) /4t
2
pt (x, y) ≥ (6.7.12)
(4πt)n/2
for all t > 0 and (x, y) ∈ M × M.
These precise upper and lower bounds on heat kernels actually significantly im-
prove milder small time asymptotics at a logarithmic scale, which on the other
hand may be considered in greater generality. For example, the classical Laplace-
Varadhan small time asymptotics
where
1−(1/q1 ) (1/q2 )−1 (1/q2 )−(1/q1 )
q1 − 1 q2 − 1 q1 u2 − q2 u1
M= .
u1 u2 q2 − q1
Proof We proceed as in the proof of Theorem 5.5.5, p. 262, and only outline the
main steps. Starting from the curvature condition (i), let f ∈ Aconst+
0 as usual, and
consider
1/q
(s) = Pu (Pt−s f )q , s ∈ [0, t], t > 0,
where now q : [0, t] → (1, ∞) and u : [0, t] → [0, ∞) are functions of s. Then, with
g = Pt−s f ,
q 2 q−1 q q2
= Ent p g + u − 1 Pu g q−1 Lg
q u
q
(6.7.15)
q2
+ u (q − 1)Pu g q−2 (g) .
q
Theorem 6.7.3 indicates that (i) is equivalent to the local logarithmic Sobolev in-
equality (6.7.4). By linearization, the latter amounts to the family of inequalities
with parameter κ > 0,
n
Entpu (f ) ≤ u LPu f + (κ − 1 − log κ)Pu f − u κPu f L(log f ) .
2
Now apply these inequalities to g q and compare with (6.7.15). Choosing
q = q(s), u = u(s) and κ = κ(s) solving the system
⎧q
⎪
⎨ (u − 1) + u(1 − κ) = 0
q
q (6.7.16)
⎪
⎩u(q − 1 + κ) + u (q − 1) = 0
q
2
we get that qq −1 ≤ n2 A(κ) with A(κ) = κ − 1 − log κ, κ > 0. Assuming that q
is decreasing, it follows that (s) ≤ enMs,t /2 (t) where
t
q (r)
Ms,t = − 2
A κ(r) dr.
s q(r)
6.8 Sobolev Inequalities Under a Curvature-Dimension Condition 305
In other words,
1/q(s) 1/q(t)
Pu(s) (Pt−s f )q(s) ≤ enMs,t /2 Pu(t) f q(t) .
In order to optimize the value of Ms,t , we choose q(r), u(r) and κ(r) on [s, t] of
the form
r +γ 1−q
q(r) = , u(r) = αr + β, κ= ,
αr + β 1 − αq
where α, β, γ are constants that we adjust to fit the values of q(0) = q2 and
q(t) = q1 (and for which u(0) = u1 and u(t) = u2 ) (and therefore q is decreas-
ing since it is monotonic). It is easily checked that those functions solve the sys-
tem (6.7.16). The conclusion then follows after tedious computations. As for The-
orem 6.7.3, (ii) yields CD(0, n) by a Taylor expansion as u1 goes to u2 and q1
to q2 .
The inequalities (6.7.14) in Theorem 6.7.7 are optimal in the sense that if L is the
Laplacian in Rn (therefore satisfying CD(0, n)), these inequalities are equalities for
2
fα (x) = eα|x| , x ∈ Rn , α ∈ R. There is also a version of Theorem 6.7.7 in the range
−∞ < q2 < q1 < 0 or 0 < q2 < q1 < 1.
It is not so easy to prove the Sobolev inequality directly. Again, a convenient tool is
to establish instead a logarithmic entropy-energy inequality (6.2.2). The following
theorem is a first main result in this direction. As usual, we consider a Markov Triple
(E, μ, ) under the curvature-dimension condition CD(ρ, n), ρ ∈ R, n ≥ 1,
1
2 (f ) ≥ ρ (f ) + (Lf )2
n
for every f ∈ A0 (or A) from Definition 3.3.14, p. 159. Recall that since ρ > 0 in
the statement below, the invariant measure μ is assumed to be finite and normalized
into a probability measure (cf. Theorem 3.3.23, p. 165).
306 6 Sobolev Inequalities
We turn to the proof of Theorem 6.8.1 which is similar to the proof of the loga-
rithmic Sobolev inequality under the CD(ρ, ∞) condition (Theorem 5.7.4, p. 270).
√
Proof of Theorem 6.8.1 Working with f = f0 , f0 ∈ Aconst+ 0 , the inequality to be
proved is
n 1
Entμ (f ) ≤ log 1 + f (log f )dμ (6.8.1)
2 ρn E
If E eg dμ = 1, by Jensen’s inequality,
2
eg (g)2 dμ ≥ eg (g)dμ ,
E E
and the method of proof of Theorem 6.8.1 can then be used in the same way to get
this time the differential inequality
2ρ n (4n − 1) 2
≥ − + .
n−1 2(n + 2)2
In this way, we end up with the logarithmic entropy-energy inequality
2 p 4(n − 1)
Entμ f ≤ log 1 + E(f )
2 pρ n
2
for every f ∈ D(E) such that E f 2 dμ = 1, where p = 4(n+2)4n−1 . This inequality
is of weaker Sobolev dimension but yields this time the sharp logarithmic Sobolev
inequality.
308 6 Sobolev Inequalities
Via Proposition 6.2.3, Theorem 6.8.1 thus shows how the curvature-dimension con-
dition CD(ρ, n) with ρ > 0 and n < ∞ produces a Sobolev inequality Sn (A, C)
of dimension n. This Sobolev inequality may furthermore be tightened. However,
the constants given by this process are not sharp. Actually, the CD(ρ, n) condition,
ρ > 0, n < ∞, may indeed be shown to reach optimal Sobolev inequalities but using
a completely different technique of proof.
Theorem 6.8.3 (Sobolev inequality under CD(ρ, n)) Under the curvature-dimen-
sion condition CD(ρ, n), ρ > 0, 2 < n < ∞, the Markov Triple (E, μ, ),
with μ a probability measure, satisfies a Sobolev inequality Sn (C) with constant
4(n−1)
C = ρn(n−2) . That is,
4 n−1
f 2p ≤ f 22 + · E(f ) (6.8.2)
n(n − 2) ρ
It is not known how to prove this theorem using (linear) heat flow monotonic-
ity. In fact, the semigroup proof of Theorem 6.8.1 (for example) may be adapted
2n
to Sobolev inequalities up to some (rather mysterious) exponent 2 < p < n−2 . In
Sect. 6.11, alternative evolution equations (fast diffusion equations) will be devel-
oped in order to obtain Sobolev inequalities in the form of Gagliardo-Nirenberg
inequalities. Here, we use instead non-linear methods on extremal functions. We
first sketch the formal argument, and then provide the necessary technical details
needed to complete the proof.
for all f ∈ D(E) with C > 0 the optimal constant. If it exists, let f
be a positive non-
constant extremal function for this inequality, normalized so that E f p dμ = 1, that
is so that
Later, in the true proof, we shall show that such an extremal function is bounded
from above and below, justifying all the integrations below. Setting f = eg , the
equation becomes
e(p−2)g = 1 − C Lg + (g) . (6.8.4)
Then multiply both terms of the latter identity by ebg Lg, b ∈ R, and integrate with
respect to μ. By integration by parts, it follows that
(p − 2 + b) e(p−2)g ebg (g)dμ
E
=b ebg (g)dμ + C ebg (Lg)2 dμ
E E
−C ebg g, (g) dμ − C b ebg (g)2 dμ.
E E
The following makes use of various integral formulas on the 2 operator devel-
oped in the proof of Theorem 5.7.4, p. 270. For example, by (5.7.8), p. 271, with
2a = b,
3b
ebg (Lg)2 dμ = ebg 2 (g)dμ + ebg g, (g) dμ
E E 2 E
+ b2 ebg (g)2 dμ.
E
310 6 Sobolev Inequalities
Now replace e(p−2)g by its expression from (6.8.4) and integrate by parts again to
get, after some algebra,
p−2
ebg 2 (g)dμ = ebg (g)dμ
E C E
b
+ p−1− ebg g, (g) dμ (6.8.5)
2 E
+ (p − 2)(b − 1) ebg (g)2 dμ.
E
On the other hand, arguing as for (5.7.10), p. 271, the CD(ρ, n) condition yields,
for every a ∈ R,
ρn
e 2 (g)dμ ≥
bg
ebg (g)dμ
E n−1 E
3b − 2(n + 2)a
+ ebg g, (g) dμ (6.8.6)
2(n − 1) E
b2 − 2ab − (n − 1)a 2
+ ebg (g)2 dμ.
n−1 E
bg
E e (g) dμ coincide. Recalling that p = n−2 , this is achieved by choosing
2 2n
b n−1 2(n − 3)
a= − and b = .
2 n−2 n−2
As a consequence, it follows that
p−2 ρn
ebg (g)dμ ≥ ebg (g)dμ.
C E n − 1 E
bg p−2 ρn
Since E e (g) dμ > 0 (because f is assumed to be non-constant), C ≥ n−1 ,
or in other words,
4(n − 1)
C≤ .
ρ n(n − 2)
6.8 Sobolev Inequalities Under a Curvature-Dimension Condition 311
This function f is therefore in the domain D(L) and satisfies the extremal function
equation
f q−1 = (1 + ε)f − C(q, ε) Lf.
Next, we make sure that f is bounded from above and below (by a strictly posi-
tive constant). To this end, observe that the equation for extremal functions may be
312 6 Sobolev Inequalities
rewritten as
−1
−1 1+ε q−1
f = C(q, ε) Id −L f = C(q, ε)−1 Rλ f q−1 ,
C(q, ε)
∞
where λ = C(q,ε)
1+ε
and Rλ = (λ Id −L)−1 = 0 e−λt Pt dt is the resolvent operator.
For the upper bound, Corollary 6.3.3 already indicates that Rλ (f q−1 ) is bounded
as soon as f ∈ Lr (μ) for some r > (q − 1) n2 . But from the same argument, as
soon as f ∈ Lr (μ), then f ∈ Lr (μ) for any r < ar where a = (q − 1) − 2r n . Since
a < 1 as soon as r > (q − 2) 2 , by a simple iterative procedure, it follows that f is
n
bounded as soon as it belongs to Lq0 (μ) for some q0 > (q − 2) n2 . Since in our case
f ∈ Lq (μ) and q > (q − 2) n2 , it is indeed bounded. For the (strictly positive) lower
bound, Proposition
6.3.4 provides a lower bound on the kernel of Pt , and thus, say
for t ≥ 1, Pt g ≥ c E gdμ for some c > 0 and every positive function g. This in turn
implies that
∞
−λt Ce−λ
Rλ f q−1 ≥ Pt f q−1
e dt ≥ f q−1 dμ
1 λ E
from which the claim easily follows. Putting together the various pieces, the proof
of Theorem 6.8.3 is complete.
Remark 6.8.4 The preceding proof similarly shows that, under the curvature-
dimension condition CD(ρ, n) with ρ > 0 and 2 < n < ∞, for every f ∈ D(E),
f 2q − f 22 (n − 1)
≤ E(f ) (6.8.7)
q −2 ρn
for every 2 ≤ q ≤ n−2
2n
. Such an inequality will be called later in Sect. 7.6, p. 382,
a Beckner-type inequality. In this form, it actually holds for every 1 ≤ q ≤ n−2 2n
,
the value q = 1 corresponding to the Poincaré inequality, the value q = 2 to the
logarithmic Sobolev inequality (in the limit), and q = n−2
2n
to the Sobolev inequality.
By Proposition 6.2.2, any inequality in this family implies the Poincaré inequality.
Since on the sphere Sn , ρ = n − 1, the constant in (6.8.7) is optimal for any q, and
in particular Theorem 6.8.3 leads to the optimal Sobolev inequality (6.1.2) in this
case. When 1 ≤ n ≤ 2, inequality (6.8.7) may actually be considered for any q ≥ 1
and implies, as q → ∞, Moser-Trudinger-type inequalities of the form
(n − 1)
log e dμ − f dμ ≤
f
E(f ), f ∈ D(E). (6.8.8)
E E 2ρ n
Remark 6.8.5 Under the condition CD(ρ, n), for ρ > 0, 2 < n < ∞, the family of
inequalities (6.8.7) in the previous remark describes optimal Poincaré, logarithmic
Sobolev and Sobolev inequalities. The constant in (6.8.7) may actually be slightly
2n
improved together with the spectral gap whenever q < n−2 . Indeed denoting by λ1
6.9 Conformal Invariance of Sobolev Inequalities 313
(n − 1)2 ρ n 4n
κ= · + λ1 .
(n + 1) n − 1 (n + 1)2
2
ρn
Since λ1 ≥ n−1 by (4.8.1), p. 213, these bounds improve upon (6.8.7). They are ob-
tained by a simple
modification
at the end of the (formal) proof of Theorem 6.8.3,
using that λ1 E (f )dμ ≤ E (Lf )2 dμ according to Proposition 4.8.3, p. 214. As a
geometric consequence, together with Obata’s Theorem, an n-dimensional Rieman-
nian manifold with Ricci curvature bounded from below by n − 1 for which κ = n
2n
for some q < n−2 (in particular if the logarithmic Sobolev constant is equal to n1 ) is
isometric to the n-sphere Sn (Obata’s Theorem tells us this is the case when λ1 = n).
Conformal invariance may be presented in various ways. On the one hand, it may
be seen simply as the change on a Riemannian manifold (M, g) with dimension n
314 6 Sobolev Inequalities
Definition 6.9.1 (Scalar curvature) The scalar curvature scg (x) at a point x ∈ M is
the trace of the Ricci tensor. In a local coordinate system,
n
scg (x) = g ij (x) Ricij (x)
i,j =1
have to be adjusted), of (E, μ, ) is the set of all Markov Triples (E, c−n μ, c2 ) for
all strictly positive functions c ∈ A. An n-conformal invariant of the n-conformal
class of (E, μ, ) is a map S = S(μ, ) : E → R depending only on μ and such
that, for any function c = eτ ∈ A,
−n n−2 n−2
S c μ, c = c S(μ, ) +
2 2
Lτ − (τ ) . (6.9.1)
2 2
for some C > 0, is invariant in the n-conformal class of (E, μ, ). That is, if it holds
for the pair (μ, ), then it holds with the same constant C for the pair (c−n μ, c2 )
for every strictly positive function c ∈ A.
Proof The proof is immediate. It suffices to change f into c(2−n)/2 f in the Sobolev
inequality and, after the use of (6.9.1), to integrate by parts the mixed term in the
change of variables formula for the operator
n − 2 2 (n − 2)2 2
c(2−n)/2 f = c2−n (f ) − f ,τ + f (τ ) .
2 4
Note that by the change f → c(2−n)/2 f used in Proposition 6.9.2, the generator
L associated with (E, μ, ) is transformed for (E, c−n μ, c2 ) with c = eτ into
L = c2 Lf − (n − 2) (τ, f ) . (6.9.2)
According to Proposition 6.9.2 above, the aim here will be to show that, in a Rie-
mannian setting, the scalar curvature is such an n-conformal invariant (thus satis-
fying (6.9.1)), where n is the dimension of the manifold. We refer to Appendix C
for the necessary Riemannian background. In Riemannian geometry, when g is the
(co-) metric and dμg the Riemannian volume element, the change of into c2
and dμg into c−n dμg corresponds to a conformal change of the metric g into c2 g.
316 6 Sobolev Inequalities
the carré du champ operator takes the form (f ) = i,j =1 g ∂i f ∂j f , the Rie-
mannian measure is given in local coordinates by dμg = det(g)−1/2 dx, where dx
is the Lebesgue measure in the coordinate system. (Observe that, contrary to our
usual convention, it is not normalized whenever it is finite.) The aim is to examine
how the scalar curvature is modified by such a conformal transformation. To this
end we first study the Ricci curvature. Two approaches are available. One via the
2 -calculus for the new operator, and another in local coordinates. Both are heavy
and somewhat cumbersome, but it might be easier to work with the 2 operator,
which we do below.
While the case of a Laplacian will be our main illustrative example, the 2 for-
mulas which need to be developed are just as easy to describe in the general case of
a generator of the form L = g − ∇W · ∇ for some smooth potential W on a Rie-
mannian manifold (M, g) with symmetric invariant measure dμ = e−W dμg . This
setting will furthermore be suited to our later investigation of non-geometric exam-
ples. We thus discuss the 2 operations below in this more general setting. In this
instance, recall from (1.16.4), p. 71 (cf. also Sect. C.6, p. 513), that the 2 operator
is given on smooth functions by
2 (f ) = |∇∇f |2 + Ric(L)(∇f, ∇f )
where Ric(L) is a symmetric tensor defined from the Ricci tensor Ricg of the Rie-
mannian manifold (M, g) by Ric(L) = Ricg +∇∇W .
In this setting, the first step will be to compute, for any n and any smooth function
c, the Ricci tensor associated with (c−n μ, c2 ) from the Ricci tensor associated to
(μ, ). Although most interesting formulas will be achieved when n is the dimen-
sion of M, assume for the moment that it is distinct from this dimension, denoted
here by n0 . The computations below are performed as usual on the class A0 of
smooth compactly supported functions on M.
Given c written in the form c = eτ , the 2 operator associated with c2 L may be
expressed in terms of the 2 operator of L as
c4 2 (f ) + 2 τ, (f ) + L τ + 2 (τ ) (f ) − 2 Lf (f, τ ) .
According to Proposition 6.9.2, and for a generic n, the aim is to modify this formula
in terms of the operator L of (6.9.2), which thus requires us to replace 2 (f ) by
2 (f ) + (n − 2)∇∇τ (∇f, ∇f ) and L by L − (n − 2)(τ, ·) in the preceding. Hence,
denoting by 2 the 2 operator of
L,
(
2 (f ) = c4 2 (f ) + (n − 2)∇∇τ (∇f, ∇f ) + 2 τ, (f )
)
+ L τ − (n − 4) (τ ) (f ) − 2 Lf (f, τ ) + 2(n − 2) (f, τ )2 .
In this formula, n0 (the effective dimension of the manifold) appears as the norm,
in the space of symmetric matrices, of the identity matrix in the form of the term
(f, τ ) g. The term g f appears as the scalar product, still in the space of symmet-
ric matrices, of ∇∇f and g, that is the trace of ∇∇f . Lastly, comparing the two
formulas,
(
Ric(L)(∇f, ∇f ) = c4 Ric(L)(∇f, ∇f ) + (n − 2)∇∇τ (∇f, ∇f )
+ L τ − (n − 2)(τ ) (f ) + 2 ∇W (f ) (f, τ )
)
+ (2n − n0 − 2) (f, τ )2 .
In other words, in terms of Ricci tensors with lower indices, that is as symmetric
operators acting on the tangent space,
Ric(
L) = Ric(L) + (n − 2) ∇∇τ + L τ − (n − 2) (τ ) g
+ 2 ∇W # ∇τ + (2n − n0 − 2)∇τ # ∇τ.
Taking the trace to reach scalar curvature, we get the following formula for the scalar
curvature scg of
L in terms of scg ,
scg = c2 scg + (n − 1) 2g τ − (n − 2) (τ ) .
Observe that when n is the dimension of the manifold (which is the case here), the
n-conformal class of a given Laplacian only contains Laplacians. Note furthermore
that, setting f = e−(n−2)τ/2 and as usual p = n−2
2n
, n > 2, (6.9.3) can be rewritten as
4(n − 1)
f p−1 scg = f scg − g f. (6.9.4)
n−2
As a consequence of these developments and of Proposition 6.9.2, we may now
state the following main conclusion.
Theorem 6.9.4 (Optimal Sobolev inequalities on the model spaces) For the Rie-
mannian measures μ on the three model spaces Rn , Sn , Hn (not normalized on
Sn ), n > 2, the following optimal Sobolev inequalities with p = n−2
2n
hold (for every
function f in the respective Dirichlet domains D(E)).
(i) On the Euclidean space Rn ,
4
f 2p ≤ 2/n
E(f ).
n(n − 2) ωn
Remark 6.9.5 For the Sobolev inequality in Rn (case (i)), the L2 (dx) norm of the
function f does not appear in the statement. However, the fact that f ∈ D(E) (and
therefore that f ∈ L2 (dx)) may not be removed without care since, for example,
the inequality cannot hold for a non-zero constant function. However, through stan-
dard localization and symmetrization procedures in Euclidean space, the Sobolev
inequality (i) may be extended to any smooth function f such that |∇f | ∈ L2 (dx)
and such that voln (|f | ≥ ε) < ∞ for any ε > 0.
Proof We start from the optimal Sobolev inequality on the sphere Sn established in
Theorem 6.8.3, rewritten here under the un-normalized uniform measure. Now, it
was observed in Sect. 2.2.2, p. 83, that in the stereographic projection, the metric of
the sphere is a conformal transformation of the metric of the Euclidean space Rn .
The scalar curvature of Rn is 0, and the resulting optimal Sobolev inequality on Rn
is thus obtained from Theorem 6.9.3. By the conformal invariance principle, these
two Sobolev inequalities are therefore equivalent. One can add to the equivalence
the Sobolev inequality on the hyperbolic space Hn since Hn may be identified with
Rn−1 × (0, ∞) with a metric given as a conformal transformation of the Euclidean
metric with constant scalar curvature −n(n − 1). The proof is complete.
A further important property and application drawn from Theorem 6.9.3 con-
cerns extremal functions. Assume that the two metrics given by and c2 have
constant scalar curvature equal to n(n − 1). Then, the equation (6.9.4) of conformal
transformation of scalar curvature indicates that the function f = c−(n−2)/2 satisfies
the equation
4 Lf
f p−2 = 1 − ,
n(n − 2) f
which is precisely Eq. (6.8.3) of extremal functions for the Sobolev inequality on the
sphere with constant C = n(n−2)4
. Now, there exist conformal maps on the sphere,
that is diffeomorphisms from Sn onto itself, such that the image carré du champ
operators are of the form c2 . But under such a conformal map, the metric c2 is
nothing else but the metric viewed under a change of coordinates. The new metric
has constant scalar curvature (since curvature is independent of the coordinates and
invariant under diffeomorphisms). These observations thus explain the existence of
non-constant extremal functions for the Sobolev inequality at the critical exponent
on the sphere. (It may be shown that there are no non-constant extremal functions
for the Sobolev inequality (6.8.7) below the critical exponent.)
It might be useful to briefly describe the conformal maps from Sn into itself. Let
Q be a point in Rn+1 exterior to the sphere Sn . To each I ∈ Sn associate the other
intersection point J between Sn and the line QI . The map I → J is then such a
conformal map as a consequence of the following three observations:
320 6 Sobolev Inequalities
It should be observed that such functions are not in L2 (dx) when n = 3, 4. However,
they satisfy the equation for extremal functions and the Euclidean Sobolev inequal-
ity (i) applies to them thanks to Remark 6.9.5. These functions will appear naturally
when solving fast diffusion equations in Rn (see Sect. 6.11 below) justifying the
role played by those equations in connection with Sobolev inequalities.
Similarly, on the sphere Sn , extremal functions for the optimal Sobolev inequality
((ii) in Theorem 6.9.4) may be written as
−(n−2)/2
fb,e (x) = σ + b (e · x) , (6.9.6)
from which the conclusion follows with α = − σb . Observe that, for n integer, n > 2,
this change of variable from x to y is the trace on zonal functions of the conformal
transformation of the sphere.
The conformal invariance result of Theorem 6.9.3 is not restricted to the Laplacian
case and to the case where n is the dimension of the manifold. In fact, other general
(and sometimes useful) forms of conformal invariance may be considered once a
good candidate to replace the scalar curvature is determined. In the last (and some-
what technical) part of this section, we discuss a few relevant instances of interest
on the basis of the operations on 2 already developed for Theorem 6.9.3.
In the following examples, we restrict ourselves to the ncase of elliptic opera-
tors on some n0 -dimensional manifold M. Set (f ) = i,j0 =1 g ij ∂i f ∂j f where
g = (g ij ) is elliptic, and as usual denote respectively by g , μg and scg the Laplace
operator, Riemannian measure and scalar curvature associated with the (co-) metric
g. Using this notation we look at measures μ which have a density e−W with respect
to μg . The following non-geometric conformal invariance result illustrates a variety
of examples.
Proof It is enough to check that Sα (μ, ) satisfies (6.9.1). The transformation rules
for scg under the change (μg , ) → (c−n0 μg , c2 ) was already considered previ-
ously. It remains to observe that if the density of μ is e−W for (μg , ), the density
of c−n μ with respect to c−n0 μg is equal to e−W cn0 −n = e−W −(n−n0 )τ . The compu-
tations are then straightforward (although somewhat tedious).
In fact Proposition 6.9.6 applies for any n = n0 , but in view of Sobolev in-
equalities, it is only interesting for n > n0 . As alluded to above, restricting to
Laplace operators leads to the restriction of conformal transformations to n = n0 .
Indeed, the change of μ into c−n μ changes a Riemannian measure into a measure
322 6 Sobolev Inequalities
du champ operator
2
1 + |x|2
(f ) = |∇f |2
2
and measure
n−n
2n x1 0
dν(x) = · dx
ωn,n0 (1 + |x|2 )n
where ωn,n0 is the normalizing constant (so that ν is a probability measure). The
resulting inequality takes the form, with p = n−2
2n
,
2/p
|f |p dν
Rn0 −1 ×R+
4
≤ Sα (μ, ) f 2 dν + (f )dν .
n(n − 2) Rn0 −1 ×R+ Rn0 −1 ×R+
2
Apply the n-conformal invariance result to the function c(x) = 1+|x|2 to obtain a
new n-dimensional inequality on Rn0 −1 × R+ , but with the usual carré du champ
n−n
operator |∇f |2 and the measure x1 0 dx. For this pair, and for the same α, Sα = 0
so that, finally
n−n
x1 0 dx 2/p
|f | p
Rn0 −1 ×R+ ωn,n0
(6.9.8)
n−n
4 x 0 dx
≤ |∇f |2 1 .
n(n − 2) Rn0 −1 ×R+ ωn,n0
This Sobolev inequality is precisely the announced one. In this non-geometric pic-
ture, the quantity Sα behaves in the same way as the scalar curvature for Laplace
operators as it is constant strictly positive on the sphere and vanishes on Rn .
Further connections between Sobolev inequalities, extremal functions and met-
rics with constant scalar curvature may be developed along these lines. For example,
as already mentioned, there is a strong connection, through (6.9.4), between confor-
mal transformations of a space into itself which preserve constant scalar curvature
and extremal solutions of Sobolev inequalities. Furthermore, Sobolev inequalities
may be used to control functions c for which the metric associated with c2 has
constant scalar curvature.
ing the differential equation ∂t u = Lu. The sharp constants are however obtained by
some variant, nevertheless still dealing with the infinitesimal Markov generator L
of (Pt )t≥0 . As discussed in the preceding Sect. 6.9, together with conformal invari-
ance, the sharp Sobolev inequality on the sphere then yields the optimal Sobolev
inequalities in Euclidean and hyperbolic spaces. In this picture, the spherical case,
with its sharp constants, and the curvature-dimension condition, appear as critical.
However, this methodology has not yet produced any type of Sobolev inequal-
ity comparable to the Euclidean inequality directly from the curvature-dimension
condition CD(0, n), and similarly for the hyperbolic space under CD(ρ, n) with
ρ < 0. One explanation for this is the following. When ρ > 0, the invariant mea-
sure is finite (cf. Theorem 3.3.23, p. 165) and may thus naturally be normalized into
a probability measure. It is actually from this normalization that optimal constants
are produced. By conformal invariance, this normalization is also reflected in the
sharp Sobolev inequalities on Rn and Hn . However, when the invariant measure
has infinite mass, there is no natural normalization, and hence when the measure is
multiplied by a (strictly positive) constant, the constants in the Sobolev inequality
are modified. Therefore, there is no hope of directly obtaining Sobolev inequali-
ties under curvature-dimension conditions without any further information on the
measure.
In this section, we examine how to suitably modify Sobolev inequalities in Eu-
clidean space in order to obtain new families of inequalities which will be shown in
the next section to be accessible under the curvature-dimension condition CD(ρ, n)
for ρ ∈ R, via non-linear evolution equations. With this task in mind, we first ex-
tend the Sobolev and Nash inequalities to the more general family of Gagliardo-
Nirenberg inequalities, and then consider entropic formulations of the latter.
For simplicity, only A ≥ 0 is considered here although the formal definition in-
cludes A ∈ R. The Sobolev inequality Sn (A, C) of Definition 6.2.1 belongs to the
Gagliardo-Nirenberg family for θ = 1 or q = s, while the Nash inequality (6.2.3)
is achieved for q = 2, s = 1. As for the Nash inequality (cf. Proposition 6.2.3), any
inequality GNn (q, s ; A, C) is equivalent, up to constants, to the Sobolev inequality
Sn (A, C) with the same dimension n.
Proposition 6.10.3 For any α > 0, α = 1, such that θ α < 1, the Gagliardo-
Nirenberg inequality GNn (q, s ; A, C) is equivalent to the inequality
q ≤ αθ A f 2 + C E(f ) + (1 − αθ ) f s , f ∈ D(E) ∩ L (μ), (6.10.2)
f 2α 2 2β s
α(1−θ)
where β = 1−θα .
r αθ ≤ α θ r0αθ−1 r + (1 − αθ )r0αθ .
2γ
Choose then r0 = f s with γ = α−1
1−αθ . Conversely, replace f by cf and optimize
in c > 0.
Theorem 6.10.4 (Del Pino-Dolbeault Theorem) Let ν > n > 2, and set
2ν 2(ν − 1)
q= , s=
ν −2 ν −2
where q1 = pθ + 1−θ s and where the (optimal) constant C > 0 is the one for which
there is equality for the function
−(ν−2)/2
f (x) = 1 + |x|2 , x ∈ Rn .
Proof Although the result is true as stated for ν ≥ n > 2, the method of proof devel-
oped here only works for ν ≥ n + 12 , n > 2, to which we restrict below. It is however
a quite general strategy to deduce sharp inequalities from other optimal inequalities
provided there exist extremal functions. We first deal with the case ν = n + m2 for
some integer m ≥ 1 and start from the optimal Sobolev inequality on Rn+m ,
(n + m − 2)2 −(n+m)
|∇g|2 = f + |y|2 |∇f |2 + 4 |y|2 .
4
Now, for a > 0,
−(n+m)
a + |y|2 dy = cn,m a −n−(m/2)
Rm
and
−(n+m)
|y|2 a + |y|2 dy = dn,m a −n−(m/2)+1
Rm
where cn,m , dn,m > 0 only depend on n, m. In this picture, the measure
(1 + |x|2 )−(n+m) dx is simply the image of the spherical measure of Rn+m (viewed
in stereographic projection from Sn+m onto Rn+m , cf. Sect. 2.2.2, p. 83) on Rn by
orthogonal projection.
The integrated Sobolev inequality therefore takes the form
2/p
f −n−(m/2) dx
Rn
−n−(m/2)
≤ C1 f |∇f | dx + C2
2
f −n−(m/2)+1 dx
Rn Rn
for constants C1 , C2 > 0 only depending on n, m. The construction ensures that this
inequality is an equality for f = 1 + |x|2 since in this case the function g is an ex-
tremal function of the Sobolev inequality in Rn+m . Changing f into f −4/(2n+m−4)
yields the inequality, with further constants C3 , C4 > 0,
2/p
f q dx ≤ C3 |∇f |2 dx + C4 f s dx
Rn Rn Rn
for the values q, s specified in the statement (recall that ν = n + m2 ), and for which
there is now equality for f = (1 + |x|2 )−(ν−2)/2 . Changing f to cf and optimizing
in c > 0 yields the announced Gagliardo-Nirenberg inequality. The inequality is
optimal since it admits non-constant extremal functions.
The preceding argument however only works when the parameter m is an integer.
To cover the general case, it should be observed that we only used functions of
r = |y| in Rm and the Sobolev inequality of dimension n + m in Rn × R+ for
the standard carré du champ operator and the measure dμ(x, r) = r m−1 dxdr. As
presented in Sect. 6.9, this Sobolev inequality may be deduced from the spherical
case for m ≥ 1 via conformal transformations. The proof of Theorem 6.10.4 may
therefore be completed in this way.
328 6 Sobolev Inequalities
ν − 1 −1/ν
H (r) = − r 1−1/ν and (r) = H (r) = − r , r ∈ (0, ∞).
ν
Fix b > 0 and let vσ = h−ν σ,b , σ > 0, where hσ,b = hσ,b (x) = σ + b|x| , x ∈ R .
2 2 n
Let f : R →
R+ be smooth and compactly supported and choose σ > 0 such that
n
(holding for every such function f ). When ν = n, the latter amounts to the optimal
Sobolev inequality on Rn .
Since the function H is (strictly) convex, the left-hand side of (6.10.3) is always
positive. It is zero only for f = vσ , and the inequality measures, in a certain entropic
sense, the distance from f to vσ .
Proof Fix b = 1 (the general case being similar), and to simplify the notation, set
h = hσ,1 . Using integration by parts on the right-hand side, (6.10.3) may be rewritten
6.10 Gagliardo-Nirenberg Inequalities 329
as
1 1
f h − |∇h|2 dx + h1−ν dx
Rn 4 ν − 1 Rn
1 1 1 (ν−2)/2ν 2
≤ f 1−1/ν
ν − h dx + ∇ f dx.
ν − 1 Rn 2 (ν − 2) Rn
2
On the other hand, Rn f dx = Rn h−ν dx = Cn,ν σ n−2ν where Cn,ν > 0 only de-
pends on
n and ν. Replacing the values of Rn h1−ν dx and σ by their values in
terms of Rn f dx, we are left with
(2ν−n−2)/(2ν−n)
Cn,ν f dx
Rn
ν−n 1 (ν−2)/2ν 2
≤ f 1−1/ν dx + ∇ f dx,
ν −1 Rn (ν − 2)2 Rn
where Cn,ν > 0 is a another constant. Finally, changing f into f 2ν/(ν−2) yields the
This section will be a bit formal (and at the same time somewhat technical). While
we observed that linear heat equations do not allow us to reach sharp Sobolev in-
equalities, the aim here is to show that possibly non-linear evolution equations may
be used to establish optimal Sobolev inequalities under curvature-dimension condi-
tions CD(0, n) for example.
The method may be applied to more general situations and equations, but we only
concentrate here on the porous medium and fast diffusion equations, with a partic-
ular emphasis on the latter which is well-suited to both curvature-dimension con-
ditions and Sobolev-type inequalities. In addition, this fast diffusion equation plays
the same role with respect to Sobolev and curvature-dimension CD(ρ, n) inequal-
ities as the usual heat equation with respect to logarithmic Sobolev and curvature
CD(ρ, ∞) inequalities. Deeper relationships are actually underlying the picture.
The Ornstein-Uhlenbeck semigroup is a model case to test logarithmic Sobolev in-
equalities and curvature conditions. Indeed, under CD(ρ, ∞), the heat flow mono-
tonicity method produces in this case logarithmic Sobolev inequalities, both for the
heat semigroup (Pt )t≥0 and for the invariant measure μ (when ρ > 0) which are
optimal in the example of the Ornstein-Uhlenbeck model on Rn . As already ob-
served in Sect. 5.5, p. 257, this is fully coherent since for this model, on the one
side the heat kernels (that is the solutions of the heat equation starting from Dirac
masses) are Gaussian measures, for which the logarithmic Sobolev inequality is
optimal, and on the other, the logarithmic Sobolev inequality for any Gaussian mea-
sure after dilations and translations yields the logarithmic Sobolev inequality for
the Ornstein-Uhlenbeck semigroup, which in turn gives in the limit the curvature
condition CD(1, ∞). The same picture holds with the Euclidean heat semigroup in
place of the Ornstein-Uhlenbeck semigroup with respect to the curvature condition
CD(0, ∞). The question now is whether this full set of equivalences and mod-
els may be extended to Sobolev and curvature-dimension CD(ρ, n) inequalities
(for some finite dimension n). The models for these Sobolev inequalities are now
spheres, which may be seen via stereographic projections on the Euclidean space,
and for which the references measures are Cauchy measures (cf. Sect. 2.2, p. 81).
Although the game with dilations and translations may no longer be played in this
context (and should be replaced by conformal transformations on the sphere), the
heat equation may still be replaced by the fast diffusion equation described below,
which, although non-linear, produces in Rn the required Cauchy measures when the
initial data are Dirac masses. While this method will not be entirely satisfactory, it
certainly shows that non-linear fast diffusion equations could play a similar role for
Sobolev inequalities as linear heat equations for logarithmic Sobolev inequalities
with respect to curvature-dimension CD(ρ, n) inequalities.
The main purpose of this section will be to establish the entropic version of the
Gagliardo-Nirenberg inequality of Proposition 6.10.6 (which requires some addi-
tional stabilizing function v) under a curvature-dimension condition CD(0, n). As
discussed in the previous section, this Gagliardo-Nirenberg inequality is equivalent
to the sharp Sobolev inequality in Rn . Recall that stating a Sobolev inequality for
6.11 Fast Diffusion Equations and Sobolev Inequalities 331
an infinite measure requires a given normalization. Here, the choice of the func-
tion v in the general context will play the role of this normalization. In some sense,
the fast diffusion equation described below is a model case illustrating the inter-
play between entropy methods for evolution equations, functional inequalities and
curvature-dimension conditions. The case of CD(ρ, n) with ρ > 0 is briefly ad-
dressed next.
The porous medium and fast diffusion equations are evolution equations of the form,
say on Rn ,
∂t u = um
for u = u(t, x) = ut (x), t ≥ 0, x ∈ Rn . Porous medium corresponds to m > 1 and
fast diffusion to m < 1. It is not obvious that there exist solutions to such equa-
tions at any time, in particular when m < 1. Actually, in Rn and for small m, the
solution starting from an initial condition u0 > 0 vanishes in finite time. We do
not discuss here existence (or uniqueness) issues for these equations. Our aim is
rather to explain how they may be used to reach Sobolev or Gagliardo-Nirenberg
inequalities under curvature-dimension CD(ρ, n) conditions, exactly as the heat
equation is used to reach Poincaré or logarithmic Sobolev inequalities under curva-
ture conditions CD(ρ, ∞). Namely, via Proposition 6.10.6, functional inequalities
of Sobolev-type are related to evolution equations through the control of entropic
quantities. As was extensively developed in the preceding chapters, on the one hand,
functional inequalities may be used to prove exponential decay of entropy related
to the evolution, where the functional inequality is translated into a differential in-
equality linking the entropy and its time derivative along the evolution equation. On
the other hand, functional inequalities may be reached by deriving entropy twice,
and using curvature-dimension conditions to directly reach this exponential decay.
In what follows, we present a general framework to control entropy of general
evolution equations with stabilizing term v. The general form of the second differen-
tial of the entropy is quite heavy, and the comparison with the curvature-dimension
CD(ρ, n) condition leads to the rather complicated form of the general Proposi-
tion 6.11.6 below. However, a much simplified and surprising form occurs for the
fast diffusion equation with parameter m = 1 − n1 , leading to the optimal form of the
Gagliardo-Nirenberg inequality under the CD(0, n) hypothesis. A slightly modified
version also leads directly to the optimal Sobolev inequality under the CD(ρ, n)
condition with ρ > 0, where no stabilizing term v is required.
We work below with a state space E that will either be Rn or a smooth manifold,
although most of what follows may be extended to more general settings at the
expense of several notational complications. Moreover, the various integration by
parts performed below, typically for a solution at time t > 0 of the associated fast
diffusion equation, should be carefully justified to turn this scheme into an actual
332 6 Sobolev Inequalities
proof of functional inequalities in the context of general Full Markov Triples. Once
again, we will keep the exposition at a formal level in order to better emphasize the
main principle of proof.
Thus given a Markov Triple (E, μ, ), where E = Rn or a smooth (weighted)
Riemannian manifold, with associated diffusion operator L, and a function at
least C 2 on R+ , a first set of conclusions may be developed for the extended porous
medium and fast diffusion equations in the form of the non-linear equation
∂t u = L(u), u0 = f, (6.11.1)
In this form, the heat equation corresponds to (r) = log r, r ∈ (0, ∞). A priori, the
behavior of u as t → ∞ is unknown, but if the interchange between differentiation
and integration is justified, we should at least have
∂t u dμ = ∂t u dμ = 0.
E E
On a compact Riemannian
manifold, it may thus be expected that ut converges
at infinity towards E udμ. On the other hand, in Rn , the behavior of u will be
forced to a given asymptotic behavior by a suitable choice of function v (to be made
explicit below) by modifying the equation as
∂t u = −∇ ∗ u ∇ (u) − (v) . (6.11.2)
This function v may be considered as a stabilizing term and plays an important role
in the associated functional inequality. For example, for the standard heat equation
on Rn , the choice for (r) = log r would be v = e−|x| /2 and the resulting equation
2
∂t u = u + x · ∇u + nu = u + ∇ · (ux),
In order to quantify the previous convergence along the same lines as the corre-
sponding development for logarithmic Sobolev inequalities in Chap. 5, an analogue
of the entropy functional needs to be considered. The quantity that will replace en-
tropy in this case is given by
F (u) = H (u) − u (v) dμ
E
where H on (0, ∞) is such that H = . Note that for the heat equation, H (r)
is indeed r log r − r, close to the standard entropy. When H is convex (which
will be the case below for the fast diffusion equation with m = 1 − n1 where
H (r) = −nr 1−1/n ), then F (u) − F (v) ≥ 0, since
F (u) − F (v) = H (u) − H (v) − (u − v)H (v) dμ.
E
Proof We make use of the following simple integration by parts formula, which will
be used repeatedly below. Namely, for each smooth function h on E (for example
in the algebra A0 ),
∂t u h dμ = − u (ξ, h)dμ. (6.11.3)
E E
334 6 Sobolev Inequalities
which is the claim. On the basis of (6.11.3), it then suffices to observe that
d
F (u) = ∂t u (u) − (v) dμ = (∂t u) ξ dμ.
dt E E
Proposition 6.11.2 In Rn , if H (r) = −n r 1−1/n , r ∈ (0, ∞), that is, for the equa-
tion
∂t u = −(n − 1)∇ ∗ u ∇ u−1/n − vσ−1/n
2 −n
with c = 4(n−1)
b2
. In other words, ut converges exponentially towards vσ in the en-
tropic sense as t → ∞.
For the proof, it suffices to observe that Proposition 6.10.6 actually yields that
d 4(n − 1)
F (ut ) − F (vσ ) ≤ − F (ut ) − F (vσ ) .
dt b2
As for logarithmic Sobolev inequalities (cf. Theorem 5.2.1, p. 244), the entropy
decay of Proposition 6.11.2 implies in return the Gagliardo-Nirenberg inequalities
of Proposition 6.10.6 with ν = n in the limit t → 0.
We next turn to the second step of the analysis, namely computing, as for classical
entropy, the second order derivative of the functional F (that is, the derivative of the
generalized Fisher information Iμ,F ) along the fast diffusion evolution with the help
of the 2 operator. The curvature-dimension condition will then enter into play when
establishing functional inequalities along the lines developed for Poincaré and loga-
rithmic Sobolev inequalities in Sect. 4.8, p. 211, and Sect. 5.7, p. 268, respectively.
This second order differentation will be performed for some general function
along the evolution (6.11.2) assuming however that the state space E is a smooth
manifold M and that the operator L is elliptic (allowing for the use of the differential
6.11 Fast Diffusion Equations and Sobolev Inequalities 335
calculus described in Sect. C.5, p. 511). Questions of existence, regularity and do-
mains for the solutions of the non-linear equations (6.11.2) are not discussed here. In
practice, for any such evolution equation related to and v, with initial data u0 = f ,
it may be quite hard to justify the formal computations described below. In particu-
lar, for fast diffusion equations, which is the central point of interest below, (r) is
singular at r = 0 which causes serious difficulties. Since the point in what follows
is to use those non-linear evolution equations to obtain functional inequalities from
curvature-dimension conditions CD(ρ, n) together with geometric properties of the
function v involved in the equation, it is in general necessary to first approximate
the singular map by some smooth approximation ε . For this approximation,
all the computations and integrations by parts may be justified by some previous a
priori analysis so to yield an approximated functional inequality which converges
in the limit ε → 0 to the expected result. We however do not enter into these quite
technical and tedious considerations here, assuming as mentioned above the regular-
ity of u and v and of related expressions justifying the various integration by parts
formulas. The aim of this study is rather to show that the modified fast diffusion
equation leads, through these formal computations, precisely to the entropic form
of the Sobolev or Gagliardo-Nirenberg inequalities described in Proposition 6.10.6.
The next lemma describes the announced second derivative operation on the
functional F .
Lemma 6.11.3 Let R be the function R(r) = r (r)
(r) , r ∈ (0, ∞). Set S = −(u),
ξ = (u) − (v), which are functions (depending on t) from E to R. Then, with
the generalized Fisher information Iμ,F of Proposition 6.11.1,
d
Iμ,F (u) = − u K dμ
dt E
where
K = 2u (u) 2 (ξ ) + ξ, (ξ ) − R(u) + 2 S, (ξ )
R(u) + 1 + uR (u)
− 2R(u) ξ, (ξ, S) + 2 (ξ, S)2 .
u (u)
Proof Again, the exchanges between differentiation and integration will not al-
ways be fully justified (although care may be developed to this end). Since
∂t ξ = (u)∂t u,
d
Iμ,F (u) = ∂t u (ξ ) + 2u (u)∂t u, ξ dμ.
dt E
By integration by parts with respect to the carré du champ operator , for suitable
functions f, g (say in the algebra A0 ),
u (f, g)dμ = − f u Lg + (u, g) dμ.
E E
336 6 Sobolev Inequalities
Therefore, the second term on the right-hand side of the preceding identity may be
rewritten as
−2 ∂t u (u) u Lξ + (u, ξ ) dμ.
E
From the evolution of u, we have with (6.11.3) that
d
Iμ,F (u) = − u R1 dμ
dt E
where
R1 = ξ, (ξ ) + 2 ξ, (S, ξ ) − 2 ξ, u (u) Lξ .
By the change of variables formula and the definition of the 2 operator,
ξ, u (u) Lξ = u (u) (ξ, Lξ ) + u (u) + (u) Lξ (ξ, u)
= u (u) (ξ, Lξ ) − R(u) + 1 Lξ (ξ, S)
1
= u (u) −2 (ξ ) + L(ξ ) − R(u) + 1 Lξ (ξ, S).
2
Furthermore,
u2 (u)L (ξ )dμ = − u2 (u), (ξ ) dμ
E E
= u R(u) + 2 S, (ξ ) dμ.
E
On the basis of Lemma 6.11.3, the aim is now to compare the second derivative
of the entropy functional F to quantities arising from the CD(ρ, n) condition. The
6.11 Fast Diffusion Equations and Sobolev Inequalities 337
situation is much simpler here in the fast diffusion case since then the various quan-
tities involved in the expression K are easier to handle. Indeed, the fast diffusion
rm
equation corresponds to (r) = r m , in which case (r) = m−1 m
r m−1 , H (r) = m−1
and R = m − 2. Therefore K = Km in Lemma 6.11.3 is equal to
Km = 2(1 − m) S 2 (ξ ) + ξ, (ξ ) − m S, (ξ )
(ξ, S)2
+ 2(2 − m) ξ, (ξ, S) − 2 .
S
This quantity Km may actually be described in more geometrical and tractable
terms. Set ζ = (v) (and thus −S = ξ + ζ ), and X = ∇ξ , Y = ∇ζ , M = ∇∇ξ ,
P = ∇∇ζ . Using the notation M · N for the scalar product in the space of symmetric
tensors, that is in coordinates (and with Einstein’s summation notation),
M · N = M ij N k gik gj ,
and the notation X # Y for the symmetric tensor product of two vectors (X # Y )ij =
2 (X Y + X Y ), we get that
1 i j j i
2 (ξ ) = |M|2 + Ric(L) · X # X,
S, (ξ ) = −2M · (X + Y ) # X,
ξ, (ξ ) = 2M · X # X,
ξ, (ξ, S) = −M · X # (X + Y ) − (M + P ) · X # X,
(S, ξ ) = −X · (X + Y ).
Then
1
Km = (1 − m) S 2 (ξ ) + (m − 1) M · X # (3X + 2Y )
2
(X · (X + Y ))2
+ (m − 2) P · (X # X) − .
S
We are now ready to use the curvature-dimension condition CD(ρ, n). We start
with an elementary computation which is a further application of the integration by
parts formula. Details are omitted.
As a consequence,
338 6 Sobolev Inequalities
Proposition 6.11.5 Under the curvature-dimension condition CD(ρ, n), for every
smooth function w ≥ 0,
ρn 1
w 2 (ξ ) dμ ≥ w (ξ )dμ + w, (ξ ) dμ
E n−1 E 2(n − 1) E
1
+ ξ, (ξ, w) dμ.
n−1 E
The next proposition summarizes the final relevant inequality on the derivative
of the generalized Fisher information.
Proposition 6.11.6 Under the curvature-dimension condition CD(ρ, n), and for
(r) = m−1
m
r m−1 ,
1 d ρn
− Iμ,F (u) ≥ mum (ξ )dμ
2 dt n−1 E
m
+ +m−2 u P · (X # X)dμ
n−1 E
m
+ +m−1 u M · X # (3X + 2Y ) dμ
n−1 E
1 m−1 2
+ + u2−m X · (X + Y ) dμ,
n−1 m E
On the basis of the preceding proposition, we may now state the resulting func-
tional inequality under a curvature-dimension condition. For simplicity, we only
deal with the CD(0, n) hypothesis.
Theorem 6.11.7 In the preceding setting, under the CD(0, n) condition, for a
∇∇h ≥
a > 0, then, for H (r) = −n r
positive smooth function h such that 1−1/n ,
H (v) = −h, f > 0, and provided E vdμ = E f dμ, we have
1 2
H (f ) − H (v) − (f − v)H (v) dμ ≤ f ∇ H (f ) − H (v) dμ.
E 2a E
Proof Note that by the sign choice, the function H is convex, so that the left-hand
side in the inequality of the theorem is positive. We stay at a somewhat informal
level since justification of each step is rather tedious. Nevertheless, the principle is
the following. Under the CD(0, n) condition, Proposition 6.11.6 shows that solving
the equation
∂t u = −∇ ∗ u ∇ (u) − (v) , u0 = f,
leads to the differential inequality
d
Iμ,F (ut ) ≤ −2 a Iμ,F (ut ).
dt
It follows from the latter that
We are left
to show that indeed ut → v as t → ∞. A necessary condition is that
E vdμ = E f dμ. A more precise investigation of the entropy decay (and further
technical bounds) indeed ensures that this is the case, so that the preceding amounts
to the entropic inequality of the theorem.
Remark 6.11.8 It is somewhat strange that for the parameters of Theorem 6.10.4,
the Gagliardo-Nirenberg inequalities may be deduced either from the Sobolev in-
equality in Rn+m with ν = n + m2 relying on the CD(0, n + m) condition, or via
the fast diffusion equation from the CD(0, ν) condition of Rn (with the impres-
sion therefore of a loss of information in the CD(0, n) condition of Rn ). But, as
already observed throughout this chapter, this loss of information is actually recap-
tured by the action of the dilation group (which enters in a somewhat subtle way
in the equivalence between the entropic inequalities of Proposition 6.10.6 and the
340 6 Sobolev Inequalities
which leads to
d ρ n(n − 1)
Iμ,F (u) + u1−2/n
dμ ≤ 0.
dt n−2 E
Therefore,
1−2/n 1−2/n n−2
ut dμ ≤ u0 dμ + Iμ,F (u0 ).
E E ρ n(n − 1)
Changing u0 into f 2n/(n−2) leads to the tight Sobolev inequality with the sharp
constant of Theorem 6.8.3, thus providing an alternative approach.
and more applied topics. It is not within the scope of this short notice to present
exhaustive comments and historical background on Sobolev inequalities. One major
reference is the comprehensive monograph of V. Maz’ya [303] (a recent expansion
of the famous [302]) to which we refer for a complete account of the subject. The
contributions [1, 2, 96, 288, 413–415] are further relevant references. Related to the
geometric aspects emphasized in this work, we mention in addition the monographs
by T. Aubin [20, 21], E. Hebey [233, 234], and those of by L. Saloff-Coste [376],
A. Grigor’yan [217] and P. Li [284] as well as the references therein.
The optimal constants in the Sobolev inequalities on Rn are due independently
to T. Aubin [18, 19] and G. Talenti [406] (see also [364] for dimension 3). The
case of the sphere is examined in [17] (see also [20]), while the hyperbolic case
is proved in [235] by similar means as described here, relying on [382]. The Nash
inequality in Rn was introduced by J. Nash [323], the optimal constant having been
computed by E. Carlen and M. Loss [110]. The history of the Euclidean logarithmic
Sobolev inequality (Proposition 6.2.5) is closely related to that of the Gaussian log-
arithmic Sobolev inequality (see Chap. 5), which in particular goes back to [387].
Proposition 6.2.5 linking the Euclidean and Gaussian forms has been explicitly or
implicitly used by many authors [54, 106, 437, 438]. Forms of the Euclidean loga-
rithmic Sobolev inequality were used by G. Perelman in his solution of the Poincaré
conjecture.
The equivalences of Proposition 6.2.3 in Sect. 6.2 between the various forms
of Sobolev inequalities classically appeared as consequences of ultracontractive
bounds equivalent to the corresponding functional inequalities (Sect. 6.3). The di-
rect approach presented here via the slicing decomposition was introduced in [35]
and [149]. See e.g. [376, 377] for more and for bibliographical accounts. The Eu-
clidean logarithmic Sobolev inequality regarded as a limit of the sharp Sobolev in-
equality (Remark 6.2.6) was pointed out in [58] (see also [57]).
The main ultracontractive Theorem 6.3.1 of Sect. 6.3 is the end result of a se-
ries of steps and results by numerous authors including, among others, [26, 108,
138, 144, 181, 319, 323, 421]. The proof presented here is due to Nash, using the
corresponding Nash inequality, following [108]. The other authors dealt with either
Sobolev or logarithmic Sobolev inequalities, illustrated here in Proposition 6.2.3,
or Gagliardo-Nirenberg inequalities. See [26, 144, 217, 376, 377, 422] for further
details and historical background.
The Rellich-Kondrachov Theorem 6.4.3 may be found in most classical refer-
ences such as [96, 179, 203].
Proposition 6.6.1 in Sect. 6.6 is due to L. Saloff-Coste [375] (the proof here
being taken from [275]). The sharp version of the diameter bound under a Sobolev
inequality (namely that D ≤ π whenever the Sobolev inequality with the constants
of the model space Sn holds) was obtained in [39], leading to the classical Bonnet-
Myers Theorem in this context. Proposition 6.6.2 on the volume growth is part of
the folklore and may be found, for example, in [114] (see also [217, 377]).
The celebrated Li-Yau parabolic inequality (Corollary 6.7.5) was established
in [285] by means of the maximum principle. The heat flow monotonicity proof
presented here via dimensional logarithmic Sobolev inequalities for heat kernels is
taken from [40]. More on Harnack inequalities for elliptic operators on Riemannian
342 6 Sobolev Inequalities
manifolds and precise heat kernel bounds (of the form (6.7.11) and (6.7.12)) may be
found in [144, 217, 284, 378]. The small time asymptotics (6.7.13) go back to [420]
(see [332] for the final word). Local hypercontractivity (Theorem 6.7.7) first ap-
peared in [31]. Versions of the Li-Yau inequality in sub-Riemannian geometries by
means of heat flow monotonicity have been obtained in [51].
The sharp Sobolev inequality on the sphere (6.1.2) of T. Aubin [17, 20] was ex-
tended to Riemannian manifolds with a uniform strictly positive lower bound on
the Ricci curvature in [253] via the Lévy-Gromov isoperimetric comparison The-
orem (see the Notes and References of Chap. 8). The non-linear proof of Theo-
rem 6.8.3 developed in Sect. 6.8 is adapted from the corresponding analysis by
O. Rothaus [371] for logarithmic Sobolev inequalities. The method actually orig-
inates in [202] in a partial differential equation context (see later [66]). The fi-
nal step of the argument is reminiscent of the Nash-Moser iteration principle. The
proof is presented in [26], and further developed in [185] where the subsequent
Remarks 6.8.4 and 6.8.5 are emphasized, the second one again following [371].
On the sphere, these remarks go back to W. Beckner [56] who used sharp Hardy-
Littlewood-Sobolev inequalities for this purpose. See also [66, 160, 161]. Moser-
Trudinger inequalities appeared in [320, 416] as the limiting case n = 2 (actually
any n in the suitable version in higher dimension). See also [35] in the context of the
slicing technique of Proposition 6.2.3. On the sphere S2 , the sharp inequality (6.8.8)
is sometimes referred to as Onofri’s inequality [337] (see also [56]).
Section 6.9 on conformal invariance of Sobolev inequalities describes some of
the main ideas involved in the Yamabé program developed in particular by T. Au-
bin [17, 20] and R. Schoen [380, 381] (see [21, 233, 234]). Its abstract formulation
in the context of this book is partly inspired by the paper [382] of R. Schoen and
S.-T. Yau. The formal equivalence between the classical Sobolev inequalities in Eu-
clidean, spherical and hyperbolic spaces may also be traced back to [382] and is em-
phasized in [235]. (Remark 6.9.5 on the (minimal) class of functions in the Sobolev
inequality in Euclidean space is part of the classical theory [288, 303, 449].) The
references [165, 234] present the (A, B) program which is devoted to the respective
sharpness of the constants C and A, in the notation used here, for Sobolev inequal-
ities in manifolds.
The Gagliardo-Nirenberg inequalities recalled in Sect. 6.10 were first put forward
in [192, 331]. See for instance [96, 303] for general recent references and histori-
cal background. Optimality and extremal functions of the sub-family of Gagliardo-
Nirenberg inequalities of Theorem 6.10.4 have been obtained by M. Del Pino and
J. Dolbeault [147] by an analytic study of the corresponding minimization prob-
lem. Theorem 6.10.4 has been generalized in [330] to a larger class of parameters
with arguments based on the mass transportation method of [137]. The reformula-
tion in [147] as an entropy decay along non-linear equations in Proposition 6.10.6
has been an important step in the understanding of the connections between porous
medium and fast diffusion equations, functional inequalities and convergence to
equilibrium. See also [111], and [424, 426] for links with mass transportation and
the references therein.
A comprehensive account of porous medium and fast diffusion equations is the
recent monograph [423] to which we refer for a complete background. On the basis
6.12 Notes and References 343
of the Dolbeault-Del Pino developments, Sect. 6.11, and Theorem 6.11.7, are es-
sentially due to J. Demange [155], who solved in particular all the regularity issues
necessary for the proof (see also [28, 156] and [161]). The last Remark 6.11.9 is
also due to him (unpublished). Fast diffusion has also been used to obtain Hardy-
Littlewood-Sobolev inequalities in [107], relying on the heat flow monotonicity ap-
proach to geometric Brascamp-Lieb inequalities (different from the approach de-
scribed in Sect. 4.9, p. 215) developed in [59, 109].
Part III
Related Functional, Isoperimetric
and Transportation Inequalities
Chapter 7
Generalized Functional Inequalities
Part II, devoted to Poincaré, logarithmic Sobolev and Sobolev inequalities, describes
how each of these families capture different features of the associated semigroup or
the invariant measure, in terms of convergence to equilibrium, estimates on the heat
kernels or tail behaviors of the invariant measure. There are many ways to describe
intermediate families of functional inequalities which are suited to a wide variety of
regimes as well as to more precise, or different, features. This section investigates
such families, restricting to three main examples, entropy-energy, generalized Nash
and weak Poincaré inequalities. Many other families have been developed in differ-
ent directions, each of them having its own interest. Some will be mentioned at the
end of the section.
In this chapter, we deal as usual with a Full Markov Triple (E, μ, ) as pre-
sented in Sect. 3.4, p. 168, with Dirichlet form E, infinitesimal generator L, Markov
semigroup P = (Pt )t≥0 and underlying function algebras A0 and A. Actually, the
natural framework for the investigation here is that of the Standard Markov Triple,
and besides the section on off-diagonal heat kernel bounds for which the extended
algebra A is required, all the results may be stated in this framework. We neverthe-
less simply use the terminology “Markov Triple” to cover the various instances.
The first section describes the family of functional entropy-energy inequalities
governed by a growth function , the example of the logarithmic function giv-
ing rise to the logarithmic entropy-energy inequality of the previous chapter as an
equivalent form of the standard Sobolev inequality. The family of entropy-energy in-
equalities is in particular well-suited to heat kernel bounds by means of the method
developed for hypercontractivity under logarithmic Sobolev inequalities in Chap. 5.
Off-diagonal heat kernel estimates may be achieved in the same way (Sect. 7.2).
Several examples of both entropy-energy inequalities and their associated heat ker-
nel bounds are presented in Sect. 7.3. The next section investigates generalized Nash
inequalities on the basis of the example of the classical Nash inequality in Euclidean
space, as well as weighted Nash inequalities which form a further family of interest.
Again, heat kernel bounds and tail inequalities of various types may be obtained.
Weak Poincaré inequalities are studied in Sect. 7.5 as the main minimal tool to
tighten families of standard functional inequalities. Weak Poincaré inequalities may
be further studied in their own right from the viewpoint of heat kernel bounds and
tail estimates. Section 7.6 briefly describes related families of functional inequali-
ties of interest and their relationships with the previous ones. Various applications
of the tools investigated in this chapter are illustrated, for comparison, on the model
family of (probability) measures cα e−|x| dx, α > 0, on Rn with respect to the stan-
α
dard carré du champ operator (f ) = |∇f |2 . Actually, to avoid technical regularity
issues of the potential W (x) = |x|α (at the origin), it will be more convenient to deal
with the family
α/2
dμα (x) = cα exp − 1 + |x|2 dx (7.0.1)
(with cα > 0 the normalization constant) which behave similarly at infinity. The
main results for this family in the one-dimensional case are summarized in Sect. 7.7
at the end of the chapter. The multi-dimensional picture is essentially the same with
constants depending on the dimension.
Finally, it is worth emphasizing that most inequalities investigated in this chapter
compare two functionals via a growth function which will always be C 1 increasing
and concave from (0, ∞) to R, R+ or (0, ∞). Although the C 1 hypothesis is not
strictly necessary (continuous should be enough), it will be convenient for a number
of technical steps. As already (briefly) mentioned in the proof of Proposition 4.10.4,
p. 224, a standard procedure in this regard, at least
for positive growth functions, is
to replace (concave increasing) by (r) = 2 r (s)ds which is still increasing
r 0
and concave and satisfies 12 (r) ≤ (r) ≤ (r)). In several parts of this chapter,
sharp constants are indeed not an issue and focus is instead placed on growth orders.
This family compares the entropy and the energy of a function (in the Dirichlet
domain D(E)) via a C 1 increasing concave growth function : (0, ∞) → R. A first
example of such an inequality is the logarithmic entropy-energy inequality (6.2.2)
of Proposition 6.2.3, p. 281, with growth function the logarithmic function. The
function (r), r > 0, may have a finite limit at r = 0, denoted (0), or converge to
−∞ as in the latter example. Recall that the entropy of a measure ν (not necessarily
finite) on a measurable space (E, F) has been defined at (5.1.1), p. 236, as
Entν (f ) = f log f dν − f dν log f dν
E E E
for all positive integrable functions f : E → R such that E f | log f |dν < ∞.
7.1 Inequalities Between Entropy and Energy 349
Observe that for smooth functions, β = −−1 , and so there is a complete equiv-
alence between the two formulations (7.1.2) and (7.1.3). Whenever is not bi-
jective, −1 has to be replaced by its generalized inverse, where the generalized
350 7 Generalized Functional Inequalities
The main interest in the family of entropy-energy inequalities EE() is the flexi-
bility in the choice of the growth function . In particular, this family interpolates
between logarithmic Sobolev and Sobolev-type inequalities, and actually allows for
intermediate regimes, in particular of heat kernel bounds. The following first and
main result is an illustration of this principle. It is a direct extension of the hyper-
contractivity Theorem 5.2.3, p. 246, and is proved using the same Gross method.
If 1 ≤ p ≤ ∞, p ∗ denotes its conjugate exponent, so that p1 + p1∗ = 1. The norms
are understood as usual in Lp (μ), 1 ≤ p ≤ ∞. Recall the operator norm notation
· p,q from Lp (μ) into Lq (μ) from (6.3.1), p. 286.
where
⎧ qq ∗
⎪
⎪ dr
⎪t (δ) =
⎨ (δr) √ ,
pp ∗ 4 r(r − 4)
⎪ qq ∗
(7.1.5)
⎪
⎪ dr
⎩m(δ) = (δr) √ .
pp ∗ r r(r − 4)
qq ∗
pp∗
When 1 ≤ p ≤ q ≤ 2, the integrals in (7.1.5) have to be replaced by
pp ∗ qq ∗ while
pp∗
qq ∗
when 1 ≤ p ≤ 2 ≤ q ≤ ∞, they have to be replaced by 4 + 4 .
7.1 Inequalities Between Entropy and Energy 351
Remark
7.1.3 In Theorem 7.1.2, it is sometimes simpler to change r into
s = 1 − 4r and δ into 4δ so that (7.1.5) becomes
⎧ 2
⎪
⎪ 1 1− q δ ds
⎪
⎪ t (δ) = ,
⎨ 2 1− p
2 1−s 2 1 − s2
(7.1.6)
⎪
⎪ 1 1− q
2
⎪
⎪ =
δ
⎩ m(δ) ds.
2 1− p2 1 − s2
In this form, the formulas are valid whatever the values of the parameters
1 ≤ p ≤ q ≤ ∞, as long as the integrals converge.
We make a few comments before turning to the proof of Theorem 7.1.2. First,
the result is compatible with the semigroup property. Namely, since Pt ◦ Ps p,q ≤
Pt p,r Ps r,q when p ≤ r ≤ q, applying the conclusion separately on [0, t] and
on [0, s] yields the same bound as applying it directly on [0, t + s]. In particular, the
case p ≤ 2 ≤ q is a direct consequence of the separate results for the pairs (p, 2)
and (2, q). Moreover, thanks to the symmetry of the formulas under the change of
p and q into their conjugate exponents p ∗ and q ∗ , it is also compatible with the
symmetry property Pt p,q = Pt q ∗ ,p∗ . Therefore, everything reduces to the case
2 ≤ p ≤ q ≤ ∞.
The second set of observations actually illustrates the range of applications of
Theorem 7.1.2. Note first that whenever lim infr→∞ (r) > 0, as is the case for a
logarithmic Sobolev inequality and hypercontractivity (cf. Theorem 5.2.3, p. 246),
one may not reach q = ∞ in Theorem 7.1.2. There is actually a minimal value
of t for which hypercontractivity applies. On the other hand, the conclusion of
Theorem 7.1.2 may be applied to p = 2 and q = ∞, or p = 1 and q = ∞, as
soon as r −1 (r) is integrable at infinity (which implies that r −2 (r) is also in-
tegrable at infinity). It applies in particular to the logarithmic entropy-energy in-
equality (6.2.2), p. 281, for which (r) = n2 log(A + Cr), r ∈ (0, ∞), and yields
in this case ultracontractive bounds of the form Pt 1,∞ ≤ Ct −n/2 as described
in Sect. 6.3, p. 286. It also shows that (r) = A + Cr α , r ∈ (0, ∞), gives ultra-
contractive bounds as soon as α < 1 from which it appears that the standard log-
arithmic Sobolev case is a limiting case
∞ for ultracontractivity. It may furthermore
happen that limr→∞ (r) = 0 while (r) dr
r = ∞. In this case, the semigroup
is said to be immediately hypercontractive in the sense that for any t > 0 and any
1 < p < q < ∞, Pt is a bounded operator from Lp (μ) into Lq (μ).
(t) = E (Pt f )
q(t) dμ, t ≥ 0, and arguing as in the proof of Theorem 5.2.3,
q (t)
(t) = Entμ (Pt f )q(t) + (t) log (t) − q(t) q(t) − 1 Eq(t) (Pt f )
q(t)
where we set Eq (f ) = E f q−2 (f )dμ to ease the notation. Now, (7.1.2) applied
to (Pt f )q/2 yields that for all r > 0,
q q
≤q (r) − (q − 1) Eq (Pt f ) + (r) + log .
4 q
Next choose q → r(q) ∈ (0, ∞) and take q = q(t) which solves the differential
equation q (r(q)) = 4(q − 1). The preceding inequality then reads
q
(log ) ≤ r(q) + log ,
q
or equivalently
1 q
log ≤ 2
r(q) .
q q
Integrating this differential inequality between 0 and t yields
t
q
log (t) 1/q(t)
≤ log (0) 1/q(0)
+ 2
r(q) ds.
0 q
In other words, for any choice of the function q → r(q), and using q as a variable
instead of t, if
q1
dq
t= r(q)
q0 4(q − 1)
with q0 < q1 , then
q1 dq
Pt q0 ,q1 ≤ exp r(q) 2 .
q0 q
It remains to choose the map q → r(q) in order to optimize this bound when
q0 = q(0), q1 = q(t) and t are given. The computation shows that the optimal choice
δq 2
indeed does not depend on , and is given by q → r(q) = q−1 . The final result is
then obtained after a change of variable. Theorem 7.1.2 is established.
Conversely, if for some t ∈ (a, ∞), a ≥ 0, Pt 1,∞ ≤ K(t), then EE() holds for
defined by
(r) = inf 2tr + log K(t) , r ∈ (0, ∞).
t>a
Proof The first assertion follows from Theorem 7.1.2 with p = 1 and q = ∞,
changing r into 4r and δ in 4δ. Conversely, as consequence of the Riesz-Thorin
√
Theorem, Pt 1,∞ ≤ K implies that Pt 2,∞ ≤ K. By Proposition 5.2.6, p. 248,
2
it follows that whenever E f dμ = 1,
Entμ f 2 ≤ 2t E(f ) + log K(t).
The example of K(t) = B t −n/2 , t > 0, in Corollary 7.1.4 (as is the case under
a Sobolev inequality, cf. Theorem 6.3.1, p. 286), yields a growth function of the
form (r) = n2 log(A + Cr), r ∈ (0, ∞), in the entropy-energy inequality EE().
In general, the uniform bound on Pt 1,∞ reflects the behavior of at infinity as
t → 0, and the behavior of at 0 as t → ∞. In this way, the next statement recov-
ers from Corollary 7.1.4 the ultracontractive bounds under a Sobolev inequality of
Theorem 6.3.1, p. 286, through entropy-energy inequalities.
Corollary 7.1.6 Under the Euclidean logarithmic Sobolev inequality EE() with
(r) = n2 log( nπe
2r
), r ∈ (0, ∞),
1
Pt 1,∞ ≤ , t > 0.
(4πt)n/2
These intermediate bounds are again optimal on the standard heat semigroup on Rn .
Indeed, since this bound is optimal when p = 1 and q = ∞, thanks to the compati-
bility relations between the (p, q) norms and the semigroup property, any interme-
diate bound is also optimal.
When the measure μ is finite (and normalized into a probability measure), the
growth function in the definition of an entropy-energy inequality EE() takes
positive values (since Entμ (f 2 ) ≥ 0). Furthermore, as mentioned earlier, the exis-
tence of a Poincaré inequality is equivalent to the fact that may be chosen so
that (r) ≤ Cr near 0. As it is known from Corollary 6.4.1, p. 290, as soon as
(Pt )t≥0 is ultracontractive and the measure is finite, the operators Pt , t > 0, are
Hilbert-Schmidt and therefore the generator L has a discrete spectrum. In particular,
a Poincaré inequality holds, and it may therefore be assumed that (r) ≤ Cr for
7.2 Off-diagonal Heat Kernel Bounds 355
some constant C, or equivalently that (0) < ∞. Note also that, from the general
Definition 7.1.1, tightness in EE() is achieved as soon as (0) = 0. As a conse-
quence of the discussion in Sect. 7.5 below (Corollary 7.5.7), when (0) = 0 it is
possible to choose (some other) growth function such that (r) ≤ Cr for some
C > 0.
When (0) < ∞ and r −1 (r) is integrable at infinity, Corollary 7.1.4 actually
yields bounds on the convergence to equilibrium. The following statement again
follows from a precise analysis of t (δ) and m(δ) as δ → 0 in (7.1.7). The details are
left to the reader.
This section addresses the issue of off-diagonal estimates on the density kernels
pt (x, y), t > 0, (x, y) ∈ E × E, of a Markov semigroup (Pt )t≥0 (with respect to the
invariant measure) under an entropy-energy inequality EE(). Actually, as soon as
uniform bounds are available, that is a control of pt (x, x) or pt (x, y) uniformly over
(x, y) ∈ E × E (almost everywhere), there are in general also off-diagonal estimates
taking into account the (intrinsic) distance d(x, y) between x and y associated with
the Markov Triple (E, μ, ) as considered in (3.3.9), p. 166. Recall that the distance
function (Sect. 3.3.7, p. 166) refers to the extended algebra A, justifying the Full
Markov Triple assumption (cf. Sect. 3.4, p. 168).
Gaussian bounds of the type
(for constants C(t) > 0, c > 0) are of interest. Such heat kernel bounds were already
considered in Sect. 6.7, p. 296, under geometric features by means of Harnack-type
inequalities. In this section, a general method of providing such bounds is developed
under functional inequalities in entropy-energy form.
356 7 Generalized Functional Inequalities
In the finite measure case, whenever C(t) < 2 in (7.2.1) (with this in mind, re-
call Proposition 6.3.4, p. 289), Proposition 1.2.6, p. 15, provides a uniform lower
bound on p2t , and therefore such heat kernel bounds can only hold if the diam-
eter D = D(E, μ, ) of E with respect to the distance d (cf. (3.3.10), p. 167) is
finite. In this respect, we first state a bound on the diameter under an entropy-energy
inequality EE() for a suitable growth function . It is obtained exactly as Propo-
sition 6.6.1, p. 294, under logarithmic entropy-energy inequalities by means of the
Herbst argument (Proposition 5.4.1, p. 252).
Recall that a standard logarithmic Sobolev inequality LS(C) with (r) = 2Cr,
r ∈ (0, ∞), does not yield any finite diameter (for example the standard Gaussian
measure in Rn ). As for Poincaré and logarithmic Sobolev inequalities, the Laplace
transform
bounds of Proposition 7.2.1 may be used towards tails estimates on
μ(|f − E f dμ| ≥ r), r > 0, at rates reflected by the growth of . Further, and
stronger, estimates will be illustrated later in the context of generalized Nash in-
equalities.
The next theorem presents the announced off-diagonal estimates under an entropy-
energy inequality EE(). According to the previous discussion and Proposi-
tion 7.2.1, in the finite measure case the statement is restricted to the case when
EE() implies a bounded diameter. The technique developed in the proof by a
suitable transformation of the semigroup under a Lipschitz function turns out to be
most efficient in many situations.
Then, for 0 < t ≤ T (d(x, y)), (x, y) ∈ E × E, the density kernels pt (x, y), t > 0, of
the semigroup (Pt )t≥0 satisfy
log pt (x, y) ≤ H t, d(x, y)
Observe that in the finite measure case, the diameter D is bounded from above
by U (0) according to Proposition 7.2.1 so that T (d) is well-defined on [0, D]. Al-
though the explicit bound put forward in Theorem 7.2.2 appears rather involved,
Corollary 7.2.3 below will actually show that it leads to the expected precise off-
diagonal bounds under Sobolev-type inequalities.
Proof For a function h ∈ A0 such that (h) ≤ δ where δ > 0 will be specified later,
consider the new semigroup Pth f = e−h Pt (eh f ), t ≥ 0. This semigroup is no longer
Markov, but is still positivity preserving. It is symmetric with respect to the measure
e2h dμ, but this symmetry will actually not be used since we mostly work with the
initial measure μ. The first task is to reach, under the EE() inequality, the bound
h
P
t 1,∞ ≤ e
m(t,δ)
(7.2.2)
where m(t, δ) only depends on t and δ (and of course on the function itself).
Hence the density kernel pth (x, y) with respect to μ is uniformly bounded from
above by em(t,δ) . But, by construction, pth (x, y) = pt (x, y)eh(y)−h(x) so that, for all
t > 0 and (x, y) ∈ E × E,
pt (x, y) ≤ em(t,δ)+h(x)−h(y) .
Minimize then the previous bound on pt (x, y) on all functions h such that (h) ≤ δ
to get that
√
pt (x, y) ≤ em(t,δ)− δd(x,y)
. (7.2.3)
For T > 0, choose then an increasing function t → q(t) with q(0) = 1 and
q(T ) = ∞ to be specified later and consider on [0, T ] the quantity (t) =
(t, q(t)), t ≥ 0. The inequality EE() in the form (7.1.2) leads to the differential
inequality on ,
q |q − 2| 4(q − 1)
≤
(r) + − g q/2 dμ
q α q E
q q
+ (r) + δ q + α|q − 2| + log
q q
depending on the two parameters α > 0 and r > 0. These parameters may be cho-
sen to depend upon q, and by further choosing q = q(t) so that the coefficient of
7.2 Off-diagonal Heat Kernel Bounds 359
q/2 )dμ
E (g in the previous inequality vanishes, we end up with
1 q δ
log ≤ 2 r(q) + q + α(q)|q − 2|
q q q
where
q|q − 2|
α(q) = . (7.2.4)
4(q − 1) − q (r(q))
Setting
q|q − 2|s
α=
4(q − 1)(s − 1)
where s = s(q) > 1 is a new function, (7.2.4) turns into
(r(q))
dt = s(q) dq.
4(q − 1)
and
∞ 4(q − 1)
m(T ) = r(q)
1 q2
(q − 2)2 s(q) dq
+ δs(p) r(q) 1 + .
4(q − 1)(s(q) − 1) 4(q − 1)
The final step is to choose optimal functions r(q) and s(q) which minimize the
quantity m(T ) when T > 0 is fixed. The optimal choice once again does not depend
on , and is given, for some parameter 0 < κ < 1, by
qq ∗ − 4 qq ∗ s
s = s(q) = 1 + , r = r(q) = δ(1 − κ)
qq ∗ − 4κ 4(2 − s)
or, with κ = 1 − τ
δ and v = δw,
⎧
⎪ 1 ∞ dv
⎪
⎨t = t (δ, τ ) = 2 (v) √ ,
v(v + δ − τ )
τ
∞
⎪ 1 √ δ−τ dv
⎪
⎩m = m(δ, τ ) = (v) δ − √ + (δ − τ )t.
2 τ v + δ − τ 3/2
v
√
Turning back to (7.2.3), the minimum in δ of m(δ, τ ) − δd when t (δ, τ ) is fixed
is obtained when
∞
1 ∞ (v) (τ ) (v)
d= 3/2
dv − √ = dv
2 τ v τ τ v 1/2
from which the theorem is a direct consequence. The proof is complete.
(x, y) ∈ E × E,
pt (x, y) ≤ B t −n e−d
2 (x,y)/4t
.
The details, a direct consequence of Theorem 7.2.2, are left to the reader. It is
worth mentioning that the preceding corollary achieves the optimal 4t in the expo-
nential factor, although the polynomial term is only of order t −n and not t −n/2 as in
the uniform estimate (Corollary 7.1.5). This is not a failure of the method. Indeed,
on a Riemannian manifold, outside the diagonal, pt (x, y) behaves when t → 0 in
a different way at the cut-locus of x. The correct exponent is t −(n+p)/2 , where p
is the dimension of the manifold of geodesics which go from x to y. For example,
p = n − 1 on two opposite points on an n-dimensional sphere.
The distance function d used to obtain the off-diagonal estimates in Theorem 7.2.2 is
built from the generator L of the Markov semigroup (Pt )t≥0 . Other kinds of pseudo-
distances may be considered in the same way, such as for example, the harmonic
distance d H (x, y) defined as
d H (x, y) = esssup h(x) − h(y) , (x, y) ∈ E × E, (7.2.5)
the esssup running over all functions h in A such that Lh = 0 and (h) ≤ 1. Of
course d H = 0 on a compact manifold with respect to the Laplace-Beltrami operator
7.2 Off-diagonal Heat Kernel Bounds 361
since every harmonic function is then constant. But in Rn with the usual Laplace
operator, d H (x, y) = |x − y| since the linear functions are harmonic and attain the
distance between x and y. With this definition, optimal heat kernel bounds may be
achieved.
Proof We only sketch the proof, staying at a somewhat informal level. Following
the proof of Theorem 7.2.2, change Pt f into Pt f = e−γ h Pt (eγ h f ) where now
γ
then
q2
Entμ f q ≤ − (r) f q−1 Lγ f dμ + (r) + γ 2 f q dμ.
4(q − 1) E E
Hence, following the proof of Theorem 7.1.2, and with the same values of m and t,
γ
Pt
2
≤ em+γ t ,
1,∞
for every t > 0 and (x, y) ∈ E × E. It then remains to optimize first in h and then
in γ to get the desired result.
362 7 Generalized Functional Inequalities
7.3 Examples
This section illustrates some of the results of the preceding sections, both at the
level of the entropy-energy inequalities themselves and of their consequences for
heat kernel bounds. We deal for simplicity with Markov Triples (E, μ, ) on the
Euclidean space E = Rn with (f ) = |∇f |2 and dμ = e−W dx where W : Rn → R
is a smooth potential.
The next proposition describes conditions on the growth of the potential W and
its derivatives in order for an entropy-energy inequality EE() to hold for some
function .
In the preceding, the function c may of course be replaced by any larger function,
and the best choice for (r), r ∈ (0, ∞), is the concave envelope of the function
n r
inf (log s − 1) + + c(s)
s>0 2 πes
Proof Start from the Euclidean logarithmic Sobolev inequality (6.2.8) of Proposi-
tion 6.2.5, p. 284, which
indicates that for any smooth compactly supported function
g on Rn such that Rn g 2 dx = 1,
n 2
g 2 log g 2 dx ≤ log |∇g|2 dx .
Rn 2 nπe Rn
The above proof is not specifically tied to the Euclidean case and shows how to go
from an entropy-energy inequality EE() on (E, μ, ) to a new one for (E, ν, ),
where dν = e−W dμ.
Proposition 7.3.1 may be applied in several instances of interest, including the
model family μα of (7.0.1) described in the introduction of this chapter. Indeed, if
W (x) = (1 + |x|2 )α/2 with α > 2, it is easily checked that c(s) ≤ Cs α/(α−2) , s ≥ 1.
Here and below, C > 0 depends on n and α, and may change from line to line. As a
consequence, μα with α > 2 satisfies an EE() inequality with
(r) = C 1 + r α/(2α−2) , r ∈ (0, ∞). (7.3.1)
(It may be checked furthermore that whenever α < 2, μα does not satisfy any
EE() inequality, while μ2 satisfies a logarithmic Sobolev inequality, that is
EE() with (r) = Cr, r ∈ (0, ∞).) According to Corollary 7.1.4, for 0 < t ≤ 1,
−α/(α−2)
Pt 1,∞ ≤ eCt .
Conversely, this bound for some t > 0 shows that EE() holds with (7.3.1), so that
the heat kernel bound has the right order of magnitude as t → 0. Finally, Propo-
sition 7.2.1 easily shows that μα (|x| ≥ r) ≤ Ce−r /C , r > 0, confirming again the
α
exponents of .
Along the same lines, if W (x) = |x|2 [log(1 + |x|2 )]α , α > 1, we may choose
c(s) = C(1 + ebs ), s > 0, and (r) = C(1 + r log−α (e + r)), r ∈ (0, ∞), from
1/α
which
−1/(α−1)
Pt 1,∞ ≤ exp eCt , 0 < t ≤ 1.
In the same spirit, it may be checked that for α = 2, the semigroup is not ultracon-
tractive but immediately hypercontractive (cf. p. 351).
The same method applies on a compact interval of the real line. For example, on
(0, 1) with W (x) = −m log x for small x > 0 and W (x) = −p log(1 − x) for x near
to 1, m, p > 2, there exists an entropy-energy inequality EE() with
n
(r) ≤ (C + log r)
2
364 7 Generalized Functional Inequalities
with κ = max(a, b). This last inequality is easily seen to be compatible with bound-
edness of the diameter. These examples indicate that the behavior of the function
at infinity reflects the behavior of the measure at the boundaries of the interval, the
smaller the weight around the edges, the bigger the function at infinity.
max(, a0 ). However, the latter function is no longer concave on (0, ∞). Provided
that (r) = (r) − r (r) → ∞ as r → ∞ (which is usually the case), may be
compensated by a linear part near the origin in order to make it concave.
As for the entropy-energy inequality EE(), linearized versions of the Nash
inequality N () may be considered. Whenever is bijective (if it is not bijective,
use its generalized inverse) and takes values in some interval (a, b), then the N ()
inequality is equivalent to the family of inequalities
f 2 dμ ≤ s E(f ) + β(s), s ∈ (a, b), (7.4.1)
E
Proof The proof is straightforward. Under P (C), for every f ∈ D(E) with
f 1 = 1,
2
f 2 dμ ≤ C E(f ) + f dμ ≤ C E(f ) + 1.
E E
Conversely, apply N () with (r) = 1 + Cr to 1 + εf where f is a bounded
function in D(E). For ε > 0 small enough so that 1 + εf ≥ 0, N () boils down to
2
1 + 2ε f dμ + ε 2 f 2 dμ ≤ ε 2 C E(f ) + 1 + 2ε f dμ + ε 2 f dμ
E E E E
Proof Recall as in the proof of Proposition 6.2.3, p. 281, that the convex function φ :
Entμ (f 2 )
r → log(f 1/r ), r ∈ (0, 1], is such that φ ( 12 ) = − . Hence, by convexity,
f 22
f 22 Entμ (f 2 )
log ≤ .
f 21 f 22
b b
log b ≤ 1 + log .
a a
As for the classical Nash inequalities in the proof of Theorem 6.3.1, p. 286, the
generalized Nash inequalities N () may be used to obtain heat kernel bounds and
ultracontractivity provided does not grow too fast at infinity. While easier to ob-
tain than under an entropy-energy inequality EE(), the bounds are in general less
precise (and do not reveal any information on the · p,q norms). However, as de-
scribed in the next section, the same method may provide non-uniform bounds on
the heat kernel when the latter is not necessarily bounded. The next statement is a
first illustration of this phenomenon.
In particular, the density kernel pt (x, y), t > 0, (x, y) ∈ E × E, of Pt with respect
to the invariant measure μ is uniformly bounded from above by K(t).
∞
Observe from the hypotheses that since U ((0)) = 0, 0 (r) dr
r = ∞, so that
the function K is a decreasing bijection from (0, ∞) onto ((0), ∞).
Proof The proof is very similar to that of the corresponding result for the classical
Nash inequalities in Theorem
6.3.1, p. 286, so we shall only
sketch it. For a positive
function f in D(E) with E f dμ = 1, consider (t) = E (Pt f )2 dμ, t ≥ 0. The
N() inequality turns into the differential inequality ≤ (− 2 ). Fix now t > 0.
Since K(2t) ∈ [(0), ∞), if (t) ≤ (0) there is nothing to prove. If not, since
is decreasing, (s) > (0) for s ∈ (0, t). Therefore, by definition of the inverse
function U , (s) ≤ −2U ( (s)) for s ∈ (0, t). Introducing the decreasing function
∞ ∞
du dr
H (s) = = (r) , s ∈ (0), ∞ ,
s U (u) −1 (s) r
Generalized Nash inequalities may also be used to obtain tail estimates for Lip-
schitz functions. The following statement goes somewhat beyond Proposition 7.2.1
in case of entropy-energy inequalities.
1
Whenever 02(0) [u3 −1 ( 2u
1 −1/2
)] du < ∞, then f is bounded (μ-almost every-
where), and in particular the diameter of (E, μ, ) is finite.
q0
In this statement, F is defined on (0, A), where A = 0 [u3 −1 ( 2u
1 −1/2
)] du.
When A < ∞, it is extended by 0 on [A, ∞).
Proof For r, s > 0, apply N () to g = 1s [(f − r)+ ∧ s] where f ∈ D(E). Since
1{f ≥r+s} ≤ g ≤ 1{f ≥r} and (g) ≤ s −2 1{f ≥r} , using that is concave, and there-
fore that u−1 (u) is decreasing, it follows that
1
q(r + s) ≤ q 2 (r) 2
s q(r)
for any k ≥ 1. Using that r(q) is decreasing, it finally follows that for any q1 < q
−1/2
q 1
r(q1 ) − r(q) log 2 ≤ u3 −1 du
q1 /2 2u
To conclude this section, we illustrate some of the previous results again on the
family μα defined in (7.0.1), in dimension one for simplicity. As mentioned ear-
lier (p. 363), these measures μα with α ∈ [1, 2) on the real line do not satisfy any
entropy-energy inequality (with respect to the standard carré du champ operator
(f ) = f 2 ). However an entropy-energy inequality EE(1 ) holds when α ≥ 2
with 1 (r) = C(1 + r α/(2α−2) ), r ∈ (0, ∞), and in turn a generalized Nash in-
equality N (2 ) with 2 ∼ r(log r)−2(α−1)/α at infinity (Proposition 7.4.4). For this
growth function, the tail estimates produced by Proposition 7.4.7 provide the correct
behavior of μα .
Using arguments close to those developed for the Muckenhoupt criterion
(cf. Sect. 4.5.1, p. 194), it is not difficult to see that this Nash inequality actu-
ally extends to every α ∈ [1, 2) (and is optimal according to the tail estimates of
Proposition 7.4.7). We only sketch the method, which may be used in other, similar,
instances. In what follows, the constant C > 0 depends only on α and may vary
from place to place.
It is enough to deal with a smooth function f on R such that
f (0) = 0 and R f 2 dμα = 1. We may moreover work separately on [0, ∞) and
∞
(−∞, 0], say [0, ∞) (thus assuming 0 f 2 dμα = 1). If p(x) = e−(1+x ) is the
2 α/2
density of μα (up to the normalizing constant), the only ingredient is actually the
tail estimate μα ([x, ∞)) ≤ Cx 1−α p(x), x > 0. For some parameter a > 0, since
f (0) = 0, write
∞ ∞
f 2 dμα = f 2 1{|f |≤a} dμα
0 0
∞ ∞ (7.4.2)
+2 f (x)f (x) 1{|f |>a} dμα dx.
0 x
370 7 Generalized Functional Inequalities
The first integral is bounded above by af 1 , while for the second, use that
∞
1{|f |>a} dμα ≤ min μα [x, ∞) , a −2 ≤ C min p(x)x 1−α , a −2 .
x
If x0 = x0 (a) is the point where x 1−α p(x) = a −2 , then the right-hand side in the
latter inequality is bounded from above by aCp(x)
2 p(x ) on [0, x0 ) and by Cx0
1−α
p(x)
0
on [x0 , ∞), hence bounded everywhere by x01−α p(x). By the Cauchy-Schwarz in-
equality, the second integral on the right-hand side of (7.4.2) is then bounded from
above by Cx01−α E(f )1/2 , and therefore
∞
1= f 2 dμα ≤ af 1 + Cx01−α E(f )1/2 .
0
for every f ∈ D(E) with f 1 = 1 and with (r) ≤ Cr(log r)−2(α−1)/α for large r.
The announced claim follows.
In this respect, this short section is concerned with an extension of the Nash
inequalities by the introduction of a weight function which offers more flexibility
and produces further heat kernel bounds. It deals with a Markov Triple (E, μ, )
where μ is a probability measure. Here, a weight function will be a positive (mea-
surable) function w : E → R+ such that Pt w ≤ L(t)w, t ≥ 0, for some increas-
ing and positive function L on R+ . The next definition of a generalized weighted
Nash inequality is entirely similar to the generalized Nash inequality provided the
L1 (μ)-norm is weighted by w. It involves similarly a C 1 increasing and concave
growth function : (0, ∞) → R+ .
Definition 7.4.8 (Weighted Nash inequality) A Markov Triple (E, μ, ), with μ
a probability measure, is said to satisfy a (generalized) weighted Nash inequality
7.4 Beyond Nash Inequalities 371
Such inequalities were already considered earlier in Sect. 4.10 (cf. (4.10.4),
p. 224) in connection with emptyness of the essential spectrum. The extra condition
Pt w ≤ L(t)w on the weight, together with special properties of the growth function
, will actually allow for quantitative estimates on the (discrete) spectrum, as well
as pointwise estimates on the density kernels pt (x, y).
The next statement is very similar to Theorem 7.4.5 and borrows from the latter
the definition of K(t), t ≥ 0.
Proposition 7.4.9 Let (E, μ, ) be a Markov Triple satisfying a weighted Nash in-
equality N(, w) with growth function : (0, ∞) → R+ such that r −1 (r) is inte-
grable at infinity. Then (Pt )t≥0 has density kernels pt (x, y), t > 0, (x, y) ∈ E × E,
such that, for every t > 0 and (x, y) ∈ E × E,
2
t
pt (x, y) ≤ K(t)L w(x)w(y), (7.4.3)
2
for every t > 0 and f such that f 1 = 1. This leads to an ultracontractive bound
on Q2t , t > 0, and therefore to the uniform bound K(2t)L(t)2 on its kernel with
respect to ν. But if the semigroup (Qt )t≥0 has a kernel bounded from above by
then (Pt )t≥0 has a density kernel, with respect to μ, which is bounded from
K,
above by K w(x)w(y). The conclusion then easily follows and Proposition 7.4.9 is
established.
Proposition 7.4.9 is most useful when w ∈ L2 (μ), in which case the condition
Pt w ≤ ect w, t ≥ 0, holds as soon as w ∈ D(L) and Lw ≤ cw. The function w is thus
372 7 Generalized Functional Inequalities
similar to the Lyapunov functions of Sect. 4.6, p. 201. Furthermore, since r −1 (r),
r > 0, is decreasing (as is concave and increasing), the condition Pt w ≤ L(t)w,
t ≥ 0, may be replaced by Pt w1 ≤ L1 (t)w1 , t ≥ 0, for some new weight w1 ≥ w.
In particular, adding a constant, it may always be assumed that w is bounded from
below by some strictly positive constant.
Corollary 7.4.10 Under the hypotheses of Proposition 7.4.9, and assuming fur-
thermore that w ∈ L2 (μ), the semigroup P = (Pt )t≥0 is Hilbert-Schmidt, and if
(λk )k∈N denotes the sequence of eigenvalues of −L,
2
−λk t t
e ≤ K(t)L w 2 dμ
2 E
k∈N
Of course, since in general w is not bounded and goes to infinity at infinity, the
lower bound is only useful (for a given t > 0) on a compact subset of E × E, this
subset enlarging to the full space when t goes to infinity.
7.5 Weak Poincaré Inequalities 373
We conclude this section with some examples. Starting from a classical Nash
inequality for a given operator L, it is not hard to deduce weighted Nash inequalities
for the operator L − (W, ·), that is to pass from a Nash inequality for some measure
μ to a weighted Nash inequality for the (weighted) measure e−W dμ. The procedure
is similar to the technique developed for entropy-energy inequalities in Sect. 7.3. We
briefly illustrate it in Rn for the Lebesgue measure dx and the usual Nash inequality
from (6.2.4), p. 281 (the norms being understood with respect to dx). Considering
dμ = e−W dx, choose w = eW/2 as weight function. Changing then f into f e−W/2
in the above Nash inequality, after integration by parts, it follows that for every
smooth compactly supported function f : Rn → R,
(n+2)/n 4/n
2
f dμ ≤ Cn |∇f | dμ +
2 2
f R dμ |f |w dμ
Rn Rn Rn Rn
(7.4.4)
one may obtain weighted Nash inequalities with w(x) = p(x)1/2 (1 + x 2 )−β for any
β > 0. In this case, for constants C > 0 and δ ∈ (0, 1) depending on α and β, one
may choose (r) = C(1 + r)δ , r ∈ (0, ∞). Thanks to the introduction of the weight,
the function may be chosen to be similar to the one used for the classical Nash
inequalities, even in the case α > 2 where ultracontractive bounds hold. But it also
extends to non-ultracontractive settings corresponding to 1 < α ≤ 2. When α = 1
however, it is known that the spectrum of the associated operator is not discrete (cf.
p. 189).
For simplicity of exposition, we deal here with a Markov Triple (E, μ, ) (Standard
would be enough) where μ is a probability measure. Similar (useful) statements may
be obtained in the sigma-finite case (in which case variances have to be replaced by
L2 (μ)-norms).
374 7 Generalized Functional Inequalities
At a technical level, weak Poincaré inequalities control how a sequence (fk )k∈N of
(suitable) functions converge to a constant in the weakest sense (in measure) when
E(fk ) → 0. Since E(f ) = E(f − a) for any a ∈ R, the control of E(fk ) cannot say
anything about the control of this constant, justifying the investigation of a tool such
as weak Poincaré inequalities.
Before presenting the notion of weak Poincaré inequality itself, it is useful to
study, as a preliminary investigation, some aspects of convergence in measure. Re-
call that a sequence (fk )k∈N of real-valued measurable functions on (E, F, μ) con-
verges in measure (or in probability, or in L0 (μ)) to a measurable function g, if, for
The first statement tries to characterize the oscillation of a given function under
convergence in measure. Dealing with functions which are a priori not integrable,
we work instead with medians. Given f measurable on (E, F), the set of m ∈ R
such that μ(f ≥ m) ≥ 12 and μ(f ≤ m) ≥ 12 is a non-empty bounded interval (pos-
sibly reduced to one point). For simplicity, we agree to define the median m(f ) of
f to be the middle point of this interval. On the other hand, for a given measurable
7.5 Weak Poincaré Inequalities 375
Obviously Tμ (f ) ≤ Tμ∗ (f ) ≤ 1.
limk→∞ E fk dμ = 0.
Definition 7.5.2 (Weak Poincaré inequality) A Markov Triple (E, μ, ), with
μ a probability measure, is said to satisfy a weak Poincaré inequality W P (!)
with respect to a growth function ! : (0, ∞) → R+ bounded by 1 and such that
376 7 Generalized Functional Inequalities
Proof One direction is obvious thanks to Proposition 7.5.1. For the converse impli-
cation, consider
!(r) = sup Tμ (f ) ; E(f ) ≤ r , r ∈ (0, ∞).
! is clearly increasing and bounded (by 1), and the fact that limr→0 !(r) = 0 is a
direct consequence of the hypothesis. Indeed, if not, there exist ε > 0 and a sequence
(fk )k∈N in D(E) such that limk→∞ E(fk ) = 0 and Tμ (fk ) ≥ ε for every k. But now,
each fk may be translated so that m(fk ) = 0, thus contradicting the hypothesis.
7.5 Weak Poincaré Inequalities 377
Corollary 7.5.4 A tight Nash inequality N () implies a weak Poincaré inequality.
The same holds with a tight entropy-energy inequality EE().
Weak Poincaré inequalities are aimed at tightening functional inequalities. The fol-
lowing is an illustration of the principle in the context of (linearized) Nash inequal-
ities. It may be applied similarly to most functional inequalities investigated in this
monograph.
We start with a technical proposition describing the main tightening step.
Proposition 7.5.6 Let (E, μ, ) be a Markov Triple such that, for constants
s1 , β1 > 0,
f 22 ≤ s1 E(f ) + β1 f 21 (7.5.3)
for every f ∈ D(E). Assume furthermore that for some s2 > 0 and 0 < γ2 < 1
β1 +1 ,
and any f ∈ D(E) bounded by 1,
To deduce this corollary from Proposition 7.5.6, simply observe that by lin-
earization, both a generalized Nash or entropy-energy inequality imply (7.5.3),
while (7.5.4) holds under a weak Poincaré inequality since in its linearized
form (7.5.2), lims→∞ γ (s) = 0. Concerning the second part of the statement, by
Corollary 7.5.4, a weak Poincaré inequality holds under a tight Nash inequality
N(1 ) or a tight EE(2 ) inequality, and therefore also a Poincaré inequality. Then
1 may be chosen such that 1 (r) ≤ Cr + 1 (by Proposition 7.4.2) and 2 (r) ≤ Cr
(from the discussion p. 350), r > 0, for some constant C > 0.
1
fR2 dμ ≤ + s2 E(f ) + γ2 R 2 ,
E 16R 2
β1 + 1 1
1 ≤ (s1 + s2 )E(f ) + + γ2 R 2 +
16R 2 2
1 1
≤ (s1 + s2 )E(f ) + γ2 (β1 + 1) +
2 2
√
after optimizing in R > 0. Hence 1 −
γ2 (β1 + 1) ≤ 2(s1
+ s2 )E(f ) which amounts
to the Poincaré inequality P (C) since E f dμ = 0 and E f 2 dμ = 1. The proof is
complete.
Proposition 7.5.8 Let (E, μ, ) be a Markov Triple such that for some measurable
set K ⊂ E with μ(K) > 0, some constant C > 0 and any f ∈ D(E),
2
1
f dμ ≤ C E(f ) +
2
f dμ . (7.5.5)
K μ(K) K
Therefore, if (7.5.5) holds for a sequence (K )∈N of measurable subsets of E such
that lim→∞ μ(Kc ) = 0, a weak Poincaré inequality holds.
To conclude this section, we briefly describe how weak Poincaré inequalities are
related to heat kernel bounds, convergence to equilibrium and tail estimates. The
strategy is entirely similar to the one developed under generalized or weighted Nash
inequalities in the preceding sections. The procedure applies similarly to any func-
tional inequality of the form
Q(f ) ≤ E(f )
for any f ∈ D(E) where Q is a positive functional which is closed under (Pt )t≥0 and
for which Q(Pt f ) ≤ M(t)Q(f ), t ≥ 0, provided (r) is increasing and r −1 (r)
7.5 Weak Poincaré Inequalities 381
Choose r large enough so that q(r) ≤ 14 and then s such that !( q(r)
s2
)≤ q(r)
4 which
may be achieved for
1/2
q(r)
s= .
!−1 ( q(r)
4 )
q(r)
Hence q(r + s) ≤ 2 and the conclusion follows as in Proposition 7.4.7.
This last section briefly surveys related families of functional inequalities which
have been developed for different purposes. As for the inequalities investigated in
the previous sections, each such family has its own interest and is more or less
better suited to specific issues (convergence to equilibrium, heat kernel bounds or
tail estimates). Only a few properties and illustrations are outlined here. In par-
ticular, the guideline is the study of the family of probability measures dμα (x) =
cα e−(1+x ) dx defined in (7.0.1) on the real line for α ∈ [1, 2] which do satisfy
2 α/2
The first family comprises the -entropy inequalities. Such an inequality depends
on some convex C 2 function defined on an open interval
I of (0, ∞). For every
measurable function f : E → I in L1 (μ) such that E |(f )|dμ < ∞, define the
-entropy of f as
Entμ (f ) =
(f )dμ − f dμ . (7.6.1)
E E
The usual variance Varμ is obtained with (r) = r 2 on I = R and the entropy Entμ
((5.1.1), p. 236) for (r) = r log r on I = (0, ∞).
Given then a function : I → R as above, a Markov Triple (E, μ, ) is said to
satisfy a -entropy inequality with constant C > 0 if
C
Entμ (f ) ≤
(f ) (f )dμ (7.6.2)
2 E
for every I -valued function f ∈ D(E). According to the previous examples of the
function , this definition appears as a direct extension of Poincaré and logarith-
mic Sobolev inequalities. To provide interesting consequences and applications, it
is necessary to supplement the convexity assumption of by the further hypothesis
that is C 4 and that 1 is concave on I . This is clearly the case for the examples of
(r) = r 2 on I = R and (r) = r log r on I = (0, ∞). In the following, only such
admissible functions : I → R on an interval I ⊂ R are considered.
The -entropy inequalities satisfy the standard stability properties by bounded
perturbation and by tensorization (which is less easy), analogous to the correspond-
ing ones for Poincaré and logarithmic Sobolev inequalities. They may also be char-
acterized by an exponential decay along the underlying semigroup, and hold under
curvature conditions analogous to those for Poincaré and logarithmic Sobolev in-
equalities discussed in Chaps. 4 and 5. The next statement summarizes the latter
two results.
for every t ≥ 0 and every I -valued function f ∈ L1 (μ) such that E |(f )|dμ < ∞.
Furthermore, under the curvature condition CD(ρ, ∞) for some ρ > 0, the Triple
(E, μ, ) satisfies a -entropy inequality (7.6.2) with constant C = ρ1 .
The proof of the first part of the proposition is similar to that of Theorem 5.2.1,
p. 244. The second part relies on the arguments put forward in the proof Proposi-
tion 5.5.3, p. 261, via the analysis of the function (s) = Ps ((Pt−s f )), s ∈ [0, t],
t > 0, for f : E → I in Aconst
0 , and the use of the admissibility property of .
384 7 Generalized Functional Inequalities
One example of interest in the family of -entropy inequalities concerns the choice
of
rp − 1
(r) = , r ∈ (0, ∞),
p−1
which is admissible only if p ∈ (1, 2], and includes the case (r) = r log r in the
limit p = 1. For this family of functions , the -entropy inequality takes the form
(after changing f ≥ 0 into f 2/p and letting q = p2 ∈ [1, 2])
2/q
1
f dμ −
2
|f | dμ
q
≤ C E(f ) (7.6.3)
2−q E E
for every f ∈ D(E), which is known as the Beckner inequality Bq (C), q ∈ [1, 2],
with constant C > 0. According to the study of Sect. 7.4, B1 (C) corresponds to the
Poincaré inequality P (C), while (in the limit) B2 (C) amounts to the logarithmic
Sobolev inequality LS(C). The inequality Bq (C) is also part of the family (6.8.7)
of Remark 6.8.4, p. 312 (with n = ∞).
By convexity of the map r → log(f 1/r ), r ∈ (0, 1], the expression
2/q
1
f dμ −
2
|f | dμ
q
1
q − 1
2 E E
where C1 > 0 only depends only on C and p (and a = 2(1 − p1 )). The modified
logarithmic Sobolev inequality implies the concentration result for all p > 1 while
the Latała-Oleszkiewicz inequalities are valid only for p ∈ [1, 2]. Again, such tail
estimates capture the correct behaviors of the model measures μα .
on the Borel sets of Rn . Recall that the density of μα is essentially e−|x| but it is
α
convenient to smooth out the potential to avoid regularity issues. As for the Markov
Triple, the carré du champ operator is the usual one (f ) = |∇f |2 on smooth func-
tions f on Rn . The associated Markov semigroup is denoted by (Pt )t≥0 .
According to (7.4.4), for all α > 0, μα satisfies a weighted Nash inequality
N() with growth function (r) = r n/(n+2) , r ∈ (0, ∞), and weight w(x) =
e−(1+|x| ) /2 , x ∈ Rn . In particular, following Proposition 7.4.9, the semigroup
2 α/2
(Pt )t≥0 has density kernels pt (x, y), t > 0, (x, y) ∈ Rn × Rn , with respect to μα
satisfying
pt (x, y) ≤ C(t)w(x)w(y)
with C(t) > 0. This inequality however does not yield any significant information
about the spectrum since the weight is not in L2 (μα ).
We next distinguish the various ranges of α > 0, and for simplicity restrict to
the one-dimensional case. Growth and rate functions , !, γ , β of the various
inequalities are only given for small or large values of the variables. Constants C > 0
may change from line to line.
• α ∈ (0, 1). Only a weak Poincaré inequality is satisfied here with growth func-
tion !(r) = Cr(− log r)2(1−α)/α for small values of r > 0. Equivalently, in
the linearized form (7.5.2), γ (s) = exp(−Cs α/(2−2α) ) for large values of s.
Poincaré, logarithmic Sobolev, Latała-Oleszkiewicz and modified logarithmic
Sobolev inequalities are not satisfied. Moreover the essential spectrum is not
empty (cf. Corollary 4.10.9, p. 229).
• α = 1. The measure μ1 behaves as the symmetric exponential measure.
A Poincaré inequality P (C) with constant C > 0 holds (cf. (4.4.3), p. 189),
as well as a Latała-Oleszkiewicz inequality with a = 0 or a modified logarith-
mic Sobolev inequalities with p = 1. A weak Poincaré inequality holds with
!(r) = Cr, r ∈ (0, ∞), or γ (s) = 0 for every s ≥ C (with C the Poincaré con-
stant). The measure μ1 does not satisfy any logarithmic Sobolev inequalities (see
for example Theorem 5.4.5, p 256). Even if there is a spectral gap, the essential
spectrum is not empty (cf. p. 229).
7.8 Notes and References 387
• α ∈ (1, 2). A Poincaré inequality P (C) holds (see Theorem 4.5.1, p. 194), and
then also a weak Poincaré inequality with !(r) = Cr, r ∈ (0, ∞). Both the
Latała-Oleszkiewicz inequality for any 0 ≤ a ≤ 2(1 − α1 ), and the modified log-
arithmic Sobolev inequality for any 1 ≤ p ≤ α, hold. On the other hand, there is
no logarithmic Sobolev inequality or entropy-energy inequality. However, a gen-
eralized Nash inequality N () holds with (r) = Cr(log r)−2(α−1)/α for large
r > 0, as well as, for any κ ∈ (0, 1), a weighted Nash inequality W N () with
(r) = C(1 + r)κ , r ∈ (0, ∞), and weight
−σ −(1+x 2 )α/2 /2
w(x) = 1 + x 2 e , x ∈ R,
where σ > 0. For σ large enough, w ∈ L2 (μα ) so that the operator Pt is Hilbert-
Schmidt for any t > 0 (while not ultracontractive). In particular, the essential
spectrum is empty (cf. the end of Sect. 7.4.3).
• α = 2. The measure μ2 behaves as the Gaussian measure and in particular satis-
fies a Poincaré and a logarithmic Sobolev inequality. All the properties of the case
α ∈ (1, 2) extend to α = 2, but due to the additional logarithmic Sobolev inequal-
ity, the semigroup (Pt )t≥0 is hypercontractive (while still not ultracontractive),
cf. Chap. 5.
• α > 2. Both the Poincaré and the logarithmic Sobolev inequalities hold. As
for α ∈ (1, 2], a weak Poincaré inequality holds with !(r) = Cr, r ∈ (0, ∞),
where C > 0 is the Poincaré constant. The Latała-Oleszkiewicz inequality for
any a ∈ [0, 1] and the modified logarithmic Sobolev inequality for any p ∈ [1, α]
hold. An entropy-energy inequality EE() holds with (r) = C(1 + r α/(2α−2) ),
r ∈ (0, ∞), (cf. Sect. 7.3) as well as a generalized Nash inequality N () with
(r) = Cr(log r)2(1−α)/α for large r > 0 (cf. p. 366). (Weighted Nash inequal-
ities may be established for suitable weights but are not useful as compared to
the generalized Nash inequality.) The semigroup (Pt )t≥0 is ultracontractive, the
density kernels are uniformly bounded for every t > 0, and hence the spectrum is
discrete.
mainly follow [26]. The optimal Euclidean bounds under the sharp Euclidean log-
arithmic Sobolev inequality presented in Corollary 7.1.6 and Proposition 7.2.4 are
taken from [34]. The optimal heat kernel bounds (7.1.8) are part of a much deeper
investigation of Gaussian kernels by E. Lieb [287]. More on heat kernel bounds in
manifolds and metric spaces may be found in [217].
The examples of Sect. 7.3 are taken from [265] (where a more probabilistic view-
point is emphasized) and [34]. Later and deeper refinements along these lines in-
clude in particular [46].
The ultracontractive bounds from the generalized Nash inequality (Theo-
rem 7.4.5 in Sect. 7.4) are due to M. Tomisaki [411] and T. Coulhon [139]. The
generalized Nash inequalities have also been considered in the equivalent lin-
earized form (7.4.1) called super Poincaré inequalities, extensively developed by
F.-Y. Wang (cf. [431] for a complete account as well as further bibliographical
information in the context of probability theory). They are presented here in the
equivalent language of generalized Nash inequalities which offer a somewhat more
flexible treatment. The results of Sect. 7.4.1, in particular tail estimates, are mainly
suitable translations of Wang’s main contributions in this context. Weighted Nash
inequalities were investigated in [430] and more recently in [32]. The results of
Sect. 7.4.3 are essentially taken from these references. Other types of generalized
Nash inequalities have been considered in the literature, in particular for infinite-
dimensional models [63, 254, 445].
Weak Poincaré inequalities, and variations in the form of positivity improving
properties, as presented in Sect. 7.5 were first used by S. Aida [4] and P. Math-
ieu [299] as a tool to tighten defective logarithmic Sobolev inequalities. The argu-
ment was further refined in for instance [361], and then extensively developed in the
monograph [431] to which we refer for further information and complete references.
Section 7.5 tries to offer a synthesized treatment to tighten arbitrary functional in-
equalities. The topic of Remark 7.5.9 has been studied extensively in the literature
(cf. e.g. [431] and the references therein).
Section 7.6 collects various forms of functional inequalities which have been
studied in recent years. Each form has its own advantages and privileged applica-
tions. The -entropic inequalities were first considered in [36], and further devel-
oped in [84, 120, 238]. The Beckner inequality appeared in [55] for the Gaussian
measure (cf. also in a manifold setting [66]) and the Latała-Oleszkiewicz inequality
was introduced in [272]. Modified logarithmic Sobolev inequalities were considered
in, among other references, [79] and [441], and later extended in [48, 199–201, 261].
See also [80] for further developments in connection with isoperimetric bounds. We
refer to these references for the corresponding tensorization and tail properties, in
particular with the measures μα as model examples.
To conclude, and in addition to the families of Sect. 7.6, we mention here
the further family of F -Sobolev inequalities. It is established in [46] that for
α ∈ (1, 2), the probability measure μα on the line satisfies
f log 1 + f dμα − f dμα log 1 + f dμα ≤ C f dμα
2 β 2 2 β 2 2
R R R R
7.8 Notes and References 389
for every smooth f : R → R, where β = 2(1 − α1 ). The suitable general form for a
Markov Triple (E, μ, ) appears as
f 2 F f 2 dμ ≤ C E(f ) + D, (7.8.1)
E
The capacity of a set is a way to measure its size from the point of view of potential
theory. The theme of this chapter is inequalities comparing measure and capacity
uniformly over a given class of sets as equivalent forms of functional inequalities.
The passage from sets to functions is usually performed through the use of level sets
of functions and co-area formulas. Inequalities on sets often offer more flexibility
and a more transparent description of the hierarchy between functional inequalities.
Moreover, measure-capacity inequalities allow for the description of more general
forms of functional inequalities, dealing with Orlicz norms or with different mea-
sures on Lp -spaces and Dirichlet forms, and it is in these contexts that they are
most naturally studied. Furthermore, measure-capacity inequalities produce criteria
to satisfy a functional inequality much in the spirit of the Muckenhoupt character-
izations in dimension one (cf. Sect. 4.5.1, p. 194, and Sect. 5.4, p. 252), although
these criteria might appear less useful for exhibiting examples in higher dimension.
Capacities, and measure-capacity inequalities, may be investigated (and have
been considered in the literature) in rather large generality, but we only concen-
trate here on the so-called 2-capacities which capture the relevant Dirichlet form
information on sets. A second form of interest are the 1-capacities which are related
to boundary or surface measures. Measure-capacity inequalities in this case are then
isoperimetric-type inequalities comparing the measure of a set with the measure
of its boundary. Indeed, the first part of this chapter suitably adapts to the Markov
Triple framework some of the basic tools and results of the theory of capacities,
translating functional inequalities as measure-capacity inequalities. The second part
of the chapter is devoted to the particular case of Gaussian isoperimetric-type in-
equalities as an illustration of the power of heat flow techniques.
The context of this chapter is the traditional one of a (Standard or Full) Markov
Triple (E, μ, ) with Dirichlet form E, generator L and semigroup P = (Pt )t≥0 , as
summarized in Sect. 3.4, p. 168. Full Markov Triples are actually only used in the
study of the local heat kernel measures in the second part of this chapter, starting
with Sect. 8.5, where curvature-dimension inequalities come into play. To translate,
for comparison, the functional isoperimetric-type inequalities into more (classical)
geometric statements, we will sometimes assume in addition that the intrinsic dis-
tance d defines a Polish topology on (E, μ, ).
As announced, the first part of this chapter mainly revisits from the viewpoint
of sets some of the most important functional inequalities studied in the former
chapters. The first section introduces the basic notions associated with capacities and
the main technical tool to transfer (and back) functional inequalities into measure-
capacity inequalities. This tool may be seen as the essence of the slicing method
developed in previous chapters and is related to the famous co-area formulas in
geometric measure theory. In Sect. 8.2, Sobolev-type inequalities are described via
measure-capacity inequalities by a simple Orlicz space duality argument. Poincaré
and logarithmic Sobolev inequalities are handled in Sect. 8.3, and Nash and weak
Poincaré inequalities are examined in Sect. 8.4.
Replacing the 2-capacity by the boundary measure, measure-capacity inequali-
ties turn into isoperimetric-type inequalities. In this direction, Sect. 8.5 investigates,
with the heat kernel tools under a curvature condition, measure-capacity inequal-
ities of isoperimetric-type, leading in particular to the Gaussian isoperimetric in-
equality as well as to comparison results under the curvature condition CD(ρ, ∞).
On the basis of these developments, Sect. 8.6 revisits the infinite-dimensional Har-
nack inequalities of Chap. 5, Sect. 5.6, p. 265, by providing a new approach via
a reverse isoperimetric-type inequality along the semigroup. Combining the heat
kernel isoperimetric inequality with its reverse form yields furthermore an isoperi-
metric-type Harnack inequality. Finally, Sect. 8.7 addresses the relationships be-
tween Poincaré and logarithmic Sobolev inequalities and the resulting concentra-
tion properties raised in Chaps. 4 and 5. While the latter are not enough in general
to entail Poincaré and logarithmic Sobolev inequalities, they actually do under pos-
itive curvature bounds. More precise comparisons of the isoperimetric functions are
moreover available in this context.
For a Markov Triple (E, μ, ) with associated Dirichlet form E(f ) = E (f )dμ,
f ∈ D(E), we first introduce the notion of the capacity of a measurable subset of E.
Capacities related to Lp (μ)-norms of the gradient for any p ≥ 1 are defined in the
same way, but as announced, only the value p = 2 is considered here.
8.1 Capacity Inequalities and Co-area Formulas 393
where the infimum is taken over all functions f in D(E) such that 1A ≤ f ≤ 1 (the
latter expression is +∞ if the infimum is taken over the empty set. The quantity
Capμ (A) is usually referred to as the 2-capacity of A relative to the measure μ and
the Dirichlet form E or carré du champ operator .
For measurable subsets A ⊂ B ⊂ E, the capacity Capμ (A, B) is defined by re-
stricting E to B, and is referred to as the 2-capacity of A with respect to B relative
to μ and E or . In particular Capμ (A, E) = Capμ (A). From the very definition,
Capμ (A, B) is increasing in A and decreasing in B.
If the constant function 1 belongs to D(E), as is the case for example when the
measure μ is finite, then Capμ (A) = 0 for every measurable set A ⊂ E. In order
to overcome this difficulty, introduce a modified capacity defined on all measurable
sets A with μ(A) ≤ 12 by
" #
∗ 1
Capμ (A) = inf Capμ (A, B) ; A ⊂ B, μ(B) ≤ . (8.1.2)
2
for all A with μ(A) ≤ 12 . (In particular, it is enough in this case to know on [0, 12 ].)
for every measurable set A in E (such that ν(A) < ∞). If ν is finite and normalized
into a probability measure, set
" #
∗ 1
Capμ,ν (A) = inf Capμ (A, B) ; A ⊂ B, ν(B) ≤
2
8.1 Capacity Inequalities and Co-area Formulas 395
and the second part of the definition then takes the form
ν(A) ≤ Cap∗μ,ν (A)
The purpose of this chapter is to relate Definition 8.1.1 to the functional inequal-
ities studied in the previous chapters. When μ has infinite mass, Sect. 8.2 studies
how (8.1.4) of Definition 8.1.1 is an alternative description of a Sobolev inequality
(for a suitable choice of the growth function ). When μ is a probability mea-
sure, (8.1.5) will be related, in Sects. 8.3 and 8.4, according again to the growth
of , to Poincaré, logarithmic Sobolev and others classical inequalities (such as
generalized Nash or weak Poincaré inequalities).
The key to this transfer is a suitable description of the energy of a function in
terms of capacities of its level sets, through versions of the so-called co-area for-
mula for 2-capacities, and achieved at the technical level by the slicing method. The
standard form of the co-area formula will be presented below in Theorem 8.5.1, but
a first formula at the level of 2-capacities is stated next.
Proof We use the classical slicing principle on the basis of Proposition 3.1.17,
p. 137. Letting f ∈ D(E), since the sets Nr are decreasing in r,
∞ 2k+1
2 Capμ (Nr ) rdr = Capμ (Nr )d r 2
0 k
k∈Z 2
≤ 22k+2 − 22k Capμ (N2k ) (8.1.7)
k∈Z
=3 22k Capμ (N2k ).
k∈Z
396 8 Capacity and Isoperimetric-Type Inequalities
The two functions and ϒ are convex increasing on R+ and dual with respect to
the Legendre-Fenchel transform (∗∗ (r) = ϒ ∗ (r) = sups∈R+ [rs − (s)], r ∈ R+ ),
8.2 Capacity and Sobolev Inequalities 397
and form a pair of Orlicz functions. If μ is a measure on (E, F), the Orlicz space
L (μ) is defined as the space of measurable functions f : E → R such that
" #
f = sup |f |g dμ ; g ≥ 0, ϒ(g)dμ ≤ 1 < ∞. (8.2.1)
E E
It is easily checked, using convexity and Jensen’s inequality, that for every measur-
able subset A ⊂ E with μ(A) < ∞,
1
1A = μ(A) ϒ −1 . (8.2.2)
μ(A)
∗
As a main example, if 1 < p < ∞ and 1
p + p1∗ = 1, for ϒ(r) = r p , then (r) =
(p−1)p−1
pp r p , r ∈ R+ , L (μ) is the usual Lebesgue space Lp (μ) and · = · p .
(The results below for the case p = 1 may simply be obtained by taking the limit
p → 1.) Observe also that when ϒ(r) = er − 1, r ∈ R+ , comparison with the vari-
ational formula for entropy (5.1.3), p. 236, shows that for a probability measure μ
and f ≥ 0, Entμ (f ) ≤ f , although these quantities are not equal in general.
The following proposition is a model example of the equivalence between capac-
ity and functional inequalities.
Proposition 8.2.1 Let (E, μ, ) be a Markov Triple and (, ϒ) be a pair of dual
Orlicz functions. The following assertions are equivalent.
(i) There is a constant C1 > 0 such that for all measurable subsets A ⊂ E with
μ(A) < ∞,
−1 1
μ(A) ϒ ≤ C1 Capμ (A). (8.2.3)
μ(A)
(ii) There is a constant C2 > 0 such that for all functions f ∈ D(E),
2
f
≤ C2 E(f ). (8.2.4)
for all bounded measurable subsets A ⊂ Rn . While not necessarily of practical value
in order to reach the standard Sobolev inequality, this equivalence produces a some-
what different view of its meaning.
Proof of Proposition 8.2.1 Start with the implication from (i) to (ii). For any func-
tion f ∈ D(E), setting as above Nr = {|f | > r}, r > 0,
∞
2
f
= sup f 2 g dμ = 2 sup g dμ rdr
E 0 Nr
∞
≤2 sup g dμ rdr,
0 Nr
the supremum is taken over all positive measurable functions g : E → R such
where
that E ϒ(g)dμ ≤ 1. Hence,
2
∞
f
≤ 2 1Nr rdr
0
The conclusion then follows from (8.2.3) and Theorem 8.1.2 with C2 = 12C1 . Turn-
ing to the converse implication, given f ∈ D(E) such that 1A ≤ f ≤ 1 for A ⊂ E,
2
f
≥ 1A
for all functions f ∈ D(E) with compact support in O. The optimal constant C2 > 0
describes a lower bound of the spectrum of the restriction of the operator L on
O with Dirichlet boundary conditions in the spirit of the Faber-Krahn inequalities
(Remark 6.2.4, p. 284). More precisely, in a classical formulation,
1
Cap# (O) ≤ λ(O) ≤ Cap# (O)
12
Cap (A,O )
where Cap# (O) = infA⊂O μ(A) μ
and λ(O) = inf
Ef(f2 dμ
)
where the infimum
O
runs over all non-zero functions f ∈ D(E) with compact support in O.
Proposition 8.3.1 Let (E, μ, ) be a Markov Triple. Assume that for some con-
stant C1 > 0,
μ(A) ≤ C1 Cap∗μ (A) (8.3.1)
for all subsets A ⊂ E such that μ(A) ≤ 12 . Then μ satisfies a Poincaré inequality
P (C2 ) with C2 = 12C1 . Conversely, under a Poincaré inequality P (C2 ), (8.3.1)
holds with C1 = 2C2 .
Proof Let f ∈ D(E) and denote by m a median of f for μ (cf. p. 374). Setting
F+ = (f − m)+ and F− = (f − m)− ,
Varμ (f ) ≤ (f − m)2 dμ = F+2 dμ + F−2 dμ.
E E E
With the same inequality for F− , the announced Poincaré inequality P (C2 ) holds
with C2 = 12C1 , using the fact, from (3.1.17), p. 129, that
(F+ )dμ + (F− )dμ ≤ E(f ). (8.3.2)
E E
Therefore
1
Varμ (f ) ≥ 1 − μ(B) f 2 dμ ≥ μ(A)
E 2
and, under the Poincaré inequality P (C2 ), 2C2 E(f ) ≥ μ(A). Taking the infimum
over all f ’s as above yields the announced claim. Proposition 8.3.1 is established.
8.3 Capacity and Poincaré and Logarithmic Sobolev Inequalities 401
for some C > 0 and every f in the Dirichlet domain D(E) of the Markov Triple
(E, μ, ). This inequality is then characterized via Proposition 8.3.1 by
for all sets A ⊂ E such that ν(A) ≤ 12 . In particular, μ need not be finite in this
formulation.
Proposition 8.3.2 Let (E, μ, ) be a Markov Triple. Assume that for some con-
stant C1 > 0,
e2
μ(A) log 1 + ≤ C1 Cap∗μ (A) (8.3.3)
μ(A)
for all subsets A ⊂ E such that μ(A) ≤ 12 . Then μ satisfies a logarithmic Sobolev
inequality LS(C2 ) with C2 = 12C1 . Conversely, under a logarithmic Sobolev in-
equality LS(C2 ), (8.3.3) holds with C1 = 8C2 .
Again, there is a version of this statement for a pair of measures (μ, ν).
Proof The scheme of proof is rather similar to that of Proposition 8.3.1. One starting
point is Lemma 5.1.4, p. 239, which indicates that for all a ∈ R,
2
Entμ f ≤ Entμ (f − a) + 2 (f − a)2 dμ.
2
E
402 8 Capacity and Isoperimetric-Type Inequalities
2
(choose h = log(1 + μ(A)
e
)1A ). Combining with (8.3.3), for all A ⊂ B+ and h ∈ H+ ,
A hdμ ≤ C1 Capμ (A, B+ ), from which, by Proposition 8.2.1 and Remark 8.2.2,
F+2 h dμ ≤ 12 C1 (F+ )dμ.
B+ B+
Optimize then over h ∈ H+ and add the same bound for F− to reach LS(12C1 ) due
to (8.3.5) and (8.3.2).
Turning to the converse, let B ⊂ E such that μ(B) ≤ 12 , A ⊂ B and f ∈ D(E)
such that 1B ≥ f ≥ 1A . Then (5.1.3), p. 236, implies that
1 − μ(B)
Entμ f 2 ≥ sup g dμ = μ(A) log 1 +
A 2μ(A)
1
≥ μ(A) log 1 +
2μ(A)
8.4 Capacity and Further Functional Inequalities 403
where
the supremum may be taken over all positive measurable functions g such
that B eg dμ ≤ 1. Now, for every 0 < μ(A) ≤ 12 ,
1 e2 1
μ(A) log 1 + ≤ μ(A) log 1 +
4 μ(A) 2μ(A)
so that the claim immediately follows. The proposition is therefore established.
We start with the generalized Nash inequalities N () of Definition 7.4.1, p. 364,
that is for a (C 1 increasing and concave) growth function on (0, ∞),
f 2 dμ ≤ E(f )
E
which holds for all functions f ∈ D(E) with f 1 = 1 with rate function
δ : [1, ∞) → (0, ∞) (since μ is a probability measure, only the values u ≥ 1 are
relevant).
The next statement is the announced equivalence between Nash and suitable ca-
pacity inequalities.
Then μ satisfies a generalized Nash inequality (8.4.1) with associated rate func-
tion 12 δ. Conversely, assume furthermore that there exists a q ≥ 4 such that
qδ(qu) ≥ 4 δ(u) for all u ≥ 1. Then, if μ satisfies a generalized Nash in-
equality (8.4.1) with rate function δ, then it satisfies (8.4.2) with rate function
u → 2qδ( u2 ).
Observe that for the classical Euclidean Nash inequality (6.2.3), p. 281 (with
A = 0), for which (r) = Cr n/(n+2) (although this case is formally excluded here
since the statement is restricted for convenience to probability measures), it holds
that μ(A)(n−2)/n ≤ C Cap(A), which is of the same form as the inequality arising
from Sobolev inequalities (Proposition 8.2.1). The conclusion here therefore recov-
ers the equivalence between Sobolev and Nash inequalities studied in Sect. 6.2,
p. 279.
Proof Given f ∈ D(E), let m be a median of f for μ. For every u ≥ 1,
2 2
f 2 dμ − u |f |dμ = Varμ |f | − (u − 1) |f |dμ
E E E
2
≤ F dμ − (u − 1)
2
|F |dμ
E E
where F = f − m. Since g ≥ 0,
2 " #
1
|F |dμ = inf F g dμ ; g ≥ 0,
2
dμ ≤ 1 ,
E E E g
for every u ≥ 1,
2
F 2 dμ − (u − 1) |F |dμ
E E
" #
−1 1
= sup F h dμ ; h ≤ 1, (1 − h) dμ ≤
2
E E u−1
" #
1
≤ sup F 2 g dμ ; g ∈ [0, 1], (1 − g)−1 dμ ≤ 1 + .
E E u−1
8.4 Capacity and Further Functional Inequalities 405
It follows that
2
F 2 dμ − (u − 1) |F |dμ
E E
" #
1
≤ sup F+2 g dμ ; g ∈ [0, 1], (1 − g)−1 dμ ≤ 1 +
B+ E u−1
" #
1
+ sup F−2 g dμ ; g ∈ [0, 1], (1 − g)−1 dμ ≤ 1 + ,
B− E u−1
(this and similar formulas remain valid in the limit case u = 1). Now, for every u ≥ 1
and a ∈ (0, 12 ], using the fact that δ(u) is decreasing and uδ(u) is increasing, it is
easily checked according to whether u ≤ a1 or not that
a aδ(u)
≤ .
1 + (u − 1)a δ( a1 )
u ≥ 1,
μ̃(A) = g dμ ≤ δ(u) Capμ (A, B+ ).
A
Proposition 8.2.1 and Remark 8.2.2 applied to μ and μ̃ then yield that for every
u ≥ 1,
F+ g dμ ≤ 12 δ(u)
2
(F+ )dμ.
B+ B+
Optimizing over g together with the same inequality for F− allows us to conclude
the first assertion of the proposition.
Turning to the converse statement, let A ⊂ B with μ(B) ≤ 12 and let as usual f
in D(E) be such that 1A ≤ f ≤ 1B . Define for every k ∈ N, fk = (f − 2−k )+ ∧ 2−k
406 8 Capacity and Isoperimetric-Type Inequalities
and Nk = {f > 2−k }. For k ∈ N, (8.4.1) applied to fk shows that, for every u ≥ 1,
2
fk2 dμ ≤ δ(u) (fk )dμ + u |fk |dμ .
E E E
Since ( E |fk |dμ)2 ≤ μ(Nk ) E fk2 dμ, choosing u = 2μ(1Nk ) this inequality be-
comes
1 1 1
2−2k−1 μ(Nk−1 ) ≤ fk2 dμ ≤ δ E(fk ) ≤ δ E(f )
2 E 2μ(Nk ) 2μ(Nk )
Optimizing over all f ’s as above then yields the converse statement of the proposi-
tion. Proposition 8.4.1 is therefore established.
For the proof to be complete, it remains to prove the following technical lemma.
ak−1 C 22k δk
≤ .
δk−1 δ( 2aq k )
We turn to the analogous discussion for the weak Poincaré inequalities of Defini-
tion 7.5.2, p. 375, that is (for μ a probability measure),
Varμ (f ) ≤ ! E(f )
for every bounded functions f in D(E) such that (for example) osc(f ) =
sup f − inf f = 1, where ! : (0, ∞) → R+ is a given (C 1 ) concave increasing
growth function bounded by 1 and such that limr→0 !(r) = !(0) = 0. Again, it
will be better to work with the linearized version (7.5.2), p. 376, or rather, with the
(generalized) inverse = γ −1 of the rate function γ there. Hence, we agree here
that (E, μ, ) satisfies a weak Poincaré inequality if
Proof The proof follows the same pattern as the previous ones. The second part
of the proposition is rather easy. Namely, for A ⊂ B ⊂ E such that μ(B) = 12 , if
408 8 Capacity and Isoperimetric-Type Inequalities
μ(A)
The choice of u = 4then yields the claim.
Turning to the direct implication, assuming the capacity inequality (8.4.7), let
f ∈ D(E) and let m be a median of f with respect to μ. As in previous similar
proofs, set F+ = (f − m)+ and F− = (f − m)− so that
Varμ (f ) ≤ (f − m) dμ =2
F+ dμ + F−2 dμ.
2
E E E
We handle the term E F+2 dμ, E F−2 dμ being treated similarly. Denote by B+ the
support of F+ thus satisfying μ(B+ ) ≤ 12 . Fixing u ∈ (0, 1), define
c = c(u) = inf r ≥ 0 ; μ(F+ > r) ≤ u .
Since u0 ≤ u and
1 − a 2 2k
a 2k (uk+1 − uk ) = a (uk − u0 ),
a2
k∈N k≥1
Since : (0, 1) → (0, ∞) is positive and decreasing, for any integer k ≥ 1, the map
uk −u0 + θ u0
θ → (u+θ (uk −u)) is increasing on (0, 1). Hence
uk − u0 uk
≤ ≤ Cap∗μ (Nk ) ≤ Capμ (Nk , Nk+1 ),
(u) (uk )
8.4 Capacity and Further Functional Inequalities 409
and
uk − u0 ≤ (u) Capμ (Nk , Nk+1 ).
Define then, for every k ∈ N,
1
fk = F+ − c a k+1 + ∧ c a k (1 − a) .
c a k (1 − a)
Then 1Nk ≤ fk ≤ 1Nk+1 and, by definition,
Capμ (Nk , Nk+1 ) ≤ (fk )dμ
Nk+1 \Nk
1
≤ (F+ )dμ.
c2 a 2k (1 − a)2 Nk+1 \Nk
The conclusion then follows since if p = sup f and q = inf f , then for any
m ∈ [p, q], (p − m)2 + (q − m)2 ≤ (p − q)2 . The proof is complete.
for every bounded function f ∈ D(E) such that osc(f ) = 1. Then, there exist uni-
versal constants c, c > 0 such that
c max(b− , b+ ) ≤ C ≤ c max(B− , B+ ),
where
x
1 1
b+ = sup μ [x, +∞) dt,
x≥m (μ([x, +∞))/4) m p(t)
x
1 1
B+ = sup μ [x, +∞) dt,
x≥m (μ([x, +∞))) m p(t)
and similarly,
1 m 1
b− = sup μ (−∞, x] dt,
x≤m (μ((−∞, x])/4) x p(t)
and
1 m 1
B− = sup μ (−∞, x] dt.
x≤m (μ((−∞, x])) x p(t)
in Sect. 7.7, p. 386) satisfies a weak Poincaré inequality for 0 < α < 1 with rate
function (u) = C (− log u)2(1−α)/α , u ∈ (0, 1). Alternatively, the growth function
! is given by !(r) = (C r(− log r)2(1−α)/α ) ∧ 1, r ∈ (0, ∞), for some other con-
stant C > 0. When α = 1, since μ1 satisfies a Poincaré inequality P (C), (u) = C
for all u ∈ (0, 1). As a further example, if dμ(x) = cγ (1 + x 2 )−γ /2 dx with γ > 1, a
weak Poincaré inequality holds with rate function (u) = C u2/(1−γ ) , u ∈ (0, 1), or
equivalently !(r) = (C r (γ −1)/(γ +1) ) ∧ 1, r ∈ (0, ∞).
for every f ∈ D(E). The following statement is in the same spirit as those for gen-
eralized Nash or weak Poincaré inequalities. We omit the proof.
for all measurable sets A ⊂ E such that μ(A) ≤ 12 . Then μ satisfies a Latała-
Oleszkiewicz inequality LO(a, cC1 ) where c > 0 is a numerical constant. Con-
versely, under the Latała-Oleszkiewicz inequality LO(a, C2 ) of (8.4.8) with C2 > 0,
the inequality (8.4.9) holds with constant C1 = c C2 where c > 0 is numerical.
This result recovers the fact that Latała-Oleszkiewicz inequalities are an inter-
polation between Poincaré and logarithmic Sobolev inequalities according to the
parameter a ∈ [0, 1]. Moreover, following Proposition 8.4.1, it provides a link with
generalized Nash inequalities. Namely, if μ satisfies LO(a, C), a ∈ [0, 1], C > 0,
then it satisfies a generalized Nash inequality in the linearized form (8.4.1) with rate
function
−a
δ(u) = c C log(1 + u) , u ∈ (1, ∞),
where c > 0 is a numerical constant. With the tools developed above, Latała-
Oleszkiewicz inequalities may be shown to be characterized similarly through a
Muckenhoupt-type criterion. In particular, the model family μα on the real line sat-
isfies LO(a, C) if and only if α ≥ 2−a
2
.
In the second part of this chapter, starting with this section, we consider another form
of measure-capacity inequalities: the isoperimetric-type inequalities. The p = 1
definition of capacity may be viewed as a boundary or surface measure so that a
capacity inequality compares in this case measures and surface measures, which is
the essence of isoperimetric inequalities.
To address these issues, the first sub-section collects general properties on bound-
ary measures and co-area formulas. Our main focus is then Gaussian-type isoperi-
metric inequalities for heat kernel measures under curvature condition, providing in
particular a semigroup proof of the classical Gaussian isoperimetric inequality.
The setting of this and the following sections is that of a Full Markov Triple
(E, μ, ). As in all this work, the emphasis is on functional inequalities, and the
main results here actually describe functional forms of isoperimetric properties (for
heat kernel and invariant measures).
412 8 Capacity and Isoperimetric-Type Inequalities
(for every x ∈ E). In this setting, and when μ is a probability, Lipschitz means that
|f (x) − f (y)| ≤ Cd(x, y) for some C > 0 on the support of μ ⊗ μ, and it will
be assumed in addition that bounded Lipschitz functions (in this sense) are in the
Dirichlet domain D(E).
This description suitably covers the examples of weighted Riemannian manifolds
with (f ) the usual Riemannian length squared of the gradient of a smooth function
f (cf. Sect. C.4, p. 509). These assumptions will be implicit as soon as geometric
statements on sets (involving isoperimetric neighborhood and surface measure) are
considered below, and they are in particular in force in the first sub-section. It should
be pointed out that these hypotheses might rule out natural infinite-dimensional ex-
amples such as Wiener spaces (cf. Sect. 2.7.2, p. 108). Finite-dimensional approxi-
mations may nevertheless be developed to cover such instances on the basis of the
dimension-free inequalities described here (for example for Gaussian measures).
1
μ+ (A) = lim inf μ(Aε ) − μ(A) (8.5.2)
ε→0 ε
where Aε = {x ∈ E ; d(x, A) < ε}, ε > 0, is the ε-(open) neighborhood of A ⊂ E.
The quantity μ+ (A) describes the surface measure of A.
8.5 Gaussian Isoperimetric-Type Inequalities Under a Curvature Condition 413
Theorem 8.5.1 (Co-area formula II) In the metric space setting (E, d) induced by
the Markov Triple (E, μ, ) as described above, for every Lipschitz function f on
E,
+∞
μ+ (Nr )dr ≤ (f ) dμ (8.5.4)
−∞ E
where Nr = {x ∈ E, f (x) > r}, r ∈ R.
In the context of a general Markov Triple, there is no reason why (8.5.4) should
be an equality. Equality is known in specific instances such as Rn with a measure
μ which is absolutely continuous with respect to the Lebesgue measure, in which
case (8.5.4) is classically referred to as a co-area formula.
Proof Assume first that f : E → R is bounded and moreover, without loss of gen-
erality, positive. For each ε > 0, set fε (x) = supd(x,y)<ε f (y), x ∈ E. Since f is
Lipschitz, fε is finite and lower semi-continuous. From (8.5.1), for every x ∈ E,
fε (x) − f (x)
lim sup ≤ (f )(x).
ε→0 ε
414 8 Capacity and Isoperimetric-Type Inequalities
∞ 0
0 μ((Nr )ε )dr,
∞
1 1
(fε − f )dμ = μ (Nr )ε − μ(Nr ) dr.
ε E ε 0
By Fatou’s Lemma,
∞
1 ∞ μ((Nr )ε ) − μ(Nr )
lim inf μ (Nr )ε − μ(Nr ) dr ≥ lim inf dr
ε→0 ε 0 0 ε→0 ε
∞
= μ+ (Nr )dr.
0
Theorem 8.5.1 is typically used to provide evidence that a lower bound on the
isoperimetric profile Iμ may imply standard functional inequalities for μ. A clas-
sical example is that of Lebesgue measure in Rn , n > 1. Namely, if f : Rn → R is
smooth and compactly supported, then by the co-area formula, denoting respectively
by voln and vol+
n the volume and surface measures,
∞
vol+
n |f | > r dr ≤ |∇f |dx.
0 Rn
(n−1)/n
≤ vol+
1/n
n
ωn voln |f | > r n |f | > r
with p = n−1 n
. Changing f into f 2 then yields the standard Sobolev inequal-
ity (6.1.1), p. 278, in Rn (however not with its best constant).
Some more care has to be taken for finite measures. One classical example in
this case is the following statement expressing that whenever Iμ may be compared
to the isoperimetric profile of the exponential measure, then it satisfies a Poincaré
inequality. Similar statements may be obtained for different behaviors of the isoperi-
metric function and different functional inequalities. For example, as will be used
below in Theorem 8.7.2, comparison with the Gaussian isoperimetric profile yields
logarithmic Sobolev inequalities.
Proof By Theorem 8.5.1 and the hypothesis, for any positive and Lipschitz g,
∞
c min μ(g > r), 1 − μ(g > r) dr ≤ (g) dμ. (8.5.5)
0 E
1 1
μ (F+ )2 > r ≤ and μ (F− )2 > r ≤ .
2 2
and similarly for F− . By (8.3.2), the right-hand side of (8.5.6) is less than or equal to
1/2 1/2
2 |f − m| dμ
2
(f )dμ .
E E
defines the Gaussian isoperimetric profile (in any dimension). Note that the function
I is concave continuous, symmetric with respect to the vertical line going through
2 , and such that I(0) = I(1) = 0, and satisfies the fundamental differential equation
1
I I = −1.
√
The main result of this section is that ρ I is an isoperimetric function for the
invariant probability measure μ of a Markov Triple (E, μ, ) satisfying the cur-
vature condition CD(ρ, ∞) for some ρ > 0. To reach this goal, we use an ad-hoc
functional description of isoperimetry, known as a Bobkov inequality, and, as for
Poincaré and logarithmic Sobolev inequalities, we establish such Bobkov inequali-
ties for heat kernel measures under curvature conditions.
Theorem 8.5.3 (Local Bobkov inequalities) Let (E, μ, ) be a Markov Triple with
semigroup P = (Pt )t≥0 . The following assertions are equivalent.
(i) The curvature condition CD(ρ, ∞) holds for some ρ ∈ R.
(ii) For every function f in A0 with values in [0, 1], every (or some) α ≥ 0 and
every t ≥ 0,
I 2 (Pt f ) + α (Pt f ) ≤ Pt I 2 (f ) + cα (t)(f ) (8.5.8)
where
1 − e−2ρt
cα (t) = + α e−2ρt , t ≥0
ρ
(with cα (t) = 2t + α whenever ρ = 0).
Before turning to the proof of this result, let us illustrate its isoperimetric content
in the form of the following corollary, which follows from the choice of α = ρ1 with
ρ > 0 (so that cα (t) = ρ1 for every t ≥ 0) and letting t → ∞ by ergodicity. In this
context, μ is a probability measure.
Corollary 8.5.4 (Bobkov inequality under CD(ρ, ∞)) Let (E, μ, ) be a Markov
Triple, with μ a probability measure, satisfying the curvature condition CD(ρ, ∞)
for some ρ > 0. Then, for every f in D(E) with values in [0, 1],
√
ρI f dμ ≤ ρ I 2 (f ) + (f ) dμ. (8.5.9)
E E
for any function f ∈ A0 and any t ≥ 0. In particular, these various arguments settle
the implication from (ii) to (i).
We now address the converse implication in the proof of Theorem 8.5.3. The ar-
gument is very similar to the semigroup monotonicity proofs of the local Poincaré
and logarithmic Sobolev inequalities. Basic use will be made of the differential
equation I I = −1. It will be convenient to record the following technical lemma.
Lemma 8.5.6 Let be smooth on R3 (or some open set in R3 ), f ∈ A0 and t > 0
be fixed. Then
d
Ps s, Pt−s f, (Pt−s f ) = Ps (K)
ds
with
K = ∂1 + 2 ∂3 2 (g) + ∂22 (g) + 2 ∂2 ∂3 g, (g) + ∂32 (g)
where we wrote on the right-hand side g = Pt−s f and for (s, g, (g)).
Proof We have
d
Ps s, Pt−s f, (Pt−s f )
ds
d
= Ps L s, Pt−s f, (Pt−s f ) + s, Pt−s f, (Pt−s f ) .
ds
Proof of Theorem 8.5.3 Assume the curvature CD(ρ, ∞) condition for some
ρ ∈ R. Let f ∈ Aconst+
0 with values in [0, 1] and t > 0 be fixed. For every s ∈ [0, t],
set
(s) = Ps I 2 (Pt−s f ) + cα (s)(Pt−s f ) .
420 8 Capacity and Isoperimetric-Type Inequalities
cα cα
∂1 = v, ∂2 = I I , ∂3 = ,
2 2
cα
3 ∂22 = − I 2 I 2 + 2 I 2 − 1 , 3 ∂2 ∂3 = − I I ,
2
cα2
3 ∂32 = − .
4
cα (s)
3K = 2 (g) + 2 cα (s)2 (g)
2
− I 2 (g)I 2 (g)(g) + 2 I 2 (g) − 1 (g)
cα (s)2
− cα (s)I(g)I (g) g, (g) − (g) ,
4
= s, g, (g) = I 2 (g) + cα (s)(g),
it follows that
c (s)
3 K = I 2 (g) + cα (s)(g) α (g)
2
+ I 2 (g) + cα (s)(g) cα (s)2 (g)
− I 2 (g)I 2 (g)(g) + I 2 (g) + cα (s)(g) I 2 (g) − 1 (g)
cα (s)2
− cα (s)I(g)I (g) g, (g) − (g) .
4
8.6 Harnack Inequalities Revisited 421
applied twice,
((g))
2
K ≥ cα (s) I (g)
3 2
− I (g)I(g) g, (g) + I (g)(g) .
2
4(g)
The right-hand-side of this inequality is a quadratic form in I(g) and I (g) which
is positive since, as (f ) ≥ 0 for every f ,
2
f, (f ) ≤ (f ) (f ) .
Hence K ≥ 0 and thus (s) = Ps (K) ≥ 0 for every s ∈ [0, t]. This completes the
proof of Theorem 8.5.3.
The natural challenge after Theorem 8.5.3 would be to investigate similar con-
clusions under the curvature-dimension condition CD(ρ, n) (for some finite n). The
aim would be to reach in this way the isoperimetric inequality on the n-sphere,
and the corresponding comparison statement for manifolds with dimension n and
with a strictly positive lower bound on the Ricci curvature. However, this program
is mainly open since in particular a functional description of isoperimetry on the
sphere suitable for the -calculus is still missing. The analysis should probably also
involve the fast diffusion equation rather than the heat flow as in Sect. 6.11, p. 330.
(see (5.6.4), p. 267) that the reverse forms of logarithmic Sobolev inequalities for
heat kernel measures may be used to get quite close to the optimal statement. To-
gether with the refined Gaussian isoperimetric function I, we show in this section
how to obtain these Harnack inequalities by means of a suitable reverse form of The-
orem 8.5.3. In addition, the conjunction of Theorem 8.5.3 together with its reverse
form will lead to an isoperimetric-type Harnack inequality.
The following statement is therefore a kind of reverse Bobkov inequality along
−2ρt
the semigroup. Recall c0 (t) = 1−eρ (= 2t if ρ = 0) from Theorem 8.5.3.
Proposition 8.6.1 Let (E, μ, ) be a Markov Triple satisfying the curvature con-
dition CD(ρ, ∞) for some ρ ∈ R. Then, for every f in A0 and every t ≥ 0,
2 2
I(Pt f ) − Pt I(f ) ≥ e2ρt c0 (t) (Pt f ). (8.6.1)
By the usual extension procedure, the inequality (for t > 0) applies to any measur-
able function f with values in [0, 1]. As a consequence, for every bounded measur-
able function f and every t > 0,
e−2ρt
(Pt f ) ≤ f 2∞ . (8.6.2)
2πc0 (t)
Proof The proof relies on the standard heat flow interpolation argument. For
f ∈ Aconst+
0 with values in [0, 1] and t > 0, write
t
2 2 d 2
I(Pt f ) − Pt I(f ) = − Ps I(Pt−s f ) ds.
0 ds
which is the result. The gradient bound (8.6.2) is sharp on the example of the
Ornstein-Uhlenbeck semigroup (2.7.3), p. 104, for f the characteristic function of
a half-space.
Now apply this inequality with f replaced by 1{f ≥r} , r ≥ 0, for a positive measur-
able function f on E. Denoting by ν the distribution of f under Pt at the point y
(that is ν(B) = Pt (1{f ∈B} )(y) for every Borel set B in R),
Pt (1{f ≥r} )(x) ≤ −1 ν [r, ∞) + δ .
Theorem 8.6.2 Let (E, μ, ) be a Markov Triple satisfying the curvature condition
CD(ρ, ∞) for some ρ ∈ R. For every positive measurable function f on E, every
t > 0 and every x, y ∈ E,
∞
−δ 2 /2 −1
Pt f (x) ≤ e eδ ◦F (s) s dF (s)
0
Although not expressed in a very tractable form, Theorem 8.6.2 may be viewed
as the root of the various Harnack inequalities in this context. For example, by the
Cauchy-Schwarz inequality,
∞ ∞ 1/2 ∞ 1/2
δ −1 ◦F (s) 2δ −1 ◦F (s)
e s dF (s) ≤ e dF (s) 2
s dF (s)
0 0 0
1/2
δ2
≤ e Pt f 2 (y)
since
∞ 1 ∞
2δ−1 ◦F (s) 2δ−1 (v) 2 /2 du 2
e dF (s) = e dv = e2δu−u √ = e2δ .
0 0 −∞ 2π
The preceding therefore exactly recovers Wang’s Harnack inequality of Theo-
rem 5.6.1, p. 265, for α = 2,
2 d(x, y)2
(Pt f ) (x) ≤ Pt f (y) exp 2ρt
2
.
e c0 (t)
By Hölder’s inequality rather than the Cauchy-Schwarz inequality, one obtains the
whole family of inequalities with α > 1. Using the entropic inequality (5.1.2),
p. 236, yields similarly the log-Harnack inequality (5.6.2) of Remark 5.6.2, p. 266.
With respect to the original argument of Theorem 5.6.1, the proof here avoids inter-
polation along geodesics and holds true in the general context of Markov Triples.
8.7 From Concentration to Isoperimetry 425
Proof In its integrated form (cf. (8.5.11)), Theorem 8.5.3 implies that for any y ∈ E,
any measurable set A ⊂ E, any t > 0 and any r > 0,
r
−1 ◦ Pt (1Ar )(y) ≥ −1 ◦ Pt (1A )(y) + √
c0 (t)
where Ar is the r-neighborhood of A with respect to the distance d. On the other
hand, the Lipschitz property (8.6.3) applied to f = 1A ensures that
where δ = e−ρt c0 (t)−1/2 d(x, y). The combination of these two inequalities imme-
diately yields the result.
This section investigates the converse implication. While it may not be expected
to hold in general, as is clear for example from the characterization of Poincaré
inequalities on the line (cf. Theorem 4.5.1, p. 194), surprisingly a converse does
hold under suitable curvature assumptions.
In order to properly state the result, we say that (E, μ, ) has exponential con-
centration if there are constants C, c > 0 such that for every integrable 1-Lipschitz
function f on E,
μ f≥ f dμ + r ≤ C e−cr , r ≥ 0. (8.7.1)
E
It should be mentioned that the relevant constant in this property is c which controls
the exponential decay. The first constant C is usually easily handled.
The following is then the announced converse implication. Recall the isoperimet-
ric profile Iμ of a measure μ.
where c > 0 only depends on C, c. (In other words, the isoperimetric profile of
(E, μ, ) is bounded from below, up to the constant c , by the isoperimetric profile
of the exponential measure). In particular, (E, μ, ) satisfies a Poincaré inequality
P (C ) where C > 0 only depends on C, c.
Theorem 8.7.2 (Milman’s Theorem II) Let (E, μ, ) be a Markov Triple, with μ
a probability measure, satisfying the curvature condition CD(0, ∞). If (E, μ, )
has Gaussian concentration with constants C, c > 0, then
Iμ ≥ c I,
8.7 From Concentration to Isoperimetry 427
Both Theorems 8.7.1 and 8.7.2 bound the isoperimetric functions Iμ for the
Markov Triple (E, μ, ) from below by the isoperimetric functions of exponential
and (standard) Gaussian measure. It is not too difficult to verify, either directly or via
the subsequent Poincaré or logarithmic Sobolev inequalities, that conversely such
lower bounds ensure the exponential, respectively Gaussian, concentration proper-
ties (8.7.1) and (8.7.3). As such, Theorems 8.7.1 and 8.7.2 provide a complete con-
nection between concentration properties and isoperimetric-type inequalities under
positive curvature bounds.
The schemes of the proofs of Theorems 8.7.1 and 8.7.2 are rather similar. For
simplicity, we only concentrate on the first one on exponential concentration.
By the reverse local logarithmic Sobolev inequality (5.5.5) of Theorem 5.5.2, p. 259,
whenever 0 < ε ≤ f ≤ 1 where 0 < ε < 1, for every s > 0,
,
1 1
(Ps f ) ≤ log Ps f.
s ε
Hence
,
(Ps f ) 1 1
dμ ≤ log (Ps f ) dμ
E Ps f s ε E
,
1 1
≤ log (f ) dμ
s ε E
Aconst+
0 such that ε ≤ f ≤ 1,
,
1
f log f dμ − Pt f log Pt f dμ ≤ 2 t log (f ) dμ. (8.7.4)
E E ε E
Recall next (5.6.4), p. 267, which tells us that −ψ = − log P2t f is
1
√
2 t
-Lipschitz. (Note that Proposition 8.6.1 may also be used for this task.) By
the exponential concentration hypothesis (8.7.1), for every r ≥ 0,
√
μ(ψ ≤ m − r) ≤ C e−2cr t
where m = E ψdμ. By convexity of the map u → log u2 on (0, 1],
m ≥ log
f2 dμ so that, for every r > 0 and t > 0,
E
2 √
μ ψ ≤ log
− r ≤ C e−2cr t .
E f dμ
This implies that, for every 0 ≤ r ≤ log
2
f dμ
and t > 0,
E
√
μ Pt f ≥ 2 f dμ er ≤ C e−2cr t .
2
E
2
Therefore, whenever 0 < δ ≤ 1 is such that δ ≥ 2 E f dμ er ,
1 1
Pt f log dμ ≥ log Pt f dμ
E Pt f δ {Pt f ≤δ}
1
= f dμ − Pt f dμ log (8.7.5)
E {Pt f >δ} δ
√
1
≥ f dμ − C e−2cr t log .
E δ
2
Let δ = [2(ε + (1 − ε)μ(A))]1/2 er , which we assume to be less than 1. The pre-
vious inequalities (8.7.4) and (8.7.5) extend to D(E). Now, for a closed set A ⊂ E,
apply these to f = max(1A , ε), 0 < ε < 1, actually first to some suitable Lipschitz
approximations fη = max(ε, (1 − η1 d(·, A))+ ) with η → 0, η > 0 (which belong to
D(E) by hypothesis). For this choice of f , (8.7.4) and (8.7.5) (together with (8.5.2))
imply that
,
1 √
−2cr t 1 1
−ε log + (1 − ε)μ(A) − C e log ≤ 2 t log μ+ (A).
ε δ ε
8.8 Notes and References 429
1 1 μ(A) √ 1 √ √
−2μ(A)2 log + − C e−2cr t log ≤ 4 2 r t μ+ (A).
μ(A) 4 2 16μ(A)
1
As a consequence, for t > 0 well chosen of the order of log μ(A) (for instance satis-
√
fying r t = 2c log μ(A) ), there exists a c > 0 depending only on C, c > 0 such that
1 4C
Similar, and actually easier, arguments show that (8.7.6) may be extended to ev-
ery set A with 0 < μ(A) ≤ 12 . Taking the complement yields (8.7.2) and therefore
Theorem 8.7.1.
The notion of capacity has been developed in the second part of the 20th century
in various parts of mathematics, including functional analysis, potential theory, har-
monic analysis and geometric measure theory. A comprehensive account on the
notion of measure-capacity inequalities and the various developments in analysis,
together with bibliographical references, is the monograph [303] by V. G. Maz’ya.
Early developments in the context of geometric measure theory are considered by
H. Federer in [184]. See also [97, 98].
The first co-area formula of Theorem 8.1.2 may be found in [303, Chap. 2].
The topic of Sect. 8.2 linking classical Sobolev inequalities and capacity inequal-
ities is also discussed in [216, 303], where in particular the general Proposition 8.2.1
on capacity inequalities and pairs of Orlicz functions is emphasized. The classical
Faber-Krahn inequality compares the first eigenvalue of the Laplacian on a bounded
open domain in Rn with Dirichlet boundary conditions to that of the ball with the
same volume (cf. e.g. [121, 303]). As alluded to in Remark 8.2.2, such inequalities
may be studied similarly in the context of measure-capacity inequalities to formu-
late equivalently functional inequalities [35, 114, 216, 303].
The corresponding statements for Poincaré and logarithmic Sobolev inequalities
in Sect. 8.3 have been developed more recently in the works of F. Barthe, P. Cattiaux
and C. Roberto [46, 49]. Their results may be understood as a generalization of the
one-dimensional characterizations of B. Muckenhoupt [321] and S. Bobkov and
430 8 Capacity and Isoperimetric-Type Inequalities
F. Götze [77] discussed respectively in Chap. 4, Sect. 4.5, p. 193, and Chap. 5,
Sect. 5.4, p. 252.
Generalized Nash inequalities from the point of view of capacity inequalities are
studied in [47, 207, 450]. For the corresponding results for weak Poincaré inequal-
ities, see [45]. Characterizations in dimension one of Latała-Oleszkiewicz inequali-
ties are considered in [49].
Extensions of the method have been studied for further families of functional
inequalities including F -Sobolev inequalities (cf. [46]), weak logarithmic Sobolev
inequalities (cf. [116]) or Lq -logarithmic Sobolev inequalities (cf. [162]).
The co-area formula of Theorem 8.5.1 in Sect. 8.5 is a standard statement
which may be found in numerous references including [97, 122, 184, 303]. See
also [78, 295, 313]. We refer to these references for an account of the historical de-
velopments. Proposition 8.5.2 linking (exponential) isoperimetric-type bounds with
Poincaré inequalities is an observation going back to J. Cheeger [124] in Rieman-
nian geometry (see also [99, 442]). It has been the source of many extensions on
the basis of the same principle (cf. for instance [78, 312]). The monograph [78]
by S. Bobkov and C. Houdré studies in particular a variety of statements along
these lines connecting isoperimetric-type inequalities and functional inequalities,
including those for Gaussian measures of Sect. 8.5.2 (see below). Moreover, the
metric framework of Sect. 8.5 (surface measure, co-area formulas etc.) is care-
fully described there and we refer the reader to it for all the necessary techni-
cal details. General introductions to isoperimetric and geometric inequalities in-
clude [98, 122, 195, 338].
The isoperimetric inequality for Gaussian measures (Sect. 8.5.2) is due to
C. Borell [87] and V. Sudakov and B. Cirel’son [401]. The functional form of
the Gaussian isoperimetric inequality illustrated in Corollary 8.5.4 was introduced
by S. Bobkov [73] who established it first on the two-point space and then in the
limit for the Gaussian measure by the central limit Theorem (following the orig-
inal approach of L. Gross [224] in his proof of the logarithmic Sobolev inequal-
ity, cf. the Notes and References in Chap. 5). On the basis of the Bobkov in-
equality, the local inequalities of Theorem 8.5.3 were established in [38]. Corol-
lary 8.5.5 may be considered as the infinite-dimensional extension of a famous re-
sult of P. Lévy and M. Gromov [220, 281, 316] comparing the isoperimetric profile
of a Riemannian manifold with a strictly positive lower bound on the Ricci curva-
ture to that of the sphere with the same (constant) curvature and dimension (see
also [60, 193, 221, 315]). A purely Markov operator proof of this statement under
the curvature-dimension condition CD(ρ, n), ρ > 0, n < ∞, is yet to be found. Al-
ternative proofs of Corollary 8.5.5 for a probability measure dμ = e−W dx on Rn
with ∇∇W ≥ ρ Id, ρ > 0, have been provided in [103] by mass transportation meth-
ods (more precisely using Theorem 9.3.4, p. 447, in the next chapter), in [75] via
the localization method of [293] and in [318] with a geometric derivation. See [315]
for a recent complete geometric picture of isoperimetric comparison theorems with
families of one-dimensional models under curvature-dimension conditions and di-
ameter bounds.
8.8 Notes and References 431
Section 8.6 is taken from [37]. Proposition 8.6.1 already appeared in [38] (in an
alternative proof of Gaussian isoperimetry).
Theorems 8.7.1 and 8.7.2 of Sect. 8.7 connecting isoperimetric and concentra-
tion properties in spaces with positive curvature are due to E. Milman [312, 314] in
a (weighted) Riemannian manifold setting. His results go far beyond the statements
presented here and cover a large spectrum of isoperimetric regimes and functional
inequalities. Preliminary contributions in this context go back to [38, 99, 273] (see
also [279]). The semigroup proof of Theorems 8.7.1 and 8.7.2 presented here is
taken from [280]. It should be pointed out that, in a (weighted) Riemannian manifold
framework with (extended) positive Ricci curvature, the isoperimetric profile of the
(weighted) Riemannian measure is always concave as established by V. Bayle [52]
(see [312, 314]). Therefore, in this case, the weakest concentration rate actually
implies its comparison with that of the exponential measure (and thus exponential
concentration). Earlier steps in the relationships between (Gaussian) measure con-
centration and logarithmic Sobolev inequalities are due to F.-Y. Wang [428] on the
basis of his Theorem 5.6.1, p. 265 (see also [431], [14, Chap. 7] and [80]).
Chapter 9
Optimal Transportation and Functional
Inequalities
the Notes and References for recent progress in this direction), for the sake of clar-
ity, the exposition is restricted to the specific smooth frameworks, indicating when
such extensions are possible.
The first section introduces the basic definitions and general results (without
proofs) of optimal transportation as well as the basic tool of the Brenier map. The
main topic of transportation cost inequalities and first examples, in particular for
Gaussian measures, are discussed in Sect. 9.2. Section 9.3 develops the tool of op-
timal transportation to establish logarithmic Sobolev and Sobolev inequalities (in
Euclidean space) with sharp constants. Non-linear Hamilton-Jacobi equations are
briefly presented in Sect. 9.4, while the subsequent section emphasizes a hypercon-
tractivity property of solutions of Hamilton-Jacobi equations analogous to the one
for linear heat equations. The preceding results are then applied in Sect. 9.6 to in-
vestigate the relationships between (quadratic) transportation cost inequalities and
logarithmic Sobolev inequalities. Section 9.7 investigates contraction properties in
Wasserstein space along the heat semigroup under a curvature condition by means
of commutation between the heat and Hopf-Lax semigroups. Section 9.8 is a very
brief overview of recent developments towards a notion of Ricci curvature lower
bounds based on optimal transportation.
Although we will mostly deal with Euclidean or Riemannian spaces in the various
illustrations, in order to develop some of the preliminaries of optimal transportation,
a natural topological framework is that of a Polish space (complete separable met-
ric space) (E, d) equipped with its Borel σ -field F . The product space E × E is
equipped with the product Borel σ -field. P(E) denotes the set of probability mea-
sures on (E, F).
where the infimum runs over the set of probability measures π ∈ P(E × E) with
respective marginals μ and ν. That is, for all bounded measurable functions u and
9.1 Optimal Transportation 435
v on E,
u(x) + v(y) dπ(x, y) = u dμ + v dν.
E×E E E
The function c is called a cost function and a measure π ∈ P(E × E) with marginals
μ and ν is called a coupling of (μ, ν).
Note that the set over which we take the infimum in (9.1.1) is not empty, the
product measure μ ⊗ ν being a coupling of μ and ν. The first result concerns the
existence of such an optimal coupling.
where the supremum runs over all bounded continuous functions u and v (or in
L1 (μ) and L1 (ν) respectively) satisfying, for all (x, y) ∈ E × E,
where the supremum runs over all 1-Lipschitz (with respect to d̄) functions u. (The
functions u may be assumed furthermore to be bounded.)
measures μ on E with finite p-th moment (i.e. such that E d(x, x0 )p dμ(x) < ∞
for some x0 ∈ E). Therefore, for the choice of c = d p , the optimal transportation
cost gives rise to the so-called Wasserstein distance
1/p
Wp (μ, ν) = inf d(x, y)p dπ(x, y) = Td p (μ, ν)1/p , (9.1.4)
E×E
where as above the infimum runs over all couplings π of (μ, ν) such that
μ, ν ∈ Pp (E). Note, as is easily seen, that if μ ∈ Pp (E) and Wp (μ, ν) < ∞, then
necessarily ν ∈ Pp (E). From a more probabilistic point of view,
1/p
Wp (μ, ν) = inf E d(X, Y )p
where the infimum is over all random variables X and Y with respective laws μ
and ν. On Pp (E), the Wasserstein distance Wp metrizes the weak convergence
topology together with convergence of the respective p-th moments, defining the
corresponding Wasserstein space.
The Kantorovich-Rubinstein Theorem 9.1.4 is of particular interest in two special
cases. First, choose d̄ = d where d is the distance on the Polish space E. This
choice gives rise to the Wasserstein distance W1 . Another choice is the trivial metric
d̄(x, y) = 1x=y , (x, y) ∈ E × E, which yields the total variation distance between
μ and ν,
1 dν
Td̄ (μ, ν) = 1− dμ = μ − νTV = sup μ(A) − ν(A). (9.1.5)
2 E dμ A∈F
Theorem 9.1.5 (Brenier’s map) Let μ and ν be two probability measures on the
Borel sets of Rn with μ absolutely continuous with respect to the Lebesgue measure
9.1 Optimal Transportation 437
and such that W2 (μ, ν) < ∞. Then there exists a convex function φ : Rn → R such
that T = ∇φ maps μ onto ν (denoted T #μ = ν), where, here, ∇φ is considered as
a map Rn → Rn . In other words, for every bounded measurable function h on Rn ,
h dν = h(∇φ)dμ. (9.1.6)
Rn Rn
Euclidean case).
If μ and ν have densities f and g with respect to the Lebesgue measure on Rn ,
according to (9.1.6), for every bounded measurable map h : Rn → R,
h(y)g(y)dy = h ∇φ(x) f (x)dx.
Rn Rn
Whenever the change of variable y = ∇φ(x) is licit, the preceding leads to the so-
called Monge-Ampère equation
(where ∇∇φ is the matrix of the second derivatives of φ). However, the map
T = ∇φ exists only almost everywhere, and may not be differentiable in any usual
sense. The Monge-Ampère equation may then be understood in a generalized sense.
Actually it can be proved that φ is locally Lipschitz on the interior of its domain and
that the Monge-Ampère equation is valid f dx-almost everywhere with ∇∇φ being
understood as the Hessian of φ in the sense of Aleksandrov (the absolutely contin-
uous part of the distributional Hessian of φ). Alternatively, suitable assumptions on
f and g ensure its validity as in the next statement.
In the following, we say that a function f defined on an open set O in Rn belongs
to C k,α (O) for some k ∈ N if f ∈ C k (O) and all its derivatives up to order k are
locally Hölder continuous with exponent α ∈ (0, 1).
This section addresses transportation cost inequalities which are built on the cost
functions discussed in the preceding section. More precisely, transportation cost in-
equalities compare a transportation cost distance to a fluctuation distance expressed
by (relative) entropy.
The first example of a transportation cost inequality is the Pinsker-Csizsár-
Kullback inequality (5.2.2), p. 244,
1
μ − ν2TV ≤ H(ν | μ),
2
where we recall that, for probability measures μ, ν on a metric space (E, d),
dν
H(ν | μ) = log dν
E dμ
for some C > 0 and all ν ∈ P(E). (Note that under this inequality, H(ν | μ) = ∞
whenever ν ∈ / P1 (E).) This family of transportation cost inequalities will in partic-
ular be studied below in connection with concentration inequalities.
However, the connections between transportation cost inequalities, partial differ-
ential equations and functional inequalities as discussed in this work are actually
of most interest for a quadratic cost and the associated W2 Wasserstein distance
(with respect to the Euclidean or Riemannian structures). The first basic example in
this setting is the following quadratic transportation cost inequality for the Gaussian
measure in Rn . Recall that P2 (Rn ) denotes the set of probability measures on Rn
with a second moment.
standard Gaussian measure on the Borel sets of Rn . Then, for any ν ∈ P(Rn ),
This identity is the Monge-Ampère equation (9.1.8) in this case. Hence, taking log-
arithms, for every x,
1 x2
log f T (x) + log T (x) − T (x)2 = − .
2 2
Integrating with respect to μ, and using that T #μ = ν,
1
log f dν = T (x)2 − x 2 dμ − log T dμ.
R 2 R R
Talagrand’s inequality of Theorem 9.2.1 tells us that the standard Gaussian mea-
sure on Rn satisfies T2 (1) (justifying again the choice in the normalization of the
constant C).
The following equivalent dual formulation of Definition 9.2.2 will turn out to
be important and useful. With this task in mind, we introduce, for any continuous
function f : E → R, the infimum-convolutions
" #
1
Qt f (x) = Qt (f )(x) = inf f (y) + d(x, y)2 , t > 0, x ∈ E. (9.2.2)
y∈E 2t
eQC f dμ ≤ e E f dμ . (9.2.4)
E
dν
Proof We first prove (9.2.4) from T2 (C). According to (9.2.3), if dμ = g, the
quadratic transportation cost inequality T2 (C) expresses that
Q1 f g dμ − f dμ ≤ C H(ν | μ) = C g log g dμ.
E E E
Choosing
e(Q1 f )/C
g=
(Q1 f )/C dμ
Ee
9.2 Transportation Cost Inequalities 441
f
The conclusion follows since by homogeneity C1 Q1 f = QC ( C ). Conversely, (9.2.4)
h
f
Again by means of (9.2.3), the T2 (C) inequality follows after changing f into C.
Before turning to the tensorization issues, note that a quadratic transportation cost
inequality T2 (C) (for μ ∈ P2 (E)) implies an inequality T1 (C) of the form (9.2.1),
with the same constant C > 0. Moreover, following the scheme of proof of the above
Proposition 9.2.3, such an inequality T1 (C) may be translated to the equivalent
2
esf dμ ≤ es E f dμ+Cs /2 (9.2.5)
E
Proof We work accordingly with the dual formulation of Proposition 9.2.3. Let
f : E1 × E2 → R be bounded and continuous. For C = max(C1 , C2 ) and
442 9 Optimal Transportation and Functional Inequalities
(x1 , x2 ) ∈ E1 × E2 ,
QC f (x1 , x2 )
" #
1 1
≤ inf f (y1 , y2 ) + d1 (x1 , y1 ) +
2 2
d2 (x2 , y2 ) .
(y1 ,y2 )∈E1 ×E2 2C1 2C2
C2 is an infimum,
Now, since Q
QC2 f (x1 , x2 )dμ1 (x1 ) ≤ QC2 f (x1 , ·)dμ1 (x1 ) (x2 ),
E1 E1
so that integrating in dμ2 (x2 ) and applying Proposition 9.2.3 again, this time along
the variable x2 , yields the conclusion.
Together with Proposition 9.2.4, we may then conclude the proof of Talagrand’s
inequality of Theorem 9.2.1 in any dimension. Now, another way to obtain this
theorem is to try to perform a transportation proof directly in dimension n with
the help of the Brenier transportation map. This is the approach followed in the
next section, which will prove useful not only for transportation cost inequalities
but also for other functional inequalities such as logarithmic Sobolev and Sobolev
inequalities.
and the standard carré du champ operator (f ) = |∇f | of the diffusion operator
2
Proof We only sketch the argument, without justifying several of the regularity is-
sues involved in the proof. Since T pushes f μ onto gμ, the corresponding Monge-
Ampère equation (9.1.8) reads
f (x)e−W (x) = g T (x) e−W (x+∇θ(x)) det Id +∇∇θ (x) , x ∈ Rn
444 9 Optimal Transportation and Functional Inequalities
Note that the matrix Id +∇∇θ is positive at any point since the Brenier map φ is
convex. Hence
log det Id +∇∇θ (x) ≤ θ (x).
Furthermore, by the convexity assumption on W , for every x ∈ Rn ,
ρ 2
W x + ∇θ (x) − W (x) ≥ ∇W (x) · ∇θ (x) + ∇θ (x) .
2
Therefore,
ρ 2
log g T (x) ≥ log f (x) + ∇W (x) · ∇θ (x) − θ (x) + ∇θ (x) .
2
After integration with respect to f μ,
g log g dμ ≥ f log f dμ
Rn Rn
ρ
− [θ − ∇W · ∇θ ]f dμ + W2 (f μ, gμ)2
Rn 2
where
where
|∇f |2
Iμ (f ) = dμ
Rn f
is the Fisher information (5.1.6), p. 237, of f (or f μ) with respect to μ. (Recall that
according to Remark 5.1.2, p. 238, this Fisher information Iμ (f ) has to be under-
stood as a suitable limit of strictly positive functions decreasing to f .) The resulting
inequality is known as the HWI inequality (involving entropy H, Wasserstein dis-
tance W and Fisher information Iμ ) and is exhibited in the next corollary.
The HWI inequality actually admits an alternative proof which relies on the
semigroup tools developed in this work, allowing in particular with the material
presented in Sect. 8.6, p. 421, for the extension to the setting of a (Full) Markov
Triple (E, μ, ) under the curvature condition CD(ρ, ∞). We briefly outline the
argument keeping for simplicity the Euclidean notation, all the arguments holding
true in the same way in a Markov Triple.
Alternative proof of Corollary 9.3.3 Denote by P = (Pt )t≥0 the Markov semigroup
with generator L = − ∇W · ∇. Let T
> 0 and let f be a smooth bounded and
strictly positive function on Rn such that Rn f dμ = 1. Following de Bruijn’s iden-
tity (5.7.3), p. 269,
T
Entμ (f ) = Iμ (Pt f )dt + Entμ (PT f ).
0
Since ∇∇W ≥ ρ Id, and thus the CD(ρ, ∞) criterion holds, it follows from the
exponential decay (5.7.4), p. 269, of Fisher information that
|x − y|2
PT (log PT f )(x) ≤ log P2T f (y) +
2β(T )
1
H(f μ | μ) ≤ Iμ (f ),
2ρ
which is the logarithmic Sobolev inequality for the measure dμ = e−W dx (as de-
duced in Corollary 5.7.2, p. 268, from the curvature condition CD(ρ, ∞) for some
ρ > 0). When only ρ ≥ 0, the quadratic transportation cost inequality T2 (C) still im-
plies a logarithmic Sobolev inequality, but with a weaker constant, namely LS(4C)
(the implication actually still holds as soon as ρ > − C1 ). With a less precise depen-
dence on the constants, this implication actually holds if μ only satisfies the T1 (C)
transportation cost inequality (9.2.1) as a consequence of Theorem 8.7.2, p. 426. In
general, however, a quadratic transportation cost inequality is not enough to ensure
the validity of a logarithmic Sobolev inequality (see Sect. 9.6 below and the Notes
and References).
9.3 Transportation Proofs of Functional Inequalities 447
where x · y is as usual the Euclidean scalar product in Rn . The proof below will
make use of Young’s inequality
p∗
xp y∗
x·y ≤ + (9.3.3)
p p∗
for x, y ∈ Rn , 1 ≤ p ≤ ∞, 1
p + 1
p∗ = 1, and of Hölder’s inequality for (suitable)
functions F, G : Rn → Rn ,
1/p 1/p∗
p∗
F · G dx ≤ F dx
p
G∗ dx . (9.3.4)
Rn Rn Rn
The next statement is the announced version of the Sobolev inequality (6.1.1),
p. 278, on Rn (n > 2) with respect to such an arbitrary norm. We refer to Re-
mark 6.9.5, p. 319, for the class of functions satisfying the Sobolev inequality in
Euclidean space.
448 9 Optimal Transportation and Functional Inequalities
where the norms are understood to be with respect to the Lebesgue measure.
The sharp constant Cn > 0 is achieved on the extremal functions fσ,b,x0 (x) =
(σ 2 + b x − x0 2 )(2−n)/2 , x ∈ Rn , σ > 0, b > 0 and x0 ∈ Rn .
Proof Before starting the proof, note that the regularity properties of Theorem 9.1.6
will not be enough since the various densities should be smooth and compactly
supported and not bounded from below by a strictly positive constant. A careful
proof requires the use of distributional gradients, Laplacians and Hessians in the
sense of Aleksandrov. As in the previous sub-section, this will be mostly implicit
below and the proof therefore only emphasizes the main ideas and arguments.
It is enough to establish the theorem for positive functions. Let u, v be smooth
compactly supported probability densities with respect to the Lebesgue measure.
According to Theorem 9.1.5, there is a convex function φ : Rn → R such that
T = ∇φ, the Brenier map, is the optimal map from udx to vdx. The associated
Monge-Ampère equation (9.1.8) reads
u = v(∇φ) det(∇∇φ).
Hence,
v −1/n (∇φ) = u−1/n det(∇∇φ)1/n .
The arithmetic-geometric inequality (for the determinant of symmetric positive ma-
trices) indicates that
1
det(∇∇φ)1/n ≤ φ
n
since φ is the trace of ∇∇φ (which is at the heart of the CD(0, n) condition).
Therefore, after integration with respect to the measure udx,
−1/n 1
v (∇φ)u dx ≤ u1−(1/n) φ dx. (9.3.5)
Rn n Rn
Since T = ∇φ maps udx onto vdx,
−1/n
v (∇φ)u dx = v 1−(1/n) dx.
Rn Rn
The same strategy may be used to derive an (optimal) Sobolev trace inequality on the
half-space, illustrating the power of the transportation argument. Set below Rn+ =
Rn−1 × [0, ∞) and ∂Rn+ = Rn−1 × {0} and consider as before · a norm on Rn .
Denote a generic point x ∈ Rn+ by x = (y, s) so that y ∈ Rn−1 is identified with
(y, 0) ∈ ∂R+n.
450 9 Optimal Transportation and Functional Inequalities
The sharp constant Cn > 0 is attained for the extremal functions h(x) = x − e2−n ,
x ∈ Rn+ , with e = (0, . . . , 0, −1) ∈ Rn .
Proof The starting point is similar to the proof of Theorem 9.3.5 for the Sobolev
inequality. We again make use of the Monge-Ampère equation (9.1.8) in the sense
of Aleksandrov without any further notice.
Let u and v be two smooth probability densities with respect to the Lebesgue
measure on Rn+ with compact support, and denote by T = ∇φ the Brenier map
from udx onto vdx. As in (9.3.5), integrating with respect to the measure udx,
1
v 1−(1/n)
dx ≤ u1−(1/n) φ dx. (9.3.9)
Rn+ n Rn+
Again the integration by parts is justified since u and v are compactly supported,
and the map φ may be extended to the full space as a convex function so that ∇φ
has a bounded variation at the boundary. Since T = ∇φ is the optimal map between
udx and vdx, ∇φ ∈ Rn+ and then ∇φ · e ≤ 0 on ∂Rn+ , so that the previous inequality
yields
1
v 1−(1/n)
dx ≤ − ∇ u1−(1/n) · ∇(φ − e · x)dx
R+n n R+ n
(9.3.11)
1
− u 1−(1/n)
dx.
n ∂Rn+
Let now f and g be two positive smooth compactly supported functions such
that u = f p and v = g p (p = n−22n
) are probability densities. The Cauchy-Schwarz
inequality and the properties of the transportation map T = ∇φ then imply, as in the
proof of Theorem 9.3.5, that
1/2
f p̃ dx ≤ p̃ g p x − e2 dx
∇f ∗
−n g p̃ dx.
L2 (Rn+ )
∂Rn+ Rn+ Rn+
9.4 Hamilton-Jacobi Equations 451
p̃
f Lp̃ (∂Rn )
+
≤ κ −p̃ κA(g) − B(g)
∇f ∗
p̃ 2 n
L (R ) +
where
1/2
A(g) = p̃ g x − e dx
p 2
and B(g) = n g p̃ dx.
Rn+ Rn+
B(g)
The worse case is when κ = 2(n−1)
n A(g) , for which it follows that for all positive g
p
such that g is a probability density,
f p̃ n ≤ Cn (g)
∇f ∗
2 n
L (∂R+ ) L (R+ )
where Cn (g) > 0. The optimal Sobolev trace inequality follows with the choice of
g = ch for a suitable normalization constant c > 0 since when f = g = ch, all the
previous inequalities turn into equalities. As in the proof of Theorem 9.3.5, even if h
is not compactly supported, all the arguments are justified since in this case ∇φ = x.
The proof of Theorem 9.3.6 is then easily completed along these lines.
qn
where now p = n−q . Similarly, the Sobolev trace inequality
f Lp̃ (∂Rn+ ) ≤ Cn,q
∇f ∗
Lq (Rn )
+
Existence and properties of the solutions are summarized in the next proposition.
Hamiltonians, and actually only concentrate below (as for Wasserstein distances in
2
the preceding section) on the quadratic example H (r) = r2 for which H ∗ = H . In
(H )
this case, Qt will be denoted more simply by Qt , so that
" #
1
Qt f (x) = inf f (y) + d(x, y)2 , t > 0, x ∈ M. (9.4.4)
y∈M 2t
As already emphasized in (9.2.3), the link between the Hopf-Lax formula and
optimal transportation is made clear on the dual Kantorovich problem. Indeed, for
c(x, y) = H (d(x, y)), (x, y) ∈ M × M, Theorem 9.1.3 implies that for all probabil-
ity measures μ, ν ∈ P(M),
(H )
Tc (μ, ν) = sup Q1 f dμ − f dν (9.4.5)
M M
In order to get a flavor of the connections between Hamilton-Jacobi and heat equa-
tions, and therefore between transportation inequalities and functional inequalities
related to heat kernels as developed earlier in this monograph, consider the example
of a diffusion semigroup (Pt )t≥0 with generator L = g − ∇W · ∇ on a Rieman-
nian manifold (M, g) with carré du champ operator (f ) = |∇f |2 . The Hamilton-
2
Jacobi equation, with H (r) = r2 , r ∈ R+ , with a bounded continuous initial condi-
tion f : M → R takes here the form
1
∂t u + |∇u|2 = 0, u(0, ·) = f (·). (9.4.6)
2
Consider now, for every ε > 0,
uε = −2ε log Pεt e−f/2ε , (9.4.7)
for which, by the heat equation and the change of variables formula,
1 2
∂t uε = ε Luε − ∇uε , uε (0, ·) = f.
2
As ε → 0, it is expected that uε will approach the solution u of (9.4.6) in a rea-
sonable sense. This approximation is classically known as the vanishing viscosity
method. Note that the limiting solution u given by the infimum-convolution Qt f is
independent of the potential W and thus of the invariant measure dμ = e−W dμg of
the diffusion operator L = g − ∇W · ∇. In particular, this asymptotics is explicit
on the basic Brownian and Ornstein-Uhlenbeck examples.
454 9 Optimal Transportation and Functional Inequalities
where pt (x, y), t > 0, (x, y) ∈ M × M, are the density kernels with respect to the
reference measure μ. Under this asymptotics, uε in (9.4.7) takes the form
d 2 (x, ·)
u (x) = − log
ε
exp −f − + o(1)
, x∈M
2t 1/2ε
(where the norm is understood to be with respect to μ). Using the fact that
Lp -norms converge to L∞ -norms as p → ∞, it turns out in the end that uε con-
verges when ε → 0 to the infimum-convolution Qt f of (9.4.4) which recovers the
Hopf-Lax formulation. Of course, this should only be considered as an heuristic, the
precise approximation having to be set up more carefully (see Sect. 9.5 below).
The following statement is then the analogue of the (linear) hypercontractivity The-
orem 5.2.3, p. 246.
Conversely, if (9.5.1) holds for some a > 0 and all t > 0 and all f bounded and
continuous, then μ satisfies LS(C).
Proof The proof is similar to the proof of Theorem 5.2.3, p. 246. The derivative
along Pt is replaced by the derivative along Qt , which together with the Hamilton-
Jacobi equation (9.4.2) yields that
q
∂t eq Qt f dμ = − |∇Qt f |2 eq Qt f dμ.
M 2 M
Then, setting
= (t, q) = eq Qt f dμ, t ≥ 0, q > 0,
M
the LS(C) inequality applied to q Qt f expresses that
q ∂q − log ≤ −Cq ∂t .
For the choice of q = q(t) = a + Ct , it is then immediately checked from this in-
equality that H (t) = q(t)
1
log( (t, q(t))) is decreasing in t > 0. Inequality (9.5.1) is
then simply H (t) ≤ H (0).
Conversely, assuming that, for some a > 0, (9.5.1) holds for any t > 0 and any
initial condition f , the time derivative at t = 0 yields
Ca 2
Entμ eaf ≤ |∇f |2 eaf dμ,
2 M
which amounts (as a > 0) to the logarithmic Sobolev inequality LS(C). Theo-
rem 9.5.1 is therefore established. (Note that the absolute continuity of μ is im-
plicitly used in the Hamilton-Jacobi equation, which only holds almost everywhere.
More refined analyses developed in works mentioned in the Notes and References
actually allow us to suitably extend the conclusion without this assumption.)
of (Qt )t>0 directly from that for (Pt )t≥0 using the vanishing viscosity method out-
lined in Sect. 9.4. Indeed, under the logarithmic Sobolev inequality LS(C), apply
the reverse hypercontractivity property described in Remark 5.2.4, p. 247, stating
that
Pt hq ≥ hp
for any strictly positive h : M → R and any −∞ < q < p < 1 such that
q−1
e2t/C = p−1 . For ε > 0, let 0 > p = −2εa > q = −2εb and t > 0 so that
1 + 2εb
e2εt/C = .
1 + 2εa
(where the norms are understood here with respect to the Lebesgue measure). Fur-
thermore, as β → ∞ (with α = 1 for example), for every x ∈ Rn ,
n 1
Qt f (x) ≤ log ef dx + log .
Rn 2 2πt
as
n n r
Entdx ef ≤ |∇f |2 ef dx + log ef dx, r > 0.
2r Rn 2 2πe2 n Rn
αβ
Setting (t) = eQt f q(t) with q(t) = (α−β)t+β , 0 < α ≤ β, this family of logarith-
mic Sobolev inequalities with r = nq (t) implies as in the proof of Theorem 9.5.1
that
nq (t) q (t)
(t) ≤ (t) 2 log .
2q (t) 2πe2
The conclusion follows after integration over [0, 1]. Note that the limiting case
β = ∞ may also be established directly since by definition of Qt f (x),
2 /2t
eQt f (x)−|x−y| ≤ ef (y)
It should be noted that (9.5.3) of Theorem 9.5.2, which holds for every t ≥ 0 and
β ≥ α for a fixed α > 0, is formally equivalent to the Euclidean logarithmic Sobolev
inequality.
To conclude this section we briefly describe a local version of the main hyper-
contractivity Theorem 9.5.1 for solutions of the Hamilton-Jacobi equations in the
spirit of local hypercontractivity for the linear semigroup (Pt )t≥0 of Theorem 5.5.5,
p. 262. The result in particular produces a characterization of the curvature condi-
tion CD(ρ, ∞). We still work in the context of a (weighted) Riemannian manifold
(M, g).
Proof We only sketch the argument following the proofs of Theorem 5.5.5,
p. 262, and Theorem 9.5.1. Theorem 5.5.2, p. 259, indicates that (i) is equiva-
lent to the local logarithmic Sobolev inequality (5.5.5), p. 260. Fixing u > 0, let
(s) = Pu (eq(s)Qs f )1/q(s) , s ∈ [0, t], where q is affine and satisfies q (s) = 1−eρ−2ρu .
458 9 Optimal Transportation and Functional Inequalities
Arguing as in the proof of Theorem 9.5.1, the local logarithmic Sobolev inequal-
ity (5.5.5), p. 260, at time u readily implies that (s) ≤ 0 on [0, t], proving (ii).
Conversely, a simple asymptotics at t = 0 of (9.5.4) implies (5.5.5), p. 260, and
thus (i).
The first main result is that a logarithmic Sobolev inequality is stronger than a
quadratic transportation cost inequality.
eQC f dμ ≤ e M f dμ .
M
But this is exactly the dual formulation of the T2 (C) inequality as expressed in
Proposition 9.2.3.
We will present below an alternative proof of the Otto-Villani Theorem. How-
ever, before we begin, it should be pointed out that the converse implication in The-
orem 9.5.1 only works when a > 0, that is a transportation cost inequality T2 (C)
does not in general imply in return a logarithmic Sobolev inequality (and explicit
counterexamples may be given on the real line). Nevertheless, a T2 (C) inequality
implies a Poincaré inequality P (C).
The proof again relies on the dual description of T2 (C) which yields that for
every bounded continuous function f on M, and every t > 0,
et QCt f dμ ≤ et M f dμ .
M
where we used Jensen’s inequality in the last inequality. By the entropic inequal-
ity (5.1.2), p. 236,
t
ef Qt f dμ ≤ Entμ ef + ef dμ log etQt f/C dμ .
C M M M
for every t > C. While we cannot let t → 0 in (9.6.1) to reach a logarithmic Sobolev
inequality, it turns out that, after some work, (9.6.1) implies in return a transportation
cost inequality T2 (C1 ) for some C1 > 0 only depending on C. In other words, (9.6.1)
is a kind of ersatz of the logarithmic Sobolev inequality replacing the generator L by
t (Id −Qt ) which is equivalent to T2 (C). The argument relies on a refined analysis of
1
the Herbst argument of Sect. 5.4, p. 252, but we skip the details here (see Sect. 9.9).
Now, due to Lemma 5.1.7, p. 240, and since f − Qt f ≥ 0, it is clear that (9.6.1)
is stable under bounded perturbations of μ. As a consequence, similarly to Poincaré
and logarithmic Sobolev inequalities, quadratic transportation cost inequalities T2
are stable under bounded perturbations.
The proof shows that the conclusion actually holds under a (uniform) Gaussian
measure concentration property (8.7.3), p. 426 (essentially equivalent to T1 (C)).
Then 1 − μ(A) ≤ 1
1+ε and, for every t > 0 and every x ∈ E,
" #
1 1
Qt f (x) ≤ inf f (y) + d(x, y) ≤ (1 + ε) f dμ + d(x, A)2
2
y∈A 2t E 2t
where d(x, A) is the distance from the point x to the set A. For every r > 0, the map
h(x) = min(d(x, A), r), x ∈ E, is 1-Lipschitz and
r
h dμ ≤ r 1 − μ(A) ≤ .
E 1+ε
Hence, if s = εr
1+ε ,
r
μ d(·, A) ≥ r = μ(h ≥ r) = μ h ≥ +s ≤μ h≥ hdμ + s ,
1+ε E
so that by the dual formulation (9.2.5) of the T1 inequality and its application (9.2.6)
via Markov’s inequality,
μ d(·, A) ≥ r ≤ e−s /2C = e−ε r /2C(1+ε)
2 2 2 2
9.6 Transportation Cost and Logarithmic Sobolev Inequalities 461
for every r > 0. On the basis of this tail estimate, we may actually control E hdμ
in a finer way as
r
h dμ = μ d(·, A) ≥ u du
E 0
, √
r
−ε 2 u2 /2C(1+ε)2 2 C(1 + ε)
≤ e du ≤ = r0 (ε)
0 π ε
to get similarly that, for every r ≥ r0 (ε),
μ d(·, A) ≥ r = μ(h ≥ r) ≤ μ h ≥ hdμ + r − r0 (ε)
E
≤ e−(r−r0 (ε))
2 /2C
.
eQt f dμ ≤ K e(1+ε) E f dμ .
E
By the hypothesis, apply the latter on (E k , μ⊗k ), with respect to the 2 -metric
(thus with K uniform in k ≥ 1), to
it follows that
same constant). In this form, Theorem 9.6.1 may be extended to the Markov Triple
structure (E, μ, ) (provided the intrinsic distance d defines a Polish topology
on E).
Proof For simplicity we only prove the result for ρ = 0, the general result being
obtained in the same way. The most important part is the proof of (ii) under the
curvature condition. Assume for simplicity that f ≥ 0. To this end, use the log-
Harnack inequality (5.6.2), p. 266, which applied to eu Q1 f yields
1
Pt (Q1 f ) ≤ Q2t log Pt eu Q1 f
u
for every t, u > 0. Now, the local hypercontractivity inequality (9.5.4) as p → 0
implies that
1 1
log Pt e 2t Q1 f ≤ Pt f
2t
for every t > 0. Combining the inequalities with u = 2t1 yields (i) for s = 1. Arbi-
trary values of s are obtained by homogeneity. Conversely, under (ii), the s deriva-
tive in (9.7.1) at s = 0 implies in return, by the Hamilton-Jacobi equation, the gra-
dient bound |∇Pt f |2 ≤ Pt (|∇f |2 ) which is equivalent to the curvature dimension
CD(0, ∞). The proof is complete.
Under the curvature condition CD(ρ, ∞), the commutation (9.7.1) actually ex-
tends to the (Full) Markov Triple setting (E, μ, ) on the basis of the developments
9.7 Heat Flow Contraction in Wasserstein Space 463
in Sect. 8.6, p. 421. Indeed (assuming in addition that the intrinsic distance de-
fines a Polish topology on E), apply the isoperimetric Harnack inequality of The-
orem 8.6.3, p. 425, to A = {Q1 f ≥ u}, u ≥ 0, for f : E → R bounded continuous
and positive. If z ∈ Aε (where ε > 0), there exists an a ∈ A such that d(z, a) < ε so
that, by definition of the infimum-convolution Q1 ,
ε2 d(z, a)2
f (z) + ≥ f (z) + ≥ Q1 f (a) ≥ u.
2 2
Hence
" #
d(x, y)2
{Q1 f ≥ u} d(x,y) ≤ f ≥ u +
2
for all (x, y) ∈ E × E with d(x, y) > 0. Again taking ρ = 0 for simplicity, Theo-
rem 8.6.3 after integration with respect to u ≥ 0 then yields
d(x, y)2
Pt (Q1 f )(x) ≤ Pt f (y) +
2
for every x and y in E. In other words, Pt (Q1 f ) ≤ Q1 (Pt f ) which amounts to (ii).
On the basis of the commutation property (9.7.1) between the heat and Hopf-
Lax semigroups of the last proposition, we next obtain contraction in Wasserstein
distance, which is again shown to be equivalent to the curvature condition. Recall
that f μ denotes for simplicity the measure with density f with respect to μ.
Theorem 9.7.2 (Heat flow contraction in Wasserstein space) Consider the Markov
Triple consisting of a smooth complete connected Riemannian manifold (M, g) and
of a diffusion operator L = g − ∇W · ∇, where W is a smooth potential, with
invariant measure dμ = e−W dμg and Markov semigroup (Pt )t≥0 . The following
assertions are equivalent.
(i) The curvature condition CD(ρ, ∞) holds for some ρ ∈ R.
(ii) For every t ≥ 0 and all probability densities f and g with respect to μ with a
finite second moment,
Proof We start with (i). For any bounded continuous h : M → R, by time reversibil-
ity and (9.7.1)
Q1 h Pt f dμ − h Pt g dμ = Pt (Q1 h)f dμ − Pt h g dμ
M M M M
≤ Qe2ρt (Pt h)f dμ − Pt h g dμ.
M M
Since by homogeneity Qe2ρt (Pt h) = e−2ρt Q1 (e2ρt Pt h), the Kantorovich dual de-
scription of the Wasserstein distance (Theorem 9.1.3) yields the contraction prop-
erty (9.7.2).
464 9 Optimal Transportation and Functional Inequalities
Assume conversely that (ii) holds, that is, again by Kantorovich duality and sym-
metry,
Pt (Q1 h)f dμ − Pt h g dμ ≤ e−ρt W2 (f μ, gμ)
M M
for any bounded continuous h : M → R. Given x, y ∈ M, by a suitable approxima-
tion, let f μ and gμ approach respectively Dirac masses at x and y so that the latter
inequality turns into
d(x, y)2
Pt (Q1 h)(x) − Pt h(y) ≤ e−2ρt
2
(alternatively take ν1 = δx and ν2 = δy in (9.7.4) below). Taking the infimum over y
then exactly expresses the commutation property (9.7.1), which is equivalent to the
CD(ρ, ∞) condition. The theorem is established.
Remark 9.7.3 Theorem 9.7.2 expresses equivalently that, under the curvature con-
dition CD(ρ, ∞),
W2 Pt∗ ν1 , Pt∗ ν2 ≤ e−ρt W2 (ν1 , ν2 ) (9.7.4)
for all probability measures ν1 and ν2 in P2 (M), where according to (1.1.2), p. 9,
Pt∗ ν1 , respectively Pt∗ ν2 , is the law at time t ≥ 0 of the underlying Markov process
{Xtx ; t ≥ 0, x ∈ M} with initial data ν1 , respectively ν2 . It should also be observed
that the preceding inequalities extend, via the commutation property, to the whole
family of Wasserstein distances Wp , 1 ≤ p < ∞, and even to more general costs.
is called a length space if, for all (x, y) ∈ E × E, d(x, y) = inf L(γ ) where the
infimum is taken over all curves γ joining x to y. The space (E, d) is called
geodesic if all points x, y ∈ E are connected by a geodesic. In a geodesic space,
there exist middle points, or θ -middle points (θ ∈ [0, 1]), z between x, y ∈ E, i.e.
d(x, z) = θ d(x, y), d(z, y) = (1 − θ )d(x, y), and such a point z is necessarily on
a geodesic joining x to y. The Wasserstein space P2 (E) of probability measures
on the Borel sets of E with a finite second moment equipped with the Wasserstein
distance W2 is a length space if (E, d) is a length space.
In the following, μ is a fixed reference probability measure on (E, d), so that
(E, d, μ) is a metric measure space. According to the preceding developments in
a Riemannian context, we say that (E, d, μ) is of Ricci curvature bounded from
below by ρ, ρ ∈ R, if for every pair (μ0 , μ1 ) in P2 (E) (with supports contained in
the support of μ), there exists a geodesic (μθ )θ∈[0,1] in P2 (E) joining μ0 and μ1
such that
ρ θ (1 − θ )
H(μθ | μ) ≤ (1 − θ )H(μ0 | μ) + θ H(μ1 | μ) − W2 (μ0 , μ1 )2 (9.8.3)
2
for every θ ∈ [0, 1].
As already mentioned above, this definition is thus equivalent to the Ricci cur-
vature lower bound Ric +∇∇W ≥ ρ g in Riemannian manifolds. Among the most
noticeable aspects of this synthetic definition, not discussed here, it may be shown
to be stable under the so-called Gromov-Hausdorff convergence (a weak notion of
convergence of metric measure spaces for which a limit of Riemannian manifolds is
not necessarily a Riemannian manifold anymore). The preceding definition of Ricci
curvature lower bound also allows for Poincaré or logarithmic Sobolev inequalities
as investigated in this monograph. Further definitions may also be used to include
curvature and dimension of a metric measure space as the CD(ρ, n) condition (see
the Notes and References).
This chapter and the notes and references here are only a rough introduction to
mass transportation and transportation cost inequalities. General references on mass
transportation are the comprehensive books by S. T. Rachev [351], S. T. Rachev
and L. Rüschendorf [352, 353], and C. Villani [424, 426]. The first references are
more oriented towards probabilistic issues. The monographs by C. Villani present
a complete and deep investigation of modern mass transportation theory and its
rich interplay with partial differential equations, probability theory and geometry.
A recent account of transportation cost inequalities with bibliographical references
is the survey [210] by N. Gozlan and C. Léonard.
The reader may find in the preceding references the basic Kantorovich Theo-
rem 9.1.3 and Kantorovich-Rubinstein Theorem 9.1.4 of Sect. 9.1, presented in
greater generality. The existence of a transport map as the gradient of a convex
9.9 Notes and References 467
function (Theorem 9.1.5) goes back to M. Knott and C. Smith [267], L. Rüschen-
dorf and S. T. Rachev [373] and Y. Brenier [94], the latter having emphasized the
geometric relevance to partial differential equations. The manifold case is due to
R. McCann [304]. The associated Monge-Ampère equation (9.1.8) with the Hessian
of φ in the sense of Aleksandrov is proved in [305], on the basis of results in [180]
and the regularity properties of the Brenier map as expressed in Theorem 9.1.6 due
to L. Caffarelli [101, 102] (cf. [424, 426] for details and numerous further contribu-
tions).
The quadratic transportation cost inequality for the Gaussian measure (Theo-
rem 9.2.1) is due to M. Talagrand [404] with the one-dimensional proof together
with tensorization. A multi-dimensional transportation proof, further extended to
strictly log-concave measures, appeared in [71, 133] (see also below). The dual for-
mulation of the quadratic transportation cost inequality T2 (Proposition 9.2.3) was
introduced by S. Bobkov and F. Götze in [77]. The corresponding statement for T1
may also be found there. The link between the latter and measure concentration goes
back to K. Marton [298] (cf. [278, 426]). See [210, 213] for further developments
on transportation cost inequalities.
The new HWI inequality of Corollary 9.3.3 was introduced in the seminal pa-
per [340] by F. Otto and C. Villani. In this work, deep links between mass transporta-
tion and functional inequalities were identified on the basis of the Otto differential
calculus in the Wasserstein space of probability measures [259, 339]. The proof of
Theorem 9.3.1 given here, and its consequence for logarithmic Sobolev inequalities
and quadratic transportation cost inequalities, is due to D. Cordero-Erausquin [133].
The alternative heat flow proof of the HWI inequality comes from [76]. Caffarelli’s
contraction Theorem 9.3.4 is proved in [103] (see also [424]). On the basis of the
transportation proof of the HWI and logarithmic Sobolev inequalities, the optimal
transportation method to obtain classical Sobolev inequalities with their extremal
functions was developed by D. Cordero-Erausquin, B. Nazaret and C. Villani [137]
(where in particular more details on the integration by parts formula (9.3.6) may
be found). The optimal Sobolev trace inequality of Theorem 9.3.6 is due inde-
pendently to J. F. Escobar and W. Beckner [56, 177]. The proof presented in the
monograph and its generalization in Lp (as presented in Remark 9.3.7) is due
to B. Nazaret [324]. The transportation mass method has been further developed
to investigate a wide variety of inequalities between entropy functionals and en-
ergy in a partial differential equation context (see e.g. [3, 112, 113, 134, 296], and
cf. [424, 426]). It should be noted that, as far as functional inequalities such as
Sobolev-type inequalities are concerned, alternative maps may be considered in the
transportation proofs. In particular, the earlier triangular Knothe map [266] can be
used to this end, as M. Gromov did in a proof of the standard isoperimetric in-
equality in Euclidean space [316] (see also [137, 426]). The optimal Brenier map
becomes relevant as soon as more (Riemannian) geometric features enter into the
picture (see [426] for much more in this regard).
Classical properties of Hamilton-Jacobi equations as briefly outlined in Sect. 9.4
such as the Hopf-Lax infimum-convolution formula and the vanishing viscosity
method are presented, for example, in [44, 104, 179] (see also [424, 426] or [291]
468 9 Optimal Transportation and Functional Inequalities
in the context of length spaces). For recent developments in metric structures, see
[11, 12, 43, 212]).
Hypercontractivity of solutions of Hamilton-Jacobi equations in Sect. 9.5 have
been emphasized in [76] towards a new proof of the Otto-Villani Theorem. Theo-
rem 9.5.1 appears there whereas ultracontractivity of the Hamilton-Jacobi solutions
is examined in [197] and in a more general context in [188, 198]. In the latter pa-
pers, generalizations of the Euclidean logarithmic Sobolev inequality in Lp (dx),
originally introduced in [148], are also considered. Local hypercontractivity for
Hamilton-Jacobi solutions, Theorem 9.5.3, is investigated in [31].
A first contraction result in Wasserstein distance (along the Boltzmann equation)
goes back to H. Tanaka [407, 408]. The Wasserstein contraction property put for-
ward in Theorem 9.7.2 may be traced back to the investigation [339] of the heat
flow as a gradient flow in the Wasserstein space, further developed in this spirit
in [112, 113, 341] and [9, 175] in connection with the Evolutionary Variational
Inequality. For coupling arguments, see [128, 129, 427, 431]. The crucial equiva-
lence with curvature bounds as expressed by Theorem 9.7.2 is due to M. von Re-
nesse and K.-T. Sturm [427]. See [424, 426] for more on Wasserstein contraction
properties. The proof given here, based of the commutation property between the
heat and Hopf-Lax semigroup (Proposition 9.7.1) is due to K. Kuwada [270] (see
also [12, 37]). Contraction under any curvature-dimension is studied, among other
recent developments, in [85, 176, 271].
The major link between quadratic transportation inequality and logarithmic
Sobolev inequality, at the source of many developments at the interface between
partial differential equations, probability theory and geometry, is due to F. Otto
and C. Villani [340]. The proof presented here is taken from [76]. That logarithmic
Sobolev inequalities are actually strictly stronger that quadratic transportation cost
inequalities is due to P. Cattiaux and A. Guillin [117] (see [208] for a direct charac-
terization). The perturbation Proposition 9.6.3 is due to N. Gozlan, C. Roberto and
P.-M. Samson [211], and was part of the early motivation of [340]. Theorem 9.6.4
is due to N. Gozlan [206], following the preliminary investigation [209] (cf. [210]).
Section 9.8 is a very short overview of recent deep investigations involving a no-
tion of Ricci curvature lower bounds in metric measure spaces due to J. Lott and
C. Villani [292] and K.-T. Sturm [399, 400]. Preliminary steps consisted of the dis-
placement convexity (9.8.1) by R. McCann [304], the heat flow as gradient flow of
entropy with respect to the Wasserstein metric by R. Jordan, D. Kinderlehrer and
F. Otto [259] and the Otto calculus [339], the links with functional inequalities by
F. Otto and C. Villani [340], the Borell-Brascamp-Lieb inequalities by D. Cordero-
Erausquin, R. McCann and M. Schmuckenschläger [135, 136] and the equiva-
lence with the Riemannian Ricci curvature lower bounds by M. von Renesse and
K.-T. Sturm [427]. This has been the starting point of many fruitful developments in
recent years. The monumental monograph [426] by C. Villani is a complete expo-
sition to which the reader is warmly referred for both mathematical developments
and historical background. See also [9].
In connection with the Markov operator curvature-dimension viewpoint empha-
sized in this book, it should be mentioned that recent developments by L. Ambrosio,
9.9 Notes and References 469
N. Gigli and G. Savaré [10, 11] actually conclude that in a suitable sub-class of met-
ric measure spaces for which the gradient energy is Hilbertian, the curvature as con-
vexity of relative entropy (9.8.3) is equivalent to the Markov operator curvature con-
dition CD(ρ, ∞), thus providing an even deeper link between Markov semigroups
and optimal transportation. The approach relies on the identification of the heat
flow as gradient flow of either Dirichlet energy or entropy in curved spaces. An im-
proved version of contraction in Wasserstein distance in the form of the Evolution-
ary Variational Inequality combining the heat flow interpolation along the geodesics
of optimal transportation provides one key step in this direction. This investigation
partially relies on the extension to a non-smooth framework of some of the ba-
sic semigroup tools described in this monograph by means of refined non-smooth
analysis in metric measure spaces. The case of curvature-dimension CD(ρ, n) is
investigated by M. Erbar, K. Kuwada and K.-T. Sturm [176, 271]. We refer to these
and the previously mentioned references for a complete account of these deep de-
velopments.
Appendices
The following appendices are devoted to the three basic aspects of this monograph,
the analytic aspect of semigroups of operators, the probabilistic aspect of diffusion
processes and their generators, and the geometric aspect of Laplace-type operators
and their curvature properties. These appendices only briefly feature a few central
objects and properties, in perspective with the main developments in the core of
the book. In particular, the analysis of Markov semigroups and their infinitesimal
generators may be found there. In this interplay between these different viewpoints,
the aim here is to help the reader to access these topics in a balanced way. It is of
course strongly advised to supplement these short surveys with more complete and
detailed references. A few such references are indicated at the end of each appendix.
The first appendix is thus devoted to some basics of the analysis of semigroups of
bounded operators on a Banach space, the second is a brief introduction to stochas-
tic calculus while the third presents some basic notions in differential and Rieman-
nian geometry. Note that the last two sections of the third appendix describe in this
geometric context some features of the 2 operator, in particular the reinforced cur-
vature condition, which are critically used in the core of the book.
Appendix A
Semigroups of Bounded Operators on a Banach
Space
This appendix presents some of the basics of the analysis of semigroups of operators
on a Banach space. It provides in particular the necessary material for the investi-
gation of Markov semigroups and their infinitesimal generators as developed in the
core of the book. The appendix surveys the Hille-Yosida theory, symmetric and
self-adjoint operators and their Friedrichs extensions, spectral decompositions and
Hilbert-Schmidt operators. Some general references on these topics are provided at
the end of the appendix.
For simplicity here (and throughout the book) Pt x = Pt (x). Note that linearity
together with the contraction property imply that for any x ∈ B, t → Pt x is contin-
uous in B since for every s ≥ 0,
Pt+s x − Pt x =
Pt (Ps x − x)
≤ Ps x − x,
(thus corresponding to the choice of dν(t) = e−λt dt). Anticipating the next results,
note that formally Rλ = (λ Id −L)−1 . Indeed, for any t ≥ 0 and x ∈ B,
∞
Rλ Pt x = eλt Ps x e−λs ds.
t
In other words, Rλ = Rλ (Id +(λ − λ)Rλ ). This identity indicates that the image of
Rλ in B is included in the image of Rλ , and exchanging the roles of λ and λ , that
this image is independent of λ. Call this image D.
A.2 Symmetric Operators 475
∞
Since Pt x → x as t → 0 by (i) and λRλ x = 0 Ps/λ x e−s ds, it follows that for
every x ∈ B,
lim λRλ x = x.
λ→∞
In the special case when B is a (real) Hilbert space H with scalar product ·, ·, one
may consider semigroups of bounded symmetric operators. Their generators are
then (unbounded) self-adjoint operators on H. In this book, H will mainly appear
as an L2 (μ)-space for some (positive) measure μ on a measurable space (E, F),
but what follows has nothing to do with this particular case. (Unbounded) self-
adjoint operators have special features, such as, for example, spectral decomposi-
tions (cf. Sect. A.4).
The first point we need to pay attention to is the difference between symmetry
and self-adjointness of unbounded operators. A linear operator A defined on a dense
476 A Semigroups of Bounded Operators on a Banach Space
For a symmetric operator A, its adjoint operator A∗ is defined on the domain D(A∗ )
consisting of the points x ∈ H for which there exists a finite constant C(x) such that
for any y ∈ D(A), |x, Ay| ≤ C(x)y. Since A is symmetric, D(A) ⊂ D(A∗ ). On
D(A∗ ), A∗ is then defined by duality
∗
A x, y = x, Ay.
More precisely, the map y → x, Ay, which is defined on the dense subspace D(A),
may be uniquely extended into a continuous linear form on H, and thus may be
represented as the scalar product with some element A∗ x ∈ H. By the symmetry
assumption, A∗ x = Ax for every x ∈ D(A) and A∗ is therefore an extension of A.
Moreover, A∗ is a closed operator on its domain, meaning that whenever (xk )k∈N
is a sequence in D(A∗ ) converging to x such that (A∗ xk )k∈N converges to y, then
x ∈ D(A∗ ) and A∗ x = y. Thus, in particular, densely defined symmetric operators
are closable (they admit a closed extension).
The following definition makes precise the main difference between symmetry
and self-adjointness.
and then take the completion of D0 under this norm. Since the norm · A
is larger than the norm in H, this completion naturally imbeds into H. The
resulting completion thus defines a new Hilbert space H1 which is such that
D0 ⊂ H1 ⊂ H. Define then the domain D(A) as the set of points x ∈ H1 for
which the map (x) : y → x, yA is bounded for the H-topology (that is such that
|(x)(y)| ≤ C(x)y for some C(x) < ∞). For these points x, (x)(y) may be rep-
resented as T (x), y for some linear operator T defined on D(A) which is easily
seen to be self-adjoint by construction. The self-adjoint extension of A is then given
by T − Id, and the domain D(A) of A satisfies D0 ⊂ D(A) ⊂ H1 . This self-adjoint
extension is called the Friedrichs extension of A.
The Friedrichs extension is not in general the unique self-adjoint extension of A,
and two different self-adjoint extensions may have domains which cannot be com-
pared. However, the Friedrichs extension is minimal in some other sense, namely
in terms of Dirichlet domains. The construction of the Hilbert space H1 above may
actually be performed for any symmetric (and therefore any self-adjoint) extension
of A and is called the domain of the Dirichlet form. It is this domain of the Dirichlet
form which is minimal for the Friedrichs extension. When dealing with operators
478 A Semigroups of Bounded Operators on a Banach Space
Definition A.4.1 (Spectrum) Given a linear operator A on B, the resolvent set ρ(A)
of A is the set of all complex values λ ∈ C such that the range of λ Id −A is dense
in B and such that λ Id −A has a bounded inverse Rλ = (λ Id −A)−1 . The spectrum
σ (A) of A is C \ ρ(A).
When the operator (A, D(A)) is closed (for the domain topology), if λ ∈ ρ(A),
then Rλ (λ Id −A) = B. Therefore the inverse (λ Id −A)−1 is everywhere defined.
Also, the resolvent set ρ(A) is always an open set in C.
A complex number λ may be in the spectrum σ (A) of A for three different rea-
sons. If there is a non-zero solution x ∈ B to Ax = λx, then λ is called an eigenvalue
(of A) and such an x ∈ B (x = 0) is an eigenvector (associated to the eigenvalue
λ). The set of eigenvalues forms the so-called point spectrum. In case the range of
λ Id −A is dense in B but the inverse is not bounded, the set of those λ’s forms the
continuous spectrum. Finally, it may happen that the range of λ Id −A is not dense,
but λ is not an eigenvalue. The set of those λ’s forms the residual spectrum.
On real Banach spaces, as is the case throughout this book, one has to look at the
complexification of the space. Fortunately, this is irrelevant for self-adjoint operators
on separable Hilbert spaces for which the spectrum is always real, and the residual
spectrum is empty.
Self-adjoint operators are quite similar to symmetric matrices in finite dimension
which may be diagonalized in orthonormal bases of eigenvectors associated to real
eigenvalues. Things are of course more intricate in infinite dimension where there
may not exist any eigenvectors. Think for example of the operator Af = −f on the
line R, which is self-adjoint on L2 (dx) for the Lebesgue measure dx (the space of
smooth compactly supported functions being a core), and for which eigenvectors are
not in L2 (dx). Diagonalization then has to be replaced by what is called a spectral
decomposition.
We restrict ourselves here to positive self-adjoint operators A on a (real, separa-
ble) Hilbert space H as described in the previous section.
A spectral decomposition is then an increasing family (Hλ )λ≥0 of closed linear
subspaces of H, right-continuous
! in the sense that λ >λ Hλ = Hλ . It is further-
more required that λ≥0 Hλ is dense in H. Consider then, for every λ ≥ 0, the
A.4 Spectral Decompositions 479
is bounded and increasing (and positive). Therefore, for any pair of points
(x, y) ∈ H × H, the map λ → Eλ x, y is right-continuous and of bounded vari-
ations as a difference of bounded increasing functions
1
Eλ x, y = Eλ (x + y), (x + y) − Eλ x, x − Eλ y, y .
2
Then given any bounded measurable function ψ : R+ → R, and any x, y ∈ H, one
may define the Stieltjes integral
∞
ψ(λ)dEλ x, y.
0
For any measurable function ψ on R+ , one may then define the operator ψ(A) as
∞
ψ(A) = ψ(λ) dEλ
0
∞
on the domain (A.4.1) on which it satisfies ψ(A)x2 = 0 ψ(λ)2 dEλ x, x.
480 A Semigroups of Bounded Operators on a Banach Space
and also
∞
2
E(f, f ) = f (−Lf )dμ = λdEλ f, f = (−L)1/2 f dμ.
E 0 E
The relationship of the spectral decomposition Theorem A.4.2 with the spectrum
σ (A) of A as described in Definition A.4.1 is the following. For a positive real
number λ, λ ∈ σ (A) if and only if, for any ε > 0, the dimension of Hλ+ε is strictly
larger than the dimension of Hλ−ε . Furthermore,
! λ belongs to the point spectrum if
and only if the closed space Hλ− spanned by λ <λ Hλ is such that Hλ− = Hλ . In
that case, any point x ∈ Hλ which is orthogonal to Hλ− is an eigenvector of A, with
eigenvalue λ.
The spectrum σ (A) of a positive self-adjoint operator A may also be decomposed
in another way, much more useful in practice.
Indeed, since the resolvent set is an open subset in C, λ Id −A∗ is still injective
for some imaginary complex number close to λ with positive or negative imaginary
part. Therefore, both K+ and K− are reduced to 0 and the conclusion follows from
Theorem A.5.2. In particular, in order to ensure self-adjointness, it is enough to
exhibit some real number λ for which the equation λx = A∗ x admits no non-zero
solutions (in other words, if x is such that λy, x = Ay, x and |y, x| ≤ Cy
for all y ∈ D0 , then x must be 0).
We conclude this section with a classical illustration. A lot of attention has been
paid over the last century to the class of so-called Schrödinger operators on Rn , that
is, operators of the form Af = −f + Vf , where is the standard Laplacian and
V is some L1loc -function on Rn . The main results on this family of operators are
summarized in the next statement.
The statement may be further completed with a description of the bottom of the
p
essential spectrum. That is, as soon as V ∈ Lloc with p > n2 and
then
n f Af dx
inf σess (A) = sup inf
R
∞ 2
K⊂Rn f ∈Cc (K ),f =0 Rn f dx
c
where K ranges among all compact subsets of Rn and where Cc∞ (K c ) denotes the
class of smooth functions compactly supported in K c = Rn \ K. Operators A on
Rn sharing this property are called Persson operators in the literature and aspects
of their analysis are discussed in Sect. 4.10.3, p. 227. In particular, for such V , the
spectrum is discrete as soon as V goes to ∞ at infinity.
Theorem A.6.1 For a compact symmetric operator on a Hilbert space, the spec-
trum consists of a sequence of eigenvalues (λk )k∈N , the only limit point possibly not
in the discrete spectrum being 0. The non-zero eigenvalues form a (possibly finite)
sequence converging to 0 at infinity, have finite-dimensional eigenspaces, and there
is an orthonormal basis of eigenvectors. Conversely, an operator with a sequence of
eigenvalues converging to 0 is compact.
Hilbert-Schmidt
operators are thus operators for which the sequence (λk )k∈N of
eigenvalues satisfy k∈N λ2k < ∞. When the Hilbert space is L2 (μ) over a mea-
sure space (E, F, μ), a Hilbert-Schmidt operator K may be represented by a kernel
k(x, y) ∈ L2 (μ ⊗ μ) such that
Kf (x) = k(x, y)f (y)dμ(y), x ∈ E,
E
where C = supm∈N fm 2 < ∞. Since k(x, y) ∈ L2 (μ ⊗ μ), it follows that Kfmp
does indeed converge in L2 (μ). By Theorem A.6.1, there exists an orthonormal
basis of eigenvectors of L2 (μ) with corresponding sequence of eigenvalues (λ )∈N .
Using standard Hilbertian tools, it is easily seen that
λ2 = k 2 (x, y)dμ(x)dμ(y)
∈N E E
Theorem A.6.4 For a symmetric semigroup P = (Pt )t≥0 with (unbounded) in-
finitesimal generator L such that −L is positive with domain D(L), the following
are equivalent:
(i) σess (−L) = ∅.
(ii) For all, or some, t > 0, Pt is compact.
(iii) For all, or some, λ > 0, the resolvent (λ Id −L)−1 is compact.
Stochastic calculus is a wide topic. For the reader not familiar with advanced prob-
ability theory, we briefly present in this appendix some basic notions adapted to the
calculus related to Brownian motion. This short exposition surveys stochastic inte-
grals with respect to Brownian motion, Itô’s formula, stochastic differential equa-
tions and diffusion processes. In particular, these elements are aimed at a better
understanding of the links with diffusion operators, semigroups and heat kernels as
informally described in Chap. 1. The Notes and References include pointers to the
literature and more complete accounts.
for some stochastic processes Hs (ω) provided they “depend only on the past”.
To understand this notion, it is necessary to introduce the so-called filtration
Ft , t ≥ 0, of Brownian motion consisting of the sub-σ -fields of defined as
Ft = σ (Bs ; 0 ≤ s ≤ t) with, by convention, F∞ = σ (Bs ; s ≥ 0). In other words,
Ft is the smallest σ -field for which all the random variables Bs for s ≤ t are mea-
surable, and thus contains all information up to time t (as far as observations on
the process (Bs )s≥0 are concerned). The collection (Ft )t≥0 defines an increasing
family of σ -fields and, for technical reasons, it is usually enlarged so to contain all
the elements of F∞ which have 0 probability. Sometimes, the σ -field Ft for every
t is also enlarged in order to include non-constant variables in F0 , which are in gen-
eral independent of F∞ , and this is systematically the case when solving stochastic
differential equations with non-constant initial data.
Ms = E(Mt | Fs ).
(for every t ≥ 0). Given such a process H = (Hs )s≥0 , the stochastic integrals
t
Xt = Hs dBs , t ≥ 0, (B.1.1)
0
define a new stochastic process, again with continuous paths (even if H itself is not
continuous). Although the integral Xt does not make sense for any particular choice
of ω since the map t → Bt (ω) has no bounded variation on any interval, the usual
way to define Xt is to consider it as a suitable limit of Riemann sums. Indeed, start
with Hs (ω) piecewise constant in the sense that
p−1
Hs (ω) = Ki (ω)1(ti ,ti+1 ] (s)
i=0
for some finite sequence 0 = t0 ≤ t1 < · · · < tp and random variables K1 , . . . , Kp−1 .
In order for this process to be adapted, assume that Ki is Fti -measurable for every
i, and of course in L2 (P). In this case, one directly defines
∞
p−1
Hs dBs = Ki (Bti+1 − Bti )
0 i=0
and
∞
Xt = Hs 1[0,t] (s)dBs , t ≥ 0.
0
It is easily checked that the process t → Xt is a martingale and that
t
E Xt2 = E Hs2 ds
0
for every t ≥ 0. This identity thus defines an isometry between the piecewise con-
stant processes H and L2 (P) random variables, which may then be extended by
L2 (P)-continuity to general (continuous) processes H .
490 B Elements of Stochastic Calculus
Via this construction, the resulting process (Xt )t≥0 is then a continuous square
integrable martingale (with respect to the filtration (Ft )t≥0 ). In general X0 = 0. It is
however convenient to add some F0 -measurable random variable X0 and to consider
the process
t
Xt = X0 + Hs dBs , t ≥ 0,
0
with initial value X0 . These stochastic integrals are in general as irregular (in the
time variable) as the original Brownian motion (Bt )t≥0 itself.
A simple special case is obtained with the choice of non-random
∞ functions
Hs = hs , s ≥ 0. In this case, the random variable X(h) = 0 hs dBs , known as
∞
a Wiener integral, is a Gaussian variable with mean 0 and variance 0 h2s ds (pro-
vided it is finite). The covariance of two such variables X(h) and X(k) is given
similarly by
∞
E X(h)X(k) = hs ks ds.
0
This identity thus yields an embedding of L2 ([0, ∞), dt) into the L2 (P)-space of
random variables defined on (
, , P) giving rise to the so-called reproducing ker-
nel Hilbert space or Cameron-Martin Hilbert space associated to Brownian motion.
More on the Cameron-Martin Hilbert space in the context of this book may be found
in Sect. 2.7.2, p. 108.
In general, it is not enough to construct stochastic integrals only for square inte-
grable processes H and it is required to “localize” this procedure. This localization
procedure is performed with the notion of stopping time.
Definition B.1.2 (Stopping time) A stopping time, with respect to the filtration
(Ft )t≥0 , is a random variable T :
→ [0, ∞] such that, for any t ≥ 0, the set
{T ≤ t} = {ω ∈
; T (ω) ≤ t} is Ft -measurable.
It is important to allow stopping times to take the value +∞. A typical stopping
time is
T (ω) = inf t ≥ 0 ; Xt (ω) ∈ A , ω ∈
,
where (Xt )t≥0 is a continuous process (with values in R or Rn , for example) adapted
to the filtration (Ft )t≥0 and A any Borel set in R or Rn (with the usual convention
that inf{∅} = +∞).
Associated with a stopping time T , one introduces the σ -field FT of events which
take place before T , defined as
FT = A ∈ F∞ ; A ∩ {T ≤ t} ∈ Ft for every t ≥ 0 .
Clearly T is FT -measurable.
When (Xt )t≥0 is an adapted stochastic process (with respect to (Ft )t≥0 ), this
process may be stopped at any stopping time T (for (Ft )t≥0 ) by setting
Xt∧T (ω) = X(ω)min(t,T (ω)) , t ≥ 0, ω ∈
.
B.2 The Itô Formula 491
Set XT = X(ω)T (ω) , being careful with this expression when T (ω) = ∞ (in gen-
eral assume that some X∞ random variable is given). This procedure thus pro-
duces a new adapted stochastic process (Xt∧T )t≥0 , and it is a basic result, known
as Doob’s stopping time Theorem, that if (Xt )t≥0 is a continuous martingale, so
is the new process (Xt∧T )t≥0 (relative to the underlying filtration (Ft )t≥0 ). More-
over, when Xt = E(X∞ | Ft ), t ≥ 0, for some integrable random variable X∞ , then
XT = E(X∞ | FT ).
Local martingales are then defined as adapted processes (Xt )t≥0 for which there
exists an increasing sequence (Tk )k∈N of stopping times (relative to (Ft )t≥0 ) which
converges to ∞ and for which the processes (Xt∧Tk )t≥0 , k ∈ N, are martingales.
Stochastic integrals of the type (B.1.1)
t
Hs dXs , t ≥ 0,
0
with respect to such a local martingale (Xt )t≥0 are then defined for (continuous)
adapted processes (Hs )s≥0 provided that
Tk
E Hs2 ds <∞
0
for any k. The resulting stochastic integral is then again a local martingale. This
construction yields much more freedom to define stochastic integrals (and later so-
lutions of stochastic differential equations) although one can no longer be sure that
a local martingale (Xt )t≥0 has a constant expectation. To reach this conclusion usu-
ally requires further integrability properties, such as supt≥0 E(|Xt |p ) < ∞ for some
p > 1.
f (Xt ) = f (X0 ) + Mt + At , t ≥ 0,
where
t t
1
Mt = f (Xs )Hs dBs and At = f (Xs )Hs2 ds.
0 2 0
492 B Elements of Stochastic Calculus
In particular, (Mt )t≥0 is a local martingale and (At )t≥0 has locally bounded varia-
tions.
In other terms, using the differential notation dXt = Ht dBt , and introducing the
“bracket”
dX, Xt = Ht2 dt,
Itô’s formula indicates that for a C 2 function f : R → R,
1
df (Xt ) = f (Xt )dXt + f (Xt )dX, Xt .
2
The quadratic notation dX, X is natural under the interpretation
when the mesh of the subdivision goes to 0. It is easily seen that such a quantity is
0 as soon as (Xt )t≥0 is a continuous bounded variation process (and therefore this
bracket extracts the purely martingale part of the process (Xt )t≥0 in the same way
as the carré du champ operator of Definition 1.4.2, p. 20, extracts the second order
part of a differential operator L).
The change of variables formula expressed by Itô’s formula is valid for the larger
class of so-called semi-martingales. A process (Xt )t≥0 is called a semi-martingale
(with respect to a given filtration (Ft )t≥0 ) if it may be decomposed into a sum
Xt = X0 + Mt + At , t ≥ 0,
where (Mt )t≥0 is a local martingale and (At )t≥0 has locally bounded variations. For
example, as illustrated by Itô’s formula, a process (Xt )t≥0 given by
t t
Xt = X0 + Hs dBs + Ks ds, t ≥ 0, (B.2.1)
0 0
t
for suitable stochastic processes (Hs )s≥0 and (Ks )s≥0 such that 0 Hs dBs , t ≥ 0, is
t
a local martingale and 0 Ks ds, t ≥ 0, has bounded variations, is a semi-martingale.
Replacing in Itô’s formula dXt = Ht dBt by dXt = dMt + dAt shows that the class
of semi-martingales is stable under the action of C 2 functions.
B.3 Stochastic Differential Equations 493
One may also consider stochastic integrals driven by semi-martingales, for ex-
ample in the framework of example (B.2.1), set, for an adapted process (Rs )s≥0 ,
t t t
(R · X)t = Rs dXt = Rs Hs dBs + Rs Ks ds, t ≥ 0.
0 0 0
d(R · X)t = Rt dXt , dR · X, R · Xt = Rt2 dX, Xt = Rt2 Ht2 dt.
The Itô chain rule formula is thus quite different than the formula for differentiable
processes for which the second order term vanishes. The full strength of stochastic
calculus relies on this specific feature, which deeply connects stochastic integration
to second order differential operators.
When dealing with multivariate processes, introduce (by polarization) for a pair
(Xt )t≥0 , (Yt )t≥0 of semi-martingales, the bounded variation process
1
dX, Y t = dX + Y, X + Y t − dX, Xt − dY, Y t , t ≥ 0.
2
The multi-variable form of Itô’s chain rule formula for a vector Xt = (Xt1 , . . . , Xtn ),
t ≥ 0, of semi-martingales and f : Rk → R a C 2 function is then
k
1 2
k
df (Xt ) = ∂i f (Xt )dYti + ∂ij f (Xt )d X i , X j t .
2
i=1 i,j =1
This formula is of course closely related to the change of variables formula for
diffusion processes (1.11.4), p. 43, in Chap. 1.
j
p
j j
dXt = σi (Xt )dBti + bj (Xt )dt, X0 = x j , 1 ≤ j ≤ n.
i=1
j
Here σ = σ (x) = (σi (x))1≤i,j ≤n is an n × p matrix depending on x ∈ Rn and
b = b(x) = (bj (x))1≤j ≤n is a vector depending on x ∈ Rn satisfying suitable growth
and smoothness assumptions (see Theorem B.3.1 below). The preceding equation is
then summarized more simply in the form
dXt = σ (Xt )dBt + b(Xt )dt, X0 = x, (B.3.1)
and the solution is the vector-valued process denoted by {Xtx ; t ≥ 0, x ∈ Rn } to de-
scribe dependence on the initial condition x (there should be no confusion with the
j -th coordinate of Xt ).
There are many ways in which existence and uniqueness of such stochastic differ-
ential equations may be discussed. The existence may be thought of as the existence
of a Brownian motion (Bt )t≥0 on some probability space (
, , P) such that this
holds. The uniqueness may also be regarded in terms of the uniqueness of the laws of
the processes {Xtx ; t ≥ 0, x ∈ Rn }. The discussion here is restricted to the simplest
notions. Say that {Xtx ; t ≥ 0, x ∈ Rn } is a solution of (B.3.1) if it satisfies the equa-
tion path-wise (that is for almost every ω ∈
in the probability space on which the
Brownian motion is defined). Moreover, say that the solution is (path-wise) unique
if for any initial value x, two solutions X and X of (B.3.1) are such that, outside a
set of probability 0, the maps t → Xt and t → Xt coincide. In this context, the main
conclusions may be described by the following statement. There, · denotes any
norm on points, matrices and vectors which are compatible with the usual topology.
The link between stochastic differential equations and diffusion processes is one of
the main features of this monograph. Solutions of stochastic differential equations
with smooth coefficients define stochastic processes driven by second order differ-
ential operators. Solutions of the associated heat equation give rise to semigroups of
operators and heat kernels.
Let {Xtx ; t ≥ 0, x ∈ Rn } be a solution of the stochastic differential equa-
tion (B.3.1) with coefficients σ and b and driven by a p-dimensional Brownian
motion (Bt )t≥0 . For every smooth function f : Rn → R, by Itô’s formula,
p
n
j
df (Xt ) = σi (Xt )∂j f (Xt )dBti + Lf (Xt )dt, X0 = x,
i=1 j =1
1 i j 2
n p n
Lf = σk σk ∂ij f + bi ∂i f.
2
i,j =1 k=1 i=1
496 B Elements of Stochastic Calculus
Now, the first term in the expression of df (Xt ) gives rise to a local martingale
f
(Mt )t≥0 so that
f
t
f Xtx = Mt + Lf Xsx ds, t ≥ 0, x ∈ Rn ,
0
it holds that
t
1
Pt f (x) = f (x) + Ps Lf (x)ds, t ≥ 0, x ∈ Rn ,
2 0
1 2
d
L= Zj + Z0
2
j =1
where, for j = 0, 1, . . . , d, Zj is a vector field (in other words, the vector Zj (x) =
(Z 1 (x), . . . , Zjn (x)) is identified to the first order differential operator Zj f =
nj i
i=1 Zj ∂i f ). Indeed, in this case, as soon as
d
j
dXt = Zj (Xt ) ◦ dBt + Z0 (Xt )dt
j =1
j
where (Bt )t≥0 , j = 1, . . . , d, are independent Brownian motions, the Itô formula
immediately yields that
t
f
f (Xt ) = Mt + Lf (Xs )ds
0
f
where (Mt )t≥0 is the local martingale
f
d
j
dMt = ∂j f (Xt )Zj (Xt )dBt .
j =1
The process (Xt )t≥0 thus similarly solves the martingale problem with respect to
the operator L.
When dealing with stochastic processes and stochastic differential equations on
a smooth manifold, one usually uses the localization procedure. In some open set
O on which there exists a local chart, one solves the stochastic differential equation
up to the stopping time T = inf{t ≥ 0 ; Xt ∈ / O} and then changes the local chart.
The change of variable laws according to Itô’s formula assert that the result is in-
dependent of the choice of coordinate system. Then, the strong Markov property
at time T is used to start the process again in the new chart with initial value XT .
In such a way, the diffusion process is well defined as long as it does not reach
the boundary of the manifold (or infinity), and then specific arguments are required
to analyze whether or not the process reaches this boundary in finite time or not.
(However, the global growth condition of Theorem B.3.1 provides a useful criterion
towards non-explosion.) When dealing with Laplace operators on compact or com-
plete Riemannian manifolds, there exist some more global procedures (embedding
the manifold in some Euclidean space with larger dimensions, lifting the process to
the frame fiber bundle etc.), each of them having its specific advantages.
As already mentioned, solutions of stochastic differential equations are very ir-
regular in the time variable t (as irregular as the Brownian paths). However, they
498 B Elements of Stochastic Calculus
depend smoothly on the initial condition x. In fact, solving for example the stochas-
tic differential equation
d
j
dXt = Zj (Xt ) ◦ dBt + Z0 (Xt )dt, X0 = x,
j =1
This appendix presents some basic notions of differential geometry which are used
in many parts of this book. Even for the reader only interested in diffusions in Rn
or open sets in Rn , it may sometimes be useful to consider the case of manifolds.
For example, it is much easier to obtain the optimal Sobolev inequality on spheres
than on Euclidean spaces, and then to carry those optimal inequalities from spheres
to Rn by conformal transformations as described in Sect. 6.9, p. 313. It is also true
that the analysis of Markov semigroups on compact Riemannian manifolds without
boundaries (like spheres, toruses and so on) is in many respects much easier than the
analysis of the same objects on Rn or open sets of Rn . (This is somewhat similar to
the difference in the analysis of Markov chains on a finite set or on an infinite set.)
Unfortunately, on manifolds there is no way to locate a point through a unique chart,
thus explaining why it is necessary to develop an (apparently) somewhat heavy ma-
chinery. Finally, there are some computation rules which have a natural interpreta-
tion in terms of differential geometry, even if only the case of Rn is considered (such
as connections, curvature tensors, 2 operators etc.).
This appendix thus briefly presents the language of differentiable and Rieman-
nian manifolds, and some of the classical rules. It also introduces the notion of
Ricci curvature and its connection with the curvature-dimension conditions exten-
sively used throughout this work. It should be mentioned in particular that the last
two sections outline arguments and results on the 2 operator and the reinforced
curvature-dimension condition, which are central to this book. As for the previous
appendices, this appendix is far from a complete and formal exposition of differen-
tiable and Riemannian manifolds and only emphasizes some basic elements neces-
sary for the further developments. The reader is referred to standard textbooks on the
subject for complete details, some of which are listed in the Notes and References.
In several places in this appendix, as well as in the corresponding parts of this
book, index conventions are used according to the covariant or contravariant rules of
differential geometry. (In particular, coordinates of vectors are indicated with upper
indices. For simplicity, in the core of the book, lower indices are usually used.) The
j
Kronecker symbol δi is equal to 1 if i = j , and 0 otherwise. The Einstein notation
summarizes a summation over an index when it appears twice, one time up, one time
down (usually from 1 to the dimension of the underlying structure). For example,
Z i ei is a shorthand for ni=1 Z i ei . In the same spirit
n
g ij ei ej = g ij ei ej .
i,j =1
in the second one. Therefore, this correspondence defines a map ψ from the in-
tersection of the unit ball with {y 1 > 0} into the intersection of the unit ball with
{y n > 0} given by
1 2 2 2
ψ y , . . . , y = y , . . . , y , 1 − y1 − · · · − yn
n n
.
sets (Oi ), such that each Oi is homeomorphic to some open set in Rn via a home-
omorphism ψi . Every point x ∈ M belongs to at least one Oi , and the changes of
coordinates
ψj ◦ ψi−1 : ψi (Oi ∩ Oj ) → ψj (Oi ∩ Oj )
are C k -diffeomorphisms in Rn .
A function from one manifold to another (in particular curves from R into a
manifold) is C k if it is C k in the given charts. In the following and throughout the
monograph, mostly C ∞ connected manifolds are considered (C k manifolds for ev-
ery k) without further mention. “Smooth” is usually understood as C ∞ below (and
throughout the book).
The first objects of importance in this context are tangent vectors. A C 1 curve on a
differentiable manifold M is a C 1 map from R, or an open interval I ⊂ R containing
0, into M. It is given in a local chart by coordinates γ (t) = (x i (t))1≤i≤n , t ∈ I .
The derivative at t = 0 is the tangent vector (Z i )1≤i≤n of the curve at γ (0) = x0 .
Through a change of coordinates given by y = y(x) in a neighborhood of x0 , this
tangent vector will have coordinates (V j )1≤j ≤n at the point y0 = y(x0 ). Then (recall
the Einstein convention),
j
V j = Ji Z i , 1 ≤ j ≤ n,
j j
where (Ji ) = ( ∂y )
∂x i x=x0
is the Jacobian matrix at the point x0 .
The set of all tangent vectors to all curves passing through a given point x0 is
an n-dimensional vector space which is called the tangent space of M at x0 and
is denoted Tx0 (M). Its dual vector space (the linear forms on it) is spanned by the
differentials of functions f : M → R. Actually, given such a smooth function f and
a curve γ (t) onto M with derivative Z = (Z i )1≤i≤n at t = 0, the derivative at t = 0
of the function f (γ (t)) is nothing else than
∂f
Zi = Z i ∂i f.
∂x i
The form (more precisely the 1-form) w = df with coordinates wi = ∂i f ,
1 ≤ i ≤ n, in the local coordinate system x follows a different rule in the change
of variables from x to y = y(x). Namely, in the coordinate system y, df has com-
ponents ηj , 1 ≤ j ≤ n, with
j
ηj J i = wi , 1 ≤ i ≤ n,
which is precisely the inverse of the change of coordinates rule for vectors.
A vector field Z is for any x ∈ M a tangent vector Z(x) ∈ Tx (M) depending
in general smoothly on the point x, meaning that in a local system of coordi-
nates, its components are smooth functions. To any vector field Z is associated a
first order differential operator f → Zf given in a local system of coordinates by
Zf = Z i ∂i f . Thanks to the change of variables rules for vectors and forms, this
expression is independent of the system of coordinates in which it is considered.
502 C Basic Notions in Differential and Riemannian Geometry
In this monograph, more complicated objects than vectors and 1-forms will be
considered, namely tensors with many indices, some up (like vectors), some down
(like 1-forms). Tensors are required in particular to introduce Riemannian metrics,
which are simply Euclidean structures on the tangent space Tx (M) moving smoothly
with the point x ∈ M. The next section describes these objects first in the classical
Euclidean framework.
This section is devoted to some basic notions concerning tensor calculus in Eu-
clidean spaces. On Rn , or any n-dimensional vector space E, a Euclidean structure
is a strictly positive (positive-definite in the standard terminology) symmetric bilin-
ear form G(Z, Y ) which to any pair (Z, Y ) of vectors in E associates a real number
G(Z, Y ) such that G(Z, Z) ≥ 0 for any Z ∈ E and G(Z, Z) = 0 only for Z = 0.
Given any basis e = (ei )1≤i≤n in E, such a strictly positive symmetric bilinear
form G is represented by the positive-definite symmetric matrix gij = G(ei , ej ),
1 ≤ i, j ≤ n, so that if a vector Z is Z = Z i ei (that is (Z i )1≤i≤n are the coordinates
of Z in the basis e), then
G(Z, Z) = Z i Z j gij .
Such a non-degenerate bilinear form G provides an isomorphism G between E
and its dual space E ∗ through G (Z)(Y ) = G(Z, Y ), Z, Y ∈ E. The linear map
G is represented by the matrix G in the dual basis e∗ . The Euclidean structure G
on E may then be transferred to a Euclidean structure G∗ on E ∗ by
−1 −1
G∗ (V , W ) = G G (V ), G (W ) , (V , W ) ∈ E ∗ × E ∗ .
It turns out that if G has matrix (gij )1≤i,j ≤n in a basis e, the matrix of G∗ in the dual
basis e∗ is the inverse matrix of G, denoted (g ij )1≤i,j ≤n = ((gij )1≤i,j ≤n )−1 . When
V and W have respective coordinates (Vi )1≤i≤n and (Wi )1≤i≤n in the dual basis
e∗ , then G∗ (V , W ) = g ij Vi Wj and the correspondence G (Z) = V is such that
Vi = gij Z j , Z i = g ij Vj , 1 ≤ i ≤ n. This operation of identification between E and
E ∗ is called the lifting or lowering of indices. Observe also that since g ij gj k = δki ,
the lifting and lowering operators are inverse to each other.
Multi-linear forms acting both on E and E ∗ may be investigated similarly.
For example, a trilinear map T (Z, Y, V ) on E × E × E ∗ would have coordinates
Tij k = T (ei , ej , e∗k ) such that
T (Z, Y, V ) = Tij k Z i Y j Vk .
Quite often, it is not important to precisely locate the position of the upper and lower
indices. When it is implicit or not necessary to specify, this aspect will simply be
omitted, as is usually the case. Such an object is called a tensor (here a 3-tensor). It
may have indices up or down, depending on whether it acts for some component on
C.2 Some Elementary Euclidean Geometry 503
(The notation | · | is used throughout the monograph to denote the Euclidean norm
of vectors and tensors.) It is easier to use the operation of lifting or lowering indices
to obtain simpler expressions such as |T |2 = Tij k T ij k , or Tij k T ij k (the multiplica-
tion by the matrix g ij or the matrix gij being hidden in the operation of lifting or
lowering indices).
Recall finally that the Lebesgue measure is in general defined on a vector space
up to some scaling constant (as the unique Radon measure which is invariant
under translation). It is well-defined on any Euclidean space, so that for exam-
ple, that the measure of the cube C constructed on an orthonormal basis, that is
C = {t i ei ; 0 ≤ ti ≤ 1}, has measure 1. This property is independent of the cho-
sen orthonormal basis (due to the fact that the Lebesgue measure in Rn is in-
variant under orthogonal transformations). In a non-orthonormal basis, with the
metric given by the matrix G = (gij )1≤i,j ≤n , the Lebesgue measure is given by
det(G)1/2 dx 1 · · · dx n .
In Euclidean spaces, it is thus easier to develop the associated calculus in or-
thonormal bases. But in Riemannian geometry, it is impossible in general to choose
a system of variables for which the tangent spaces, spanned by the vectors ∂i , are
orthonormal everywhere in the neighborhood of any point. This is why the calculus
has to be developed in non-orthonormal bases.
504 C Basic Notions in Differential and Riemannian Geometry
In the framework of this monograph, working with functions rather than points,
the relevant Riemannian object is the co-metric g (thus with upper indices) rather
than the metric G. It is indeed the co-metric which naturally enters into the descrip-
tion of the Markov generators and their carré du champ operators (cf. Sects. 1.10
and 1.11, p. 38 and p. 42.) We thus prefer to write (M, g) for the basic Riemannian
structure.
Due to the dependence on x ∈ M, it is important to be able to follow various
quantities in the tangent spaces, such as metrics and tensors, in different coordi-
nate systems. In a given coordinate system, a tensor T = T (x) is represented by
···i
its coordinates Tji11···j m
(x), with as many functions of x ∈ M as there are possi-
ble values for the multi-indices i1 · · · i , j1 · · · jm , each of them varying between
1 and the dimension n (this range for the various indices is not always indicated
in similar expressions below). For example, a vector field Z(x) = (Z i (x))1≤i≤n
has one index, and represents a first order differential operator. A matrix field like
G(x) = (gij (x))1≤i,j ≤n has two indices. Recall that tensors with more indices may
be considered similarly and appear naturally under the operation of tensor products
such as (Z i gk ) (corresponding to the tensor product Z ⊗ G of the vector field Z
and the metric G). Here and below, dependence on x ∈ M is often omitted.
C.3 Basic Notions in Riemannian Geometry 505
∂f j
i ∂y ∂f j ∂f .
Zi = Z =Z
∂x i ∂x i ∂y j ∂y j
Similarly, a 2-tensor T = (Tij )1≤i,j ≤n would be changed into Tk J¯ik J¯j , 1 ≤ i, j ≤ n.
A simple reminder is the change of variables formula
∂f ∂f ∂x j
=
∂y i ∂x j ∂y i
which amounts to the change of variables formula for tensors with one index down.
Upper indices are called covariant and lower indices are called contravariant.
Since the Riemannian metric G(x) = (gij (x))1≤i,j ≤n is non-degenerate, the in-
verse matrix g(x) = (g ij (x))1≤i,j ≤n corresponds to the metric on the dual of the
space of 1-forms. Recall that, as vector fields correspond to first order differen-
tial operators, a vector field Z in a given coordinate system may be written as
Zf = Z i ∂i f (in short Z = Z i ∂i ) and the operators ∂i , 1 ≤ i ≤ n, form a linear
basis of this space of vector fields (or tangent vectors) in this system of coordinates.
In the dual space, a basis is given by the forms dx i , 1 ≤ i ≤ n, and
df = ∂i f dx i , Z, df = Z i ∂i f.
Therefore, the duality action between a vector field Z and a 1-form df is what was
denoted by Zf . Note that not every 1-form is of the form df . For example, when g
and f are smooth functions, gdf is a 1-form, but may not in general be written as
dh.
Recall that the indices for vectors are up and for forms are down. As in the
Euclidean case, to compute the length of a 1-form w = (wi )1≤i≤n , one uses the
inverse matrix g = (g ij (x))1≤i,j ≤n
|w|2 = g ij wi wj .
In particular, this formula defines the length |df |2 . As mentioned above, in the con-
text of this monograph, while working with second order differential operators, we
will be dealing precisely with the co-metric g via the formula
(f, f ) = |df |2 = g ij ∂i f ∂j f
for the carré du champ operator . This corresponds to a Riemannian metric only
when the underlying operator is elliptic. Some of the formulas defined in Rieman-
506 C Basic Notions in Differential and Riemannian Geometry
nian geometry however still make sense for degenerate operators, although the com-
putations are much easier in the language of Riemannian geometry (cf. Sect. C.5
below).
As already discussed, the metric and its inverse may be used to lift or lower
indices. For example, one may lift in this way the form df to yield the gradient
vector denoted by ∇f with coordinates
∇ i f = g ij ∂j f, 1 ≤ i ≤ n.
where the coefficients γijk will satisfy a specific change of variables formula, dif-
ferent from the one for tensors, in order for the resulting operation ∇i Z i to behave
as a tensor. Those coefficients γijk = γi j k , 1 ≤ i, j, k ≤ n, are called the Christoffel
symbols of the connection. With tensors w = (wj )1≤j ≤n with one index down, signs
have to be changed. For example,
∇i wj = ∂i wj − γijk wk , 1 ≤ i, j ≤ n.
This yields in particular the following rule, for Z a vector field and w a 1-form,
meaning in coordinates
∂i Z j wj = ∇i Z j · wj + Z j (∇i wj ), 1 ≤ i ≤ n.
For tensors T with many indices, the rule may be applied on any index. For example,
j
∇i T j k = ∂i T j k + γi T k + γik T j .
The rule (changing sign for indices down) is applied as many times as there are
indices in the tensor. In this way,
being especially careful with the position of the indices in the resulting tensor.
The first fundamental theorem in Riemannian geometry is the existence of a par-
ticular connection.
C.3 Basic Notions in Riemannian Geometry 507
1 kp
γijk = g (∂i gjp + ∂j gip − ∂p gij ), 1 ≤ i, j, k ≤ n.
2
The derivative of any quantity according to this connection rule is called the
covariant derivative.
Of course, when the metric is constant (that is, on Euclidean spaces), the con-
nection is just the usual derivative, as long as the system of coordinates is such that
(g ij ) is constant. But even in this case, in some other coordinate system (for exam-
ple polar coordinates in Rn ), the usual derivatives differ from the covariant ones.
The connection rules may be used to compute what corresponds to the usual deriva-
tive under a change of coordinates without coming back to the standard system of
coordinates. The only necessary information is the matrix g(x) = (g ij (x)).
There is often no need to explicitly compute the Christoffel symbols. This con-
nection is nevertheless a powerful tool when working with and differentiating ten-
sors. Thanks to the rules of the Riemannian connection, the operation ∇ commutes
with lifting and lowering of indices, and also with the contraction of indices.
The Levi-Civita connection is chosen to be torsion-free, meaning that second
derivatives of functions (0-tensors) are symmetric 2-tensors. This is no longer the
case with derivatives of higher order tensors. For example, for a vector field Z,
∇i ∇j Z k may differ from ∇j ∇i Z k . Similarly, for a smooth function f , ∇i ∇j ∇k f is
in general not symmetric with respect to (i, j, k) as is the case in Euclidean spaces.
Actually, the symmetry defect (∇i ∇j − ∇j ∇i )Z k of a vector field Z gives rise to a
4-tensor R = (Rij k )1≤i,j,k,≤n , independent of the vector field Z, such that
(∇i ∇j − ∇j ∇i )Z k = Rij k Z .
This tensor R is the Riemann curvature tensor. It may be expressed (in a com-
plicated way) in terms of the metric (gij ) and its first and second derivatives in a
coordinate system.
A fundamental theorem of geometry asserts that the Riemann tensor R entirely
characterizes the metric. In particular, when it vanishes, the metric is Euclidean,
which means that there exists (locally) a change of coordinates in which the metric is
constant. It is also true that when it is constant (strictly) positive or constant (strictly)
negative, the metric is spherical or hyperbolic (provided the notion of constant tensor
is properly described). The Riemann curvature tensor R has a lot of symmetries. In
particular, the tensor (Rij k ) obtained by lowering the index k is anti-symmetric in
(i, j ), anti-symmetric in (k, ) and symmetric under the exchange of the pairs (i, j )
508 C Basic Notions in Differential and Riemannian Geometry
and (k, ). There are many other identities involving the covariant derivatives of R,
for example the so-called Bianchi identities, not described here.
Most of the tools and results in this book do not use the full Riemann tensor R,
but a simpler one, known as the Ricci 2-tensor, which is the trace of R defined as
The Ricci tensor is a symmetric tensor. While defined here with lower indices, de-
pending on the context it may be convenient to use Ricij for which we retain the
same notation Ric without risk of confusion. In low dimension n ≤ 3, due to the
many symmetries of the Riemann tensor R, it also characterizes the metric. To say
that the Ricci tensor is constant (equal to some constant ρ) amounts to saying that
Ricij = ρ gij . Therefore, in dimension less than or equal to 3, a constant Ricci ten-
sor indicates that the metric is a metric of a sphere when ρ > 0 and a metric of
hyperbolic space when ρ < 0 (ρ = 0 corresponding to the flat Euclidean space).
The Ricci tensor of (M, g) is usually denoted by Ricg to emphasize the underlying
(co-) metric, or more simply Ric (to avoid confusion with indices).
To say that the Ricci tensor is bounded below by ρ ∈ R means that the tensor
Ric −ρ g is positive. In other words, the lowest eigenvalue of the 2-tensor Ric in a
system of coordinates where at a point x, g(x) is the identity, is bounded from below
by ρ. This is usually denoted as Ric ≥ ρ g. In this definition, ρ may be a function,
although in most parts of this book only constant ρ is considered.
A further trace operation on this Ricci tensor produces the scalar curvature, that
is
scg = g ij Ricij = Ricii . (C.3.2)
The function scg (x), x ∈ M, plays an important role in conformal invariance of
Sobolev inequalities (Chap. 6). In dimension 2, it is also enough to characterize the
metric, as for the Riemann tensor in general and the Ricci tensor in dimension 3.
(In dimension one, there is only one metric after a change of variables, and it is the
usual, Euclidean, flat metric.)
This Ricci tensor appears crucially in the analysis of Markov semigroups through
the Bochner-Lichnerowicz formula. This formula connects the Laplace operator
with the Ricci curvature. In the same way as in Rn where it is given on a smooth
function f as the trace of the symmetric matrix (Hessian) (∂ij2 f ) , the
1≤i,j ≤n
Laplace-Beltrami operator (or Laplacian) g on a Riemannian manifold (M, g)
is defined on smooth functions f : M → R as
g f = g ij ∇i ∇j f = ∇ i ∇i f = ∇i ∇ i f. (C.3.3)
This definition is coordinate-free and one would obtain the same operator in any
coordinate system, using the change of variables rules for tensors.
The Laplace-Beltrami operator g is presented in Sect. 1.11, p. 42, in a different
way as the differential operator whose carré du champ operator is (f, f ) = |df |2
and which is invariant with respect to the Riemannian measure. In a local system of
C.4 Riemannian Distance 509
coordinates (x i )1≤i≤n , the Riemannian measure has density det(g)−1/2 with respect
to the Lebesgue measure dx 1 · · · dx n where det(g) is the determinant of the ma-
trix g = (g ij ). This is coherent with the representation of the Lebesgue measure in
Euclidean space when the coordinate system is not orthogonal. Then, the Rieman-
nian measure, often denoted by μg in this monograph, corresponds to the Lebesgue
measure on the tangent space Tx (M) equipped with the Euclidean metric associated
with g. These two definitions of the Laplacian g are equivalent.
The following theorem introduces the Bochner-Lichnerowicz formula, which in-
spires many of the developments in this book. Indeed, it is the basis for the geomet-
ric and functional analysis of curvature developed in the context of the and 2
operators in Sect. C.6 below.
By comparison with the Euclidean case, the length of the curve c between 0 and t
may be defined as
t
ċ(s)ds.
0
Curves of minimal length are called geodesics. In a given system of coordinates, the
point c(t) = (ci (t))1≤i≤n along a geodesic satisfies the differential equation (Euler-
Lagrange equation)
d 2 ci (t) j
i dc dc
k
+ γ jk = 0, 1 ≤ i ≤ n,
dt 2 dt dt
where γjik are the Christoffel symbols. Geodesics are also curves of minimal energy,
t
minimizing 0 |ċ(s)|2 ds.
It is not always the case that there is a geodesic going from one point to an-
other. For example, in the Euclidean plane, if one removes the point 0, there is no
510 C Basic Notions in Differential and Riemannian Geometry
where the supremum is taken over all smooth functions f : M → R such that
|∇f | ≤ 1. This is reminiscent of the Rademacher Theorem describing Lipschitz
functions (in Euclidean space) as functions with (almost everywhere) bounded
gradients. Functions which approximate this infimum are in fact smooth approx-
imations of the distance d(x, y). This dual description is much more convenient
when considering many aspects of functional inequalities. Furthermore, the lat-
ter definition makes sense for any diffusion operator, even for non-elliptic ones
(hypo-ellipticity is however necessary to effectively define a finite distance between
points), and this formulation is the one used throughout this monograph to introduce
a natural pseudo-distance associated with any generator of a diffusion semigroup
(see (3.3.9), p. 166).
Completeness of a Riemannian manifold (M, g) with respect to the Riemannian
distance d may be described similarly according to the following proposition. This
characterization will prove most useful in the analysis of self-adjointness or gradient
bounds as developed in Sect. 3.2.2, p. 141.
C.5 The Riemannian and 2 Operators 511
Lf = g ij ∂ij2 f + bi ∂i f
L = g + Z
L = g + ∇(log w).
Actually, when the operator L is of the form g + Z, and provided the coefficients
of Z are smooth, whenever it is symmetric with respect to a measure μ, this measure
has a smooth density w with respect to μg . Setting w = e−W , the vector field Z is
the gradient field −∇W and in particular ∇Z is a symmetric tensor. This necessary
condition on Z in order for L be symmetric in L2 (μ) is also sufficient on sim-
ply connected domains. A Riemannian manifold (M, g) equipped with a measure
dμ = e−W dμg is referred to as a weighted Riemannian manifold.
512 C Basic Notions in Differential and Riemannian Geometry
1
2 (f, f ) = L (f, f ) − 2 (f, Lf ) (C.5.2)
2
on smooth functions f : M → R. To ease the notation, we often use (f, f ) = (f )
and 2 (f, f ) = 2 (f ). Both and 2 are extended by polarization to define bilin-
ear forms (f, g) and 2 (f, g) on pairs (f, g) of smooth functions. In the local
system of coordinates g,
(f, g) = g ij ∂i f ∂j g.
1
(∇S Z)ij = (∇i Zj + ∇j Zi ), 1 ≤ i, j ≤ n,
2
is the symmetric part of the (non-symmetric) tensor ∇Z. The symmetric tensor
Ricg −∇S Z on the right-hand side of (C.5.3) is denoted by
Remark C.5.1 (Hessian) When working with diffusion operators L and their carré
du champ operators , there is no need to compute the Christoffel symbols for
formulas on functions. For instance, to compute the Hessian of a smooth function f
acting on the gradient of smooth functions g and h on M,
1
∇∇ ij f ∂i g ∂j h = g, (f, h) + h, (f, g) − f, (g, h) . (C.5.5)
2
This formula for Hessians is widely used in the construction of 2 operators on the
domain of L (in particular in Sect. 3.3, p. 151). The right-hand side makes sense
even in non-elliptic settings, when the (co-) metric is degenerate.
C.6 Curvature-Dimension Conditions 513
2 (f ) = |∇∇f |2 + Ric(∇f, ∇f ).
In this form, the second and first order terms are well separated, and it is easily seen
that
2 (f ) ≥ ρ (f ) = ρ |∇f |2 (C.6.1)
for all f ’s if and only if
Ric(∇f, ∇f ) ≥ ρ |∇f |2
for all f ’s. In other words, (C.6.1) holds if and only if the Ricci tensor at every point
is bounded from below by ρ.
Here ρ could be a function, satisfying, for any smooth function f , Ric(∇f, ∇f )
≥ ρ(x)(f, f ), which we denote by Ric ≥ ρ g. The best ρ(x) may be seen as the
lowest eigenvalue of the matrix Ricij . The inequality (C.6.1), equivalent to the lower
bound Ric ≥ ρ g on the Ricci curvature, is called the curvature condition CD(ρ, ∞)
throughout this book, and ρ will usually be a (real) constant.
However, there is a second parameter buried in the 2 operator. Namely, recall
that, on a given function f ,
g f = ∇ i ∇i f = Tr (∇∇f )
n
Tr(N ) ≤ n Nii2 ≤ n |N |2 . (C.6.2)
i=1
Note that this inequality may actually be seen in a more intrinsic way. Indeed, in
the Euclidean space of symmetric matrices N endowed with the Hilbert-Schmidt
514 C Basic Notions in Differential and Riemannian Geometry
norm |N|2 , Tr(N) is the scalar product with the matrix Id (or equivalently with the
metric g). This identity has norm | Id |2 = n, and the inequality (Tr(N ))2 ≤ n|N |2 is
nothing else than the Cauchy-Schwarz inequality in this Euclidean space.
On the basis of (C.6.2), the 2 operator of the Laplace-Beltrami operator g is
bounded from below, on every smooth function f : M → R, by
1
2 (f ) ≥ ρ (f ) + (g f )2 ,
n
where n is the dimension of the manifold and ρ the lowest eigenvalue of the
Ricci tensor. This inequality is called a curvature-dimension condition or condition
CD(ρ, n) throughout this book. Note that in this example, the best possible choice
for (ρ, n) in the CD(ρ, n) condition is the lower bound ρ on the Ricci curvature
and the dimension n of the manifold.
The preceding curvature-dimension conditions may be addressed similarly for
general operators of the form L = g + Z where Z is a vector field on an
n-dimensional manifold (M, g). From (C.5.3),
1
2 (f ) ≥ ρ (f ) + (Lf )2
m
for every smooth f : M → R if and only if m ≥ n and, as symmetric tensors,
1
Ric(L) = Ricg −∇S Z ≥ ρ g + Z ⊗ Z. (C.6.3)
m−n
Here, there is no longer a best optimal choice both for m and ρ, except in particular
cases. In particular, the parameter m may be equal to the dimension of the manifold
only for Laplace-Beltrami operators. In this sense, among elliptic operators on a
manifold, the family of Laplace operators plays a particular role.
The condition (C.6.3) above takes a simpler form for symmetric operators
L = g − ∇W · ∇ for the invariant (reversible) measure with density e−W with re-
spect to the Riemannian measure μg . Namely, setting e−W = w1m−n , (C.6.3) then
turns into
1
Ricm (L) = Ricg −(m − n) ∇∇w1 ≥ ρ g.
w1
It is worth mentioning that this tensor Ricm (L) has a simple geometric interpretation
when the dimension m is an integer. Indeed, set p = m − n, and start from the
n-dimensional manifold (M, g) together with a smooth function w1 : M → (0, ∞).
Consider then an auxiliary p-dimensional manifold (M1 , g1 ). Equip the product
M = M × M1 with the Riemannian metric
(such metrics are often called wrapped products). In terms of the carré du champ
operator, for a function f (x, y) : M × M1 → R in a local system of coordinates
C.6 Curvature-Dimension Conditions 515
(x i , y j ),
1
(f, f ) = g ij (x) ∂x i f ∂x j f + g1k (y) ∂y k f ∂y f.
w12 (x)
p
The Riemannian measure on M × M1 is given by w1 (x)dμg (x)dμg1 (y). In the
same way, the new Laplace operator is
1
f = g f + p (log w1 , f ) +
g1 f.
w12
The Ricci operator for this new structure on M × M1 splits into two parts, that is
. = Ric0 + Ric1
Ric
where the action of Ric0 only depends on (∂x i f ) and that of Ric1 only on (∂y j f ).
Using the techniques described in Sect. 6.9, p. 313, it may be shown that
p
Ric0 = Ricg − ∇∇w1 ,
w1
1 1
Ric1 = 4 Ricg1 − g (log w1 ) + p g (log w1 ) 2 g1 .
w1 w1
4 (f ) 2 (f ) − ρ (f ) ≥ (f ) . (C.6.4)
This reinforcement is rather easy to understand in this differential geometry con-
text. Namely, the CD(ρ, ∞) condition amounts to Ric(L) ≥ ρ g. The reinforced
inequality expresses that
2
4 |∇f |2 |∇∇f |2 + Ric(L)(∇f, ∇f ) − ρ|∇f |2 ≥ ∇ |∇f |2 .
516 C Basic Notions in Differential and Riemannian Geometry
and
T ψ(f ), φ(g) = ψ (f )φ (g)T (f, g),
for smooth functions ψ, φ : R → R, smooth functions f, g : M → R and 2-tensor T ,
lead to, for smooth functions : Rk → R and vectors f = (f1 , . . . , fk ) of smooth
functions on M,
k
k
2 (f ) = Xi Xj 2 (fi , fj ) + 2 Xi Yj H (fi )(fi , f )
i,j =1 i,j,=1
(C.6.5)
k
+ Yij Ym (fi , f ) (fj , fm )
i,j,,m=1
where
Xi = ∂i (f1 , . . . , fk ), Yij = ∂ij2 (f1 , . . . , fk )
and
1
H (f )(g, h) = g, (f, h) + h, (f, g) − (f, (g, h) . (C.6.6)
2
This last expression is directly related to the representation of the Hessian (C.5.5).
A similar formula applies to 2 − ρ instead of 2 .
Note that (C.6.5) of course takes a simpler form when dealing with a single func-
tion ψ : M → R,
2 ψ(f ) = ψ (f )2 2 (f ) + ψ (f )ψ (f ) f, (f ) + ψ (f )2 (f )2 (C.6.7)
(recall the notation (f ) = (f, f ) and 2 (f ) = 2 (f, f )), a formula which will
be used extensively in Chaps. 5 and 6.
While of a standard differential calculus nature, it should be pointed out that a
more intrinsic derivation of these change of variables formulas may be developed
on the basis of the diffusion property of the generator L as presented in Sect. 1.11,
C.6 Curvature-Dimension Conditions 517
p. 42. This point of view will be systematically emphasized throughout this work in
the form of the -calculus.
Let us now describe how the reinforced inequality (C.6.4) may be obtained from
the standard curvature condition CD(ρ, ∞) by differential calculus and the change
of variables formula (C.6.5). Indeed, given functions (f1 , . . . , fk ) and at any point
x, the function may be chosen in such a way that the coefficients Xi and Yij take
any particular value, provided the symmetries Yij = Yj i are respected (for example
just letting vary among second degree polynomials). The curvature condition
CD(ρ, ∞) therefore yields a positive quadratic form in the variables (Xi , Yij ). It is
for example a quadratic form in the nine variables Xi , Yi,j , i, j = 1, . . . , 3, i < j , in
case of three functions f1 , f2 , f3 . Illustrating the argument in this sample example,
restrict this quadratic form to the set where all the variables are 0 except X1 and Y23 .
Its determinant
2 (f1 ) − ρ (f1 ) (f2 , f3 )2 + (f2 )(f3 ) − 2H (f1 )(f2 , f3 )2
Now,
1
H (f1 )(f1 , f2 ) = f2 , (f1 ) ,
2
so that choosing f2 = (f1 ), it follows that
(f2 )2 ≤ 4 2 (f1 ) − ρ (f1 ) (f1 )(f2 ).
d
L= Zj2 + Z0
j =1
where Z0 , Z1 , . . . , Zd are vector fields, that is first order differential operators with
no zero-order terms. Here, the carré du champ operator is given, on smooth func-
tions, by (f ) = dj =1 (Zj f )2 . In order to have a tractable form for the 2 operator,
it is useful to introduce the symmetric second order derivatives associated to this de-
composition, that is Dij f = 12 (Zi Zj f + Zj Zi f ), 0 ≤ i, j ≤ d, and to denote the
518 C Basic Notions in Differential and Riemannian Geometry
usual commutators [Zi , Zj ] by Zij . Introduce moreover Dk,ij = 12 (Zk Zij + Zij Zk ).
Then
d
2
d
d
1
2 (f ) = Dij f + Zij f +2 Zi f Dj,j i f + Zj Z0j f.
2
i,j =1 i,j =1 j =1
To get a useful formula, as is done in the elliptic case with the use of the language
of differential geometry, it is necessary to separate the first and second order terms.
This is possible for example whenever, for 0 ≤ i, j ≤ d,
d
Zij = αijk Zk
k=1
d
d 2
2 (f ) = Dij f + αji k Zk f + R(f, f ),
i,j =1 k=1
where R(f, f ) (which plays the role of the Ricci tensor) is given by
d
d 2
d 2
1
R(f, f ) = αijk Zk f − αji k Zk f
4
i,j =1 k=1 k=1
d
j j
+ Zi f Zj f α0i + Zk αki .
k=1
Standard references on differentiable manifolds include [68, 159, 166, 167]. Ba-
sics on Riemmanian geometry and Ricci curvature may be found, for example,
in [62, 123, 126, 194, 260, 346] where in particular the construction of the Laplace-
Beltrami operator and the Bochner-Lichnerowicz formula (Theorem C.3.3) are
emphasized. Comparison methods based on Ricci curvature bounds are surveyed
in [123, 126, 448]. See also [61] for an overview of modern Riemannian geometry.
The characterization of completeness of Proposition C.4.1 is due to R. Stri-
chartz [390] (see also [23] in the context of this work).
The and 2 -calculus of Sects. C.5 and C.6 was introduced in [36] in the context
of logarithmic Sobolev inequalities and in [22, 24, 25] in the study of Riesz trans-
forms. The definition of the curvature-dimension condition CD(ρ, n) (also called
the 2 criterion), which for Laplace-Beltrami operators boils down to the Bochner
C.7 Notes and References 519
formula in the context of harmonic maps between manifolds, may be found there.
For more on Bochner-Weitzenböck formulas, see [92]. These ideas have been fur-
ther developed in the early lecture notes [26] (see also [27, 28, 277]). Some geomet-
ric properties of the tensor Ric(L) are examined in [290].
Afterword
Preparation: 60 minutes.
Cooking time: 45 minutes.
– 1 Bresse chicken
– 50 g of butter
– 2 glasses of white wine (Jura wine, preferably Savagnin, is perfect)
– 150 g of Comté cheese, plus 50 g for gratin (cheese topping)
– 400 g of single cream
– 1 teaspoon of paprika
– 1 tablespoon of Dijon mustard
– 1 tablespoon of breadcrumbs
– salt and pepper
(i) Put the pieces of chicken in a cooking pot with the butter, the paprika, salt
and pepper. Turn them over periodically and let the preparation cook for
30 minutes.
(ii) In a separate pan, mix slowly while heating the wine, cheese, mustard and
cream. Do not let the mixture boil.
(iii) Put the chicken in a gratin dish with the above mixture. Cover the preparation
with the spare grated cheese and breadcrumbs.
(iv) Place the dish in a hot oven (240 °C) for 15 minutes.
What about the heat equation in the oven: is this some practical application of
the theory developed here, or just an exercise?
1. R.A. Adams, Sobolev Spaces. Pure and Applied Mathematics, vol. 65 (Academic Press, New
York, 1975)
2. R.A. Adams, J.J.F. Fournier, Sobolev Spaces, 2nd edn. Pure and Applied Mathematics,
vol. 140 (Elsevier/Academic Press, Amsterdam, 2003)
3. M. Agueh, N. Ghoussoub, X. Kang, Geometric inequalities via a general comparison princi-
ple for interacting gases. Geom. Funct. Anal. 14(1), 215–244 (2004)
4. S. Aida, Uniform positivity improving property, Sobolev inequalities, and spectral gaps.
J. Funct. Anal. 158(1), 152–185 (1998)
5. S. Aida, K.D. Elworthy, Differential calculus on path and loop spaces. I. Logarithmic Sobolev
inequalities on path spaces. C. R. Math. Acad. Sci. Paris, Sér. I 321(1), 97–102 (1995)
6. S. Aida, T. Masuda, I. Shigekawa, Logarithmic Sobolev inequalities and exponential integra-
bility. J. Funct. Anal. 126(1), 83–101 (1994)
7. S. Aida, D.W. Stroock, Moment estimates derived from Poincaré and logarithmic Sobolev
inequalities. Math. Res. Lett. 1(1), 75–86 (1994)
8. G. Allaire, A la recherche de l’inégalité perdue. Matapli 98, 52–64 (2012)
9. L. Ambrosio, N. Gigli, G. Savaré, Gradient Flows in Metric Spaces and in the Space of
Probability Measures, 2nd edn. Lectures in Mathematics ETH Zürich (Birkhäuser, Basel,
2008)
10. L. Ambrosio, N. Gigli, G. Savaré, Bakry-Emery curvature-dimension condition and Rieman-
nian Ricci curvature bounds. Preprint, 2012
11. L. Ambrosio, N. Gigli, G. Savaré, Metric measure spaces with Riemannian Ricci curvature
bounded from below. Preprint, 2012
12. L. Ambrosio, N. Gigli, G. Savaré, Calculus and heat flow in metric measure spaces and
applications to spaces with Ricci bounds from below. Invent. Math. (2013). doi:10.1007/
s00222-013-0456-1
13. L. Ambrosio, P. Tilli, Topics on Analysis in Metric Spaces. Oxford Lecture Series in Mathe-
matics and Its Applications, vol. 25 (Oxford University Press, Oxford, 2004)
14. C. Ané, S. Blachère, D. Chafaï, P. Fougères, I. Gentil, F. Malrieu, C. Roberto, G. Schef-
fer, Sur les Inégalités de Sobolev Logarithmiques. Panoramas et Synthèses, vol. 10 (Société
Mathématique de France, Paris, 2000)
15. D. Applebaum, Lévy Processes and Stochastic Calculus, 2nd edn. Cambridge Studies in
Advanced Mathematics, vol. 116 (Cambridge University Press, Cambridge, 2009)
16. A. Arnold, P. Markowich, G. Toscani, A. Unterreiter, On convex Sobolev inequalities and
the rate of convergence to equilibrium for Fokker-Planck type equations. Commun. Partial
Differ. Equ. 26(1–2), 43–100 (2001)
17. T. Aubin, Équations différentielles non linéaires et problème de Yamabe concernant la cour-
bure scalaire. J. Math. Pures Appl. 55(3), 269–296 (1976)
18. T. Aubin, Espaces de Sobolev sur les variétés Riemanniennes. Bull. Sci. Math. 100(2),
149–173 (1976)
19. T. Aubin, Problèmes isopérimétriques et espaces de Sobolev. J. Differ. Geom. 11(4), 573–598
(1976)
20. T. Aubin, Nonlinear Analysis on Manifolds. Monge-Ampère Equations. Grundlehren
der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences],
vol. 252 (Springer, New York, 1982)
21. T. Aubin, Some Nonlinear Problems in Riemannian Geometry. Springer Monographs in
Mathematics (Springer, Berlin, 1998)
22. D. Bakry, Transformations de Riesz pour les semi-groupes symétriques. II. Étude sous la con-
dition 2 ≥ 0, in Séminaire de Probabilités, XIX, 1983/84. Lecture Notes in Math., vol. 1123
(Springer, Berlin, 1985), pp. 145–174
23. D. Bakry, Un critère de non-explosion pour certaines diffusions sur une variété Riemanni-
enne complète. C. R. Math. Acad. Sci. Paris, Sér. I 303(1), 23–26 (1986)
24. D. Bakry, Étude des transformations de Riesz dans les variétés Riemanniennes à courbure
de Ricci minorée, in Séminaire de Probabilités, XXI. Lecture Notes in Math., vol. 1247
(Springer, Berlin, 1987), pp. 137–172
25. D. Bakry, The Riesz transforms associated with second order differential operators, in Sem-
inar on Stochastic Processes, Gainesville, FL, 1988. Progr. Probab., vol. 17 (Birkhäuser,
Boston, 1989), pp. 1–43
26. D. Bakry, L’hypercontractivité et son utilisation en théorie des semigroupes, in Lectures on
Probability Theory, Saint-Flour, 1992. Lecture Notes in Math., vol. 1581 (Springer, Berlin,
1994), pp. 1–114
27. D. Bakry, On Sobolev and logarithmic Sobolev inequalities for Markov semigroups, in
New Trends in Stochastic Analysis, Charingworth, 1994 (World Sci., River Edge, 1997),
pp. 43–75
28. D. Bakry, Functional inequalities for Markov semigroups, in Probability Measures on
Groups: Recent Directions and Trends (Tata Inst. Fund. Res, Mumbai, 2006), pp. 91–147
29. D. Bakry, F. Barthe, P. Cattiaux, A. Guillin, A simple proof of the Poincaré inequality for
a large class of probability measures including the log-concave case. Electron. Commun.
Probab. 13, 60–66 (2008)
30. D. Bakry, F. Baudoin, M. Bonnefont, D. Chafaï, On gradient bounds for the heat kernel on
the Heisenberg group. J. Funct. Anal. 255(8), 1905–1938 (2008)
31. D. Bakry, F. Bolley, I. Gentil, Dimension dependent hypercontractivity for Gaussian kernels.
Probab. Theory Relat. Fields 154(3), 845–874 (2012)
32. D. Bakry, F. Bolley, I. Gentil, P. Maheux, Weighed Nash inequalities. Rev. Mat. Iberoam.
28(3), 879–906 (2012)
33. D. Bakry, P. Cattiaux, A. Guillin, Rate of convergence for ergodic continuous Markov pro-
cesses: Lyapunov versus Poincaré. J. Funct. Anal. 254(3), 727–759 (2008)
34. D. Bakry, D. Concordet, M. Ledoux, Optimal heat kernel bounds under logarithmic Sobolev
inequalities. ESAIM Probab. Stat. 1, 391–407 (1995/97) (electronic)
35. D. Bakry, T. Coulhon, M. Ledoux, L. Saloff-Coste, Sobolev inequalities in disguise. Indiana
Univ. Math. J. 44(4), 1033–1074 (1995)
36. D. Bakry, M. Émery, Diffusions hypercontractives, in Séminaire de Probabilités, XIX,
1983/1984. Lecture Notes in Math., vol. 1123 (Springer, Berlin, 1985), pp. 177–206
37. D. Bakry, I. Gentil, M. Ledoux, On Harnack inequalities and optimal transportation. Ann.
Sc. Norm. Sup. Pisa (2012). doi:10.2422/2036-2145.201210_007
38. D. Bakry, M. Ledoux, Lévy-Gromov’s isoperimetric inequality for an infinite-dimensional
diffusion generator. Invent. Math. 123(2), 259–281 (1996)
39. D. Bakry, M. Ledoux, Sobolev inequalities and Myers’s diameter theorem for an abstract
Markov generator. Duke Math. J. 85(1), 253–270 (1996)
40. D. Bakry, M. Ledoux, A logarithmic Sobolev form of the Li-Yau parabolic inequality. Rev.
Mat. Iberoam. 22(2), 683–702 (2006)
Bibliography 529
41. D. Bakry, S.Y. Orevkov, M. Zani, Orthogonal polynomials and diffusion operators. Preprint,
2013
42. D. Bakry, Z.M. Qian, Some new results on eigenvectors via dimension, diameter, and Ricci
curvature. Adv. Math. 155(1), 98–153 (2000)
43. Z.M. Balogh, A. Engulatov, L. Hunziker, O.E. Maasalo, Functional inequalities and
Hamilton–Jacobi equations in geodesic spaces. Potential Anal. 36(2), 317–337 (2012)
44. G. Barles, Solutions de Viscosité des Équations de Hamilton-Jacobi. Mathématiques & Ap-
plications (Berlin) [Mathematics & Applications], vol. 17 (Springer, Paris, 1994)
45. F. Barthe, P. Cattiaux, C. Roberto, Concentration for independent random variables with
heavy tails. Appl. Math. Res. Express 2, 39–60 (2005)
46. F. Barthe, P. Cattiaux, C. Roberto, Interpolated inequalities between exponential and Gaus-
sian, Orlicz hypercontractivity and isoperimetry. Rev. Mat. Iberoam. 22(3), 993–1067 (2006)
47. F. Barthe, P. Cattiaux, C. Roberto, Isoperimetry between exponential and Gaussian. Electron.
J. Probab. 12(44), 1212–1237 (2007) (electronic)
48. F. Barthe, A.V. Kolesnikov, Mass transport and variants of the logarithmic Sobolev inequal-
ity. J. Geom. Anal. 18(4), 921–979 (2008)
49. F. Barthe, C. Roberto, Sobolev inequalities for probability measures on the real line. Stud.
Math. 159(3), 481–497 (2003). Dedicated to Professor Aleksander Pełczyński on the occa-
sion of his 70th birthday
50. R.F. Bass, Diffusions and Elliptic Operators. Probability and Its Applications (New York)
(Springer, New York, 1998)
51. F. Baudoin, N. Garofalo, Curvature-dimension inequalities and Ricci lower bounds for sub-
Riemannian manifolds with transverse symmetries. Preprint, 2012
52. V. Bayle, A differential inequality for the isoperimetric profile. Int. Math. Res. Not. 7,
311–342 (2004)
53. M. Bebendorf, A note on the Poincaré inequality for convex domains. Z. Anal. Anwend.
22(4), 751–756 (2003)
54. W. Beckner, Inequalities in Fourier analysis. Ann. Math. (2) 102(1), 159–182 (1975)
55. W. Beckner, A generalized Poincaré inequality for Gaussian measures. Proc. Am. Math. Soc.
105(2), 397–400 (1989)
56. W. Beckner, Sharp Sobolev inequalities on the sphere and the Moser-Trudinger inequality.
Ann. Math. (2) 138(1), 213–242 (1993)
57. W. Beckner, Geometric asymptotics and the logarithmic Sobolev inequality. Forum Math.
11(1), 105–137 (1999)
58. W. Beckner, M. Pearson, On sharp Sobolev embedding and the logarithmic Sobolev inequal-
ity. Bull. Lond. Math. Soc. 30(1), 80–84 (1998)
59. J. Bennett, A. Carbery, M. Christ, T. Tao, The Brascamp-Lieb inequalities: finiteness, struc-
ture and extremals. Geom. Funct. Anal. 17(5), 1343–1415 (2008)
60. P. Bérard, G. Besson, S. Gallot, Sur une inégalité isopérimétrique qui généralise celle de Paul
Lévy-Gromov. Invent. Math. 80(2), 295–308 (1985)
61. M. Berger, A Panoramic View of Riemannian Geometry (Springer, Berlin, 2003)
62. M. Berger, P. Gauduchon, E. Mazet, Le Spectre d’Une Variété Riemannienne. Lecture Notes
in Mathematics, vol. 194 (Springer, Berlin, 1971)
63. L. Bertini, B. Zegarliński, Coercive inequalities for Gibbs measures. J. Funct. Anal. 162(2),
257–286 (1999)
64. J. Bertoin, Lévy Processes. Cambridge Tracts in Mathematics, vol. 121 (Cambridge Univer-
sity Press, Cambridge, 1996)
65. J. Bertoin, Subordinators: examples and applications, in Lectures on Probability Theory and
Statistics, Saint-Flour, 1997. Lecture Notes in Math., vol. 1717 (Springer, Berlin, 1999),
pp. 1–91
66. M.-F. Bidaut-Véron, L. Véron, Nonlinear elliptic equations on compact Riemannian mani-
folds and asymptotics of Emden equations. Invent. Math. 106(3), 489–539 (1991)
530 Bibliography
67. P. Billingsley, Probability and Measure. Wiley Series in Probability and Statistics (Wiley,
Hoboken, 2012). Anniversary edition [of MR1324786], with a foreword by Steve Lalley and
a brief biography of Billingsley by Steve Koppes
68. R.L. Bishop, R.J. Crittenden, Geometry of Manifolds. Pure and Applied Mathematics,
vol. XV (Academic Press, New York, 1964)
69. J.-M. Bismut, Large Deviations and the Malliavin Calculus. Progress in Mathematics, vol. 45
(Birkhäuser, Boston, 1984)
70. W. Blaschke, Kreis und Kugel (Chelsea, New York, 1949)
71. G. Blower, The Gaussian isoperimetric inequality and transportation. Positivity 7(3),
203–224 (2003)
72. R.M. Blumenthal, R.K. Getoor, Markov Processes and Potential Theory. Pure and Applied
Mathematics, vol. 29 (Academic Press, New York, 1968)
73. S.G. Bobkov, An isoperimetric inequality on the discrete cube, and an elementary proof of
the isoperimetric inequality in Gauss space. Ann. Probab. 25(1), 206–214 (1997)
74. S.G. Bobkov, Isoperimetric and analytic inequalities for log-concave probability measures.
Ann. Probab. 27(4), 1903–1921 (1999)
75. S.G. Bobkov, A localized proof of the isoperimetric Bakry-Ledoux inequality and some ap-
plications. Teor. Veroâtn. Ee Primen. 47(2), 340–346 (2002)
76. S.G. Bobkov, I. Gentil, M. Ledoux, Hypercontractivity of Hamilton-Jacobi equations.
J. Math. Pures Appl. 80(7), 669–696 (2001)
77. S.G. Bobkov, F. Götze, Exponential integrability and transportation cost related to logarith-
mic Sobolev inequalities. J. Funct. Anal. 163(1), 1–28 (1999)
78. S.G. Bobkov, C. Houdré, Some connections between isoperimetric and Sobolev-type in-
equalities. Mem. Am. Math. Soc. 129, 616 (1997), pp. viii+111
79. S.G. Bobkov, M. Ledoux, Poincaré’s inequalities and Talagrand’s concentration phenomenon
for the exponential distribution. Probab. Theory Relat. Fields 107(3), 383–400 (1997)
80. S.G. Bobkov, B. Zegarliński, Entropy bounds and isoperimetry. Mem. Am. Math. Soc. 176,
829 (2005)
81. T. Bodineau, B. Zegarliński, Hypercontractivity via spectral theory. Infin. Dimens. Anal.
Quantum Probab. Relat. Top. 3(1), 15–31 (2000)
82. V.I. Bogachev, Gaussian Measures. Mathematical Surveys and Monographs, vol. 62 (Amer-
ican Mathematical Society, Providence, 1998)
83. V.I. Bogachev, Measure Theory, vol. I, II (Springer, Berlin, 2007)
84. F. Bolley, I. Gentil, Phi-entropy inequalities for diffusion semigroups. J. Math. Pures Appl.
93(5), 449–473 (2010)
85. F. Bolley, I. Gentil, A. Guillin, Dimensional contraction via Markov transportation distance.
Preprint, 2013
86. L. Boltzmann, Lectures on Gas Theory (University of California Press, Berkeley, 1964).
Translated by Stephen G. Brush
87. C. Borell, The Brunn-Minkowski inequality in Gauss space. Invent. Math. 30(2), 207–216
(1975)
88. C. Borell, Positivity improving operators and hypercontractivity. Math. Z. 180(2), 225–234
(1982)
89. A.A. Borovkov, S.A. Utev, An inequality and a characterization of the normal distribution
connected with it. Teor. Veroâtn. Ee Primen. 28(2), 209–218 (1983)
90. S. Boucheron, G. Lugosi, P. Massart, Concentration Inequalities: A Nonasymptotic Theory
of Independence (Oxford University Press, Oxford, 2013)
91. N. Bouleau, F. Hirsch, Dirichlet Forms and Analysis on Wiener Space. de Gruyter Studies in
Mathematics, vol. 14 (Walter de Gruyter, Berlin, 1991)
92. J.-P. Bourguignon, The “magic” of Weitzenböck formulas, in Variational Methods, Paris,
1988. Progr. Nonlinear Differential Equations Appl., vol. 4 (Birkhäuser, Boston, 1990),
pp. 251–271
Bibliography 531
93. H.J. Brascamp, E.H. Lieb, On extensions of the Brunn-Minkowski and Prékopa-Leindler
theorems, including inequalities for log concave functions, and with an application to the
diffusion equation. J. Funct. Anal. 22(4), 366–389 (1976)
94. Y. Brenier, Polar factorization and monotone rearrangement of vector-valued functions.
Commun. Pure Appl. Math. 44(4), 375–417 (1991)
95. H. Brézis, Opérateurs Maximaux Monotones et Semi-Groupes de Contractions dans les Es-
paces de Hilbert. North-Holland Mathematics Studies, No. 5. Notas de Matemática (50)
(North-Holland, Amsterdam, 1973)
96. H. Brézis, Functional Analysis, Sobolev Spaces and Partial Differential Equations (Springer,
New York, 2011)
97. D. Burago, Y.D. Burago, S. Ivanov, A Course in Metric Geometry. Graduate Studies in Math-
ematics, vol. 33 (American Mathematical Society, Providence, 2001)
98. Y.D. Burago, V.A. Zalgaller, Geometric Inequalities. Grundlehren der mathematischen Wis-
senschaften [Fundamental Principles of Mathematical Sciences], vol. 285 (Springer, Berlin,
1988). Translated from the Russian by A.B. Sosinskiı̆, Springer Series in Soviet Mathematics
99. P. Buser, Notes: a geometric approach to invariant subspaces of orthogonal matrices. Am.
Math. Mon. 89(10), 751 (1982)
100. P.L. Butzer, H. Berens, Semi-Groups of Operators and Approximation. Die Grundlehren der
mathematischen Wissenschaften, vol. 145 (Springer, New York, 1967)
101. L.A. Caffarelli, Boundary regularity of maps with convex potentials. II. Ann. Math. (2)
144(3), 453–496 (1996)
102. L.A. Caffarelli, A priori estimates and the geometry of the Monge Ampère equation, in Non-
linear Partial Differential Equations in Differential Geometry, Park City, UT, 1992. IAS/Park
City Math. Ser., vol. 2. (Amer. Math. Soc., Providence, 1996), pp. 5–63
103. L.A. Caffarelli, Monotonicity properties of optimal transportation and the FKG and related
inequalities. Commun. Math. Phys. 214(3), 547–563 (2000)
104. P. Cannarsa, C. Sinestrari, Semiconcave Functions, Hamilton-Jacobi Equations, and Opti-
mal Control. Progress in Nonlinear Differential Equations and Their Applications, vol. 58
(Birkhäuser, Boston, 2004)
105. M. Capitaine, E.P. Hsu, M. Ledoux, Martingale representation and a simple proof of log-
arithmic Sobolev inequalities on path spaces. Electron. Commun. Probab. 2, 71–81 (1997)
(electronic)
106. E.A. Carlen, Superadditivity of Fisher’s information and logarithmic Sobolev inequalities.
J. Funct. Anal. 101(1), 194–211 (1991)
107. E.A. Carlen, J.A. Carrillo, M. Loss, Hardy-Littlewood-Sobolev inequalities via fast diffusion
flows. Proc. Natl. Acad. Sci. USA 107(46), 19696–19701 (2010)
108. E.A. Carlen, S. Kusuoka, D.W. Stroock, Upper bounds for symmetric Markov transition
functions. Ann. Inst. Henri Poincaré Probab. Stat. 23(2), 245–287 (1987)
109. E.A. Carlen, E.H. Lieb, M. Loss, A sharp analog of Young’s inequality on S N and related
entropy inequalities. J. Geom. Anal. 14(3), 487–520 (2004)
110. E.A. Carlen, M. Loss, Sharp constant in Nash’s inequality. Int. Math. Res. Not. 7, 213–215
(1993)
111. J.A. Carrillo, A. Jüngel, P.A. Markowich, G. Toscani, A. Unterreiter, Entropy dissipation
methods for degenerate parabolic problems and generalized Sobolev inequalities. Monat-
shefte Math. 133(1), 1–82 (2001)
112. J.A. Carrillo, R.J. McCann, C. Villani, Kinetic equilibration rates for granular media and
related equations: entropy dissipation and mass transportation estimates. Rev. Mat. Iberoam.
19(3), 971–1018 (2003)
113. J.A. Carrillo, R.J. McCann, C. Villani, Contractions in the 2-Wasserstein length space and
thermalization of granular media. Arch. Ration. Mech. Anal. 179, 217–263 (2006)
114. G. Carron, Inégalités isopérimétriques de Faber-Krahn et conséquences, in Actes de la Table
Ronde de Géométrie Différentielle, Luminy, 1992. Sémin. Congr., vol. 1 (Soc. Math. France,
Paris, 1996), pp. 205–232
532 Bibliography
115. P. Cattiaux, A pathwise approach of some classical inequalities. Potential Anal. 20(4),
361–394 (2004)
116. P. Cattiaux, I. Gentil, A. Guillin, Weak logarithmic Sobolev inequalities and entropic conver-
gence. Probab. Theory Relat. Fields 139(3–4), 563–603 (2007)
117. P. Cattiaux, A. Guillin, On quadratic transportation cost inequalities. J. Math. Pures Appl.
86(4), 341–361 (2006)
118. P. Cattiaux, A. Guillin, Long Time Behavior of Markov process and Functional Inequalities.
Forthcoming Monograph, 2014
119. P. Cattiaux, A. Guillin, F.-Y. Wang, L. Wu, Lyapunov conditions for super Poincaré inequal-
ities. J. Funct. Anal. 256(6), 1821–1841 (2009)
120. D. Chafaï, Entropies, convexity, and functional inequalities: on -entropies and -Sobolev
inequalities. J. Math. Kyoto Univ. 44(2), 325–363 (2004)
121. I. Chavel, Eigenvalues in Riemannian Geometry. Pure and Applied Mathematics, vol. 115
(Academic Press, Orlando, 1984). Including a chapter by Burton Randol, with an appendix
by Jozef Dodziuk
122. I. Chavel, Isoperimetric Inequalities. Cambridge Tracts in Mathematics, vol. 145 (Cambridge
University Press, Cambridge, 2001). Differential geometric and analytic perspectives
123. I. Chavel, A modern introduction, in Riemannian Geometry. Cambridge Studies in Advanced
Mathematics, vol. 98, 2nd edn. (Cambridge University Press, Cambridge, 2006)
124. J. Cheeger, A lower bound for the smallest eigenvalue of the Laplacian, in Problems in Anal-
ysis (Papers Dedicated to Salomon Bochner, 1969) (Princeton Univ. Press, Princeton, 1970),
pp. 195–199
125. J. Cheeger, Differentiability of Lipschitz functions on metric measure spaces. Geom. Funct.
Anal. 9(3), 428–517 (1999)
126. J. Cheeger, D.G. Ebin, Comparison Theorems in Riemannian Geometry (AMS Chelsea,
Providence, 2008). Revised reprint of the 1975 original
127. L.H.Y. Chen, An inequality for the multivariate normal distribution. J. Multivar. Anal. 12(2),
306–315 (1982)
128. M.-F. Chen, Trilogy of couplings and general formulas for lower bound of spectral gap, in
Probability Towards 2000, New York, 1995. Lecture Notes in Statist., vol. 128 (Springer,
New York, 1998), pp. 123–136
129. M.-F. Chen, F.-Y. Wang, Application of coupling method to the first eigenvalue on manifold.
Prog. Nat. Sci. 5(2), 227–229 (1995)
130. H. Chernoff, A note on an inequality involving the normal distribution. Ann. Probab. 9(3),
533–535 (1981)
131. T.S. Chihara, An Introduction to Orthogonal Polynomials. Mathematics and Its Applications,
vol. 13 (Gordon and Breach, New York, 1978)
132. K.L. Chung, Lectures from Markov Processes to Brownian Motion. Grundlehren der math-
ematischen Wissenschaften [Fundamental Principles of Mathematical Science], vol. 249
(Springer, New York, 1982)
133. D. Cordero-Erausquin, Some applications of mass transport to Gaussian-type inequalities.
Arch. Ration. Mech. Anal. 161(3), 257–269 (2002)
134. D. Cordero-Erausquin, W. Gangbo, C. Houdré, Inequalities for generalized entropy and op-
timal transportation, in Recent Advances in the Theory and Applications of Mass Transport.
Contemp. Math., vol. 353 (Am. Math. Soc., Providence, 2004), pp. 73–94
135. D. Cordero-Erausquin, R.J. McCann, M. Schmuckenschläger, A Riemannian interpolation
inequality à la Borell, Brascamp and Lieb. Invent. Math. 146(2), 219–257 (2001)
136. D. Cordero-Erausquin, R.J. McCann, M. Schmuckenschläger, Prékopa-Leindler type in-
equalities on Riemannian manifolds, Jacobi fields, and optimal transport. Ann. Fac. Sci.
Toulouse 15(4), 613–635 (2006)
137. D. Cordero-Erausquin, B. Nazaret, C. Villani, A mass-transportation approach to sharp
Sobolev and Gagliardo-Nirenberg inequalities. Adv. Math. 182(2), 307–332 (2004)
138. T. Coulhon, Inégalités de Gagliardo-Nirenberg pour les semi-groupes d’opérateurs et appli-
cations. Potential Anal. 1(4), 343–353 (1992)
Bibliography 533
139. T. Coulhon, Ultracontractivity and Nash type inequalities. J. Funct. Anal. 141(2), 510–539
(1996)
140. R. Courant, D. Hilbert, Methoden der mathematischen Physik, vol. 2 (1937) (German)
141. T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd edn. (Wiley-Interscience,
Hoboken, 2006)
142. G. Da Prato, An Introduction to Infinite-Dimensional Analysis. Universitext (Springer, Berlin,
2006). Revised and extended from the 2001 original by Da Prato
143. E.B. Davies, One-Parameter Semigroups. London Mathematical Society Monographs,
vol. 15 (Academic Press, London, 1980)
144. E.B. Davies, Heat Kernels and Spectral Theory. Cambridge Tracts in Mathematics, vol. 92
(Cambridge University Press, Cambridge, 1989)
145. E.B. Davies, L. Gross, B. Simon, Hypercontractivity: a bibliographic review, in Ideas and
Methods in Quantum and Statistical Physics, Oslo, 1988 (Cambridge Univ. Press, Cam-
bridge, 1992), pp. 370–389
146. E.B. Davies, B. Simon, Ultracontractivity and the heat kernel for Schrödinger operators and
Dirichlet Laplacians. J. Funct. Anal. 59(2), 335–395 (1984)
147. M. Del Pino, J. Dolbeault, Best constants for Gagliardo-Nirenberg inequalities and applica-
tions to nonlinear diffusions. J. Math. Pures Appl. 81(9), 847–875 (2002)
148. M. Del Pino, J. Dolbeault, The optimal Euclidean Lp -Sobolev logarithmic inequality.
J. Funct. Anal. 197(1), 151–161 (2003)
149. H. Delin, A proof of the equivalence between Nash and Sobolev inequalities. Bull. Sci. Math.
120(4), 405–411 (1996)
150. C. Dellacherie, Capacités et Processus Stochastiques. Ergebnisse der Mathematik und ihrer
Grenzgebiete, vol. 67 (Springer, Berlin, 1972)
151. C. Dellacherie, P.-A. Meyer, Probabilities and Potential. North-Holland Mathematics Stud-
ies, vol. 29 (North-Holland, Amsterdam, 1978)
152. C. Dellacherie, P.-A. Meyer, Theory of martingales, in Probabilities and Potential. B. North-
Holland Mathematics Studies, vol. 72 (North-Holland, Amsterdam, 1982). Translated from
the French by J.P. Wilson
153. C. Dellacherie, P.-A. Meyer, Théorie du potentiel associée à une résolvante. Théorie des pro-
cessus de Markov, in Probabilités et Potentiel. 2nd edn. Publications de l’Institut de Mathé-
matiques de l’Université de Strasbourg (Hermann, Paris, 1987). Chapitres XII–XVI
154. C. Dellacherie, P.-A. Meyer, Potential theory for discrete and continuous semigroups,
in Probabilities and Potential. C. North-Holland Mathematics Studies, vol. 151 (North-
Holland, Amsterdam, 1988). Translated from the French by J.R. Norris
155. J. Demange, Porous media equation and Sobolev inequalities under negative curvature. Bull.
Sci. Math. 129(10), 804–830 (2005)
156. J. Demange, Improved Gagliardo-Nirenberg-Sobolev inequalities on manifolds with positive
curvature. J. Funct. Anal. 254(3), 593–611 (2008)
157. A. Dembo, T.M. Cover, J.A. Thomas, Information-theoretic inequalities. IEEE Trans. Inf.
Theory 37(6), 1501–1518 (1991)
158. J.-D. Deuschel, D.W. Stroock, Large Deviations. Pure and Applied Mathematics, vol. 137
(Academic Press, Boston, 1989)
159. M.P. do Carmo, Riemannian Geometry. Mathematics: Theory & Applications (Birkhäuser,
Boston, 1992). Translated from the second Portuguese edition by Francis Flaherty
160. J. Dolbeault, M.J. Esteban, M. Kowalczyk, M. Loss, Sharp interpolation inequalities on the
sphere: new methods and consequences. Chin. Ann. Math., Ser. B 34(1), 99–112 (2013)
161. J. Dolbeault, M. Esteban, M. Loss, Nonlinear flows and rigidity results on compact mani-
folds. Preprint, 2013
162. J. Dolbeault, I. Gentil, A. Guillin, F.-Y. Wang, Lq -functional inequalities and weighted
porous media equations. Potential Anal. 28(1), 35–59 (2008)
163. H. Donnelly, Exhaustion functions and the spectrum of Riemannian manifolds. Indiana Univ.
Math. J. 46(2), 505–527 (1997)
534 Bibliography
164. J.L. Doob, Classical Potential Theory and Its Probabilistic Counterpart. Grundlehren
der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences],
vol. 262 (Springer, New York, 1984)
165. O. Druet, E. Hebey, The AB program in geometric analysis: sharp Sobolev inequalities and
related problems. Mem. Am. Math. Soc. 160, 761 (2002), pp. viii+98
166. B.A. Dubrovin, A.T. Fomenko, S.P. Novikov, The geometry and topology of manifolds,
in Modern Geometry—Methods and Applications. Part II. Graduate Texts in Mathematics,
vol. 104 (Springer, New York, 1985). Translated from the Russian by Robert G. Burns
167. B.A. Dubrovin, A.T. Fomenko, S.P. Novikov, The geometry of surfaces, transformation
groups, and fields, in Modern Geometry—Methods and Applications. Part I, 2nd edn. Grad-
uate Texts in Mathematics, vol. 93 (Springer, New York, 1992). Translated from the Russian
by Robert G. Burns
168. N. Dunford, J.T. Schwartz, Linear Operators. I. General Theory. Pure and Applied Math-
ematics, vol. 7 (Interscience, New York, 1958). With the assistance of W.G. Bade and
R.G. Bartle
169. N. Dunford, J.T. Schwartz, Linear Operators. Part II: Spectral Theory. Self Adjoint Op-
erators in Hilbert Space (Interscience, New York, 1963). With the assistance of William
G. Bade, and Robert G. Bartle
170. E.B. Dynkin, Markov Processes. Vols. I, II. Die Grundlehren der mathematischen Wis-
senschaften, vols. 121, 122 (Academic Press, New York, 1965). Translated with the autho-
rization and assistance of the author by J. Fabius, V. Greenberg, A. Maitra, G. Majone
171. K.D. Elworthy, Stochastic Differential Equations on Manifolds. London Mathematical Soci-
ety Lecture Note Series, vol. 70 (Cambridge University Press, Cambridge, 1982)
172. K.D. Elworthy, Geometric aspects of diffusions on manifolds, in École d’Été de Probabil-
ités de Saint-Flour XV–XVII, 1988–87. Lecture Notes in Math., vol. 1362 (Springer, Berlin,
1985), pp. 277–425
173. M. Émery, J.E. Yukich, A simple proof of the logarithmic Sobolev inequality on the circle, in
Séminaire de Probabilités, XXI. Lecture Notes in Math., vol. 1247 (Springer, Berlin, 1987),
pp. 173–175
174. K.-J. Engel, R. Nagel, One-Parameter Semigroups for Linear Evolution Equations. Graduate
Texts in Mathematics, vol. 194 (Springer, New York, 2000). With contributions by S. Bren-
dle, M. Campiti, T. Hahn, G. Metafune, G. Nickel, D. Pallara, C. Perazzoli, A. Rhandi, S. Ro-
manelli and R. Schnaubelt
175. M. Erbar, The heat equation on manifolds as a gradient flow in the Wasserstein space. Ann.
Inst. Henri Poincaré Probab. Stat. 46(1), 1–23 (2010)
176. M. Erbar, K. Kuwada, K.-T. Sturm, On the equivalence of the entropy curvature-dimension
condition and Bochner’s inequality on metric measure spaces. Preprint, 2013
177. J.F. Escobar, Sharp constant in a Sobolev trace inequality. Indiana Univ. Math. J. 37(3),
687–698 (1988)
178. S.N. Ethier, T.G. Kurtz, Characterization and convergence, in Markov Processes. Wiley Se-
ries in Probability and Mathematical Statistics: Probability and Mathematical Statistics (Wi-
ley, New York, 1986)
179. L.C. Evans, Partial Differential Equations, 2nd edn. Graduate Studies in Mathematics,
vol. 19 (American Mathematical Society, Providence, 2010)
180. L.C. Evans, R.F. Gariepy, Measure Theory and Fine Properties of Functions. Studies in
Advanced Mathematics (CRC Press, Boca Raton, 1992)
181. E.B. Fabes, D.W. Stroock, A new proof of Moser’s parabolic Harnack inequality using the
old ideas of Nash. Arch. Ration. Mech. Anal. 96(4), 327–338 (1986)
182. S. Fang, Inégalité du type de Poincaré sur l’espace des chemins Riemanniens. C. R. Math.
Acad. Sci. Paris, Sér. I 318(3), 257–260 (1994)
183. P. Federbush, A partially alternate derivation of a result of Nelson. J. Math. Phys. 10(1),
50–52 (1969)
184. H. Federer, Geometric Measure Theory. Die Grundlehren der mathematischen Wis-
senschaften, vol. 153 (Springer, New York, 1969)
Bibliography 535
185. É. Fontenas, Sur les constantes de Sobolev des variétés riemanniennes compactes et les fonc-
tions extrémales des sphères. Bull. Sci. Math. 121(2), 71–96 (1997)
186. P. Fougères, Spectral gap for log-concave probability measures on the real line, in Sémi-
naire de Probabilités XXXVIII. Lecture Notes in Math., vol. 1857 (Springer, Berlin, 2005),
pp. 95–123
187. J. Franchi, Y. Le Jan, Hyperbolic Dynamics and Brownian Motions: An Introduction (Oxford
University Press, Oxford, 2012)
188. Y. Fujita, An optimal logarithmic Sobolev inequality with Lipschitz constants. J. Funct. Anal.
261(5), 1133–1144 (2011)
189. M. Fukushima, Dirichlet Forms and Markov Processes. North-Holland Mathematical Li-
brary, vol. 23 (North-Holland, Amsterdam, 1980)
190. M. Fukushima, Y. Oshima, M. Takeda, Dirichlet Forms and Symmetric Markov Processes.
de Gruyter Studies in Mathematics, vol. 19 (Walter de Gruyter, Berlin, 2011), extended edn.
191. M.P. Gaffney, The conservation property of the heat equation on Riemannian manifolds.
Commun. Pure Appl. Math. 12, 1–11 (1959)
192. E. Gagliardo, Proprietà di alcune classi di funzioni in più variabili. Ric. Mat. 7, 102–137
(1958)
193. S. Gallot, Inégalités isopérimétriques et analytiques sur les variétés Riemanniennes.
Astérisque (1988), no. 163–164, pp. 5–6, 31–91, 281 (1989). On the geometry of differ-
entiable manifolds (Rome, 1986)
194. S. Gallot, D. Hulin, J. Lafontaine, Riemannian Geometry, 3nd edn. Universitext (Springer,
Berlin, 2004)
195. R.J. Gardner, The Brunn-Minkowski inequality. Bull. Am. Math. Soc. (N.S.) 39(3), 355–405
(2002)
196. A.M. Garsia, Martingale Inequalities: Seminar Notes on Recent Progress. Mathematics Lec-
ture Notes Series (Benjamin, Reading, 1973)
197. I. Gentil, Ultracontractive bounds on Hamilton-Jacobi solutions. Bull. Sci. Math. 126(6),
507–524 (2002)
198. I. Gentil, The general optimal Lp -Euclidean logarithmic Sobolev inequality by Hamilton-
Jacobi equations. J. Funct. Anal. 202(2), 591–599 (2003)
199. I. Gentil, From the Prékopa-Leindler inequality to modified logarithmic Sobolev inequality.
Ann. Fac. Sci. Toulouse 17(2), 291–308 (2008)
200. I. Gentil, A. Guillin, L. Miclo, Modified logarithmic Sobolev inequalities and transportation
inequalities. Probab. Theory Relat. Fields 133(3), 409–436 (2005)
201. I. Gentil, A. Guillin, L. Miclo, Modified logarithmic Sobolev inequalities in null curvature.
Rev. Mat. Iberoam. 23(1), 235–258 (2007)
202. B. Gidas, J. Spruck, Global and local behavior of positive solutions of nonlinear elliptic
equations. Commun. Pure Appl. Math. 34(4), 525–598 (1981)
203. D. Gilbarg, N.S. Trudinger, Elliptic Partial Differential Equations of Second Order. Classics
in Mathematics (Springer, Berlin, 2001). Reprint of the 1998 edition
204. J. Glimm, Boson fields with nonlinear selfinteraction in two dimensions. Commun. Math.
Phys. 8, 12–25 (1968) (English)
205. J.A. Goldstein, Semigroups of Linear Operators and Applications. Oxford Mathematical
Monographs (The Clarendon Press, New York, 1985)
206. N. Gozlan, A characterization of dimension free concentration in terms of transportation
inequalities. Ann. Probab. 37(6), 2480–2498 (2009)
207. N. Gozlan, Poincaré inequalities and dimension free concentration of measure. Ann. Inst.
Henri Poincaré Probab. Stat. 46(3), 708–739 (2010)
208. N. Gozlan, Transport-entropy inequalities on the line. Electron. J. Probab. 17(49), 1–18
(2012)
209. N. Gozlan, C. Léonard, A large deviation approach to some transportation cost inequalities.
Probab. Theory Relat. Fields 139(1–2), 235–283 (2007)
210. N. Gozlan, C. Léonard, Transport inequalities. A survey. Markov Process. Relat. Fields 16,
635–736 (2010)
536 Bibliography
211. N. Gozlan, C. Roberto, P.-M. Samson, From concentration to logarithmic Sobolev and
Poincaré inequalities. J. Funct. Anal. 260(5), 1491–1522 (2011)
212. N. Gozlan, C. Roberto, P.-M. Samson, Hamilton Jacobi equations on metric spaces and trans-
port entropy inequalities. Rev. Mat. Iberoam. (2013, to appear)
213. N. Gozlan, C. Roberto, P.-M. Samson, Characterization of Talagrand’s transport-entropy in-
equality in metric spaces. Ann. Probab. 41(5), 3112–3139 (2013)
214. A. Grigor’yan, The heat equation on noncompact Riemannian manifolds. Mat. Sb. 182(1),
55–87 (1991)
215. A. Grigor’yan, Analytic and geometric background of recurrence and non-explosion of the
Brownian motion on Riemannian manifolds. Bull. Am. Math. Soc. (N.S.) 36(2), 135–249
(1999)
216. A. Grigor’yan, Isoperimetric inequalities and capacities on Riemannian manifolds, in The
Maz’ya Anniversary Collection, Vol. 1, Rostock, 1998. Oper. Theory Adv. Appl., vol. 109
(Birkhäuser, Basel, 1999), pp. 139–153
217. A. Grigor’yan, Heat Kernel and Analysis on Manifolds. AMS/IP Studies in Advanced Math-
ematics, vol. 47 (American Mathematical Society, Providence, 2009)
218. A. Grigor’yan, Yau’s work on heat kernels, in Geometry and Analysis. No. 1. Adv. Lect.
Math. (ALM), vol. 17 (Int. Press, Somerville, 2011), pp. 113–117
219. G. Grillo, On Persson’s theorem in local Dirichlet spaces. Z. Anal. Anwend. 17(2), 329–338
(1998)
220. M. Gromov, Paul Lévy’s isoperimetric inequality. Preprint, 1980
221. M. Gromov, Metric Structures for Riemannian and Non-Riemannian Spaces. Progress in
Mathematics, vol. 152 (Birkhäuser, Boston, 1999). Based on the 1981 French original. With
appendices by M. Katz, P. Pansu and S. Semmes, Translated from the French by Sean
Michael Bates
222. M. Gromov, V.D. Milman, A topological application of the isoperimetric inequality. Am. J.
Math. 105(4), 843–854 (1983)
223. L. Gross, Existence and uniqueness of physical ground states. J. Funct. Anal. 10, 52–109
(1972)
224. L. Gross, Logarithmic Sobolev inequalities. Am. J. Math. 97(4), 1061–1083 (1975)
225. L. Gross, Logarithmic Sobolev inequalities and contractivity properties of semigroups, in
Dirichlet Forms, Varenna, 1992. Lecture Notes in Math., vol. 1563 (Springer, Berlin, 1993),
pp. 54–88
226. L. Gross, Hypercontractivity, logarithmic Sobolev inequalities, and applications: a survey
of surveys, in Diffusion, Quantum Theory, and Radically Elementary Mathematics. Math.
Notes, vol. 47 (Princeton Univ. Press, Princeton, 2006), pp. 45–73
227. L. Gross, O.S. Rothaus, Herbst inequalities for supercontractive semigroups. J. Math. Kyoto
Univ. 38(2), 295–318 (1998)
228. O. Guédon, Concentration phenomena in high dimensional geometry. ESAIM Proc. (2013,
to appear)
229. A. Guionnet, B. Zegarliński, Lectures on logarithmic Sobolev inequalities, in Séminaire de
Probabilités, XXXVI. Lecture Notes in Math., vol. 1801 (Springer, Berlin, 2003), pp. 1–134
230. P. Hajłasz, P. Koskela, Sobolev met Poincaré. Mem. Am. Math. Soc. 145, 688 (2000),
p. x+101
231. G.H. Hardy, Note on a theorem of Hilbert. Math. Z. 6(3–4), 314–317 (1920)
232. G.H. Hardy, J.E. Littlewood, G. Pólya, Inequalities, 2nd edn. (Cambridge University Press,
Cambridge, 1952)
233. E. Hebey, Sobolev Spaces on Riemannian Manifolds. Lecture Notes in Mathematics,
vol. 1635 (Springer, Berlin, 1996)
234. E. Hebey, Nonlinear Analysis on Manifolds: Sobolev Spaces and Inequalities. Courant Lec-
ture Notes in Mathematics, vol. 5 (New York University Courant Institute of Mathematical
Sciences, New York, 1999)
235. E. Hebey, M. Vaugon, The best constant problem in the Sobolev embedding theorem for
complete Riemannian manifolds. Duke Math. J. 79(1), 235–279 (1995)
Bibliography 537
236. J. Heinonen, Lectures on Analysis on Metric Spaces. Universitext (Springer, New York,
2001)
237. B. Helffer, Remarks on decay of correlations and Witten Laplacians, Brascamp-Lieb inequal-
ities and semiclassical limit. J. Funct. Anal. 155(2), 571–586 (1998)
238. B. Helffer, Semiclassical Analysis, Witten Laplacians, and Statistical Mechanics. Series in
Partial Differential Equations and Applications, vol. 1 (World Scientific, River Edge, 2002)
239. B. Helffer, J. Sjöstrand, On the correlation for Kac-like models in the convex case. J. Stat.
Phys. 74(1–2), 349–409 (1994)
240. S. Helgason, Integral geometry, invariant differential operators, and spherical functions, in
Groups and Geometric Analysis. Mathematical Surveys and Monographs, vol. 83 (American
Mathematical Society, Providence, 2000). Corrected reprint of the 1984 original
241. E. Hille, R.S. Phillips, Functional Analysis and Semi-Groups. American Mathematical So-
ciety Colloquium Publications, vol. 31 (American Mathematical Society, Providence, 1957),
rev. ed.
242. M. Hino, On short time asymptotic behavior of some symmetric diffusions on general state
spaces. Potential Anal. 16(3), 249–264 (2002)
243. F. Hirsch, Intrinsic metrics and Lipschitz functions. J. Evol. Equ. 3(1), 11–25 (2003). Dedi-
cated to Philippe Bénilan
244. P.D. Hislop, I.M. Sigal, Introduction to Spectral Theory. Applied Mathematical Sciences,
vol. 113 (Springer, New York, 1996). With applications to Schrödinger operators
245. R. Holley, D.W. Stroock, Logarithmic Sobolev inequalities and stochastic Ising models.
J. Stat. Phys. 46, 1159–1194 (1987)
246. L. Hörmander, L2 estimates and existence theorems for the ∂¯ operator. Acta Math. 113,
89–152 (1965)
247. L. Hörmander, Differential operators with constant coefficients, in The Analysis of Linear
Partial Differential Operators. II. Grundlehren der mathematischen Wissenschaften [Funda-
mental Principles of Mathematical Sciences], vol. 257 (Springer, Berlin, 1983)
248. L. Hörmander, Distribution theory and Fourier analysis, in The Analysis of Linear Partial
Differential Operators. I, 2nd edn. (Springer, Berlin, 1990). Springer Study Edition
249. L. Hörmander, An Introduction to Complex Analysis in Several Variables, 3nd edn. North-
Holland Mathematical Library, vol. 7 (North-Holland, Amsterdam, 1990)
250. E.P. Hsu, Logarithmic Sobolev inequalities on path spaces over Riemannian manifolds. Com-
mun. Math. Phys. 189(1), 9–16 (1997)
251. E.P. Hsu, Stochastic Analysis on Manifolds. Graduate Studies in Mathematics, vol. 38 (Amer-
ican Mathematical Society, Providence, 2002)
252. N. Ikeda, S. Watanabe, Stochastic Differential Equations and Diffusion Processes, 2nd edn.
North-Holland Mathematical Library, vol. 24 (North-Holland, Amsterdam, 1989)
253. S. Ilias, Constantes explicites pour les inégalités de Sobolev sur les variétés Riemanniennes
compactes. Ann. Inst. Fourier (Grenoble) 33(2), 151–165 (1983)
254. J. Inglis, M. Neklyudov, B. Zegarliński, Ergodicity for infinite particle systems with locally
conserved quantities. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 15(1), 1250005
(2012), 28
255. K. Itô, H.P. McKean, Diffusion Processes and Their Sample Paths. Die Grundlehren der
mathematischen Wissenschaften, vol. 125 (Springer, Berlin, 1974). Second printing, cor-
rected
256. J. Jacod, Calcul Stochastique et Problèmes de Martingales. Lecture Notes in Mathematics,
vol. 714 (Springer, Berlin, 1979)
257. S. Janson, Gaussian Hilbert Spaces. Cambridge Tracts in Mathematics, vol. 129 (Cambridge
University Press, Cambridge, 1997)
258. F. John, Partial Differential Equations, 4th edn. Applied Mathematical Sciences, vol. 1
(Springer, New York, 1982)
259. R. Jordan, D. Kinderlehrer, F. Otto, The variational formulation of the Fokker-Planck equa-
tion. SIAM J. Math. Anal. 29(1), 1–17 (1998)
538 Bibliography
260. J. Jost, Riemannian Geometry and Geometric Analysis, 5th edn. Universitext (Springer,
Berlin, 2008)
261. A. Joulin, N. Privault, Functional inequalities for discrete gradients and application to the
geometric distribution. ESAIM Probab. Stat. 8, 87–101 (2004)
262. R. Kannan, L. Lovász, M. Simonovits, Isoperimetric problems for convex bodies and a lo-
calization lemma. Discrete Comput. Geom. 13(3–4), 541–559 (1995)
263. I. Karatzas, S.E. Shreve, Brownian Motion and Stochastic Calculus, 2nd edn. Graduate Texts
in Mathematics, vol. 113 (Springer, New York, 1991)
264. T. Kato, Perturbation Theory for Linear Operators, 2nd edn. Grundlehren der mathematis-
chen Wissenschaften, vol. 132 (Springer, Berlin, 1976)
265. O. Kavian, G. Kerkyacharian, B. Roynette, Quelques remarques sur l’ultracontractivité.
J. Funct. Anal. 111(1), 155–196 (1993)
266. H. Knothe, Contributions to the theory of convex bodies. Mich. Math. J. 4, 39–52 (1957)
267. M. Knott, C.S. Smith, On the optimal mapping of distributions. J. Optim. Theory Appl. 43(1),
39–49 (1984)
268. R. Koekoek, P.A. Lesky, R.F. Swarttouw, Hypergeometric Orthogonal Polynomials and Their
q-Analogues. Springer Monographs in Mathematics (Springer, Berlin, 2010). With a fore-
word by Tom H. Koornwinder
269. A. Kufner, L.-E. Persson, Weighted Inequalities of Hardy Type (World Scientific, River Edge,
2003)
270. K. Kuwada, Duality on gradient estimates and Wasserstein controls. J. Funct. Anal. 258(11),
3758–3774 (2010)
271. K. Kuwada, Space-time Wasserstein controls and Bakry-Ledoux type gradient estimates.
Preprint, 2013
272. R. Latała, K. Oleszkiewicz, Between Sobolev and Poincaré, in Geometric Aspects of Func-
tional Analysis. Lecture Notes in Math., vol. 1745 (Springer, Berlin, 2000), pp. 147–168
273. M. Ledoux, A simple analytic proof of an inequality by P. Buser. Proc. Am. Math. Soc.
121(3), 951–959 (1994)
274. M. Ledoux, L’algèbre de Lie des gradients itérés d’un générateur markovien—
développements de moyennes et entropies. Ann. Sci. Éc. Norm. Super. 28(4), 435–460
(1995)
275. M. Ledoux, Remarks on logarithmic Sobolev constants, exponential integrability and bounds
on the diameter. J. Math. Kyoto Univ. 35(2), 211–220 (1995)
276. M. Ledoux, Concentration of measure and logarithmic Sobolev inequalities, in Sémi-
naire de Probabilités, XXXIII. Lecture Notes in Math., vol. 1709 (Springer, Berlin, 1999),
pp. 120–216
277. M. Ledoux, The geometry of Markov diffusion generators. Ann. Fac. Sci. Toulouse 9(2),
305–366 (2000). Probability theory
278. M. Ledoux, The Concentration of Measure Phenomenon. Mathematical Surveys and Mono-
graphs, vol. 89 (American Mathematical Society, Providence, 2001)
279. M. Ledoux, Spectral gap, logarithmic Sobolev constant, and geometric bounds, in Surveys in
Differential Geometry, vol. IX (Int. Press, Somerville, 2004), pp. 219–240
280. M. Ledoux, From concentration to isoperimetry: semigroup proofs, in Concentration, Func-
tional Inequalities and Isoperimetry. Contemp. Math., vol. 545 (Am. Math. Soc., Providence,
2011), pp. 155–166
281. P. Lévy, Problèmes Concrets d’Analyse Fonctionnelle, 2nd edn. Des Leçons d’Analyse Fonc-
tionnelle (Gauthier-Villars, Paris, 1951) (French). Avec un complément par F. Pellegrino
282. H.-Q. Li, Estimation optimale du gradient du semi-groupe de la chaleur sur le groupe de
Heisenberg. J. Funct. Anal. 236(2), 369–394 (2006)
283. H.-Q. Li, Estimations asymptotiques du noyau de la chaleur sur les groupes de Heisenberg.
C. R. Math. Acad. Sci. Paris 344(8), 497–502 (2007)
284. P. Li, Geometric Analysis (Cambridge University Press, Cambridge, 2012)
285. P. Li, S.-T. Yau, On the parabolic kernel of the Schrödinger operator. Acta Math. 156(3–4),
153–201 (1986)
Bibliography 539
312. E. Milman, On the role of convexity in isoperimetry, spectral gap and concentration. Invent.
Math. 177(1), 1–43 (2009)
313. E. Milman, A converse to the Maz’ya inequality for capacities under curvature lower bound,
in Around the Research of Vladimir Maz’ya. I. Int. Math. Ser. (N. Y.), vol. 11 (Springer, New
York, 2010), pp. 321–348
314. E. Milman, Isoperimetric and concentration inequalities: equivalence under curvature lower
bound. Duke Math. J. 154(2), 207–239 (2010)
315. E. Milman, Sharp isoperimetric inequalities and model spaces for curvature-dimension-
diameter condition. Preprint, 2012
316. V.D. Milman, G. Schechtman, Asymptotic Theory of Finite-Dimensional Normed Spaces.
Lecture Notes in Mathematics, vol. 1200 (Springer, Berlin, 1986). With an appendix by
M. Gromov
317. D.S. Mitrinović, J.E. Pečarić, A.M. Fink, Inequalities Involving Functions and Their Inte-
grals and Derivatives. Mathematics and Its Applications (East European Series), vol. 53
(Kluwer Academic, Dordrecht, 1991)
318. F. Morgan, Manifolds with density. Not. Am. Math. Soc. 52(8), 853–858 (2005)
319. J. Moser, A Harnack inequality for parabolic differential equations. Commun. Pure Appl.
Math. 17, 101–134 (1964)
320. J. Moser, A sharp form of an inequality by N. Trudinger. Indiana Univ. Math. J. 20,
1077–1092 (1970/71)
321. B. Muckenhoupt, Hardy’s inequality with weights. Stud. Math. 44, 31–38 (1972). Collection
of articles honoring the completion by Antoni Zygmund of 50 years of scientific activity, I
322. C.E. Mueller, F.B. Weissler, Hypercontractivity for the heat semigroup for ultraspherical
polynomials and on the n-sphere. J. Funct. Anal. 48(2), 252–283 (1982)
323. J. Nash, Continuity of solutions of parabolic and elliptic equations. Am. J. Math. 80, 931–954
(1958)
324. B. Nazaret, Best constant in Sobolev trace inequalities on the half-space. Nonlinear Anal.
65(10), 1977–1985 (2006)
325. E. Nelson, A quartic interaction in two dimensions, in Mathematical Theory of Elementary
Particles, Dedham, MA, 1965. Proc. Conf. (MIT Press, Cambridge, 1966), pp. 69–73
326. E. Nelson, Dynamical Theories of Brownian Motion (Princeton University Press, Princeton,
1967)
327. E. Nelson, The free Markoff field. J. Funct. Anal. 12, 211–227 (1973)
328. E. Nelson, Quantum fields and Markoff fields, in Partial Differential Equations, Berkeley,
CA, 1971. Proc. Sympos. Pure Math., vol. XXIII (Am. Math. Soc., Providence, 1973),
pp. 413–420
329. J. Neveu, Sur l’espérance conditionnelle par rapport à un mouvement brownien. Ann. Inst.
Henri Poincaré, Sect. B (N. S.) 12(2), 105–109 (1976)
330. V.H. Nguyen, Sharp weighted Sobolev and Gagliardo-Nirenberg inequalities on half-spaces
via mass transport and consequences. Preprint, 2013
331. L. Nirenberg, On elliptic partial differential equations. Ann. Sc. Norm. Super. Pisa, Cl. Sci.
(3) 13, 115–162 (1959)
332. J.R. Norris, Heat kernel asymptotics and the distance function in Lipschitz Riemannian man-
ifolds. Acta Math. 179(1), 79–103 (1997)
333. J.R. Norris, Markov Chains. Cambridge Series in Statistical and Probabilistic Mathematics,
vol. 2 (Cambridge University Press, Cambridge, 1998). Reprint of 1997 original
334. D. Nualart, The Malliavin Calculus and Related Topics, 2nd edn. Probability and Its Appli-
cations (New York) (Springer, Berlin, 2006)
335. M. Obata, Certain conditions for a Riemannian manifold to be isometric with a sphere.
J. Math. Soc. Jpn. 14, 333–340 (1962)
336. B. Øksendal, Stochastic Differential Equations, 6th edn. Universitext (Springer, Berlin,
2003). An introduction with applications
337. E. Onofri, On the positivity of the effective action in a theory of random surfaces. Commun.
Math. Phys. 86(3), 321–326 (1982)
Bibliography 541
338. R. Osserman, The isoperimetric inequality. Bull. Am. Math. Soc. 84(6), 1182–1238 (1978)
339. F. Otto, The geometry of dissipative evolution equations: the porous medium equation. Com-
mun. Partial Differ. Equ. 26(1–2), 101–174 (2001)
340. F. Otto, C. Villani, Generalization of an inequality by Talagrand and links with the logarith-
mic Sobolev inequality. J. Funct. Anal. 173(2), 361–400 (2000)
341. F. Otto, M. Westdickenberg, Eulerian calculus for the contraction in the Wasserstein distance.
SIAM J. Math. Anal. 37(4), 1227–1255 (2005) (electronic)
342. E.M. Ouhabaz, Analysis of Heat Equations on Domains. London Mathematical Society
Monographs Series, vol. 31 (Princeton University Press, Princeton, 2005)
343. L.E. Payne, H.F. Weinberger, An optimal Poincaré inequality for convex domains. Arch.
Ration. Mech. Anal. 5, 286–292 (1960)
344. A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations.
Applied Mathematical Sciences, vol. 44 (Springer, New York, 1983)
345. A. Persson, Bounds for the discrete part of the spectrum of a semi-bounded Schrödinger
operator. Math. Scand. 8, 143–153 (1960)
346. P. Petersen, Riemannian Geometry. Graduate Texts in Mathematics, vol. 171 (Springer, New
York, 1998)
347. R.G. Pinsky, Positive Harmonic Functions and Diffusion. Cambridge Studies in Advanced
Mathematics, vol. 45 (Cambridge University Press, Cambridge, 1995)
348. H. Poincaré, Sur la théorie analytique de la chaleur. C. R. Séances Acad. Sci. 104, 1753–1759
(1887)
349. H. Poincaré, Sur les equations aux derivees partielles de la physique mathematique. Am. J.
Math. 12(3), 211–294 (1890)
350. P.E. Protter, Stochastic Integration and Differential Equations, 2nd edn. Stochastic Mod-
elling and Applied Probability, vol. 21 (Springer, Berlin, 2005). Version 2.1, Corrected third
printing
351. S.T. Rachev, Probability Metrics and the Stability of Stochastic Models. Wiley Series in
Probability and Mathematical Statistics: Applied Probability and Statistics (Wiley, Chich-
ester, 1991)
352. S.T. Rachev, L. Rüschendorf, Mass Transportation Problems. Vol. I. Probability and Its Ap-
plications (New York) (Springer, New York, 1998). Theory
353. S.T. Rachev, L. Rüschendorf, Mass Transportation Problems. Vol. II. Probability and Its
Applications (New York) (Springer, New York, 1998). Applications
354. J.G. Ratcliffe, Foundations of Hyperbolic Manifolds, 2nd edn. Graduate Texts in Mathemat-
ics, vol. 149 (Springer, New York, 2006)
355. M. Reed, B. Simon, Methods of Modern Mathematical Physics. II. Fourier Analysis, Self-
Adjointness (Academic Press, New York, 1975)
356. M. Reed, B. Simon, Methods of Modern Mathematical Physics. I, 2nd edn. (Academic Press,
New York, 1980). Functional analysis
357. D. Revuz, Markov Chains, 2nd edn. North-Holland Mathematical Library, vol. 11 (North-
Holland, Amsterdam, 1984)
358. D. Revuz, M. Yor, Continuous Martingales and Brownian Motion, 3rd edn. Grundlehren
der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences],
vol. 293 (Springer, Berlin, 1999)
359. F. Riesz, B.Sz. Nagy, Functional Analysis (Frederick Ungar, New York, 1955). Translated
by Leo F. Boron
360. C. Roberto, B. Zegarliński, Orlicz-Sobolev inequalities for sub-Gaussian measures and er-
godicity of Markov semi-groups. J. Funct. Anal. 243(1), 28–66 (2007)
361. M. Röckner, F.-Y. Wang, Weak Poincaré inequalities and L2 -convergence rates of Markov
semigroups. J. Funct. Anal. 185(2), 564–603 (2001)
362. L.C.G. Rogers, D. Williams, Itô calculus, in Diffusions, Markov Processes, and Martingales.
Vol. 2. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical
Statistics (Wiley, New York, 1987)
542 Bibliography
363. L.C.G. Rogers, D. Williams, Foundations, in Diffusions, Markov Processes, and Martingales.
Vol. 1. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical
Statistics, 2nd edn. (Wiley, Chichester, 1994)
364. G. Rosen, Minimum value for c in the Sobolev inequality φ 3 ≤ c∇φ3 . SIAM J. Appl.
Math. 21, 30–32 (1971)
365. J. Rosen, Sobolev inequalities for weight spaces and supercontractivity. Trans. Am. Math.
Soc. 222, 367–376 (1976)
366. J.-P. Roth, Opérateurs dissipatifs et semi-groupes dans les espaces de fonctions continues.
Ann. Inst. Fourier (Grenoble) 26, 1–97 (1976)
367. O.S. Rothaus, Logarithmic Sobolev inequalities and the spectrum of Sturm-Liouville opera-
tors. J. Funct. Anal. 39(1), 42–56 (1980)
368. O.S. Rothaus, Diffusion on compact Riemannian manifolds and logarithmic Sobolev inequal-
ities. J. Funct. Anal. 42(1), 102–109 (1981)
369. O.S. Rothaus, Logarithmic Sobolev inequalities and the spectrum of Schrödinger operators.
J. Funct. Anal. 42(1), 110–120 (1981)
370. O.S. Rothaus, Analytic inequalities, isoperimetric inequalities and logarithmic Sobolev in-
equalities. J. Funct. Anal. 64(2), 296–313 (1985)
371. O.S. Rothaus, Hypercontractivity and the Bakry-Emery criterion for compact Lie groups.
J. Funct. Anal. 65(3), 358–367 (1986)
372. G. Royer, An Initiation to Logarithmic Sobolev Inequalities. SMF/AMS Texts and Mono-
graphs, vol. 14 (American Mathematical Society, Providence, 2007). Translated from the
1999 French original by Donald Babbitt
373. L. Rüschendorf, S.T. Rachev, A characterization of random variables with minimum
L2 -distance. J. Multivar. Anal. 32(1), 48–54 (1990)
374. L. Saloff-Coste, A note on Poincaré, Sobolev, and Harnack inequalities. Int. Math. Res. Not.
2, 27–38 (1992)
375. L. Saloff-Coste, Convergence to equilibrium and logarithmic Sobolev constant on manifolds
with Ricci curvature bounded below. Colloq. Math. 67(1), 109–121 (1994)
376. L. Saloff-Coste, Aspects of Sobolev-Type Inequalities. London Mathematical Society Lecture
Note Series, vol. 289 (Cambridge University Press, Cambridge, 2002)
377. L. Saloff-Coste, Sobolev inequalities in familiar and unfamiliar settings, in Sobolev Spaces
in Mathematics. I. Int. Math. Ser. (N. Y.), vol. 8 (Springer, New York, 2009), pp. 299–343
378. L. Saloff-Coste, The heat kernel and its estimates, in Probabilistic Approach to Geometry.
Adv. Stud. Pure Math., vol. 57 (Math. Soc. Japan, Tokyo, 2010), pp. 405–436
379. K.-i. Sato, Lévy Processes and Infinitely Divisible Distributions. Cambridge Studies in Ad-
vanced Mathematics, vol. 68 (Cambridge University Press, Cambridge, 1999). Translated
from the 1990 Japanese original, Revised by the author
380. R. Schoen, Conformal deformation of a Riemannian metric to constant scalar curvature.
J. Differ. Geom. 20(2), 479–495 (1984)
381. R. Schoen, Variational theory for the total scalar curvature functional for Riemannian metrics
and related topics, in Topics in Calculus of Variations, Montecatini Terme, 1987. Lecture
Notes in Math., vol. 1365 (Springer, Berlin, 1989), pp. 120–154
382. R. Schoen, S.-T. Yau, Conformally flat manifolds, Kleinian groups and scalar curvature.
Invent. Math. 92(1), 47–71 (1988)
383. C.E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423,
623–656 (1948)
384. B. Simon, R. Høegh-Krohn, Hypercontractive semigroups and two dimensional self-coupled
Bose fields. J. Funct. Anal. 9, 121–180 (1972)
385. J. Sjöstrand, Correlation asymptotics and Witten Laplacians. Algebra Anal. 8(1), 160–191
(1996)
386. S. Sobolev, Sur un théorème d’analyse fonctionnelle. Mat. Sb. (N.S.) 4, 471–497 (1938).
(Russian)
387. A.J. Stam, Some inequalities satisfied by the quantities of information of Fisher and Shannon.
Inf. Control 2, 101–112 (1959)
Bibliography 543
388. E.M. Stein, Singular Integrals and Differentiability Properties of Functions. Princeton Math-
ematical Series, vol. 30 (Princeton University Press, Princeton, 1970)
389. E.M. Stein, G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces. Princeton Math-
ematical Series, vol. 32 (Princeton University Press, Princeton, 1971)
390. R.S. Strichartz, Analysis of the Laplacian on the complete Riemannian manifold. J. Funct.
Anal. 52(1), 48–79 (1983)
391. D.W. Stroock, An Introduction to the Analysis of Paths on a Riemannian Manifold. Math-
ematical Surveys and Monographs, vol. 74 (American Mathematical Society, Providence,
2000)
392. D.W. Stroock, Partial Differential Equations for Probabilists. Cambridge Studies in Ad-
vanced Mathematics, vol. 112 (Cambridge University Press, Cambridge, 2008)
393. D.W. Stroock, S.R.S. Varadhan, Multidimensional Diffusion Processes. Classics in Mathe-
matics (Springer, Berlin, 2006). Reprint of the 1997 edition
394. D.W. Stroock, O. Zeitouni, Variations on a theme by Bismut. Astérisque 236, 291–301
(1996). Hommage à P. A. Meyer et J. Neveu
395. K.-T. Sturm, Analysis on local Dirichlet spaces. I. Recurrence, conservativeness and
Lp -Liouville properties. J. Reine Angew. Math. 456, 173–196 (1994)
396. K.-T. Sturm, Analysis on local Dirichlet spaces. II. Upper Gaussian estimates for the funda-
mental solutions of parabolic equations. Osaka J. Math. 32(2), 275–312 (1995)
397. K.-T. Sturm, Diffusion processes and heat kernels on metric spaces. Ann. Probab. 26(1),
1–55 (1998)
398. K.-T. Sturm, The geometric aspect of Dirichlet forms, in New Directions in Dirichlet Forms.
AMS/IP Stud. Adv. Math., vol. 8 (Amer. Math. Soc., Providence, 1998), pp. 233–277
399. K.-T. Sturm, On the geometry of metric measure spaces. I. Acta Math. 196(1), 65–131
(2006)
400. K.-T. Sturm, On the geometry of metric measure spaces. II. Acta Math. 196(1), 133–177
(2006)
401. V.N. Sudakov, B.S. Cirel’son, Extremal properties of half-spaces for spherically invariant
measures. Zap. Nauč. Semin. LOMI 41, 14–24, 165 (1974). Problems in the theory of prob-
ability distributions, II
402. G. Szegő, Orthogonal Polynomials, 4th edn. American Mathematical Society Colloquium
Publications, vol. XXIII (American Mathematical Society, Providence, 1975)
403. M. Talagrand, A new isoperimetric inequality and the concentration of measure phenomenon,
in Geometric Aspects of Functional Analysis (1989–90). Lecture Notes in Math., vol. 1469
(Springer, Berlin, 1991), pp. 94–124
404. M. Talagrand, Transportation cost for Gaussian and other product measures. Geom. Funct.
Anal. 6(3), 587–600 (1996)
405. G. Talenti, Osservazioni sopra una classe di disuguaglianze. Rend. Semin. Mat. Fis. Milano
39, 171–185 (1969)
406. G. Talenti, Best constant in Sobolev inequality. Ann. Mat. Pura Appl. (4) 110, 353–372
(1976)
407. H. Tanaka, An inequality for a functional of probability distributions and its application to
Kac’s one-dimensional model of a Maxwellian gas. Z. Wahrscheinlichkeitstheor. Verw. Geb.
27, 47–52 (1973)
408. H. Tanaka, Probabilistic treatment of the Boltzmann equation of Maxwellian molecules.
Z. Wahrscheinlichkeitstheor. Verw. Geb. 46(1), 67–105 (1978/79)
409. A. Terras, Harmonic Analysis on Symmetric Spaces—Euclidean Space, the Sphere, and the
Poincaré Upper Half-Plane (Springer, Berlin, 2013)
410. G. Tomaselli, A class of inequalities. Boll. Unione Mat. Ital. 2, 622–631 (1969)
411. M. Tomisaki, Comparison theorems on Dirichlet norms and their applications. Forum Math.
2(3), 277–295 (1990)
412. G. Toscani, Entropy production and the rate of convergence to equilibrium for the Fokker-
Planck equation. Q. Appl. Math. 57(3), 521–541 (1999)
544 Bibliography
A C
Adjoint operator, 45, 48, 133, 141, 170 Caffarelli’s
contraction Theorem, 447
B regularity properties, 437, 467
Cameron-Martin Hilbert space, 110, 490
Beckner inequality, 312, 384, 388
Capacity, 392, 429
Bessel
Capacity inequality, 394, 397, 400, 401, 404,
operator, 95
407, 411, 429
process, 94 Carré du champ operator , 20, 34, 37, 42, 55,
semigroup, 61, 94 74, 78, 82, 84, 89, 91, 98, 103, 111,
Beta distribution, 114 114, 120, 140, 169, 512
Bobkov inequality, 417, 430 Cattiaux’s inequality, 248, 353
local, 417 Cauchy kernel, 80
reverse, 422 Change of coordinates, 56
Bochner-Lichnerowicz formula, 509, 519 Change of variables formula, 43, 124, 169,
Boundary conditions 171, 493, 516
Dirichlet, 93, 97, 99, 200, 221 Chapman-Kolmogorov equation, 16, 28, 264
Neumann, 93, 97, 100, 199, 200, 273 Chart, 500
periodic, 200, 273 Christoffel symbols, 506
Bounded perturbation Closable operator, 476
-entropy inequality, 383 Co-area formula, 395, 413, 429, 430
logarithmic Sobolev inequality, 240, 274 Co-metric, 47, 82, 504, 505
Poincaré inequality, 185, 231 Commutation property, 144–146, 173, 259
transportation cost inequality, 459, 468 Hamilton-Jacobi, 462, 468
Brascamp-Lieb inequality, 215, 232 Compact operator, 483
Completeness, 141, 157, 172, 510
Brenier’s map, 436, 447, 467
Concentration
Brownian motion, 487
exponential, 426, 431, 467
Euclidean, 78
Gaussian, 426, 431, 441, 460, 467
hyperbolic, 89 Concentration under
killed, 94 logarithmic Sobolev inequality, 255, 274
reflected, 94 Nash inequality, 368
spherical, 85 Poincaré inequality, 192, 231
Brownian semigroup weak Poincaré inequality, 381
Euclidean, 79 Conformal
hyperbolic, 89 invariance, 314, 342
spherical, 81, 85 invariant, 315, 321
map, 84, 92 E
Connection, 506 Eigenvalue, 478
Levi-Civita, 507 Eigenvector, 478
Connexity, 33, 140, 156, 171 Ellipticity, 40, 49–51, 511
Continuity property, 11, 54 hypo-, 51
Contraction in Wasserstein space, 463, 468 semi-, 40, 43, 49, 50
Contraction principle uniform, 50
weak hypo-, 141, 157, 172
logarithmic Sobolev inequality, 254
Entropic inequality, 236
Poincaré inequality, 192
Entropy, 236, 273, 348
Contraction semigroup, 10, 18, 473
relative, 237, 244, 438, 445
Convergence to equilibrium, 33 Entropy-energy inequality, 281, 349, 387
Core, 19, 52, 101, 134, 142, 477 Ergodic semigroup, 33
Cost function, 434 Ergodicity, 33, 70, 135, 170
Coupling, 434 ESA property, 134, 142, 157, 170
Covariant derivative, 507 Essentially self-adjoint, 95, 101, 134, 170,
Curvature condition CD(ρ, ∞), 72, 104, 111, 477, 482
144, 209, 259, 416, 443, 457, 462, Explosion time, 495
463, 513 Exponential decay
reinforced, 146, 172, 421, 515 -entropy, 383
Curvature-dimension condition CD(ρ, n), 72, entropy, 244, 274
75, 79, 87, 92, 104, 115, 159, 172, variance, 183, 231
211, 232, 268, 275, 298, 305, 337, Exponential integrability
514 eigenvector, 250, 274
Cut-locus, 510 logarithmic Sobolev inequality, 252, 274
Poincaré inequality, 190, 231
Exponential measure, 188, 231
D Extended algebra, 152
De Bruijn’s identity, 245, 274, 333 Extremal function, 259, 285, 320, 448, 467
Del Pino-Dolbeault Theorem, 326, 342
Density kernel, 14, 25, 286, 356, 371 F
Diameter, 143, 166, 173, 293, 341 F -Sobolev inequality, 388, 430
Differentiable manifold, 500 Faber-Krahn inequality, 183, 284, 399, 429
Diffusion Fast diffusion equation, 331, 342
operator, 43 Feynman-Kac formula, 64
process, 38, 496 Filtration, 488
property, 56, 122 Fisher information, 237, 333, 445
Dirichlet Fokker-Planck equation, 24, 28, 332
domain, 30, 55, 125, 155, 169 Friedrichs extension, 128, 477
form, 30, 55, 125, 169, 477
norm, 125, 169 G
2 operator, 75, 79, 104, 144, 157, 172, 173,
Displacement convexity, 465, 468
275, 512, 519
Distance
Gagliardo-Nirenberg inequality, 324, 342, 451
geodesic, 510 Gamma distribution, 111
harmonic, 360 Gaussian measure, 103, 109, 296
intrinsic, 166, 173 Generalized inverse, 58, 350
Riemannian, 166, 510 Geodesic, 509
total variation, 244, 436, 438 optimal transportation, 465
Wasserstein, 436, 445 Girsanov transformation, 62
Doob’s stopping time Theorem, 491 Good measurable space, 7, 53, 169
Dual semigroup, 9, 464 Gozlan’s Theorem, 460, 468
Duhamel’s formula, 130, 144, 145, 207, 214, Gradient, 506
232 vector field, 47, 65
Index 549