0% found this document useful (0 votes)
4 views110 pages

Schilling16

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 110

An Introduction to Lévy and Feller

Processes
arXiv:1603.00251v2 [math.PR] 17 Oct 2016

– Advanced Courses in Mathematics - CRM Barcelona 2014 –

René L. Schilling

TU Dresden, Institut für Mathematische Stochastik,


01062 Dresden, Germany

rene.schilling@tu-dresden.de
http://www.math.tu-dresden.de/sto/schilling

These course notes will be published, together Davar Khoshnevisan’s notes on Invariance and Comparison
Principles for Parabolic Stochastic Partial Differential Equations as From Lévy-Type Processes to
Parabolic SPDEs by the CRM, Barcelona and Birkäuser, Cham 2017 (ISBN: 978-3-319-34119-4). The
arXiv-version and the published version may differ in layout, pagination and wording, but not in content.
Contents

Preface 3

Symbols and notation 5

1. Orientation 7

2. Lévy processes 12

3. Examples 16

4. On the Markov property 24

5. A digression: semigroups 30

6. The generator of a Lévy process 36

7. Construction of Lévy processes 44

8. Two special Lévy processes 49

9. Random measures 55

10. A digression: stochastic integrals 64

11. From Lévy to Feller processes 75

12. Symbols and semimartingales 84

13. Dénouement 93

A. Some classical results 97

Bibliography 104

1
Preface

These lecture notes are an extended version of my lectures on Lévy and Lévy-type processes
given at the Second Barcelona Summer School on Stochastic Analysis organized by the Centre
de Recerca Matemàtica (CRM). The lectures are aimed at advanced graduate and PhD students.
In order to read these notes, one should have sound knowledge of measure theoretic probability
theory and some background in stochastic processes, as it is covered in my books Measures,
Integals and Martingales [54] and Brownian Motion [56].

My purpose in these lectures is to give an introduction to Lévy processes, and to show how
one can extend this approach to space inhomogeneous processes which behave locally like Lévy
processes. After a brief overview (Chapter 1) I introduce Lévy processes, explain how to char-
acterize them (Chapter 2) and discuss the quintessential examples of Lévy processes (Chapter 3).
The Markov (loss of memory) property of Lévy processes is studied in Chapter 4. A short analytic
interlude (Chapter 5) gives an introduction to operator semigroups, resolvents and their generators
from a probabilistic perspective. Chapter 6 brings us back to generators of Lévy processes which
are identified as pseudo differential operators whose symbol is the characteristic exponent of the
Lévy process. As a by-product we obtain the Lévy–Khintchine formula.
Continuing this line, we arrive at the first construction of Lévy processes in Chapter 7. Chap-
ter 8 is devoted to two very special Lévy processes: (compound) Poisson processes and Brownian
motion. We give elementary constructions of both processes and show how and why they are
special Lévy processes, indeed. This is also the basis for the next chapter (Chapter 9) where we
construct a random measure from the jumps of a Lévy process. This can be used to provide a
further construction of Lévy processes, culminating in the famous Lévy–Itô decomposition and
yet another proof of the Lévy–Khintchine formula.
A second interlude (Chapter 10) embeds these random measures into the larger theory of ran-
dom orthogonal measures. We show how we can use random orthogonal measures to develop an
extension of Itô’s theory of stochastic integrals for square-integrable (not necessarily continuous)
martingales, but we restrict ourselves to the bare bones, i.e. the L2 -theory. In Chapter 11 we in-
troduce Feller processes as the proper spatially inhomogeneous brethren of Lévy processes, and
we show how our proof of the Lévy–Khintchine formula carries over to this setting. We will see,
in particular, that Feller processes have a symbol which is the state-space dependent analogue of
the characteristic exponent of a Lévy process. The symbol describes the process and its gener-
ator. A probabilistic way to calculate the symbol and some first consequences (in particular the
semimartingale decomposition of Feller processes) is discussed in Chapter 12; we also show that

3
4 R. L. Schilling: An Introduction to Lévy and Feller Processes

the symbol contains information on global properties of the process, such as conservativeness. In
the final Chapter 13, we summarize (mostly without proofs) how other path properties of a Feller
process can be obtained via the symbol. In order to make these notes self-contained, we collect in
the appendix some material which is not always included in standard graduate probability courses.

It is now about time to thank many individuals who helped to bring this enterprise on the way.
I am grateful to the scientific board and the organizing committee for the kind invitation to deliver
these lectures at the Centre de Recerca Matemàtica in Barcelona. The CRM is a wonderful place
to teach and to do research, and I am very happy to acknowledge their support and hospitality. I
would like to thank the students who participated in the CRM course as well as all students and
readers who were exposed to earlier (temporally & spatially inhomogeneous. . . ) versions of my
lectures; without your input these notes would look different!
I am greatly indebted to Ms. Franziska Kühn for her interest in this topic; her valuable comments
pinpointed many mistakes and helped to make the presentation much clearer.
And, last and most, I thank my wife for her love, support and forbearance while these notes
were being prepared.

Dresden, September 2015 René L. Schilling


Symbols and notation

This index is intended to aid cross-referencing, so notation that is specific to a single chapter is
generally not listed. Some symbols are used locally, without ambiguity, in senses other than those
given below; numbers following an entry are page numbers.
Unless otherwise stated, functions are real-valued and binary operations between functions such
n→∞
as f ± g, f · g, f ∧ g, f ∨ g, comparisons f 6 g, f < g or limiting relations fn −−−→ f , limn fn ,
lim infn fn , lim supn fn , supn fn or infn fn are understood pointwise.

General notation: analysis eξ (x) e− i x·ξ

positive always in the sense > 0


General notation: probability
negative always in the sense 6 0
N 1, 2, 3, . . . ∼ ‘is distributed as’

inf 0/ inf 0/ = +∞ ⊥
⊥ ‘is stochastically independent’

a∨b maximum of a and b a.s. almost surely (w. r. t. P)

a∧b minimum of a and b iid independent and identically


distributed
bxc largest integer n 6 x
N, Exp, Poi normal, exponential, Poisson
|x| norm in Rd : |x|2 = x12 + · · · + xd2
distribution
x·y scalar product in Rd : ∑dj=1 x j y j
 P, E probability, expectation
1, x ∈ A
1A 1A (x) = V, Cov variance, covariance
0, x ∈ /A
(L0)–(L3) definition of a Lévy process, 7
δx point mass at x
(L20 ) 12
D domain
∆ Laplace operator Sets and σ -algebras

∂j partial derivative ∂
∂xj
Ac complement of the set A
∂ > A closure of the set A


∇, ∇x gradient ∂ x1 , . . . , ∂ xd
F f , fb Fourier transform A ∪· B disjoint union, i.e. A ∪ B for
(2π)−d e− i x·ξ f (x) dx
R
disjoint sets A ∩ B = 0/

F−1 f , fq inverse Fourier transform Br (x) open ball,


R i x·ξ
e f (x) dx centre x, radius r
supp f support, { f 6= 0}

5
6 R. L. Schilling: An Introduction to Lévy and Feller Processes

B(E) Borel sets of E Spaces of functions


FtX canonical filtration σ (Xs : s 6 t) B(E) Borel functions on E
F∞ σ t>0 Ft
S 
Bb (E) – – , bounded
Fτ 75 C(E) continuous functions on E
Fτ+ 29 Cb (E) – – , bounded
P predictable σ -algebra, 101 C∞ (E) – – , lim f (x) = 0
|x|→∞
Stochastic processes Cc (E) – – , compact support
Px , Ex law and mean of a Markov Cn (E) n times continuously diff’ble
process starting at x, 24 functions on E
Xt− left limit lims↑t Xs Cnb (E) – – , bounded (with all
∆Xt jump at time t: Xt − Xt− derivatives)

σ,τ stopping times: {σ 6 t} ∈ Ft , Cn∞ (E) – – , 0 at infinity (with all


t >0 derivatives)

τrx , τr inf{t > 0 : |Xt − X0 | > r}, first Cnc (E) – – , compact support
exit time from the open ball Br (x) L p (E, µ), L p (µ), L p (E) L p space w. r. t. the
centered at x = X0 measure space (E, A , µ)
càdlàg right continuous on [0, ∞) with S(Rd ) rapidly decreasing smooth
finite left limits on (0, ∞) functions on Rd , 36
1. Orientation

Stochastic processes with stationary and independent increments are classical examples of Markov
processes. Their importance both in theory and for applications justifies to study these processes
and their history.
The origins of processes with independent increments reach back to the late 1920s and they are
closely connected with the notion of infinite divisibility and the genesis of the Lévy–Khintchine
formula. Around this time, the limiting behaviour of sums of independent random variables

X0 := 0 and Xn := ξ1 + ξ2 + · · · + ξn , n ∈ N,

was well understood through the contributions of Borel, Markov, Cantelli, Lindeberg, Feller, de
Finetti, Khintchine, Kolmogorov and, of course, Lévy; two new developments emerged, on the
one hand the study of dependent random variables and, on the other, the study of continuous-time
analogues of sums of independent random variables. In order to pass from n ∈ N to a continuous
parameter t ∈ [0, ∞) we need to replace the steps ξk by increments Xt − Xs . It is not hard to see
that Xt , t ∈ N, with iid (independent and identically distributed) steps ξk enjoys the following
properties:

X0 = 0 a.s. (L0)
stationary increments Xt − Xs ∼ Xt−s − X0 ∀s 6 t (L1)
independent increments Xt − Xs ⊥
⊥ σ (Xr , r 6 s) ∀s 6 t (L2)

where ‘∼’ stands for ‘same distribution’ and ‘⊥ ⊥’ for stochastic independence. In the non-discrete
setting we will also require a mild regularity condition

continuity in probability lim P(|Xt − X0 | > ε) = 0 ∀ε > 0 (L3)


t→0

which rules out fixed discontinuities of the path t 7→ Xt . Under (L0)–(L2) one has that
n
Xt = ∑ ξk,n (t) and ξk,n (t) = (X kt − X (k−1)t ) are iid (1.1)
n n
k=1

for every n ∈ N. Letting n → ∞ shows that Xt arises as a suitable limit of (a triangular array of)
iid random variables which transforms the problem into a question of limit theorems and infinite
divisibility.

7
8 R. L. Schilling: An Introduction to Lévy and Feller Processes

This was first observed in 1929 by de Finetti [15] who introduces (without naming it, the name
is due to Bawly [6] and Khintchine [29]) the concept of infinite divisibility of a random variable
X
n
∀n ∃ iid random variables ξi,n : X ∼ ∑ ξi,n (1.2)
i=1

and asks for the general structure of infinitely divisible random variables. His paper contains two
remarkable results on the characteristic function χ(ξ ) = E ei ξ ·X of an infinite divisible random
variable (taken from [39]):

De Finetti’s first theorem. A random variable X is infinitely divisible if, and only if, its charac-
 
teristic function is of the form χ(ξ ) = limn→∞ exp − pn (1 − φn (ξ )) where pn > 0 and φn
is a characteristic function.

De Finetti’s second theorem. The characteristic function of an infinitely divisible random vari-
able X is the limit of finite products of Poissonian characteristic functions

χn (ξ ) = exp − pn (1 − ei hn ξ ) ,
 

and the converse is also true. In particular, all infinitely divisible laws are limits of convo-
lutions of Poisson distributions.

Because of (1.1), Xt is infinitely divisible and as such one can construct, in principle, all indepen-
dent-increment processes Xt as limits of sums of Poisson random variables. The contributions
of Kolmogorov [31], Lévy [37] and Khintchine [28] show the exact form of the characteristic
function of an infinitely divisible random variable

1
Z  
− log E ei ξ ·X = − i l · ξ + ξ · Qξ + 1 − ei y·ξ + i ξ · y1(0,1) (|y|) ν(dy) (1.3)
2 y6=0

where l ∈ Rd , Q ∈ Rd×d is a positive semidefinite symmetric matrix, and ν is a measure on


Rd \ {0} such that y6=0 min{1, |y|2 } ν(dy) < ∞. This is the famous Lévy–Khintchine formula.
R

The exact knowledge of (1.3) makes it possible to find the approximating Poisson variables in de
Finetti’s theorem explicitly, thus leading to a construction of Xt .
A little later, and without knowledge of de Finetti’s results, Lévy came up in his seminal paper
[37] (see also [38, Chap. VII]) with a decomposition of Xt in four independent components: a
deterministic drift, a Gaussian part, the compensated small jumps and the large jumps ∆Xs :=
Xs − Xs− . This is now known as Lévy–Itô decomposition:
p Z !
Xt = tl + QWt + lim ∑ ∆X s − t y ν(dy) + ∑ ∆Xs (1.4)
ε→0
0<s6t 0<s6t
ε6|∆Xs |<1 ε6|y|<1 |∆Xs |>1
p ZZ ZZ
= tl + QWt + y (N(ds, dy) − ds ν(dy)) + y N(ds, dy). (1.5)
(0,t]×B1 (0) (0,t]×B1 (0)c
Chapter 1: Orientation 9

Lévy uses results from the convergence of random series, notably Kolmogorov’s three series the-
orem, in order to explain the convergence of the series appearing in (1.4). A rigorous proof based
on the representation (1.5) are due to Itô [23] who completed Lévy’s programme to construct Xt .
The coefficients l, Q, ν are the same as in (1.3), W is a d-dimensional standard Brownian mo-
tion, and Nω ((0,t] × B) is the random measure #{s ∈ (0,t] : Xs (ω) − Xs− (ω) ∈ B} counting the
jumps of X; it is a Poisson random variable with intensity EN((0,t] × B) = tν(B) for all Borel sets
B ⊂ Rd \ {0} such that 0 ∈ / B.
Nowadays there are at least six possible approaches to constructing processes with (stationary
and) independent increments X = (Xt )t>0 .

The de Finetti–Lévy(–Kolmogorov–Khintchine) construction. The starting point is the


observation that each Xt satisfies (1.1) and is, therefore, infinitely divisible. Thus, the character-
istic exponent log E ei ξ ·Xt is given by the Lévy–Khintchine formula (1.3), and using the triplet

(l, Q, ν) one can construct a drift lt, a Brownian motion QWt and compound Poisson processes,
i.e. Poisson processes whose intensities y ∈ Rd are mixed with respect to the finite measure
νε (dy) := 1[ε,∞) (|y|)ν(dy). Using a suitable compensation (in the spirit of Kolmogorov’s three
series theorem) of the small jumps, it is possible to show that the limit ε → 0 exists locally uni-
formly in t. A very clear presentation of this approach can be found in Breiman [10, Chapter
14.7–8], see also Chapter 7.

The Lévy–Itô construction. This is currently the most popular approach to independent-in-
crement processes, see e.g. Applebaum [2, Chapter 2.3–4] or Kyprianou [36, Chapter 2]. Origi-
nally the idea is due to Lévy [37], but Itô [23] gave the first rigorous construction. It is based on
the observation that the jumps of a process with stationary and independent increments define a
Poisson random measure Nω ([0,t] × B) and this can be used to obtain the Lévy–Itô decomposition
(1.5). The Lévy–Khintchine formula is then a corollary of the pathwise decomposition. Some of
the best presentations can be found in Gikhman–Skorokhod [18, Chapter VI], Itô [24, Chapter 4.3]
and Bretagnolle [11]. A proof based on additive functionals and martingale stochastic integrals is
due to Kunita & Watanabe [35, Section 7]. We follow this approach in Chapter 9.

Variants of the Lévy–Itô construction. The Lévy–Itô decomposition (1.5) is, in fact, the
semimartingale decomposition of a process with stationary and independent increments. Using the
general theory of semimartingales – which heavily relies on general random measures – we can
identify processes with independent increments as those semimartingales whose semimartingale
characteristics are deterministic, cf. Jacod & Shiryaev [27, Chapter II.4c]. A further interesting
derivation of the Lévy–Itô decomposition is based on stochastic integrals driven by martingales.
The key is Itô’s formula and, again, the fact that the jumps of a process with stationary and in-
dependent increments defines a Poisson point process which can be used as a good stochastic
10 R. L. Schilling: An Introduction to Lévy and Feller Processes

integrator; this unique approach1 can be found in Kunita [34, Chapter 2].

Kolmogorov’s construction. This is the classic construction of stochastic processes starting


from the finite-dimensional distributions. For a process with stationary and independent incre-
ments these are given as iterated convolutions of the form

E f (Xt0 , . . . , Xtn )
Z Z
= ··· f (y0 , y0 + y1 , . . . , y0 + · · · + yn ) pt0 (dy0 )pt1 −t0 (dy1 ) . . . ptn −tn−1 (dyn )

with pt (dy) = P(Xt ∈ dy) or ei ξ ·y pt (dy) = exp [−tψ(ξ )] where ψ is the characteristic exponent
R

(1.3). Particularly nice presentations are those of Sato [51, Chapter 2.10–11] and Bauer [5, Chapter
37].

The invariance principle. Just as for a Brownian motion, it is possible to construct Lévy
processes as limits of (suitably interpolated) random walks. For finite dimensional distributions
this is done in Gikhman & Skorokhod [18, Chapter IX.6]; for the whole trajectory, i.e. in the
space of càdlàg2 functions D[0, 1] equipped with the Skorokhod topology, the proper references
are Prokhorov [42] and Grimvall [19].

Random series constructions. A series representation of an independent-increment pro-


cess (Xt )t∈[0,1] is an expression of the form
n 
Xt = lim ∑ Jk 1[0,t] (Uk ) − tck a.s.
n→∞
k=1

The random variables Jk represent the jumps, Uk are iid uniform random variables and ck are
suitable deterministic centering terms. Compared with the Lévy–Itô decomposition (1.4), the
main difference is the fact that the jumps are summed over a deterministic index set {1, 2, . . . n}
while the summation in (1.4) extends over the random set {s : |∆Xs | > 1/n}. In order to construct
a process with characteristic exponent (1.3) where l = 0 and Q = 0, one considers a disintegration
Z ∞
ν(dy) = σ (r, dy) dr.
0

It is possible, cf. Rosiński [47], to choose σ (r, dy) = P(H(r,Vk ) ∈ dy) where V = (Vk )k∈N is any
sequence of d-dimensional iid random variables and H : (0, ∞) × Rd → Rd is measurable. Now
let Γ = (Γk )k∈N be a sequence of partial sums of iid standard exponential random variables and
U = (Uk )k∈N iid uniform random variables on [0, 1] such that U,V, Γ are independent. Then
Z k Z
Jk := H(Γk ,Vk ) and ck = y σ (r, dy) dr
k−1 |y|<1
1 It reminds of the elegant use of Itô’s formula in Kunita-and-Watanabe’s proof of Lévy’s characterization of Brow-

nian motion, see e.g. Schilling & Partzsch [56, Chapter 18.2].
2 A french acronym meaning ‘right-continuous and finite limits from the left’.
Chapter 1: Orientation 11

is the sought-for series representation, cf. Rosiński [47] and [46]. This approach is important if
one wants to simulate independent-increment processes. Moreover, it still holds for Banach space
valued random variables.
2. Lévy processes

Throughout this chapter, (Ω, A , P) is a fixed probability space, t0 = 0 6 t1 6 . . . 6 tn and 0 6 s < t


are positive real numbers, and ξk , ηk , k = 1, . . . , n, denote vectors from Rd ; we write ξ · η for the
Euclidean scalar product.

Definition 2.1. A Lévy process X = (Xt )t>0 is a stochastic process Xt : Ω → Rd satisfying (L0)–
(L3); this is to say that X starts at zero, has stationary and independent increments and is continu-
ous in probability.

One should understand Lévy processes as continuous-time versions of sums of iid random vari-
ables. This can easily be seen from the telescopic sum
n 
Xt − Xs = ∑ Xtk − Xtk−1 , s < t, n ∈ N, (2.1)
k=1

where tk = s + nk (t − s). Since the increments Xtk − Xtk−1 are iid random variables, we see that all
Xt of a Lévy process are infinitely divisible, i.e. (1.2) holds. Many properties of a Lévy process
will, therefore, resemble those of sums of iid random variables.
Let us briefly discuss the conditions (L0)–(L3).
Remark 2.2. We have stated (L2) using the canonical filtration FtX := σ (Xr , r 6 t) of the process
X. Often this condition is written in the following way

Xtn − Xtn−1 , . . . , Xt1 − Xt0 are independent random variables


(L20 )
for all n ∈ N, t0 = 0 < t1 < · · · < tn .

It is easy to see that this is actually equivalent to (L2): From


bi-measurable
(Xt1 , . . . , Xtn ) ←−−−−−−−→ (Xt1 − Xt0 , . . . , Xtn − Xtn−1 )

it follows that

FtX = σ (Xt1 , . . . , Xtn ), 0 6 t1 6 . . . 6 tn 6 t




= σ (Xt1 − Xt0 , . . . , Xtn − Xtn−1 ), 0 = t0 6 t1 6 . . . 6 tn 6 t (2.2)

= σ Xu − Xv , 0 6 v 6 u 6 t ,

and we conclude that (L2) and (L20 ) are indeed equivalent.


The condition (L3) is equivalent to either of the following

12
Chapter 2: Lévy processes 13

• ‘t 7→ Xt is continuous in probability’;

• ‘t 7→ Xt is a.s. càdlàg’1 (up to a modification of the process).

The equivalence with the first claim, and the direction ‘⇐’ of the second claim are easy:
1
lim P(|Xu − Xt | > ε) = lim P(|X|t−u| | > ε) 6 lim E(|Xh | ∧ ε), (2.3)
u→t |t−u|→0 h→0 ε

but it takes more effort to show that continuity in probatility (L3) guarantees that almost all paths
are càdlàg.2 Usually, this is proved by controlling the oscillations of the paths of a Lévy process,
cf. Sato [51, Theorem 11.1], or by the fundamental regularization theorem for submartingales, see
Revuz & Yor [44, Theorem II.(2.5)] and Remark 11.2; in contrast to the general martingale setting
[44, Theorem II.(2.9)], we do not need to augment the natural filtration because of (L1) and (L3).
Since our construction of Lévy processes gives directly a càdlàg version, we do not go into further
detail.
The condition (L3) has another consequence. Recall that the Cauchy–Abel functional equa-
tions have unique solutions if, say, φ , ψ and θ are (right-)continuous:

φ (s + t) = φ (s) · φ (t) φ (t) = φ (1)t ,


ψ(s + t) = ψ(s) + ψ(t) (s,t > 0) =⇒ ψ(t) = ψ(1) · t (2.4)
c
θ (st) = θ (s) · θ (t) θ (t) = t , c > 0.

The first equation is treated in Theorem A.1 in the appendix. For a thorough discussion on condi-
tions ensuring uniqueness we refer to Aczel [1, Chapter 2.1].

Proposition 2.3. Let (Xt )t>0 be a Lévy process in Rd . Then


t
E ei ξ ·Xt = E ei ξ ·X1 , t > 0, ξ ∈ Rd .

(2.5)

Proof. Fix s,t > 0. We get


(L2) (L1)
E ei ξ ·(Xt+s −Xs )+i ξ ·Xs = E ei ξ ·(Xt+s −Xs ) E ei ξ ·Xs = E ei ξ ·Xt E ei ξ ·Xs ,

or φ (t + s) = φ (t) · φ (s), if we write φ (t) = E ei ξ ·Xt . Since x 7→ ei ξ ·x is continuous, there is for


every ε > 0 some δ > 0 such that

|φ (t) − φ (s)| 6 E ei ξ ·(Xt −Xs ) − 1 6 ε + 2P(|Xt − Xs | > δ ) = ε + 2P(|X|t−s| | > δ ).

Thus, (L3) guarantees that t 7→ φ (t) is continuous, and the claim follows from (2.4).

Notice that any solution f (t) of (2.4) also satisfies (L0)–(L2); by Proposition 2.3 Xt + f (t)
is a Lévy process if, and only if, f (t) is continuous. On the other hand, Hamel, cf. [1, p. 35],
constructed discontinuous (non-measurable and locally unbounded) solutions to (2.4). Thus, (L3)
means that t 7→ Xt has no fixed discontinuities, i.e. all jumps occur at random times.
1 ‘Right-continuous and finite limits from the left’
2 More precisely: that there exists a modification of X which has almost surely càdlàg paths.
14 R. L. Schilling: An Introduction to Lévy and Feller Processes

Corollary 2.4. The finite-dimensional distributions P(Xt1 ∈ dx1 , . . . , Xtn ∈ dxn ) of a Lévy process
are uniquely determined by
!
n n h itk −tk−1
E exp i ∑ ξk · Xtk = ∏ E exp (i(ξk + · · · + ξn ) · X1 ) (2.6)
k=1 k=1

for all ξ1 , . . . , ξn ∈ Rd , n ∈ N and 0 = t0 6 t1 6 . . . 6 tn .

Proof. The left-hand side of (2.6) is just the characteristic function of (Xt1 , . . . , Xtn ). Consequently,
the assertion follows from (2.6). Using Proposition 2.3, we have
! !
n n−2
E exp i ∑ ξk · Xtk = E exp i ∑ ξk · Xtk + i(ξn + ξn−1 ) · Xtn−1 + i ξn · (Xtn − Xtn−1 )
k=1 k=1
!
n−2 tn −tn−1
(L2)
= E exp i ∑ ξk · Xtk + i(ξn + ξn−1 ) · Xtn−1 E ei ξn ·X1 .
(L1)
k=1

Since the first half of the right-hand side has the same structure as the original expression, we can
iterate this calculation and obtain (2.6).

It is not hard to invert the Fourier transform in (2.6). Writing pt (dx) := P(Xt ∈ dx) we get
Z Z n
P(Xt1 ∈ B1 , . . . , Xtn ∈ Bn ) = ··· ∏ 1B (x1 + · · · + xk )pt −t
k k k−1
(dxk ) (2.7)
k=1
Z Z n
= ··· ∏ 1B (yk )pt −t
k k k−1
(dyk − yk−1 ). (2.8)
k=1

Let us discuss the structure of the characteristic function χ(ξ ) = E ei ξ ·X1 of X1 . From (2.1)
we see that each random variable Xt of a Lévy process is infinitely divisible. Clearly, |χ(ξ )|2 is
the (real-valued) characteristic function of the symmetrization Xe1 = X1 − X10 (X10 is an independent
copy of X1 ) and Xe1 is again infinitely divisible:
n nh i
0 0
Xe1 = ∑ (Xek − Xek−1 ) =
n n
∑ (X k − X
n
k−1 ) − (X
n
k − X
n
k−1 ) .
n
k=1 k=1

In particular, |χ|2 = |χ1/n |2n where |χ1/n |2 is the characteristic function of Xe1/n . Since everything
is real and |χ(ξ )| 6 1, we get

θ (ξ ) := lim |χ1/n (ξ )|2 = lim |χ(ξ )|2/n , ξ ∈ Rd ,


n→∞ n→∞

which is 0 or 1 depending on |χ(ξ )| = 0 or |χ(ξ )| > 0, respectively. As χ(ξ ) is continuous at


ξ = 0 with χ(0) = 1, we have θ ≡ 1 in a neighbourhood Br (0) of 0. Now we can use Lévy’s
continuity theorem (Theorem A.5) and conclude that the limiting function θ (ξ ) is continuous
everywhere, hence θ ≡ 1. In particular, χ(ξ ) has no zeroes.
Chapter 2: Lévy processes 15

Corollary 2.5. Let (Xt )t>0 be a Lévy process in Rd . There exists a unique continuous function
ψ : Rd → C such that
E exp(i ξ · Xt ) = e−tψ(ξ ) , t > 0, ξ ∈ Rd .

The function ψ is called the characteristic exponent.

Proof. In view of Proposition 2.3 it is enough to consider t = 1. Set χ(ξ ) := E exp(i ξ · X1 ). An


obvious candidate for the exponent is ψ(ξ ) = − log χ(ξ ), but with complex logarithms there is
always the trouble which branch of the logarithm one should take. Let us begin with the unique-
ness:
e−ψ = e−φ =⇒ e−(ψ−φ ) = 1 =⇒ ψ(ξ ) − φ (ξ ) = 2π i kξ

for some integer kξ ∈ Z. Since φ , ψ are continuous and φ (0) = ψ(0) = 1, we get kξ ≡ 0.
To prove the existence of the logarithm, it is not sufficient to take the principal branch of the
logarithm. As we have seen above, χ(ξ ) is continuous and has no zeroes, i.e. inf|ξ |6r |χ(ξ )| > 0
for any r > 0; therefore, there is a ‘distinguished’, continuous3 version of the argument arg◦ χ(ξ )
such that arg◦ χ(0) = 0.
This allows us to take a continuous version of log χ(ξ ) = log |χ(ξ )| + arg◦ χ(ξ ).

Corollary 2.6. Let Y be an infinitely divisible random variable. Then there exists at most one4
Lévy process (Xt )t>0 such that X1 ∼ Y .

Proof. Since X1 ∼ Y , infinite divisibility is a necessary requirement for Y . On the other hand,
Proposition 2.3 and Corollary 2.4 show how to construct the finite-dimensional distributions of a
Lévy process, hence the process, from X1 .

So far, we have seen the following one-to-one correspondences


1:1 1:1
(Xt )t>0 Lévy process ←→ E ei ξ ·X1 ←→ ψ(ξ ) = − log E ei ξ ·X1

and the next step is to find all possible characteristic exponents. This will lead us to the Lévy–
Khintchine formula.

3 A very detailed argument is given in Sato [51, Lemma 7.6], a completely different proof can be found in Dieudonné

[16, Chapter IX, Appendix 2].


4 We will see in Chapter 7 how to construct this process. It is unique in the sense that its finite-dimensional

distributions are uniquely determined by Y .


3. Examples

We begin with a useful alternative characterisation of Lévy processes.

Theorem 3.1. Let X = (Xt )t>0 be a stochastic process with values in Rd , P(X0 = 0) = 1 and
Ft = FtX = σ (Xr , r 6 t). The process X is a Lévy process if, and only if, there exists an exponent
ψ : Rd → C such that
 
E ei ξ ·(Xt −Xs ) Fs = e−(t−s)ψ(ξ ) for all s < t, ξ ∈ Rd . (3.1)

Proof. If X is a Lévy process, we get


 
(L2) (L1) Cor. 2.5 −(t−s)ψ(ξ )
E ei ξ ·(Xt −Xs ) Fs = E ei ξ ·(Xt −Xs ) = E ei ξ ·Xt−s = e .

Conversely, assume that X0 = 0 a.s. and (3.1) holds. Then

E ei ξ ·(Xt −Xs ) = e−(t−s)ψ(ξ ) = E ei ξ ·(Xt−s −X0 )

which shows Xt − Xs ∼ Xt−s − X0 = Xt−s , i.e. (L1).


For any F ∈ Fs we find from the tower property of conditional expectation
   h i
E 1F · ei ξ ·(Xt −Xs ) = E 1F E ei ξ ·(Xt −Xs ) Fs = E1F · e−(t−s)ψ(ξ ) . (3.2)

Observe that ei u1F = 1F c + ei u 1F for any u ∈ R; since both F and F c are in Fs , we get
     
E ei u1F ei ξ ·(Xt −Xs ) = E 1F c ei ξ ·(Xt −Xs ) + E 1F ei u ei ξ ·(Xt −Xs )
(3.2)
= E 1F c + ei u 1F e−(t−s)ψ(ξ )


(3.2)
= E ei u1F E ei ξ ·(Xt −Xs ) .

⊥(Xt − Xs ) for any F ∈ Fs , and (L2) follows.


Thus, 1F ⊥
Finally, limt→0 E ei ξ ·Xt = limt→0 e−tψ(ξ ) = 1 proves that Xt → 0 in distribution, hence in proba-
bility. This gives (L3).

Theorem 3.1 allows us to give concrete examples of Lévy processes.


Example 3.2. The following processes are Lévy processes.

a) Drift in direction l/|l|, l ∈ Rd , with speed |l|: Xt = tl and ψ(ξ ) = − i l · ξ .

16
Chapter 3: Examples 17

b) Brownian motion with (positive semi-definite) covariance matrix Q ∈ Rd×d : Let (Wt )t>0 be a

standard Wiener process on Rd and set Xt := QWt .
Then ψ(ξ ) = 12 ξ · Qξ and P(Xt ∈ dy) = (2πt)−d/2 (det Q)−1/2 exp(−y · Q−1 y/2t) dy.

c) Poisson process in R with jump height 1 and intensity λ . This is an integer-valued counting
process (Nt )t>0 which increases by 1 after an independent exponential waiting time with mean λ .
Thus,

Nt = ∑ 1[0,t] (τk ), τk = σ1 + · · · + σk , σk ∼ Exp(λ ) iid.
k=1

Using this definition, it is a bit messy to show that N is indeed a Lévy process (see e.g. Çinlar [12,
Chapter 4]). We will give a different proof in Theorem 3.4 below. Usually, the first step is to show
that its law is a Poisson distribution

(λt)k
P(Nt = k) = e−tλ , k = 0, 1, 2, . . .
k!

(thus the name!) and from this one can calculate the characteristic exponent

(λt)k
E ei uNt = ∑ ei uk e−tλ = e−tλ exp λtei u = exp − tλ (1 − ei u ) ,
   
k=0 k!

i.e. ψ(u) = λ (1 − ei u ). Mind that this is strictly weaker than (3.1) and does not prove that N is a
Lévy process.

d) Compound Poisson process in Rd with jump distribution µ and intensity λ . Let N = (Nt )t>0
be a Poisson process with intensity λ and replace the jumps of size 1 by independent iid jumps of
random height H1 , H2 , . . . with values in Rd and H1 ∼ µ. This is a compound Poisson process:

Nt
Ct = ∑ Hk , Hk ∼ µ iid and independent of (Nt )t>0 .
k=1

We will see in Theorem 3.4 that compound Poisson processes are Lévy processes.

Let us show that the Poisson and compound Poisson processes are Lévy processes. For this we
need the following auxiliary result. Since t 7→ Ct is a step function, the Riemann–Stieltjes integral
R
f (u) dCu is well-defined.

Lemma 3.3 (Campbell’s formula). Let Ct = H1 + · · · + HNt be a compound Poisson process as in


Example 3.2.d) with iid jumps Hk ∼ µ and an independent Poisson process (Nt )t>0 with intensity
λ . Then Z   Z Z 
∞ ∞
E exp i f (t + s) dCt = exp λ (ei y f (s+t) − 1) µ(dy) dt (3.3)
0 0 y6=0

holds for all s > 0 and bounded measurable functions f : [0, ∞) → Rd with compact support.
18 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Set τk = σ1 + · · · + σk where σk ∼ Exp(λ ) are iid. Then


Z∞ 
φ (s) :=E exp i f (s + t) dCt
0
!

=E exp i ∑ f (s + σ1 + · · · + σk )Hk
k=1
!
Z ∞ ∞
iid
= E exp i ∑ f (s + x + σ2 + · · · + σk )Hk E exp(i f (s + x)H1 ) P(σ1 ∈ dx)
0 k=2
| {z }| {z }| {z }
=φ (s+x) =:γ(s+x) =λ e−λ x dx
Z ∞
=λ φ (s + x)γ(s + x)e−λ x dx
0
Z ∞
=λ e λs
γ(t)φ (t)e−λt dt.
s
This is equivalent to Z ∞
e−λ s φ (s) = λ (φ (t)e−λt )γ(t) dt
s
and φ (∞) = 1 since f has compact support. This integral equation has a unique solution; it is now
a routine exercise to verify that the right-hand side of (3.3) is indeed a solution.

Theorem 3.4. Let Ct = H1 + · · · + HNt be a compound Poisson process as in Example 3.2.d) with
iid jumps Hk ∼ µ and an independent Poisson process (Nt )t>0 with intensity λ . Then (Ct )t>0 (and
also (Nt )t>0 ) is a d-dimensional Lévy process with characteristic exponent
Z
ψ(ξ ) = λ (1 − ei y·ξ ) µ(dy). (3.4)
y6=0
Proof. Since the trajectories of t 7→ Ct are càdlàg step functions with C0 = 0, the properties (L0)
and (L3), see (2.3), are satisfied. We will show (L1) and (L2). Let ξk ∈ Rd , 0 = t0 6 . . . 6 tn , and
a < b. Then the Riemann–Stieltjes integral
Z ∞ ∞

0
1(a,b] (t) dCt = ∑ 1(a,b] (τk )Hk = Cb −Ca
k=1
exists. We apply the Campbell formula (3.3) to the function
n
f (t) := ∑ ξk 1(t k−1 ,tk ]
(t)
k=1
and with s = 0. Then the left-hand side of (3.3) becomes the characteristic function of the incre-
ments !
n
E exp i ∑ ξk · (Ctk −Ctk−1 ) ,
k=1
while the right-hand side is equal to
" Z #
Z n tk
 n Z 
i ξk ·y i ξk ·y
exp λ ∑ (e − 1) dt µ(dy) = ∏ exp λ (tk − tk−1 ) (e − 1) µ(dy)
y6=0 k=1 tk−1 k=1 y6=0
n  
= ∏ E exp i ξk ·Ctk −tk−1
k=1
Chapter 3: Examples 19

(use Campbell’s formula with n = 1 for the last equality). This shows that the increments are
independent, i.e. (L20 ) holds, as well as (L1): Ctk −Ctk−1 ∼ Ctk −tk−1 .
If d = 1 and Hk ∼ δ1 , Ct is a Poisson process.

Denote by µ ∗k the k-fold convolution of the measure µ; as usual, µ ∗0 := δ0 .

Corollary 3.5. Let (Nt )t>0 be a Poisson process with intensity λ and Ct = H1 + · · · + HNt a com-
pound Poisson process with iid jumps Hk ∼ µ. Then, for all t > 0,

(λt)k
P(Nt = k) = e−λt , k = 0, 1, 2, . . . (3.5)
k!

(λt)k ∗k
P(Ct ∈ B) = e−λt ∑ µ (B), B ⊂ Rd Borel. (3.6)
k=0 k!

Proof. If we use Theorem 3.4 for d = 1 and µ = δ1 , we see that the characteristic function of Nt is
χt (u) = exp[−λt(1 − ei u )]. Since this is also the characteristic function of the Poisson distribution
(i.e. the r.h.s. of (3.5)), we get Nt ∼ Poi(λt).
Since (Hk )k∈N ⊥
⊥(Nt )t>0 , we have for any Borel set B

P(Ct ∈ B) = ∑ P(Ct ∈ B, Nt = k)
k=0

= δ0 (B)P(Nt = 0) + ∑ P(H1 + · · · + Hk ∈ B)P(Nt = k)
k=1

(λt)k
= e−λt ∑ µ ∗k (B).
k=0 k!

Example 3.2 contains the basic Lévy processes which will also be the building blocks for all
Lévy processes. In order to define more specialized Lévy processes, we need further assumptions
on the distributions of the random variables Xt .

Definition 3.6. Let (Xt )t>0 be a stochastically continuous process in Rd . It is called self-similar,
if
∀a > 0 ∃b = b(a) : (Xat )t>0 ∼ (bXt )t>0 (3.7)

in the sense that both sides have the same finite-dimensional distributions.

Lemma 3.7 (Lamperti). If (Xt )t>0 is self-similar and non-degenerate, then there exists a unique
index of self-similarity H > 0 such that b(a) = aH . If (Xt )t>0 is a self-similar Lévy process, then
H > 21 .

Proof. Since (Xt )t>0 is self-similar, we find for a, a0 > 0 and each t > 0

b(aa0 )Xt ∼ Xaa0t ∼ b(a)Xa0t ∼ b(a)b(a0 )Xt ,


20 R. L. Schilling: An Introduction to Lévy and Feller Processes

and so b(aa0 ) = b(a)b(a0 ) as Xt is non-degenerate.1 By the convergence of types theorem (Theo-


rem A.6) and the continuity in probability of t 7→ Xt we see that a 7→ b(a) is continuous. Thus, the
Cauchy functional equation b(aa0 ) = b(a)b(a0 ) has the unique continuous solution b(a) = aH for
some H > 0.
Assume now that (Xt )t>0 is a Lévy process. We are going to show that H > 21 . Using self-
similarity and the properties (L1), (L2) we get (primes always denote iid copies of the respective
random variables)

(n + m)H X1 ∼ Xn+m = (Xn+m − Xm ) + Xm ∼ Xn00 + Xm0 ∼ nH X100 + mH X10 . (3.8)

Any standard normal random variable X1 satisfies (3.8) with H = 12 . On the other hand, if X1 has
a second moment, we get (n + m)VX1 = VXn+m = VXn00 + VXm0 = nVX100 + mVX10 by Bienaymés
identity for variances, i.e. (3.8) can only hold with H = 12 . Thus, any self-similar X1 with finite
second moment has to satisfy (3.8) with H = 12 . If we can show that H < 21 implies the existence
of a second moment, we have reached a contradiction.
If Xn is symmetric and H < 12 , we find because of Xn ∼ nH X1 some u > 0 such that

1
P(|Xn | > unH ) = P(|X1 | > u) < .
4
By the symmetrization inequality (Theorem A.7),
1 1
1 − exp{−nP(|X1 | > unH )} 6 P(|Xn | > unH ) <

2 4

which means that nP(|X1 | > unH ) 6 c for all n ∈ N. Thus, P(|X1 | > x) 6 c0 x−1/H for all x > u + 1,
and so Z ∞ Z ∞
E|X1 |2 = 2 x P(|X1 | > x) dx 6 2(u + 1) + 2c0 x1−1/H dx < ∞
0 u+1

as H < 12 . If Xn is not symmetric, we use its symmetrization Xn − Xn0 where Xn0 are iid copies of
Xn .

Definition 3.8. A random variable X is called stable if

∀n ∈ N ∃bn > 0, cn ∈ Rd : X10 + · · · + Xn0 ∼ bn X + cn (3.9)

where X10 , . . . , Xn0 are iid copies of X. If (3.9) holds with cn = 0, the random variable is called
strictly stable. A Lévy process (Xt )t>0 is (strictly) stable if X1 is a (strictly) stable random variable.

1 We use here that bX ∼ cX =⇒ b = c if X is non-degenerate. To see this, set χ(ξ ) = E ei ξ ·X and notice
b b n
  
|χ(ξ )| = χ cξ = ··· = χ c ξ .

If b < c, the right-hand side converges for n → ∞ to χ(0) = 1, hence |χ| ≡ 1, contradicting the fact that X is non-
degenerate. Since b, c play symmetric roles, we conclude that b = c.
Chapter 3: Examples 21

Note that the symmetrization X − X 0 of a stable random variable is strictly stable. Setting
χ(ξ ) = E ei ξ ·X it is easy to see that (3.9) is equivalent to

∀n ∈ N ∃bn > 0, cn ∈ Rd : χ(ξ )n = χ(bn ξ )ei cn ·ξ . (3.90 )

Example 3.9. a) Stable processes. By definition, any stable random variable is infinitely divisible,
and for every stable X there is a unique Lévy process on Rd such that X1 ∼ X, cf. Corollary 2.6.
A Lévy process (Xt )t>0 is stable if, and only if, all random variables Xt are stable. This follows
at once from (3.90 ) if we use χt (ξ ) := E ei ξ ·Xt :

(2.5) t (3.90 ) (2.5)


χt (ξ )n = χ1 (ξ )n = χ1 (bn ξ )t ei(tcn )·ξ = χt (bn ξ )ei(tcn )·ξ .

It is possible to determine the characteristic exponent of a stable process, cf. Sato [51, Theorem
14.10] and (3.10) further down.

b) Self-similar processes. Assume that (Xt )t>0 is a self-similar Lévy process. Then
n
0 0
∀n ∈ N : b(n)X1 ∼ Xn = ∑ (Xk − Xk−1 ) ∼ X1,n + · · · + Xn,n
k=1

0 are iid copies of X . This shows that X , hence (X )


where the Xk,n 1 1 t t>0 , is strictly stable. In fact, the
converse is also true:

c) A strictly stable Lévy process is self-similar. We have already seen in b) that self-similar
Lévy processes are strictly stable. Assume now that (Xt )t>0 is strictly stable. Since Xnt ∼ bn Xt we
get
e−ntψ(ξ ) = E ei ξ ·Xnt = E ei bn ξ ·Xt = e−tψ(bn ξ ) .

Taking n = m, t t/m and ξ b−1


m ξ we see

t −1 ξ )
e− m ψ(ξ ) = e−tψ(bm .

From these equalities we obtain for q = n/m ∈ Q+ and b(q) := bn /bm

e−qtψ(ξ ) = e−tψ(b(q)ξ ) =⇒ Xqt ∼ b(q)Xt =⇒ Xat ∼ b(a)Xt

for all t > 0 because of the continuity in probability of (Xt )t>0 . Since, by Corollary 2.4, the finite-
dimensional distributions are determined by the one-dimensional distributions, we conclude that
(3.7) holds.
This means, in particular, that strictly stable Lévy processes have an index of self-similarity
H > 12 . It is common to call α = 1/H ∈ (0, 2] the index of stability of (Xt )t>0 , and we have
Xnt ∼ n1/α Xt .
If X is ‘only’ stable, its symmetrization is strictly stable and, thus, every stable Lévy process
has an index α ∈ (0, 2]. It plays an important role for the characteristic exponent. For a general
22 R. L. Schilling: An Introduction to Lévy and Feller Processes

stable process the characteristic exponent is of the form


Z 
α απ
 d |z · ξ | 1 − i sgn(z · ξ ) tan 2 σ (dz) − i µ · ξ , (α 6= 1),


S
ψ(ξ ) = Z (3.10)
 |z · ξ | 1 + π2 i sgn(z · ξ ) log |z · ξ | σ (dz) − i µ · ξ , (α = 1),

 
Sd

where σ is a finite measure on Sd and µ ∈ Rd . The strictly stable exponents have µ = 0 (if α 6= 1)
R
and Sd zk σ (dz) = 0, k = 1, . . . , d (if α = 1). These formulae can be derived from the general
Lévy–Khintchine formula; a good reference is the monograph by Samorodnitsky & Taqqu [48,
Chapters 2.3–4].
If X is strictly stable such that the distribution of Xt is rotationally invariant, it is clear that
R
ψ(ξ ) = c|ξ |α . If Xt is symmetric, i.e. Xt ∼ −Xt , then ψ(ξ ) = Sd |z · ξ |α σ (dz) for some finite,
symmetric measure σ on the unit sphere Sd ⊂ Rd .
Let us finally show Kolmogorov’s proof of the Lévy–Khintchine formula for one-dimensional
Lévy processes admitting second moments. We need the following auxiliary result.

Lemma 3.10. Let (Xt )t>0 be a Lévy process on R. If VX1 < ∞, then VXt < ∞ for all t > 0 and

EXt = tEX1 =: tµ and VXt = tVX1 =: tσ 2 .

Proof. If VX1 < ∞, then E|X1 | < ∞. With Bienaymé’s identity, we get
m
VXm = ∑ V(Xk − Xk−1 ) = mVX1 and VX1 = nVX1/n .
k=1

In particular, VXm , VX1/n < ∞. This, and a similar argument for the expectation, show

VXq = qVX1 and EXq = qEX1 for all q ∈ Q+ .

Moreover, V(Xq − Xr ) = VXq−r = (q − r)VX1 for all rational numbers r 6 q, and this shows that
Xq − EXq = Xq − qµ converges in L2 as q → t. Since t 7→ Xt is continuous in probability, we can
identify the limit and find Xq − qµ → Xt − tµ. Consequenctly, VXt = tσ 2 and EXt = tµ.

We have seen in Proposition 2.3 that the characteristic function of a Lévy process is of the form
t
χt (ξ ) = E ei ξ Xt = E ei ξ X1 = χ1 (ξ )t .


Let us assume that X is real-valued and has finite (first and) second moments VX1 = σ 2 and
EX1 = µ. By Taylor’s formula
 Z 1 
i ξ (Xt −tµ) 2 2 i θ ξ (Xt −tµ)
Ee = E 1 + i ξ (Xt − tµ) − ξ (Xt − tµ) (1 − θ )e dθ
0
 Z 1 
= 1 − E ξ 2 (Xt − tµ)2 (1 − θ )ei θ ξ (Xt −tµ) dθ .
0
Chapter 3: Examples 23

Since Z 1 Z 1
i θ ξ (Xt −tµ) 1
(1 − θ )e dθ 6 (1 − θ ) dθ = ,
0 0 2
we get
ξ2 2
E ei ξ Xt = E ei ξ (Xt −tµ) > 1 −
tσ .
2
Thus, χ1/n (ξ ) 6= 0 if n > N(ξ ) ∈ N is large, hence χ1 (ξ ) = χ1/n (ξ )n =
6 0. For ξ ∈ R we find
(using a suitable branch of the complex logarithm)

∂ t
ψ(ξ ) := − log χ1 (ξ ) = − χ1 (ξ )
∂t t=0
1 − E ei ξ Xt
= lim
t→0 t
1 ∞
Z
1 − ei yξ + i yξ pt (dy) − i ξ µ

= lim
t→0 t −∞

1 − ei yξ + i yξ
Z ∞
= lim πt (dy) − i ξ µ (3.11)
t→0 −∞ y2

where pt (dy) = P(Xt ∈ dy) and πt (dy) := y2 t −1 pt (dy). Yet another application of Taylor’s theo-
rem shows that the integrand in the above integral is bounded, vanishes at infinity, and admits a
continuous extension onto the whole real line if we choose the value 12 ξ 2 at y = 0. The family
(πt )t∈(0,1] is uniformly bounded,

1 1 1
Z 2  t→0
y2 pt (dy) = E(Xt2 ) = = σ 2 + tµ 2 −−→ σ 2 ,

VXt + EXt
t t t
hence sequentially vaguely relatively compact (see Theorem A.3). We conclude that every se-
quence (πt(n) )n∈N ⊂ (πt )t∈(0,1] with t(n) → 0 as n → ∞ has a vaguely convergent subsequence.
But since the limit (3.11) exists, all subsequential limits coincide which means2 that πt converges
vaguely to a finite measure π on R. This proves that

1 − ei yξ + i yξ
Z ∞
ψ(ξ ) = − log χ1 (ξ ) = π(dy) − i ξ µ
−∞ y2

for some finite measure π on (−∞, ∞) with total mass π(R) = σ 2 . This is sometimes called the de
Finetti–Kolmogorov formula. If we set ν(dy) := y−2 1{y6=0} π(dy) and σ02 := π{0}, we obtain
the Lévy–Khintchine formula
1
Z
ψ(ξ ) = − i µξ + σ02 ξ 2 + 1 − ei yξ + i yξ ν(dy)

2 y6=0

where σ 2 = σ02 + 2 ν(dy).


R
y6=0 y

2 Note that ei yξ = ∂ξ2 1 − ei yξ + i yξ /y2 , i.e. the kernel appearing in (3.11) is indeed measure-determining.

4. On the Markov property

Let (Ω, A , P) be a probability space with some filtration (Ft )t>0 and a d-dimensional adapted
stochastic process X = (Xt )t>0 , i.e. each Xt is Ft measurable. We write B(Rd ) for the Borel sets
and set F∞ := σ ( t>0 Ft ).
S

The process X is said to be a simple Markov process, if

P(Xt ∈ B | Fs ) = P(Xt ∈ B | Xs ), s 6 t, B ∈ B(Rd ), (4.1)

holds true. This is pretty much the most general definition of a Markov process, but it is usually
too general to work with. It is more convenient to consider Markov families.

Definition 4.1. A (temporally homogeneous) Markov transition function is a measure kernel


pt (x, B), t > 0, x ∈ Rd , B ∈ B(Rd ) such that

a) B 7→ ps (x, B) is a probability measure for every s > 0 and x ∈ Rd ;

b) (s, x) 7→ ps (x, B) is a Borel measurable function for every B ∈ B(Rd );

c) the Chapman–Kolmogorov equations hold


Z
ps+t (x, B) = pt (y, B) ps (x, dy) for all s,t > 0, x ∈ Rd , B ∈ B(Rd ). (4.2)

Definition 4.2. A stochastic process (Xt )t>0 is called a (temporally homogeneous) Markov pro-
cess with transition function if there exists a Markov transition function pt (x, B) such that

P(Xt ∈ B | Fs ) = pt−s (Xs , B) a.s. for all s 6 t, B ∈ B(Rd ). (4.3)

Conditioning w.r.t. σ (Xs ) and using the tower property of conditional expectation shows that
(4.3) implies the simple Markov property (4.1). Nowadays the following definition of a Markov
process is commonly used.

Definition 4.3. A (universal) Markov process is a tuple (Ω, A , Ft , Xt ,t > 0, Px , x ∈ Rd ) such


that pt (x, B) = Px (Xt ∈ B) is a Markov transition function and (Xt )t>0 is for each Px a Markov
process in the sense of Definition 4.2 such that Px (X0 = x) = 1. In particular,

Px (Xt ∈ B | Fs ) = PXs (Xt−s ∈ B) Px -a.s. for all s 6 t, B ∈ B(Rd ). (4.4)

24
Chapter 4: On the Markov property 25

We are going to show that a Lévy process is a (universal) Markov process. Assume that (Xt )t>0
is a Lévy process and set Ft := FtX = σ (Xr , r 6 t). Define probability measures

Px (X• ∈ Γ) := P(X• + x ∈ Γ), x ∈ Rd ,

where Γ is a Borel set of the path space (Rd )[0,∞) = {w | w : [0, ∞) → Rd }.1 We set Ex := . . . dPx .
R

By construction, P = P0 and E = E0 .
Note that Xtx := Xt + x satisfies the conditions (L1)–(L3), and it is common to call (Xtx )t>0 a
Lévy process starting from x.

Lemma 4.4. Let (Xt )t>0 be a Lévy process on Rd . Then

pt (x, B) := Px (Xt ∈ B) := P(Xt + x ∈ B), t > 0, x ∈ Rd , B ∈ B(Rd ),

is a Markov transition function.

Proof. Since pt (x, B) = E1B (Xt + x) (the proof of) Fubini’s theorem shows that x 7→ pt (x, B) is
a measurable function and B 7→ pt (x, B) is a probability measure. The Chapman–Kolmogorov
equations follow from

ps+t (x, B) = P(Xs+t + x ∈ B) = P((Xs+t − Xt ) + x + Xt ∈ B)


Z
(L2)
= P(y + Xt ∈ B) P((Xs+t − Xt ) + x ∈ dy)
d
ZR
(L1)
= P(y + Xt ∈ B) P(Xs + x ∈ dy)
Rd
Z
= pt (y, B) ps (x, dy).
Rd

Remark 4.5. The proof of Lemma 4.4 shows a bit more: From
Z Z
pt (x, B) = 1B (x + y) P(Xt ∈ dy) = 1B−x (y) P(Xt ∈ dy) = pt (0, B − x)

we see that the kernels pt (x, B) are invariant under shifts in Rd (translation invariant). In slight
abuse of notation we write pt (x, B) = pt (B − x). From this it becomes clear that the Chapman–
Kolmogorov equations are convolution identities pt+s (B) = pt ∗ ps (B), and (pt )t>0 is a convolu-
tion semigroup of probability measures; because of (L3), this semigroup is weakly continuous at
t = 0, i.e. pt → δ0 as t → 0, cf. Theorem A.3 et seq. for the weak convergence of measures.
Lévy processes enjoy an even stronger version of the above Markov property.

Theorem 4.6 (Markov property for Lévy processes). Let X be a d-dimensional Lévy process and
set Y := (Xt+a − Xa )t>0 for some fixed a > 0. Then Y is again a Lévy process satisfying

⊥(Xr )r6a , i.e. F∞Y ⊥


a) Y ⊥ ⊥ FaX .
1 Recall that B((Rd )[0,∞) ) is the smallest σ -algebra containing the cylinder sets Z = ×t>0 Bt where Bt ∈ B(Rd )
and only finitely many Bt 6= Rd .
26 R. L. Schilling: An Introduction to Lévy and Feller Processes

b) Y ∼ X, i.e. X and Y have the same finite dimensional distributions.

Proof. Observe that FsY = σ (Xr+a − Xa , r 6 s) ⊂ Fs+a X . Using Theorem 3.1 and the tower prop-

erty of conditional expectation yields for all s 6 t


  h   i
E ei ξ ·(Yt −Ys ) FsY = E E ei ξ ·(Xt+a −Xs+a ) Fs+a
X
FsY = e−(t−s)ψ(ξ ) .

Thus, (Yt )t>0 is a Lévy process with the same characteristic function as (Xt )t>0 . The property
(L20 ) for X gives

⊥ FaX .
Xtn +a − Xtn−1 +a , Xtn−1 +a − Xtn−2 +a , . . . , Xt1 +a − Xa ⊥

⊥ FaX for all t0 = 0 < t1 < · · · < tn , we get


As σ (Yt1 , . . . ,Ytn ) = σ (Ytn −Ytn−1 , . . . ,Yt1 −Yt0 ) ⊥
!
F∞Y = σ ⊥ FaX .
[
σ (Yt1 , . . . ,Ytn ) ⊥
t1 <···<tn , n∈N

Using the Markov transition function pt (x, B) we can define a linear operator on the bounded
Borel measurable functions f : Rd → R:
Z
Pt f (x) := f (y) pt (x, dy) = Ex f (Xt ), f ∈ Bb (Rd ), t > 0, x ∈ Rd . (4.5)

For a Lévy process, cf. Remark 4.5, we have pt (x, B) = pt (B − x) and the operators Pt are actually
convolution operators:
Z
Pt f (x) = E f (Xt + x) = f (y + x) pt (dy) = f ∗ pet (x) where pet (B) := pt (−B). (4.6)

Definition 4.7. Let Pt , t > 0, be defined by (4.5). The operators are said to be

a) acting on Bb (Rd ), if Pt : Bb (Rd ) → Bb (Rd ).

b) an operator semigroup, if Pt+s = Pt ◦ Ps for all s,t > 0 and P0 = id.

c) sub-Markovian if 0 6 f 6 1 =⇒ 0 6 Pt f 6 1.

d) contractive if kPt f k∞ 6 k f k∞ for all f ∈ Bb (Rd ).

e) conservative if Pt 1 = 1.

f) Feller operators, if Pt : C∞ (Rd ) → C∞ (Rd ).2

g) strongly continuous on C∞ (Rd ), if limt→0 kPt f − f k∞ = 0 for all f ∈ C∞ (Rd ).

h) strong Feller operators, if Pt : Bb (Rd ) → Cb (Rd ).

2C d)
∞ (Rdenotes the space of continuous functions vanishing at infinity. It is a Banach space when equipped with
the uniform norm k f k∞ = supx∈Rd | f (x)|.
Chapter 4: On the Markov property 27

Lemma 4.8. Let (Pt )t>0 be defined by (4.5). The properties 4.7.a)–e) hold for any Markov process,
4.7.a)–g) hold for any Lévy process, and 4.7.a)–h) hold for any Lévy process such that all transition
probabilities pt (dy) = P(Xt ∈ dy), t > 0, are absolutely continuous w.r.t. Lebesgue measure.

Proof. We only show the assertions about Lévy processes (Xt )t>0 .

a) Since Pt f (x) = E f (Xt + x), the boundedness of Pt f is obvious, and the measurability in x
follows from (the proof of) Fubini’s theorem.

b) By the tower property of conditional expectation, we get for s,t > 0

Pt+s f (x) = Ex f (Xt+s ) = Ex Ex [ f (Xt+s ) | Fs ]




(4.4) x
= E EXs f (Xt ) = Ps ◦ Pt f (x).


For the Markov transition functions this is the Chapman–Kolmogorov identity (4.2).

c) and d), e) follow directly from the fact that B 7→ pt (x, B) is a probability measure.

f) Let f ∈ C∞ (Rd ). Since x 7→ f (x + Xt ) is continuous and bounded, the claim follows from
dominated convergence as Pt f (x) = E f (x + Xt ).

g) f ∈ C∞ is uniformly continuous, i.e. for every ε > 0 there is some δ > 0 such that

|x − y| 6 δ =⇒ | f (x) − f (y)| 6 ε.

Hence,
Z
kPt f − f k∞ 6 sup | f (Xt ) − f (x)| dPx
x∈Rd
Z Z 
x x
= sup | f (Xt ) − f (x)| dP + | f (Xt ) − f (x)| dP
x∈Rd |Xt −x|6δ |Xt −x|>δ

6 ε + 2k f k∞ sup Px (|Xt − x| > δ )


x∈Rd
(L3)
= ε + 2k f k∞ P(|Xt | > δ ) −−−→ ε.
t→0

Since ε > 0 is arbitrary, the claim follows. Note that this proof shows that uniform conti-
nuity in probability is responsible for the strong continuity of the semigroup.

h) see Lemma 4.9.

Lemma 4.9 (Hawkes). Let X = (Xt )t>0 be a Lévy process on Rd . Then the operators Pt defined
by (4.5) are strong Feller if, and only if, Xt ∼ pt (y) dy for all t > 0.

Proof. ‘⇐’: Let Xt ∼ pt (y) dy. Since pt ∈ L1 and since convolutions have a smoothing property
(e.g. [54, Theorem 14.8] or [55, Satz 18.9]), we get with pet (y) = pt (−y)

Pt f = f ∗ pet ∈ L∞ ∗ L1 ⊂ Cb (Rd ).
28 R. L. Schilling: An Introduction to Lévy and Feller Processes

‘⇒’: We show that pt (dy)  dy. Let N ∈ B(Rd ) be a Lebesgue null set λ d (N) = 0 and
g ∈ Bb (Rd ). Then, by the Fubini–Tonelli theorem
Z ZZ
g(x)Pt 1N (x) dx = g(x)1N (x + y) pt (dy) dx
Z Z
= g(x)1N (x + y) dx pt (dy) = 0.
| {z }
=0

Take g = Pt 1N , then the above calculation shows


Z
(Pt 1N (x))2 dx = 0.

Hence, Pt 1N = 0 Lebesgue-a.e. By the strong Feller property, Pt 1N is continuous, and so Pt 1N ≡ 0,


hence

pt (N) = Pt 1N (0) = 0.

Remark 4.10. The existence and smoothness of densities for a Lévy process are time-dependent
properties, cf. Sato [51, Chapter V.23]. The typical example is the Gamma process. This is a
(one-dimensional) Lévy process with characteristic exponent
1
ψ(ξ ) = log(1 + |ξ |2 ) − i arctan ξ , ξ ∈ R,
2
and this process has the transition density
1 t−1 −x
pt (x) = x e 1(0,∞) (x), t > 0.
Γ(t)
The factor xt−1 gives a time-dependent condition for the property pt ∈ L p (dx). One can show, cf.
[30], that
Re ψ(ξ )
lim = ∞ =⇒ ∀t > 0 ∃pt ∈ C∞ (Rd ).
|ξ |→∞ log(1 + |ξ |2 )

The converse direction remains true if ψ(ξ ) is rotationally invariant or if it is replaced by its
symmetric rearrangement.
Remark 4.11. If (Pt )t>0 is a Feller semigroup, i.e. a semigroup satisfying the conditions 4.7.a)-g),
then there exists a unique stochastic process (a Feller process) with (Pt )t>0 as transition semi-
group. The idea is to use Kolmogorov’s consistency theorem for the following family of finite-
dimensional distributions
 
ptx1 ,...,tn (B1 × · · · × Bn ) = Pt1 1B1 Pt2 −t1 1B2 Pt3 −t2 (. . . Ptn −tn−1 (1Bn )) (x)

Here Xt0 = X0 = x a.s. Note: It is not enough to have a semigroup on L p as we need pointwise
evaluations.
If the operators Pt are not a priori given on Bb (Rd ) but only on C∞ (Rd ), one still can use the
Riesz representation theorem to construct Markov kernels pt (x, B) representing and extending Pt
onto Bb (Rd ), cf. Lemma 5.2.
Chapter 4: On the Markov property 29

Recall that a stopping time is a random time τ : Ω → [0, ∞] such that {τ 6 t} ∈ Ft for all t > 0.
It is not hard to see that τn := (b2n τc + 1)2−n , n ∈ N, is a sequence of stopping times with values
k2−n , k = 1, 2, . . . , such that

τ1 > τ2 > . . . > τn ↓ τ = inf τn .


n∈N

This approximation is the key ingredient to extend the Markov property (Theorem 4.6) to random
times.

Theorem 4.12 (Strong Markov property for Lévy processes). Let X be a Lévy process on Rd
and set Y := (Xt+τ − Xτ )t>0 for some a.s. finite stopping time τ. Then Y is again a Lévy process
satisfying

⊥(Xr )r6τ , i.e. F∞Y ⊥ X := F ∈ F X : F ∩ {τ < t} ∈ F X ∀t > 0 .


⊥ Fτ+

a) Y ⊥ ∞ t

b) Y ∼ X, i.e. X and Y have the same finite dimensional distributions.

Proof. Let τn := (b2n τc + 1)2−n . For all 0 6 s < t, ξ ∈ Rd and F ∈ Fτ+ X we find by the right-

continuity of the sample paths (or by the continuity in probability (L3))


h i h i
E ei ξ ·(Xt+τ −Xs+τ ) 1F = lim E ei ξ ·(Xt+τn −Xs+τn ) 1F
n→∞
∞ h i
= lim ∑ E ei ξ ·(Xt+k2−n −Xs+k2−n ) 1{τn =k2−n } · 1F
n→∞
k=1
∞ h i
= lim ∑E ei ξ ·(Xt+k2−n −Xs+k2−n ) 1{(k−1)2−n 6τ<k2−n } 1F
n→∞
k=1 | {z }| {z }
⊥ Fk2
⊥ X
−n by (L2) ∈ Fk2
X X
−n as F∈Fτ+

∑ E ei ξ ·X (k − 1)2−n 6 τ < k2−n ∩ F
   
= lim t−s
P
n→∞
k=1

= E ei ξ ·Xt−s P(F).
 

In the last equality we use ∞·


S
k=1 {(k − 1)2
−n 6 τ < k2−n } = {τ < ∞} for all n > 1.

The same calculation applies to finitely many increments. Let F ∈ Fτ+ X , t = 0 < t < ··· < t
0 1 n
d
and ξ1 , . . . , ξn ∈ R . Then
h n i n h i
E ei ∑k=1 ξk ·(Xtk +τ −Xtk−1 +τ ) 1F = ∏ E ei ξk ·Xtk −tk−1 P(F).
k=1

This shows that the increments Xtk +τ − Xtk−1 +τ are independent and distributed like Xtk −tk−1 . More-
over, all increments are independent of F ∈ Fτ+ X .

Therefore, all random vectors of the form (Xt1 +τ − Xτ , . . . , Xtn +τ − Xtn−1 +τ ) are independent of
Fτ+
X , and we conclude that F Y = σ (X
∞ t+τ − Xτ , t > 0) ⊥⊥ Fτ+X .
5. A digression: semigroups

We have seen that the Markov kernel pt (x, B) of a Lévy or Markov process induces a semigroup
of linear operators (Pt )t>0 . In this chapter we collect a few tools from functional analysis for the
study of operator semigroups. By Bb (Rd ) we denote the bounded Borel functions f : Rd → R, and
C∞ (Rd ) are the continuous functions vanishing at infinity, i.e. lim|x|→∞ f (x) = 0; when equipped
with the uniform norm k · k∞ both sets become Banach spaces.

Definition 5.1. A Feller semigroup is a family of linear operators Pt : Bb (Rd ) → Bb (Rd ) satisfy-
ing the properties a)–g) of Definition 4.7: (Pt )t>0 is a semigroup of conservative, sub-Markovian
operators which enjoy the Feller property Pt (C∞ (Rd )) ⊂ C∞ (Rd ) and which are strongly contin-
uous on C∞ (Rd ).

Notice that (t, x) 7→ Pt f (x) is for every f ∈ C∞ (Rd ) continuous. This follows from

|Pt f (x) − Ps f (y)| 6 |Pt f (x) − Pt f (y)| + |Pt f (y) − Ps f (y)|


6 |Pt f (x) − Pt f (y)| + kP|t−s| f − f k∞ ,

the Feller property 4.7.f) and the strong continuity 4.7.g).

Lemma 5.2. If (Pt )t>0 is a Feller semigroup, then there is a Markov transition function pt (x, dy)
R
(Definition 4.1) such that Pt f (x) = f (y) pt (x, dy).

Proof. By the Riesz representation theorem we see that the operators Pt are of the form
Z
Pt f (x) = f (y) pt (x, dy)

where pt (x, dy) is a Markov kernel. The tricky part is to show the joint measurability of the
transition function (t, x) 7→ pt (x, B) and the Chapman–Kolmogorov identities (4.2).
For every compact set K ⊂ Rd the functions defined by
d(x,Unc )
fn (x) := , d(x, A) := inf |x − a|, Un := {y : d(y, K) < 1/n},
d(x, K) + d(x,Unc ) a∈A

are in C∞ (Rd ) and fn ↓ 1K . By monotone convergence, pt (x, K) = infn∈N Pt fn (x) which proves
the joint measurability in (t, x) for all compact sets.
By the same, the semigroup property Pt+s fn = Ps Pt fn entails the Chapman–Kolmogorov identi-
R
ties for compact sets: pt+s (x, K) = pt (y, K) ps (x, dy). Since
 
 (t, x) 7→ pt (x, B) is measurable & 
D := B ∈ B(Rd ) Z
 pt+s (x, B) = pt (y, B) ps (x, dy) 

30
Chapter 5: A digression: semigroups 31

is a Dynkin system containing the compact sets, we have D = B(Rd ).

To get an intuition for semigroups it is a good idea to view the semigroup property

Pt+s = Ps ◦ Pt and P0 = id

as an operator-valued Cauchy functional equation. If t 7→ Pt is—in a suitable sense—continuous,


the unique solution will be of the form Pt = etA for some operator A. This can be easily made
rigorous for matrices A, Pt ∈ Rn×n since the matrix exponential is well defined by the uniformly
convergent series
∞ k k
t A d
Pt = exp(tA) := ∑ and A = Pt
k=0 k! dt t=0

with A0 := id and Ak = A ◦ A ◦ · · · ◦ A (k times). With a bit more care, this can be made to work
also in general settings.

Definition 5.3. Let (Pt )t>0 be a Feller semigroup. The (infinitesimal) generator is a linear opera-
tor defined by
 
Pt f − f
D(A) := f ∈ C∞ (Rd ) ∃g ∈ C∞ (Rd ) : lim −g = 0 (5.1)
t→0 t ∞
Pt f − f
A f := lim , f ∈ D(A). (5.2)
t→0 t

The following lemma is the rigorous version for the symbolic notation ‘Pt = etA ’.

Lemma 5.4. Let (Pt )t>0 be a Feller semigroup with infinitesimal generator (A, D(A)). Then
Pt (D(A)) ⊂ D(A) and

d
Pt f = APt f = Pt A f for all f ∈ D(A), t > 0. (5.3)
dt
Rt
Moreover, 0 Ps f ds ∈ D(A) for any f ∈ C∞ (Rd ), and
Z t
Pt f − f = A Ps f ds, f ∈ C∞ (Rd ), t > 0 (5.4)
0
Z t
= Ps A f ds, f ∈ D(A), t > 0. (5.5)
0

Proof. Let 0 < ε < t and f ∈ D(A). The semigroup and contraction properties give

Pt f − Pt−ε f Pε f − f
− Pt A f 6 Pt−ε − Pt−ε A f + Pt−ε A f − Pt−ε Pε A f ∞
ε ∞ ε ∞
Pε f − f
6 − A f + A f − Pε A f ∞ −−→ 0
ε ∞ ε→0

d−
where we use the strong continuity in the last step. This shows dt Pt f = APt f = Pt A f ; a similar
+
(but simpler) calculation proves this also for ddt Pt f .
32 R. L. Schilling: An Introduction to Lévy and Feller Processes

Let f ∈ C∞ (Rd ) and t, ε > 0. By Fubini’s theorem and the representation of Pt with a Markov
transition function (Lemma 5.2) we get
Z t Z t
Pε Ps f (x) ds = Pε Ps f (x) ds,
0 0
and so,
Z t
Pε − id 1 t
Z 
Ps f (x) ds = Ps+ε f (x) − Ps f (x) ds
ε 0 ε 0
1 t+ε 1 ε
Z Z
= Ps f (x) ds − Ps f (x) ds.
ε t ε 0
Since t 7→ Pt f (x) is continuous, the fundamental theorem of calculus applies, and we get
Z r+ε
1
lim Ps f (x) ds = Pr f (x)
ε→0 ε r
Rt
for r > 0. This shows that 0 Ps f ds ∈ D(A) as well as (5.4). If f ∈ D(A), then we deduce (5.5)
from
Z t Z t Z t
(5.3) d (5.4)
Ps A f (x) ds = Ps f (x) ds = Pt f (x) − f (x) = A Ps f (x) ds.
0 0 ds 0

Remark 5.5 (Consequences of Lemma 5.4). Write C∞ := C∞ (Rd ).


Rt
a) (5.4) shows that D(A) is dense in C∞ , since D(A) 3 t −1 0 Ps f ds −−→ f for any f ∈ C∞ .
t→0

b) (5.5) shows that A is a closed operator, i.e.


uniformly
fn ∈ D(A), ( fn , A fn ) −−−−−−→ ( f , g) ∈ C∞ × C∞ =⇒ f ∈ D(A) & A f = g.
n→∞

c) (5.3) means that A determines (Pt )t>0 uniquely.

Let us now consider the Laplace transform of (Pt )t>0 .

Definition 5.6. Let (Pt )t>0 be a Feller semigroup. The resolvent is a linear operator on Bb (Rd )
given by Z ∞
Rλ f (x) := e−λt Pt f (x) dt, f ∈ Bb (Rd ), x ∈ Rd , λ > 0. (5.6)
0
The following formal calculation can easily be made rigorous. Let (λ − A) := (λ id −A) for
λ > 0 and f ∈ D(A). Then
Z ∞
(λ − A)Rλ f = (λ − A) e−λt Pt f dt
0
Z ∞
(5.4),(5.5)
= e−λt (λ − A)Pt f dt
0
 
d
Z ∞ Z ∞
−λt −λt
= λ e Pt f dt − e Pt f dt
0 0 dt
Z ∞ Z ∞
parts
= λ e−λt Pt f dt − λ e−λt Pt f dt − [e−λt Pt f ]t=0

= f.
0 0

A similar calculation for Rλ (λ − A) gives


Chapter 5: A digression: semigroups 33

Theorem 5.7. Let (A, D(A)) and (Rλ )λ >0 be the generator and the resolvent of a Feller semi-
group. Then
Rλ = (λ − A)−1 for all λ > 0.

Since Rλ is the Laplace transform of (Pt )t>0 , the properties of (Rλ )λ >0 can be found from
(Pt )t>0 and vice versa. With some effort one can even invert the (operator-valued) Laplace trans-
form which leads to the familiar expression for ex :
n n  t −n strongly tA
R nt = id − A −−−−−→ e = Pt (5.7)
t n n→∞

(the notation etA = Pt is, for unbounded operators A, formal), see Pazy [41, Chapter 1.8].

Lemma 5.8. Let (Rλ )λ >0 be the resolvent of a Feller1 semigroup (Pt )t>0 . Then

dn
R = n!(−1)n Rn+1 n ∈ N0 . (5.8)
dλ n λ λ

Proof. Using a symmetry argument we see


Z t Z t Z t Z tn Z t2
n
t = ... dt1 . . . dtn = n! ... dt1 . . . dtn .
0 0 0 0 0

Let f ∈ C∞ (Rd ) and x ∈ Rd . Then

dn d n −λt
Z ∞ Z ∞
(−1) n
R f (x) = (−1) n
e Pt f (x) dt = t n e−λt Pt f (x) dt
dλ n λ 0 dλ n 0
Z ∞ Z t Z tn Z t2
= n! ... e−λt Pt f (x) dt1 . . . dtn dt
Z0 ∞ Z0 ∞ 0 Z ∞ 0
= n! ... e−λt Pt f (x) dt dt1 . . . dtn
0 tn t
Z ∞Z ∞ Z 1∞
= n! ... e−λ (t+t1 +···+tn ) Pt+t1 +···+tn f (x) dt dt1 . . . dtn
0 0 0
n+1
= n! Rλ f (x).

The key result identifying the generators of Feller semigroups is the following theorem due to
Hille, Yosida and Ray, a proof can be found in Pazy [41, Chapter 1.4] or Ethier & Kurtz [17,
Chapter 4.2]; a probabilistic approach is due to Itô [25].

Theorem 5.9 (Hille–Yosida–Ray). A linear operator (A, D(A)) on C∞ (Rd ) generates a Feller
semigroup (Pt )t>0 if, and only if,

a) D(A) ⊂ C∞ (Rd ) dense.

b) A is dissipative, i.e. kλ f − A f k∞ > λ k f k∞ for some (or all) λ > 0.

c) (λ − A)(D(A)) = C∞ (Rd ) for some (or all) λ > 0.


1 This Lemma only needs that the operators Pt are strongly continuous and contractive, Definition 4.7.g), d).
34 R. L. Schilling: An Introduction to Lévy and Feller Processes

d) A satisfies the positive maximum principle:

f ∈ D(A), f (x0 ) = sup f (x) > 0 =⇒ A f (x0 ) 6 0. (PMP)


x∈Rd

This variant of the Hille–Yosida theorem is not the standard version from functional analysis
since we are interested in positivity preserving (sub-Markov) semigroups. Let us briefly discuss
the role of the positive maximum principle.
Remark 5.10. Let (Pt )t>0 be a strongly continuous contraction semigroup on C∞ (Rd ), i.e.

kPt f k∞ 6 k f k∞ and lim kPt f − f k∞ = 0,


t→0

cf. Definition 4.7.d),g).2

1◦ Sub-Markov ⇒ (PMP). Assume that f ∈ D(A) is such that f (x0 ) = sup f > 0. Then
f6f+
Pt f (x0 ) − f (x0 ) 6 Pt+ f (x0 ) − f + (x0 ) 6 k f + k∞ − f + (x0 ) = 0.
Pt f (x0 ) − f (x0 )
=⇒ A f (x0 ) = lim 6 0.
t→0 t
Thus, (PMP) holds.

2◦ (PMP) ⇒ dissipativity. Assume that (PMP) holds and let f ∈ D(A). Since f ∈ C∞ (Rd ), we
may assume that f (x0 ) = | f (x0 )| = sup | f | (otherwise f − f ). Then

kλ f − A f k∞ > λ f (x0 ) − A f (x0 ) > λ f (x0 ) = λ k f k∞ .


| {z }
60
3◦ (PMP) ⇒ sub-Markov. Since Pt is contractive, we have Pt f (x) 6 kPt f k∞ 6 k f k∞ 6 1 for all
f ∈ C∞ (Rd ) such that | f | 6 1. In order to see positivity, let f ∈ C∞ (Rd ) be non-negative.
We distinguish between two cases:
1◦ Rλ f does not attain its infimum. Since Rλ f ∈ C∞ (Rd ) vanishes at infinity, we have nec-
essarily Rλ f > 0.
2◦ ∃x0 : Rλ f (x0 ) = inf Rλ f . Because of the (PMP) we find

λ Rλ f (x0 ) − f (x0 ) = ARλ f (x0 ) > 0


=⇒ λ Rλ f (x) > inf λ Rλ f = λ Rλ f (x0 ) > f (x0 ) > 0.

This proves that f > 0 =⇒ λ Rλ f > 0. From (5.8) we see that λ 7→ Rλ f (x) is completely
monotone, hence it is the Laplace transform of a positive measure. Since Rλ f (x) has the
integral representation (5.6), we conclude that Pt f (x) > 0 (for all t > 0 as t 7→ Pt f is contin-
uous).
Using the Riesz representation theorem (as in Lemma 5.2) we can extend Pt as a sub-Markov
operator onto Bb (Rd ).
2 These properties are essential for the existence of a generator and the resolvent on C∞ (Rd ).
Chapter 5: A digression: semigroups 35

In order to determine the domain D(A) of the generator the following ‘maximal dissipativity’
result is handy.

Lemma 5.11 (Dynkin, Reuter). Assume that (A, D(A)) generates a Feller semigroup and that
(A, D(A)) extends A, i.e. D(A) ⊂ D(A) and A|D(A) = A. If

u ∈ D(A), u − Au = 0 =⇒ u = 0, (5.9)

then (A, D(A)) = (A, D(A)).

Proof. Since A is a generator, (id −A) : D(A) → C∞ (Rd ) is bijective. On the other hand, the
relation (5.9) means that (id −A) is injective, but (id −A) cannot have a proper injective extension.

Theorem 5.12. Let (Pt )t>0 be a Feller semigroup with generator (A, D(A)). Then
 
Pt f (x) − f (x)
D(A) = f ∈ C∞ (R ) ∃g ∈ C∞ (R ) ∀x : lim
d d
= g(x) . (5.10)
t→0 t

Proof. Denote by D(A) the right-hand side of (5.10) and define

Pt f (x) − f (x)
A f (x) := lim for all f ∈ D(A), x ∈ Rd .
t→0 t
Obviously, (A, D(A)) is a linear operator which extends (A, D(A)). Since (PMP) is, essentially,
a pointwise assertion (see Remark 5.10, 1◦ ), A inherits (PMP); in particular, A is dissipative (see
Remark 5.10, 2◦ ):
kA f − λ f k∞ > λ k f k∞ .

This implies (5.9), and the claim follows from Lemma 5.11.
6. The generator of a Lévy process

We want to study the structure of the generator of (the semigroup corresponding to) a Lévy process
X = (Xt )t>0 . This will also lead to a proof of the Lévy–Khintchine formula.
Our approach uses some Fourier analysis. We denote by C∞ c (R ) and S(R ) the smooth, com-
d d

pactly supported functions and the smooth, rapidly decreasing ‘Schwartz functions’.1 The Fourier
transform is denoted by
Z
fb(ξ ) = F f (ξ ) := (2π)−d f (x) e− i ξ ·x dx, f ∈ L1 (dx).
Rd

Observe that F f is chosen in such a way that the characteristic function becomes the inverse
Fourier transform.
We have seen in Proposition 2.3 and its Corollaries 2.4 and 2.5 that X is completely character-
ized by the characteristic exponent ψ : Rd → C
t
E ei ξ ·Xt = E ei ξ ·X1 = e−tψ(ξ ) , t > 0, ξ ∈ Rd .


We need a few more properties of ψ which result from the fact that χ(ξ ) = e−ψ(ξ ) is a character-
istic function.

Lemma 6.1. Let χ(ξ ) be any characteristic function of a probability measure µ. Then

|χ(ξ + η) − χ(ξ )χ(η)|2 6 (1 − |χ(ξ )|2 )(1 − |χ(η)|2 ), ξ , η ∈ Rd . (6.1)

Proof. Since µ is a probability measure, we find from the definition of χ


ZZ
ei x·ξ ei x·η − ei x·ξ ei y·η µ(dx) µ(dy)

χ(ξ + η) − χ(ξ )χ(η) =
1
ZZ
ei x·ξ − ei y·ξ ei x·η − ei y·η µ(dx) µ(dy).
 
=
2
In the last equality we use that the integrand is symmetric in x and y, which allows us to in-
terchange the variables. Using the elementary formula |ei a − ei b |2 = 2 − 2 cos(b − a) and the

1 Tobe precise, f ∈ S(Rd ), if f ∈ C∞ (Rd ) and if supx∈Rd (1 + |x|N )|∂ α f (x)| 6 cN,α for any N ∈ N0 and any
multiindex α ∈ Nd0 .

36
Chapter 6: The generator of a Lévy process 37

Cauchy–Schwarz inequality yield

|χ(ξ + η) − χ(ξ )χ(η)|


1
ZZ
6 ei x·ξ − ei y·ξ · ei x·η − ei y·η µ(dx) µ(dy)
2
ZZ p p
= 1 − cos(y − x) · ξ 1 − cos(y − x) · η µ(dx) µ(dy)
rZZ rZZ
 
6 1 − cos(y − x) · ξ µ(dx) µ(dy) 1 − cos(y − x) · η µ(dx) µ(dy).

This finishes the proof as


ZZ Z Z 
i y·ξ − i x·ξ
cos(y − x) · ξ µ(dx) µ(dy) = Re e µ(dy) e µ(dx) = |χ(ξ )|2 .

Theorem 6.2. Let ψ : Rd → C be the characteristic exponent of a Lévy process. Then the function
p
ξ 7→ |ψ(ξ )| is subadditive and

|ψ(ξ )| 6 cψ (1 + |ξ |2 ), ξ ∈ Rd . (6.2)

Proof. We use (6.1) with χ = e−tψ , divide by t > 0 and let t → 0. Since |χ| = e−t Re ψ , this gives

|ψ(ξ + η) − ψ(ξ ) − ψ(η)|2 6 4 Re ψ(ξ ) Re ψ(η) 6 4|ψ(ξ )| · |ψ(η)|.

By the lower triangle inequality,


p p
|ψ(ξ + η)| − |ψ(ξ )| − |ψ(η)| 6 2 |ψ(ξ )| |ψ(η)|
p p p
and this is the same as subadditivity: |ψ(ξ + η)| 6 |ψ(ξ )| + |ψ(η)|.
In particular, |ψ(2ξ )| 6 4|ψ(ξ )|. For any ξ 6= 0 there is some integer n = n(ξ ) ∈ Z such that
2 n−1 6 |ξ | 6 2n , so

|ψ(ξ )| = |ψ(2n 2−n ξ )| 6 max{1, 22n } sup |ψ(η)| 6 2 sup |ψ(η)|(1 + |ξ |2 ).


|η|61 |η|61

Lemma 6.3. Let (Xt )t>0 be a Lévy process and denote by (A, D(A)) its infinitesimal generator.
Then C∞
c (R ) ⊂ D(A).
d

Proof. Let f ∈ C∞ d
c (R ). By definition, Pt f (x) = E f (Xt + x). Using the differentiation lemma for
parameter-dependent integrals (e.g. [54, Theorem 11.5] or [55, 12.2]) it is not hard to see that
Pt : S(Rd ) → S(Rd ). Obviously,

e−tψ(ξ ) = E ei ξ ·Xt = Ex ei ξ ·(Xt −x) = e−ξ (x)Pt eξ (x) (6.3)

for eξ (x) := ei ξ ·x . Recall that the Fourier transform of f ∈ S(Rd ) is again in S(Rd ). From
Z Z
Pt f = Pt fb(ξ )eξ (·) dξ = fb(ξ )Pt eξ (·) dξ (6.4)
Z
(6.3)
= fb(ξ )eξ (·)e−tψ(ξ ) dξ
38 R. L. Schilling: An Introduction to Lévy and Feller Processes

b −tψ . Hence,
t f = fe
we conclude that Pc

Pt f = F−1 ( fbe−tψ ). (6.5)

Consequently,

tf−f
Pc b e−tψ fb− fb
= −−−→ −ψ fb
t t t→0
fb∈S(Rd ) Pt f (x) − f (x)
=====⇒ −−−→ g(x) := F−1 (−ψ fb)(x).
t t→0

Since ψ grows at most polynomially (Lemma 6.2) and fb ∈ S(Rd ), we see ψ fb ∈ L1 (dx) and, by
the Riemann–Lebesgue lemma, g ∈ C∞ (Rd ). Using Theorem 5.12 it follows that f ∈ D(A).

Definition 6.4. Let L : C2b (Rd ) → Cb (Rd ) be a linear operator. Then

L(x, ξ ) := e−ξ (x)Lx eξ (x) (6.6)

is the symbol of the operator L = Lx , where eξ (x) := ei ξ ·x .

The proof of Lemma 6.3 actually shows that we can recover an operator L from its symbol
L(x, ξ ) if, say, L : C2b (Rd ) → Cb (Rd ) is continuous:2 Indeed, for all u ∈ C∞ d
c (R )
Z
Lu(x) = L ub(ξ )eξ (x) dξ
Z
= ub(ξ )Lx eξ (x) dξ
Z
= ub(ξ )L(x, ξ )eξ (x)dξ = F−1 (L(x, ·)Fu(·))(x).

Example 6.5. A typical example would be the Laplace operator (i.e. the generator of a Brownian
motion)
1
Z
1
− 12 ( 1i ∂x )2 f (x) fb(ξ ) − 21 |ξ |2 ei ξ ·x dξ , i.e. L(x, ξ ) = − |ξ |2 ,

2 ∆ f (x) = =
2
or the fractional Laplacian of order 12 α ∈ (0, 1) which generates a rotationally symmetric α-stable
Lévy process
Z
fb(ξ ) − |ξ |α ei ξ ·x dξ ,

−(−∆)α/2 f (x) = i.e. L(x, ξ ) = −|ξ |α .

More generally, if P(x, ξ ) is a polynomial in ξ , then the corresponding operator is obtained by


replacing ξ by 1i ∇x and formally expanding the powers.

Definition 6.6. An operator of the form


Z
L(x, D) f (x) = fb(ξ )L(x, ξ )ei x·ξ dξ , f ∈ S(Rd ), (6.7)

is called (if defined) a pseudo differential operator with (non-classical) symbol L(x, ξ ).
2 As usual, C2b (Rd ) is endowed with the norm kuk(2) = ∑06|α|62 k∂ α uk∞ .
Chapter 6: The generator of a Lévy process 39

Remark 6.7. The symbol of a Lévy process does not depend on x, i.e. L(x, ξ ) = L(ξ ). This
is a consequence of the spatial homogeneity of the process which is encoded in the translation
invariance of the semigroup (cf. (4.6) and Lemma 4.4):

Pt f (x) = E f (Xt + x) =⇒ Pt f (x) = ϑx (Pt f )(0) = Pt (ϑx f )(0)

where ϑx u(y) = u(y +x) is the shift operator. This property is obviously inherited by the generator,
i.e.
A f (x) = ϑx (A f )(0) = A(ϑx f )(0), f ∈ D(A).

As a matter of fact, the converse is also true: If L : C∞ c (R ) → C(R ) is a linear operator


d d

satisfying ϑx (L f ) = L(ϑx f ), then L f = f ∗ λ where λ is a distribution, i.e. a continuous linear


functional λ : C∞ d
c (R ) → R, cf. Theorem A.10.

Theorem 6.8. Let (Xt )t>0 be a Lévy process with generator A. Then

1
Z  
A f (x) = l · ∇ f (x) + ∇ · Q∇ f (x) + f (x + y) − f (x) − ∇ f (x) · y1(0,1) (|y|) ν(dy) (6.8)
2
y6=0

for any f ∈ C∞ d d
c (R ), where l ∈ R , Q ∈ R
d×d is a positive semidefinite matrix, and ν is a measure

on Rd \ {0} such that y6=0 min{1, |y|2 } ν(dy) < ∞.


R

Equivalently, A is a pseudo differential operator


Z
Au(x) = −ψ(D)u(x) = − ub(ξ )ψ(ξ )ei x·ξ dξ , u ∈ C∞ d
c (R ), (6.9)

whose symbol is the characteristic exponent −ψ of the Lévy process. It is given by the Lévy–
Khintchine formula

1
Z  
ψ(ξ ) = − i l · ξ + ξ · Qξ + 1 − ei y·ξ + i ξ · y1(0,1) (|y|) ν(dy) (6.10)
2 y6=0

where the triplet (l, Q, ν) as above.

I learned the following proof from Francis Hirsch; it is based on arguments by Courrège [13]
and Herz [20]. The presentation below follows the version in Böttcher, Schilling & Wang [9,
Section 2.3].

Proof. The proof is divided into several steps.

1◦ We have seen in Lemma 6.3 that C∞


c (R ) ⊂ D(A).
d

2◦ Set A0 f := (A f )(0) for f ∈ C∞


c (R ). This is a linear functional on Cc . Observe that
d ∞

(PMP)
f ∈ C∞ d
c (R ), f > 0, f (0) = 0 ===⇒ A0 f > 0.
40 R. L. Schilling: An Introduction to Lévy and Feller Processes

3◦ By 2◦ , f 7→ A00 f := A0 (| · |2 · f ) is a positive linear functional on C∞ d


c (R ). Therefore it is
bounded. Indeed, let f ∈ C∞ c (K) for a compact set K ⊂ R and let φ ∈ Cc (R ) be a cut-off
d ∞ d

function such that 1K 6 φ 6 1. Then

k f k∞ φ ± f > 0.

By linearity and positivity k f k∞ A00 φ ± A00 f > 0 which shows |A00 f | 6 CK k f k∞ with the
constant CK = A00 φ .
By Riesz’ representation theorem, there exists a Radon measure3 µ such that
Z Z Z
2 2 µ(dy)
A0 (| · | f ) = f (y) µ(dy) = |y| f (y) = |y|2 f (y) ν(dy).
|y|2
| {z }
This implies that =:ν(dy)
Z
A0 f0 = f0 (y) ν(dy) for all f0 ∈ C∞ d
c (R \ {0});
y6=0

since any compact subset of Rd \ {0} is contained in an annulus BR (0) \ Bε (0), we have
supp f0 ∩ Bε (0) = 0/ for some sufficiently small ε > 0. The measure ν is a Radon measure
on Rd \ {0}.
c
4◦ Let f , g ∈ C∞ d
c (R ), 0 6 f , g 6 1, supp f ⊂ B1 (0), supp g ⊂ B1 (0) and f (0) = 1. From

sup kgk∞ f (y) + g(y) = kgk∞ = kgk∞ f (0) + g(0)
y∈Rd

and (PMP), it follows that A0 (kgk∞ f + g) 6 0. Consequently,

A0 g 6 −kgk∞ A0 f .

If g ↑ 1 − 1B1 (0) , then this shows


Z
ν(dy) 6 −A0 f < ∞.
|y|>1

2
R
Hence, y6=0 (|y| ∧ 1) ν(dy) < ∞.

5◦ Let f ∈ C∞ d
c (R ) and φ (y) = 1(0,1) (|y|). Define
Z  
S0 f := f (y) − f (0) − y · ∇ f (0)φ (y) ν(dy). (6.11)
y6=0

By Taylor’s formula, there is some θ ∈ (0, 1) such that

1 d ∂ 2 f (θ y)
f (y) − f (0) − y · ∇ f (0)φ (y) = ∑ ∂ xk ∂ xl yk yl .
2 k,l=1
3 A Radon measure on a topological space Eis a Borel measure which is finite on compact subsets of E and regular:
for all open sets µ(U) = supK⊂U µ(K) and for all Borel sets µ(B) = infU⊃B µ(U) (K, U are generic compact and open
sets, respectively).
Chapter 6: The generator of a Lévy process 41

Using the elementary inequality 2yk yl 6 y2k + y2l 6 |y|2 , we obtain



 1 ∑d ∂2 2
4 k,l=1 ∂ xk ∂ xl f ∞ |y| , |y| < 1
| f (y) − f (0) − y · ∇ f (0)φ (y)| 6
2k f k , |y| > 1

6 2k f k(2) (|y|2 ∧ 1).

This means that S0 defines a distribution (generalized function) of order 2.

6◦ Set L0 := A0 − S0 . The steps 2◦ and 5◦ show that A0 is a distribution of order 2. Moreover,


Z  
L0 f 0 = f0 (0) − y · ∇ f0 (0)φ (y) ν(dy) = 0
y6=0

for any f0 ∈ C∞ d
c (R ) with f 0 |Bε (0) = 0 for some ε > 0. Hence, supp(L0 ) ⊂ {0}.

Let us show that L0 is almost positive (also: ‘fast positiv’, ‘prèsque positif’):

f 0 ∈ C∞ d
c (R ), f0 (0) = 0, f0 > 0 =⇒ L0 f0 > 0. (PP)

Indeed: Pick 0 6 φn ∈ C∞ d
c (R \ {0}), φn ↑ 1Rd \{0} and let f 0 be as in (PP). Then

supp L0 ⊂{0}
L0 f0 = L0 [(1 − φn ) f0 )]
= A0 [(1 − φn ) f0 ] − S0 [(1 − φn ) f0 ]
Z
f0 (0)=0
= A0 [(1 − φn ) f0 ] − (1 − φn (y)) f0 (y) ν(dy)
∇ f0 (0)=0 y6=0
2◦
Z
> − (1 − φn (y)) f0 (y) ν(dy) −−−→ 0
(PMP) n→∞

by the monotone convergence theorem.

7◦ As in 5◦ we find with Taylor’s formula for f ∈ C∞


c (R ), supp f ⊂ K and φ ∈ Cc (R ) satis-
d ∞ d

fying 1K 6 φ 6 1

( f (y) − f (0) − ∇ f (0) · y) φ (y) 6 2k f k(2) |y|2 φ (y).

(As usual, k f k(2) = ∑06|α|62 k∂ α f k∞ .) Therefore,

2k f k(2) |y|2 φ (y) + f (0)φ (y) + ∇ f (0) · yφ (y) − f (y) > 0,

and (PP) implies

L0 f 6 f (0)L0 φ + |∇ f (0)|L0 (| · |φ ) + 2k f k(2) L0 (| · |2 φ ) 6 CK k f k(2) .

8◦ We have seen in 6◦ that L0 is of order 2 and supp L0 ⊂ {0}. Therefore,

1 d ∂ 2 f (0) d
∂ f (0)
L0 f = ∑ qkl + ∑ lk − c f (0). (6.12)
2 k,l=1 ∂ xk ∂ xl k=1 ∂ xk
42 R. L. Schilling: An Introduction to Lévy and Feller Processes

We will show that (qkl )k,l is positive semidefinite. Set g(y) := (y·ξ )2 f (y) where f ∈ C∞ d
c (R )
is such that 1B1 (0) 6 f 6 1. By (PP), L0 g > 0. It is not difficult to see that this implies
d
∑ qkl ξk ξl > 0, for all ξ = (ξ1 , . . . , ξd ) ∈ Rd .
k,l=1

9◦ Since Lévy processes and their semigroups are invariant under translations, cf. Remark 6.7,
we get A f (x) = A0 [ f (x + ·)]. If we replace f by f (x + ·), we get
1
A f (x) = c f (x) + l · ∇ f (x) + ∇ · Q∇ f (x)
2
(6.80 )
Z  
+ f (x + y) − f (x) − y · ∇ f (x)1(0,1) (|y|) ν(dy).
y6=0

We will show in the next step that c = 0.

10◦ So far, we have seen in 5◦ , 7◦ and 9◦ that

kA f k∞ 6 Ck f k(2) = C ∑ k∂ α f k∞ , f ∈ C∞ d
c (R ),
|α|62

which means that A (has an extension which) is continuous as an operator from C2b (Rd ) to
Cb (Rd ). Therefore, (A, C∞ d
c (R )) is a pseudo differential operator with symbol

−ψ(ξ ) = e−ξ (x)Ax eξ (x), eξ (x) = ei ξ ·x .

Inserting eξ into (6.80 ) proves (6.10) and, as ψ(0) = 0, c = 0.

Remark 6.9. In step 8◦ of the proof of Theorem 6.8 one can use the (PMP) to show that the
coefficient c appearing in (6.80 ) is positive. For this, let ( fn )n∈N ⊂ C∞ d
c (R ), f n ↑ 1 and f n |B1 (0) = 1.
By (PMP), A0 fn 6 0. Moreover, ∇ fn (0) = 0 and, therefore,
Z
S0 f n = − (1 − fn (y)) ν(dy) −−−→ 0.
n→∞

Consequently,
lim sup L0 fn = lim sup(A0 fn − S0 fn ) 6 0 =⇒ c > 0.
n→∞ n→∞
For Lévy processes we have c = ψ(0) = 0 and this is a consequence of the infinite life-time of
the process:
P(Xt ∈ Rd ) = Pt 1 = 1 for all t > 0,
Rt
and we can use the formula Pt f − f = A 0 Ps f ds, cf. Lemma 5.4, for f ≡ 1 to show that

c = A1 = 0 ⇐⇒ Pt 1 = 1.

Definition 6.10. A Lévy measure is a Radon measure ν on Rd \ {0} s.t. y6=0 (|y|2 ∧ 1) ν(dy) < ∞.
R

A Lévy triplet is a triplet (l, Q, ν) consisting of a vector l ∈ Rd , positive semi-definite matrix


Q ∈ Rd×d and a Lévy measure ν.
Chapter 6: The generator of a Lévy process 43

The proof of Theorem 6.8 incidentally shows that the Lévy triplet defining the exponent (6.10)
or the generator (6.8) is unique. The following corollary can easily be checked using the represen-
tation (6.8).

Corollary 6.11. Let A be the generator and (Pt )t>0 the semigroup of a Lévy process. Then the
Lévy triplet is given by

Pt f0 (0)
Z
f0 dν = A f0 (0) = lim ∀ f 0 ∈ C∞ d
c (R \ {0}),
Z
t→0 t
 
lk = Aφk (0) − yk φ (y) − 1(0,1) (|y|) ν(dy), k = 1, . . . , d, (6.13)
Z
qkl = A(φk φl )(0) − φk (y)φl (y) ν(dy), k, l = 1, . . . d,
y6=0

where φ ∈ C∞ d
c (R ) satisfies 1B1 (0) 6 φ 6 1 and φk (y) := yk φ (y). In particular, (l, Q, ν) is uniquely
determined by A or the characteristic exponent ψ.

We will see an alternative uniqueness proof in the next Chapter 7.


Remark 6.12. Setting pt (dy) = P(Xt ∈ dy), we can recast the formula for the Lévy measure as

pt (dy)
ν(dy) = lim (vague limit of measures on the set Rd \ {0}).
t→0 t
Moreover, a direct calculation using the Lévy–Khintchine formula (6.13) gives the following al-
ternative representation for the qkl :

1 ψ(nξ )
ξ · Qξ = lim , ξ ∈ Rd .
2 n→∞ n2
7. Construction of Lévy processes

Our starting point is now the Lévy–Khintchine formula for the characteristic exponent ψ of a Lévy
process
1
Z
1 − ei y·ξ + i ξ · y1(0,1) (|y|) ν(dy)
 
ψ(ξ ) = − i l · ξ + ξ · Qξ + (7.1)
2 y6=0
where (l, Q, ν) is a Lévy triplet in the sense of Definition 6.10; a proof of (7.1) is contained in
Theorem 6.8, but the exposition below is independent of this proof, see however Remark 7.7 at
the end of this chapter.
What will be needed is that a compound Poisson process is a Lévy process with càdlàg paths
and characteristic exponent of the form
Z
1 − ei y·ξ ρ(dy)
 
φ (ξ ) = (7.2)
y6=0

(ρ is any finite measure), see Example 3.2.d), where ρ(dy) = λ · µ(dy).


Let ν be a Lévy measure and denote by Aa,b = {y : a 6 |y| < b} an annulus. Since we have
R  2 
y6=0 |y| ∧ 1 ν(dy) < ∞, the measure ρ(B) := ν(B ∩ Aa,b ) is a finite measure, and there is a
R
corresponding compound Poisson process. Adding a drift with l = − y ρ(dy) shows that for
every exponent Z
a,b
1 − ei y·ξ + i y · ξ ν(dy)
 
ψ (ξ ) = (7.3)
a6|y|<b

there is some Lévy process X a,b = (Xta,b )t>0 . In fact,

Lemma 7.1. Let 0 < a < b 6 ∞ and ψ a,b given by (7.3). Then the corresponding Lévy process
X a,b is an L2 (P)-martingale with càdlàg paths such that
 Z 
 a,b   a,b a,b > 
E Xt = 0 and E Xt · (Xt ) = t yk yl ν(dy) .
a6|y|<b k,l

Proof. Set Xt := Xta,b , ψ := ψ a,b and Ft := σ (Xr , r 6 t). Using the differentiation lemma for
parameter-dependent integrals we see that ψ is twice continuously differentiable and
∂ 2 ψ(0)
Z
∂ ψ(0)
=0 and = yk yl ν(dy).
∂ ξk ∂ ξk ∂ ξl a6|y|<b

Since the characteristic function e−tψ(ξ ) is twice continuously differentiable, X has first and second
moments, cf. Theorem A.2, and these can be obtained by differentiation:
(k) 1 ∂ ∂ ψ(0)
EXt = E ei ξ ·Xt = it =0
i ∂ ξk ξ =0 ∂ ξk

44
Chapter 7: Construction of Lévy processes 45

and

(k) (l) ∂2 t∂ 2 ψ(0) t∂ ψ(0) t∂ ψ(0)


E(Xt Xt ) = − E ei ξ ·Xt = −
∂ ξk ∂ ξl ξ =0 ∂ ξk ∂ ξl ∂ ξk ∂ ξl
Z
=t yk yl ν(dy).
a6|y|<b

The martingale property now follows from the independence of the increments: Let s 6 t, then
(L2)
E(Xt | Fs ) = E(Xt − Xs + Xs | Fs ) = E(Xt − Xs | Fs ) + Xs = E(Xt−s ) + Xs = Xs .
(L1)

We will use the processes from Lemma 7.1 as main building blocks for the Lévy process. For
this we need some preparations.

Lemma 7.2. Let (Xtk )t>0 be Lévy processes with characteristic exponents ψk . If X 1 ⊥
⊥ X 2 , then
X := X 1 + X 2 is a Lévy process with characteristic exponent ψ = ψ1 + ψ2 .

Proof. Set Ftk := σ (Xsk , s 6 t) and Ft = σ (Ft1 , Ft2 ). Since X 1 ⊥ ⊥ X 2 , we get for F = F1 ∩ F2 ,
Fk ∈ Fsk ,
   1 1 2 2

E ei ξ ·(Xt −Xs ) 1F = E ei ξ ·(Xt −Xs ) 1F1 · ei ξ ·(Xt −Xs ) 1F2
 1 1
  2 2

= E ei ξ ·(Xt −Xs ) 1F1 E ei ξ ·(Xt −Xs ) 1F2
(L2) −(t−s)ψ1 (ξ )
= e P(F1 ) · e−(t−s)ψ2 (ξ ) P(F2 )

= e−(t−s)(ψ1 (ξ )+ψ2 (ξ )) P(F).

As {F1 ∩ F2 : Fk ∈ Fsk } is a ∩-stable generator of Fs , we find


 
E ei ξ ·(Xt −Xs ) Fs = e−(t−s)ψ(ξ ) .

Observe that Fs could be larger than the canonical filtration FsX . Therefore, we first condition
w.r.t. E(· · · | FsX ) and then use Theorem 3.1, to see that X is a Lévy process with characteristic
exponent ψ = ψ1 + ψ2 .

Lemma 7.3. Let (X n )n∈N be a sequence of Lévy processes with characteristic exponents ψn .
Assume that Xtn → Xt converges in probability for every t > 0. If

either: the convergence is uniform in probability, i.e.


 
n
∀ε > 0 ∀t > 0 : lim P sup |Xs − Xs | > ε = 0,
n→∞ s6t

or: the limiting process X has càdlàg paths,

then X is a Lévy process with characteristic exponent ψ := limn→∞ ψn .


46 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Let 0 = t0 < t1 < · · · < tm and ξ1 , . . . , ξm ∈ Rd . Since the X n are Lévy processes,
" #
m m h i
(L2),(L1)
E exp i ∑ ξk · (Xtnk − Xtnk−1 ) = ∏ E exp i ξk · Xtnk −tk−1
k=1 k=1

and, because of convergence in probability, this equality is inherited by the limiting process X.
This proves that X has independent (L20 ) and stationary (L1) increments.
The condition (L3) follows either from the uniformity of the convergence in probability or the
càdlàg property. Thus, X is a Lévy process. From
n
lim E ei ξ ·X1 = E ei ξ ·X1
n→∞

we get that the limit limn→∞ ψn = ψ exists.

Lemma 7.4 (Notation of Lemma 7.1). Let (an )n∈N be a sequence a1 > a2 > . . . decreasing to zero
and assume that the processes (X an+1 ,an )n∈N are independent Lévy processes with characteristic
exponents ψan+1 ,an . Then X := ∑∞ an+1 ,an is a Lévy process with exponent ψ := ∞ ψ
n=1 X ∑n=1 an+1 ,an
2
and càdlàg paths. Moreover, X is an L (P)-martingale.

Proof. Lemmas 7.1, 7.2 show that X an+m ,an = ∑m k=1 X


an+k ,an+k−1 is a Lévy process with characteristic

exponent ψan+m ,an = ∑m


k=1 ψan+k ,an+k−1 , and X
an+m ,an is an L2 (P)-martingale. By Doob’s inequality

and Lemma 7.1


 
| 6 4 E |Xtan+m ,an |2
an+m ,an 2

E sup |Xs
s6t
Z
dom. convergence
= 4t y2 ν(dy) −−−−−−−−−→ 0.
an+m 6|y|<an m,n→∞

Hence, the limit X = limn→∞ X an ,a1 exists (uniformly in t) in L2 , i.e. X is an L2 (P)-martingale;


since the convergence is also uniform in probability, Lemma 7.3 shows that X is a Lévy process
with exponent ψ = ∑∞ n=1 ψan+1 ,an . Taking a uniformly convergent subsequence, we also see that
the limit inherits the càdlàg property from the approximating Lévy processes X an ,a1 .

We can now prove the main result of this chapter.

Theorem 7.5. Let (l, Q, ν) be a Lévy triplet and ψ be given by (7.1). Then there exists a Lévy
process X with càdlàg paths and characteristic exponent ψ.

Proof. Because of Lemma 7.1, 7.2 and 7.4 we can construct X piece by piece.

1◦ Let (Wt )t>0 be a Brownian motion and set


p 1
Xtc := tl + and ψc (ξ ) := − i l · ξ + ξ · Qξ .
QWt
2

2◦ Write Rd \ {0} = ∞
S
·  1 1
n=0 An with A0 := {|y| > 1} and An := n+1 6 |y| < n ; set λn := ν(An )
and µn := ν(· ∩ An )/ν(An ).
Chapter 7: Construction of Lévy processes 47

3◦ Construct, as in Example 3.2.d), a compound Poisson process comprising the large jumps
Z
Xt0 := Xt1,∞ 1 − ei y·ξ ν(dy)
 
and ψ0 (ξ ) :=
16|y|<∞

and compensated compound Poisson processes taking account of all small jumps
1
Z
a ,a
Xtn := Xt n+1 n , 1 − ei y·ξ + i y · ξ ν(dy).
 
an := and ψn (ξ ) :=
n An

We can construct the processes X nstochastically independent (just choose independent jump
time processes and independent iid jump heights when constructing the compound Poisson
processes) and independent of the Wiener process W .

4◦ Setting ψ = ψ0 + ψc + ∑∞n=1 ψn , Lemma 7.2 and 7.4 prove the theorem. Since all approx-
imating processes have càdlàg paths, this property is inherited by the sums and the limit
(Lemma 7.4).

The proof of Theorem 7.5 also implies the following pathwise decomposition of a Lévy process.
We write ∆Xt := Xt − Xt− for the jump at time t. From the construction we know that
[1,∞)
(large jumps) Jt = ∑ ∆Xs 1[1,∞) (|∆Xs |) (7.4)
s6t
[1/n,1)
(small jumps) Jt = ∑ ∆Xs 1[1/n,1) (|∆Xs |) (7.5)
s6t
[1/n,1) [1/n,1) [1/n,1)
(compensated small jumps) Jet = Jt − EJt (7.6)
Z
= ∑ ∆Xs 1[1/n,1) (|∆Xs |) − t y ν(dy).
s6t 1
n 6|y|<1

are Lévy processes and J [1,∞) ⊥


⊥ Je[1/n,1) .

Corollary 7.6. Let ψ be a characteristic exponent given by (7.1) and let X be the Lévy process
constructed in Theorem 7.5. Then
 Z  
p
Xt = QWt + lim ∑ ∆Xs 1[1/n,1) (|∆Xs |) − t y ν(dy) =: Mt , L2 -martingale
n→∞
s6t
[1/n,1)

tl + ∑ ∆Xs 1[1,∞) (|∆Xs |). =: At , bdd. variation
s6t

continuous pure jump part


Gaussian

where all appearing processes are independent.

Proof. The decomposition follows directly from the construction in Theorem 7.5. By Lemma 7.4,
limn→∞ X 1/n,1 is an L2 (P)-martingale, and since the (independent!) Wiener process W is also an
L2 (P)-martingale, so is their sum M.
48 R. L. Schilling: An Introduction to Lévy and Feller Processes

The paths t 7→ At (ω) are a.s. of bounded variation since, by construction, on any time-interval
[0,t] there are Nt0 (ω) jumps of size > 1. Since Nt0 (ω) < ∞ a.s., the total variation of At (ω) is less
or equal than |l|t + ∑s6t |∆Xs |1[1,∞) (|∆Xs |) < ∞ a.s.

Remark 7.7. A word of caution: Theorem 7.5 associates with any ψ given by the Lévy–Khintchi-
ne formula (7.1) a Lévy process. Unless we know that all characteristic exponents are of this form
(this was proved in Theorem 6.8), it does not follow that we have constructed all Lévy processes.
On the other hand, Theorem 7.5 shows that the Lévy triplet determining ψ is unique. Indeed,
assume that (l, Q, ν) and (l 0 , Q0 , ν 0 ) are two Lévy triplets which yield the same exponent ψ. Now
we can associate, using Theorem 7.5, with each triplet a Lévy process X and X 0 such that
0
E ei ξ ·Xt = e−tψ(ξ ) = E ei ξ ·Xt .

Thus, X ∼ X 0 and so these processes have (in law) the same pathwise decomposition, i.e. the same
drift, diffusion and jump behaviour. This, however, means that (l, Q, ν) = (l 0 , Q0 , ν 0 ).
8. Two special Lévy processes

We will now study the structure of the paths of a Lévy process. We begin with two extreme cases:
Lévy processes which only grow by jumps of size 1 and Lévy processes with continuous paths.
Throughout this chapter we assume that all paths [0, ∞) 3 t 7→ Xt (ω) are right-continuous with
finite left-hand limits (càdlàg). This is a bit stronger than (L3), but it is always possible to
construct a càdlàg version of a Lévy process (see the discussion on page 13). This allows us to
consider the jumps of the process X

∆Xt := Xt − Xt− = Xt − lim Xs .


s↑t

Theorem 8.1. Let X be a one-dimensional Lévy process which moves only by jumps of size 1.
Then X is a Poisson process.

Proof. Set FtX := σ (Xs , s 6 t) and let T1 = inf{t > 0 : ∆Xt = 1} be the time of the first jump.
Since {T1 > t} = {Xt = 0} ∈ FtX , T1 is a stopping time.
Let T0 = 0 and Tk = inf{t > Tk−1 : ∆Xt = 1}, be the time of the kth jump; this is also a stopping
time. By the Markov property ((4.4) and Lemma 4.4),

P(T1 > s + t) = P(T1 > s, T1 > s + t)


= E 1{T1 >s} PXs (T1 > t)
 

= E 1{T1 >s} P0 (T1 > t)


 

= P(T1 > s)P(T1 > t)

where we use that Xs = 0 if T1 > s (the process hasn’t yet moved!) and P = P0 .
Since t 7→ P(T1 > t) is right-continuous, P(T1 > t) = exp[t log P(T1 > 1)] is the unique solution
of this functional equation (Theorem A.1). Thus, the sequence of inter-jump times σk := Tk −Tk−1 ,
k ∈ N, is an iid sequence of exponential times. This follows immediately from the strong Markov
property (Theorem 4.12) for Lévy processes and the observation that

Tk+1 − Tk = T1Y where Y = (Yt+Tk −YTk )t>0

and T1Y is the first jump time of the process Y .


Obviously, Xt = ∑∞ k=1 1[0,t] (Tk ), Tk = σ1 + · · · + σk , and Example 3.2.c) (and Theorem 3.4) show
that X is a Poisson process.

A Lévy process with uniformly bounded jumps admits moments of all orders.

49
50 R. L. Schilling: An Introduction to Lévy and Feller Processes

Lemma 8.2. Let (Xt )t>0 be a Lévy process such that |∆Xt (ω)| 6 c for all t > 0 and some constant
c > 0. Then E(|Xt | p ) < ∞ for all p > 0.

Proof. Let FtX := σ (Xs , s 6 t) and define the stopping times



τ0 := 0, τn := inf t > τn−1 : |Xt − Xτn−1 | > c .

Since X has càdlàg paths, τ0 < τ1 < τ2 < . . . . Let us show that τ1 < ∞ a.s. For fixed t > 0 and
n ∈ N we have

P(τ1 = ∞) 6 P(τ1 > nt) 6 P(|Xkt − X(k−1)t | 6 2c, ∀k = 1, . . . , n)


n
(L2) (L1)
=0 ∏ P(|Xkt − X(k−1)t | 6 2c) = P(|Xt | 6 2c)n .
(L2 ) k=1

Letting n → ∞ we see that P(τ1 = ∞) = 0 if P(|Xt | 6 2c) < 1 for some t > 0. (In the alternative
case, we have P(|Xt | 6 2c) = 1 for all t > 0 which makes the lemma trivial.)
By the strong Markov property (Theorem 4.12) τn − τn−1 ∼ τ1 and τn − τn−1 ⊥ ⊥ FτXn−1 , i.e. (τn −
τn−1 )n∈N is an iid sequence. Therefore,
n
E e−τn = E e−τ1 = qn

for some q ∈ [0, 1). From the very definition of the stoppping times we infer
n n 
|Xt∧τn | 6 ∑ |Xτ k
− Xτk−1 | 6 ∑ |∆Xτk | + |Xτk − − Xτk−1 | 6 2nc.
k=1 k=1 | {z } | {z }
6c 6c

Thus, |Xt | > 2nc implies that τn < t, and by Markov’s inequality

P(|Xt | > 2nc) 6 P(τn < t) 6 et E e−τn = et qn .

Finally,

E |Xt | p = |Xt | p 1{2nc<|Xt |62(n+1)c}
 
∑E
n=0
∞ ∞
6 (2c) p ∑ (n + 1) p P(|Xt | > 2nc) 6 (2c) p et ∑ (n + 1) p qn < ∞.
n=0 n=0

Recall that a Brownian motion (Wt )t>0 on Rd is a Lévy process such that Wt is a normal random
variable with mean 0 and covariance matrix t id. We will need Paul Lévy’s characterization of
Brownian motion which we state without proof. An elementary proof can be found in [56, Chapter
9.4].

Theorem 8.3 (Lévy). Let M = (Mt , Ft ), M0 = 0, be a one-dimensional martingale with contin-


uous sample paths such that (Mt2 − t, Ft )t>0 is also a martingale. Then M is a one-dimensional
standard Brownian motion.
Chapter 8: Two special Lévy processes 51

Theorem 8.4. Let (Xt )t>0 be a Lévy process in Rd whose sample paths are a.s. continuous. Then

Xt ∼ tl + QWt where l ∈ Rd , Q is a positive semidefinite symmetric matrix, and W is a standard
Brownian motion in Rd .

We will give two proofs of this result.

Proof (using Theorem 8.3). By Lemma 8.2, the process (Xt )t>0 has moments of all orders. There-
ξ
fore, Mt := Mt := ξ · (Xt − EXt ) exists for any ξ ∈ Rd and is a martingale for the canonical
filtration Ft := σ (Xs , s 6 t). Indeed, for all s 6 t
(L2)
E(Mt | Fs ) = E(Mt − Ms | Fs ) − Ms = EMt−s + Ms = Ms .
(L1)

Moreover

E(Mt2 − Ms2 | Fs ) = E((Mt − Ms )2 + 2Ms (Mt − Ms ) | Fs )


(L2)
= E((Mt − Ms )2 ) + 2Ms E(Mt − Ms )
Lemma 3.10
= (t − s)EM12 = (t − s)VM1 ,

and so (Mt2 − tVM1 )t>0 and (Mt )t>0 are martingales with continuous paths.
Now we can use Theorem 8.3 and deduce that ξ · (Xt − EXt ) is a one-dimensional Brownian
motion with variance ξ · Qξ where tQ is the covariance matrix of Xt (cf. the proof of Lemma 7.1

or Lemma 3.10). Thus, Xt − EXt = QWt where Wt is a d-dimensional standard Brownian motion.
Finally, EXt = tEX1 =: tl.

Standard proof (using the CLT). Fix ξ ∈ Rd and set M(t) := ξ · (Xt − EXt ). Since X has moments
of all orders, M is well-defined and it is again a Lévy process. Moreover,

EM(t) = 0 and tσ 2 = VM(t) = E[(ξ · (Xt − EXt ))2 ] = tξ · Qξ

where Q is the covariance matrix of X, cf. the proof of Lemma 7.1. We proceed as in the proof of
the CLT: Using a Taylor expansion we get
n  n
i M(t)
0 
i ∑nk=1 [M(tk/n)−M(t(k−1)/n)] (L2 ) i M(t/n) 1 2
Ee = Ee = Ee = 1 − EM (t/n) + Rn .
(L1) 2

The remainder term Rn is estimated by 16 E|M 3 ( nt )|. If we can show that |Rn | 6 ε nt for large
n = n(ε) and any ε > 0, we get because of EM 2 ( nt ) = nt σ 2

t n
 
1 2 1 2 1 2
Ee i M(t)
= lim 1 − (σ + 2ε) = e− 2 (σ +2ε)t −−−→ e− 2 σ t .
n→∞ 2 n ε→0

This shows that ξ · (Xt − EXt ) is a centered Gaussian random variable with variance σ 2 . Since
EXt = tEX1 we conclude that Xt is Gaussian with mean tl and covariance tQ.
52 R. L. Schilling: An Introduction to Lévy and Feller Processes

We will now estimate E|M 3 ( nt )|. For every ε > 0 we can use the uniform continuity of s 7→ M(s)
on [0,t] to get
lim max |M( nk t) − M( k−1
n t)| = 0.
n→∞ 16k6n

Thus, we have for all ε > 0


 
1 = lim P max |M( nk t) − M( k−1
n t)| 6 ε
n→∞ 16k6n
 n 
\
k k−1
= lim P |M( n t) − M( n t)| 6 ε
n→∞
k=1
n
(L2)
= lim ∏ P(|M( nt )| 6 ε)
(L1) n→∞
k=1
n
= lim 1 − P(|M( nt )| > ε)

n→∞

6 lim e−n P(|M(t/n)|>ε) 6 1


n→∞

where we use the inequality 1 + x 6 ex . This proves limn→∞ n P(|M(t/n)| > ε) = 0. Therefore,
Z
E|M 3 ( nt )| 6 εEM 2 ( nt ) + |M 3 ( nt )| dP
|M(t/n)|>ε
t
q q
6 ε σ 2 + P(|M( nt )| > ε) EM 6 ( nt ).
n
It is not hard to see that EM 6 (s) = a1 s + · · · + a6 s6 (differentiate Eei uM(s) = e−sψ(u) six times at
u = 0), and so
s
t t P(|M(t/n)| > ε) t
E|M 3 ( nt )| 6 ε σ 2 + c = εσ 2 + o(1)

n n t/n n

We close this chapter with Paul Lévy’s construction of a standard Brownian motion (Wt )t>0 .
Since W is a Lévy process which has the Markov property, it is enough to construct a Brownian
motion W (t) only for t ∈ [0, 1], then produce independent copies (W n (t))t∈[0,1] , n = 0, 1, 2, . . . , and
join them continuously:

W 0 (t), t ∈ [0, 1),
Wt :=
W 0 (1) + · · · +W n−1 (1) +W n (t − n), t ∈ [n, n + 1).

Since each Wt is normally distributed with mean 0 and variance t, we will get a Lévy process with
characteristic exponent 21 ξ 2 , ξ ∈ R. In the same vein we get a d-dimensional Brownian motion
(1) (d)
by making a vector (Wt , . . . ,Wt )t>0 of d independent copies of (Wt )t>0 . This yields a Lévy
process with exponent 12 (ξ12 + · · · + ξd2 ), ξ1 , . . . , ξd ∈ R.
Denote a one-dimensional normal distribution with mean m and variance σ 2 as N(m, σ 2 ). The
motivation for the construction is the observation that a Brownian motion satisfies the following
mid-point law (cf. [56, Chapter 3.4]):

P(W(s+t)/2 ∈ • | Ws = x,Wt = y) = N 12 (x + y), 41 (t − s) , s 6 t, x, y ∈ R.



Chapter 8: Two special Lévy processes 53

𝑊1
2
𝑊1
𝑊2𝑛+1 (𝑡) 4
𝛥 𝑛 (𝑡)
𝑊1

𝑊3
4

𝑊2𝑛 (𝑡)

1
𝑘−1 2𝑘−1 𝑘 1 1 3
2𝑛 2𝑛+1 2𝑛 4 2 4

Figure 8.1.: Interpolation of order four in Lévy’s construction of Brownian motion.

This can be turned into the following construction method:

Algorithm. Set W (0) = 0 and let W (1) ∼ N(0, 1). Let n > 1 and assume that the random variables
W (k2−n ), k = 1, . . . , 2n − 1 have already been constructed. Then

W (k2−n ), l = 2k,
−n−1
W (l2 ) :=
 1 W (k2−n ) +W ((k + 1)2−n ) + Γ n , l = 2k + 1,

2 2 +k

where Γ2n +k is an independent (of everything else) N(0, 2−n /4) Gaussia random variable, cf. Fig-
ure 8.1. In-between the nodes we use piecewise linear interpolation:

W2n (t, ω) := Linear interpolation of W (k2−n , ω), k = 0, 1, . . . , 2n ,



n > 1.

At the dyadic points t = k2− j we get the ‘true’ value of W (t, ω), while the linear interpolation
is an approximation, see Figure 8.1.

Theorem 8.5 (Lévy 1940). The series


∞ 
W (t, ω) := ∑ W2n+1 (t, ω) −W2n (t, ω) +W1 (t, ω), t ∈ [0, 1],
n=0

converges a.s. uniformly. In particular (W (t))t∈[0,1] is a one-dimensional Brownian motion.

Proof. Set ∆n (t, ω) := W2n+1 (t, ω) −W2n (t, ω). By construction,

∆n (2k − 1)2−n−1 , ω) = Γ2n +(k−1) (ω), k = 1, 2, . . . , 2n ,

are iid N(0, 2−(n+2) ) distributed random variables. Therefore,



xn
 √ 
P max n ∆n (2k − 1)2−n−1 > √ 6 2n P 2n+2 ∆n 2−n−1 > xn ,
 
16k62 2n+2
54 R. L. Schilling: An Introduction to Lévy and Feller Processes

and the right-hand side equals

2 · 2n 2n+1 r 2n+1 −xn2 /2


Z ∞ Z ∞
−r2 /2 2 /2
√ e dr 6 √ e−r dr = √ e .
2π xn 2π xn xn xn 2π

Choose c > 1 and xn := c 2n log 2. Then

2n+1 −c2 log 2n


∞   ∞
−n−1
 xn
P max
∑ 16k62n n ∆ (2k − 1)2 > √ 6 ∑ √ e
n=1 2n+2 n=1 c 2π

2 2
= √ ∑ 2−(c −1)n < ∞.
c 2π n=1

Using the Borel–Cantelli lemma we find a set Ω0 ⊂ Ω with P(Ω0 ) = 1 such that for every ω ∈ Ω0
there is some N(ω) > 1 with
r
n log 2
max ∆n (2k − 1)2−n−1 6 c

for all n > N(ω).
16k62n 2n+1

∆n (t) is the distance between the polygonal arcs W2n+1 (t) and W2n (t); the maximum is attained at
one of the midpoints of the intervals [(k − 1)2−n , k2−n ], k = 1, . . . , 2n , see Figure 8.1. Thus,
r
−n−1
 n log 2
sup W2n+1 (t, ω) −W2n (t, ω) 6 max n ∆n (2k − 1)2 ,ω 6 c ,
06t61 16k62 2n+1

for all n > N(ω) which means that the limit


∞ 
W (t, ω) := lim W2N (t, ω) = ∑ W2n+1 (t, ω) −W2n (t, ω) +W1 (t, ω)
N→∞
n=0

exists for all ω ∈ Ω0 uniformly in t ∈ [0, 1]. Therefore, t 7→ W (t, ω), ω ∈ Ω0 , inherits the continuity
of the polygonal arcs t 7→ W2n (t, ω). Set

e (t, ω) := W (t, ω)1Ω0 (ω).


W

By construction, we find for all 0 6 k 6 l 6 2n

e (l2−n ) − W
W e (k2−n ) = W2n (l2−n ) −W2n (k2−n )
l
W2n (l2−n ) −W2n ((l − 1)2−n )

= ∑
l=k+1
iid
∼ N(0, (l − k)2−n ).

Since t 7→ W
e (t) is continuous and the dyadic numbers are dense in [0,t], we conclude that the
e (tk ) − W
increments W e (tk−1 ), 0 = t0 < t1 < · · · < tN 6 1 are independent N(0,tk − tk−1 ) distributed
random variables. This shows that (W e (t))t∈[0,1] is a Brownian motion.
9. Random measures

We continue our investigations of the paths of càdlàg Lévy processes. Independently of Chapters 5
and 6 we will show in Theorem 9.12 that the processes constructed in Theorem 7.5 are indeed all
Lévy processes; this gives also a new proof of the Lévy–Khintchine formula, cf. Corollary 9.13.
As before, we denote the jumps of (Xt )t>0 by

∆Xt := Xt − Xt− = Xt − lim Xs .


s↑t

Definition 9.1. Let X be a Lévy process. The counting measure

Nt (B, ω) := # {s ∈ (0,t] : ∆Xs (ω) ∈ B} , B ∈ B(Rd \ {0}) (9.1)

is called the jump measure of the process X.

Since a càdlàg function x : [0, ∞) → Rd has on any compact interval [a, b] at most finitely many
jumps |∆xt | > ε exceeding a fixed size,1 we see that

Nt (B, ω) < ∞ ∀t > 0, B ∈ B(Rd ) such that 0 ∈


/ B.

Notice that 0 ∈/ B is equivalent to Bε (0) ∩ B = 0/ for some ε > 0. Thus, B 7→ Nt (B, ω) is for every
ω a locally finite Borel measure on Rd \ {0}.

Definition 9.2. Let Nt (B) be the jump measure of the Lévy process X. For every Borel function
f : Rd → R with 0 ∈/ supp f we define
Z
Nt ( f , ω) := f (y) Nt (dy, ω). (9.2)

Since 0 ∈
/ supp f , it is clear that Nt (supp f , ω) < ∞, and for every ω

Nt ( f , ω) = ∑ f (∆Xs (ω)) = ∑ f (∆Xτ (ω))1(0,t] (τn (ω)).
n (9.20 )
0<s6t n=1

Both sums are finite sums, extending only over those s where ∆Xs (ω) 6= 0. This is obvious in the
second sum where τ1 (ω), τ2 (ω), τ3 (ω) . . . are the jump times of X.

Lemma 9.3. Let Nt (·) be the jump measure of a Lévy process X, f ∈ Cc (Rd \ {0}, Rm ) (i.e. f
takes values in Rm ), s < t and tk,n := s + nk (t − s). Then

 n−1 
Nt ( f , ω) − Ns ( f , ω) = ∑ f ∆Xu (ω) = lim ∑ f Xtk+1,n (ω) − Xtk,n (ω) . (9.3)
n→∞
s<u6t k=0
1 Otherwise we would get an accumulation point of jumps within [a, b], and x would be unbounded on [a, b].

55
56 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Throughout the proof ω is fixed and we will omit it in our notation. Since 0 ∈ / supp f ,
there is some ε > 0 such that Bε (0) ∩ supp f = 0; / therefore, we need only consider jumps of size
|∆Xt | > ε. Denote by J = {τ1 , . . . , τN } those jumps. For sufficiently large n we can achieve that

• # J ∩ (tk,n ,tk+1,n ] 6 1 for all k = 0, . . . , n − 1;

• |Xtκ+1,n − Xtκ,n | < ε if κ is such that J ∩ (tκ,n ,tκ+1,n ] = 0.


/
Indeed: Assume this is not the case, then we could find sequences s < sk < tk 6 t such that
tk − sk → 0, J ∩ (sk ,tk ] = 0/ and |Xtk − Xsk | > ε. Without loss of generality we may assume
that sk ↑ u and tk ↓ u for some u ∈ (s,t]; u = s can be ruled out because of right-continuity.
By the càdlàg property of the paths, |∆Xu | > ε, i.e. u ∈ J, which is a contradiction.

Since we have f (Xtκ+1,n − Xtκ,n ) = 0 for intervals of the ‘second kind’, only the intervals containing
some jump contribute to the (finite!) sum (9.3), and the claim follows.

Lemma 9.4. Let Nt (·) be the jump measure of a Lévy process X.

a) (Nt ( f ))t>0 is a Lévy process on Rm for all f ∈ Cc (Rd \ {0}, Rm ).

b) (Nt (B))t>0 is a Poisson process for all B ∈ B(Rd ) such that 0 ∈


/ B.

c) ν(B) := EN1 (B) is a locally finite measure on Rd \ {0}.

Proof. Set Ft := σ (Xs , s 6 t).


a) Let f ∈ Cc (Rd \ {0}, Rm ). From Lemma 9.3 and (L20 ) we see that Nt ( f ) is Ft measurable
⊥ Fs , s 6 t. Moreover, if NtY (·) denotes the jump measure of the Lévy process
and Nt ( f ) − Ns ( f ) ⊥
Y = (Xt+s − Xs )t>0 , we see that Nt ( f ) − Ns ( f ) = Nt−sY ( f ). By the Markov property (Theorem 4.6),

X ∼ Y , and we get Nt−s Y (f) ∼ N


t−s ( f ). Since t 7→ Nt ( f ) is càdlàg, (Nt ( f ))t>0 is a Lévy process.

b) By definition, N0 (B) = 0 and t 7→ Nt (B) is càdlàg. Since X is a Lévy process, we see as in the
proof of Theorem 8.1 that the jump times
 
τ0 := 0, τ1 := inf t > 0 : ∆Xt ∈ B , τk := inf t > τk−1 : ∆Xt ∈ B

satisfy τ1 ∼ Exp(ν(B)), and the inter-jump times (τk −τk−1 )k∈N are an iid sequence. The condition
0∈/ B ensures that Nt (B) < ∞ a.s., which means that the intensity ν(B) is finite. Indeed, we have

1 − e−tν(B) = P(τ1 6 t) = P(Nt (B) > 0) −−→ 0;


t→0

this shows that ν(B) < ∞. Thus,



Nt (B) = ∑ 1(0,t] (τk )
k=1
is a Poisson process (Example 3.2) and, in particular, a Lévy process (Theorem 3.4).
c) The intensity of (Nt (B))t>0 is ν(B) = EN1 (B). By Fubini’s theorem it is clear that ν is a
measure.
Chapter 9: Random measures 57

Definition 9.5. Let Nt (·) be the jump measure of a Lévy process X. The intensity measure is the
measure ν(B) := EN1 (B) from Lemma 9.4.

We will see in Corollary 9.13 that ν is the Lévy measure of (Xt )t>0 appearing in the Lévy–
Khintchine formula.

Lemma 9.6. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure. For
every f ∈ L1 (ν), f : Rd → Rm , the random variable Nt ( f ) := f (y) Nt (dy) exists as L1 -limit of
R

integrals of simple functions and satisfies


Z Z
ENt ( f ) = E f (y) Nt (dy) = t f (y) ν(dy). (9.4)
y6=0

Proof. For any step function f of the form f (y) = ∑M k=1 φk 1Bk (y) with 0 ∈
/ Bk the formula (9.4)
follows from ENt (Bk ) = tν(Bk ) and the linearity of the integral.
Since ν is defined on Rd \ {0}, any f ∈ L1 (ν) can be approximated by a sequence of step
functions ( fn )n∈N in L1 (ν)-sense, and we get
Z
E|Nt ( fn ) − Nt ( fm )| 6 t | fn − fm | dν −−−−→ 0.
m,n→∞

Because of the completeness of L1 (P), the limit limn→∞ Nt ( fn ) exists, and with a routine argument
we see that it is independent of the approximating sequence fn → f ∈ L1 (ν). This allows us to
define Nt ( f ) for f ∈ L1 (ν) as L1 (P)-limit of stochastic integrals of simple functions; obviously,
(9.4) is preserved under this limiting procedure.

Theorem 9.7. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure.

a) Nt ( f ) := f (y) Nt (dy) is a Lévy process for every f ∈ L1 (ν), f : Rd → Rm . In particular,


R

(Nt (B))t>0 is a Poisson process for every B ∈ B(Rd ) such that 0 ∈ / B.

b) XtB := Nt (y1B (y)) and Xt − XtB are for every B ∈ B(Rd ), 0 ∈


/ B, Lévy processes.

Proof. a) Note that ν is a locally finite measure on Rd \ {0}. This means that, by standard density
results from integration theory, the family Cc (Rd \ {0}) is dense in L1 (ν). Fix f ∈ L1 (ν) and
choose fn ∈ Cc (Rd \ {0}) such that fn → f in L1 (ν). Then, as in Lemma 9.6,
Z
E|Nt ( f ) − Nt ( fn )| 6 t | f − fn | dν −−−→ 0.
n→∞

Since P(|Nt ( f )| > ε) 6 εt | f | dν → 0 for every ε > 0 as t → 0, the process Nt ( f ) is continuous


R

in probability. Moreover, it is the limit (in L1 , hence in probability) of the Lévy processes Nt ( fn )
(Lemma 9.4); therefore it is itself a Lévy process, see Lemma 7.3.
In view of Lemma 9.4.c), the indicator function 1B ∈ L1 (ν) whenever B is a Borel set satisfying
0∈/ B. Thus, Nt (B) = Nt (1B ) is a Lévy process which has only jumps of unit size, i.e. it is by
Theorem 8.1 a Poisson process.
58 R. L. Schilling: An Introduction to Lévy and Feller Processes

b) Set f (y) := y1B (y) and Bn := B ∩ Bn (0). Then fn (y) = y1Bn (y) is bounded and 0 ∈ / supp fn ,
1
hence fn ∈ L (ν). This means that Nt ( fn ) is for every n ∈ N a Lévy process. Moreover,
Z Z
Nt ( fn ) = y Nt (dy) −−−→ y Nt (dy) = Nt ( f ) a.s.
Bn n→∞ B

Since Nt ( f ) changes its value only by jumps,

P(|Nt ( f )| > ε) 6 P(X has at least one jump of size B in [0,t])


= P(Nt (B) > 0) = 1 − e−tν(B) ,

which proves that the process Nt ( f ) is continuous in probability. Lemma 7.3 shows that Nt ( f ) is a
Lévy process.
Finally, approximate f (y) := y1B (y) by a sequence φl ∈ Cc (Rd \ {0}, Rd ). Now we can use
Lemma 9.3 to get

n−1  
Xt − Nt (φl ) = lim ∑ (Xtk+1,n − Xtk,n ) − φl (Xtk+1,n − Xtk,n ) ,
n→∞
k=0

The increments of X are stationary and independent, and so we conclude from the above for-
mula that X − N(φl ) has also stationary and independent increments. Since both X and N(φl ) are
continuous in probability, so is their difference, i.e. X − N(φl ) is a Lévy process. Finally,

Nt (φl ) −−→ Nt ( f ) and Xt − Nt (φl ) −−→ Xt − Nt ( f ),


l→∞ l→∞

and since X and N( f ) are continuous in probability, Lemma 7.3 tells us that X − N( f ) is a Lévy
process.

We will now show that Lévy processes with ‘disjoint jump heights’ are independent. For this
we need the following immediate consequence of Theorem 3.1:

Lemma 9.8 (Exponential martingale). Let (Xt )t>0 be a Lévy process. Then

ei ξ ·Xt
Mt := = ei ξ ·Xt etψ(ξ ) , t > 0,
E ei ξ ·Xt

is a martingale for the filtration FtX = σ (Xs , s 6 t) such that sups6t |Ms | 6 et Re ψ(ξ ) .

Theorem 9.9. Let Nt (·) be the jump measure of a Lévy process X and U,V ∈ B(Rd ), 0 ∈
/ U, 0 ∈
/V
and U ∩V = 0.
/ Then the processes

XtU := Nt (y1U (y)), XtV := Nt (y1V (y)), Xt − XtU∪V

are independent Lévy processes in Rd .


Chapter 9: Random measures 59

Proof. Set W := U ∪V . By Theorem 9.7, X U , X V and X − X W are Lévy processes. In fact, a slight
variation of that argument even shows that (X U , X V , X − X W ) is a Lévy process in R3d .
In order to see their independence, fix s > 0 and define for t > s and ξ , η, θ ∈ Rd the processes
U U V V
ei ξ ·(Xt −Xs ) ei η·(Xt −Xs )
Ct :=  U U
 − 1, Dt :=  V V
 − 1,
E ei ξ ·(Xt −Xs ) E ei η·(Xt −Xs )
W W
ei θ ·(Xt −Xt −Xs +Xs )
Et :=  W W
 − 1.
E ei θ ·(Xt −Xt −Xs +Xs )
By Lemma 9.8, these processes are bounded martingales satisfying ECt = EDt = EEt = 0. Set
tk,n = s + nk (t − s). Observe that
!
n−1
E(Ct Dt Et ) = E ∑ (Ctk+1,n −Ctk,n )(Dtl+1,n − Dtl,n )(Etm+1,n − Etm,n )
k,l,m=0
!
n−1
=E ∑ (Ct k+1,n
−Ctk,n )(Dtk+1,n − Dtk,n )(Etk+1,n − Etk,n ) .
k=0

In the second equality we use that martingale increments Ct −Cs , Dt − Ds , Et − Es are independent
of FsX , and by the tower property
 
E (Ctk+1,n −Ctk,n )(Dtl+1,n − Dtl,n )(Etm+1,n − Etm,n ) = 0 unless k = l = m.

An argument along the lines of Lemma 9.3 gives


 
E(Ct Dt Et ) = E ∑ ∆Cu ∆Du ∆Eu = 0
s<u6t
| {z }
=0

as XtU , XtV and Yt := Xt − XtW cannot jump simultaneously since U, V and Rd \ W are mutually
disjoint. Thus, h i
U U V V
E ei ξ ·(Xt −Xs ) ei η·(Xt −Xs ) ei θ ·(Yt −Ys )
h U U
i h V V
i h i (9.5)
= E ei ξ ·(Xt −Xs ) · E ei η·(Xt −Xs ) · E ei θ ·(Yt −Ys ) .

Since all processes are Lévy processes, (9.5) already proves the independence of X U , X V and
Y = X − X W . Indeed, we find for 0 = t0 < t1 < . . . < tm = t and ξk , ηk , θk ∈ Rd
i ξ ·(X U −X U ) i η ·(X V −X V )
 
E e ∑k k tk+1 tk e ∑k k tk+1 tk ei ∑k θk ·(Ytk+1 −Ytk )
i ξ ·(X U −X U ) i η ·(X V −X V )
 
= E ∏ e k tk+1 tk e k tk+1 tk ei θk ·(Ytk+1 −Ytk )
k
(L20 ) i ξ ·(X U −X U ) i η ·(X V −X V )
 
= ∏ E e k tk+1 tk e k tk+1 tk ei θk ·(Ytk+1 −Ytk )
k
i ξ ·(X U −X U ) i η ·(X V −X V )
     
(9.5)
= ∏ E e k tk+1 tk E e k tk+1 tk E ei θk ·(Ytk+1 −Ytk ) .
k

The last equality follows from (9.5); the second equality uses (L20 ) for the Lévy process

(XtU , XtV , Xt − XtW ).


60 R. L. Schilling: An Introduction to Lévy and Feller Processes

This shows that the families (XtUk+1 −XtUk )k , (XtVk+1 −XtVk )k and (Ytk+1 −Ytk )k are independent, hence
the σ -algebras σ (XtU , t > 0), σ (XtV , t > 0) and σ (Xt − XtW , t > 0) are independent.

Corollary 9.10. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure.

⊥(Nt (V ))t>0 for U,V ∈ B(Rd ), 0 ∈


a) (Nt (U))t>0 ⊥ / U, 0 ∈
/ V , U ∩V = 0.
/

b) For all measurable f : Rd → Rm satisfying f (0) = 0 and f ∈ L1 (ν)


   R  i ξ · f (y) ν(dy)
E ei ξ ·Nt ( f ) = E ei ξ · f (y) Nt (dy) = e−t y6=0 [1−e ]
R
. (9.6)
Z
|y|2 ∧ 1 ν(dy) < ∞.

c)
y6=0

Proof. a) Since (Nt (U))t>0 and (Nt (V ))t>0 are completely determined by the independent pro-
cesses (XtU )t>0 and (XtV )t>0 , cf. Theorem 9.9, the independence is clear.
b) Let us first prove (9.6) for step functions f (x) = ∑nk=1 φk 1Uk (x) with φk ∈ Rm and disjoint sets
U1 , . . . ,Un ∈ B(Rd ) such that 0 6∈ U k . Then
Z   n Z 
E exp i ξ · f (y) Nt (dy) = E exp i ∑ ξ · φk 1Uk (y) Nt (dy)
k=1
n
a)
= ∏ E exp [i ξ · φk Nt (Uk )]
k=1
n h i
9.7.a)
tν(Uk ) ei ξ ·φk − 1

= ∏ exp
k=1
 n
 i ξ ·φ 

= exp t ∑ e − 1 ν(Uk )
k

k=1
 Z  
i ξ · f (y)

= exp − t 1−e ν(dy) .

For any f ∈ L1 (ν) the integral on the right-hand side of (9.6) exists. Indeed, the elementary
inequality |1 − ei u | 6 |u| ∧ 2 and ν{|y| > 1} < ∞ (Lemma 9.4.c)) yield
Z Z Z
1 − ei ξ · f (y) ν(dy) 6 |ξ |
 
| f (y)| ν(dy) + 2 ν(dy) < ∞.
y6=0 0<|y|<1 |y|>1

Therefore, (9.6) follows with a standard approximation argument and dominated convergence.
c) We have already seen in Lemma 9.4.c) that ν{|y| > 1} < ∞.
Let us show that 0<|y|<1 |y|2 ν(dy) < ∞. For this we take U = {δ < |y| < 1}. Again by Theo-
R

rem 9.9, the processes XtU and Xt − XtU are independent, and we get
U U) U
0 < E ei ξ ·Xt = E ei ξ ·Xt · E ei ξ ·(Xt −Xt 6 E ei ξ ·Xt .

Since XtU is a compound Poisson process—use part b) with f (y) = y1U (y)—we get for all |ξ | 6 1

U
R 1 2
6 e−t δ <|y|61 4 (ξ ·y) ν(dy)
R
0 < E ei ξ ·Xt 6 E ei ξ ·Xt = e−t U (1−cos ξ ·y) ν(dy) .
Chapter 9: Random measures 61

For the equality we use |ez | = eRe z , the inequality follows 41 u2 6 1 − cos u if |u| 6 1. Letting δ → 0
we see that 0<|y|<1 |y|2 ν(dy) < ∞.
R

Corollary 9.11. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure.
For all f : Rd → Rm satisfying f (0) = 0 and f ∈ L2 (ν) we have2
 Z 2 Z
| f (y)|2 ν(dy).
 
E f (y) Nt (dy) − tν(dy) =t (9.7)
y6=0

Proof. It is clearly enough to show (9.7) for step functions of the form
n
f (x) = ∑ φk 1B (x),
k
Bk disjoint, 0 6∈ Bk , φk ∈ Rm ,
k=1

and then use an approximation argument.


Since the processes Nt (Bk , ·) are independent Poisson processes with mean ENt (Bk ) = tν(Bk )
and variance VNt (Bk ) = tν(Bk ), we find
 
E (Nt (Bk ) − tν(Bk ))(Nt (Bl ) − tν(Bl ))
 
 0, if Bk ∩ Bl = 0,
/ i.e. k 6= l, 
=
 VN (B ) = tν(B ), if k = l, 
t k k

= tν(Bk ∩ Bl ).

Therefore,

2
 Z 
E f (y) Nt (dy) − tν(dy)
 ZZ 
 
=E f (y) f (z) Nt (dy) − tν(dy) Nt (dz) − tν(dz)
n   
= ∑ k lφ φ E N (B
t k ) − tν(B k ) N (B
t l ) − tν(Bl )
k,l=1 | {z }
=tν(Bk ∩Bl )
n Z
=t ∑ |φk |2 ν(Bk ) = t | f (y)|2 ν(dy).
k=1

In contrast to Corollary 7.6 the following theorem does not need (but constructs) the Lévy triplet
(l, Q, ν).

Theorem 9.12 (Lévy–Itô decomposition). Let X be a Lévy process and denote by Nt (·) and ν the

2 This is a special case of an Itô isometry, cf. (10.9) in the following chapter.
62 R. L. Schilling: An Introduction to Lévy and Feller Processes

jump and intensity measures. Then


Z 
p
=: Mt , L2 -martingale

Xt = QWt + y Nt (dy) − tν(dy)
0<|y|<1
Z 
tl + y Nt (dy). =: At , bdd. variation (9.8)
|y|>1

continuous pure jump part


Gaussian

where l ∈ Rd and Q ∈ Rd×d is a positive semidefinite symmetric matrix and W is a standard


Brownian motion in Rd . The processes on the right-hand side of (9.8) are independent Lévy
processes.

Proof. 1◦ Set Un := { 1n < |y| < 1}, V = {|y| > 1}, Wn := Un ∪· V and define
Z Z Z
XtV := y Nt (dy) and XetUn := y Nt (dy) − t y ν(dy).
V Un Un

By Theorem 9.9 (XtV )t>0 , (XetUn )t>0 and Xt − XtWn + t Un y ν(dy) t>0 are independent Lévy pro-
R 

cesses. Since
X = (X − XeUn − X V ) + XeUn + X V ,

the theorem follows if we can show that the three terms on the right-hand side converge separately
as n → ∞.

2◦ Lemma 9.6 shows EXetUn = 0; since XeUn is a Lévy process, it is a martingale: for s 6 t
   
E XetUn Fs = E XetUn − XesUn Fs + XesUn
 
(L2) Un
= E Xet−s + XesUn = XesUn .
(L1)

(Fs can be taken as the natural filtration of X Un or X). By Doob’s L2 martingale inequality we find
for any t > 0 and m < n
   
Um 2 2
Un
E sup Xs − Xs
e e 6 4E XetUn − XetUm
s6t
Z
= 4t |y|2 ν(dy) −−−−→ 0.
1 1 m,n→∞
n <|y|6 m

Therefore, the limit 0<|y|<1 y Nt (dy) − tν(dy) = L2 -limn→∞ XetUn exists locally uniformly (in t).
R 

The limit is still an L2 martingale with càdlàg paths (take a locally uniformly a.s. convergent
subsequence) and, by Lemma 7.3, also a Lévy process.

3◦ Observe that
(X − XeUn − X V ) − (X − XeUm − X V ) = XeUm − XeUn ,
Chapter 9: Random measures 63

and so Xtc := L2 - limn→∞ (Xt − XetUn − XtV ) exists locally uniformly (in t) Since, by construction
|∆(Xt − XetUn − XtV )| 6 n1 , it is clear that X c has a.s. continuous sample paths. By Lemma 7.3 it
is a Lévy process. From Theorem 8.4 we know that all Lévy processes with continuous sample

paths are of the form tl + QWt where W is a Brownian motion, Q ∈ Rd×d a symmetric positive
semidefinite matrix and l ∈ Rd .

4◦ Since independence is preserved under L2 -limits, the decomposition (9.8) follows. Finally,
Z
y Nt (dy, ω) = ∑ ∆Xs (ω)1{∆Xs (ω)>1} − ∑ |∆Xs (ω)|1{∆Xs (ω)6−1}
|y|>1 0<s6t 0<s6t

is the difference of two increasing processes, i.e. it is of bounded variation.

Corollary 9.13 (Lévy–Khintchine formula). Let X be a Lévy process. Then the characteristic
exponent ψ is given by
1
Z h i
i y·ξ
ψ(ξ ) = − i l · ξ + ξ · Qξ + 1 − e + i ξ · y1(0,1) (|y|) ν(dy) (9.9)
2 y6=0

where ν is the intensity measure, l ∈ Rd and Q ∈ Rd×d is symmetric and positive semidefinite.

Proof. Since the processes appearing in the Lévy–Itô decomposition (9.8) are independent, we
see
√ R R
e−ψ(ξ ) = E ei ξ ·X1 = E ei ξ ·(−l+ QW1 ) · E ei 0<|y|<1 ξ ·y(N1 (dy)−ν(dy)) · E ei |y|>1 ξ ·y N1 (dy) .

Since W is a standard Brownian motion,


√ 1
E ei ξ ·(l+ QW1 )
= ei l·ξ − 2 ξ ·Qξ .

Using (9.6) with f (y) = y1Un (y), Un = { 1n < |y| < 1}, subtracting
R
Un y ν(dy) and letting n → ∞
we get
Z   Z h i 
i y·ξ
E exp i ξ · y(N1 (dy) − ν(dy)) = exp − 1 − e + i ξ · y ν(dy) ;
0<|y|<1 0<|y|<1

finally, (9.6) with f (y) = y1V (y), V = {|y| > 1}, once again yields
Z   Z h i 
i y·ξ
E exp i ξ · y N1 (dy) = exp − 1−e ν(dy)
|y|>1 |y|>1

finishing the proof.


10. A digression: stochastic integrals

In this chapter we explain how one can integrate with respect to (a certain class of) random mea-
sures. Our approach is based on the notion of random orthogonal measures and it will include
the classical Itô integral with respect to square-integrable martingales. Throughout this chapter,
(Ω, A , P) is a probability space, (Ft )t>0 some filtration, (E, E ) is a measurable space and µ is
a (positive) measure on (E, E ). Moreover, R ⊂ E is a semiring, i.e. a family of sets such that
0/ ∈ R, for all R, S ∈ R we have R ∩ S ∈ R, and R \ S can be represented as a finite union of
disjoint sets from R, cf. [54, Chapter 6] or [55, Definition 5.1]. It is not difficult to check that
R0 := {R ∈ R : µ(R) < ∞} is again a semiring.
Definition 10.1. Let R be a semiring on the measure space (E, E , µ). A random orthogonal
measure with control measure µ is a family of random variables N(ω, R) ∈ R, R ∈ R0 , such that

E |N(·, R)|2 < ∞ ∀R ∈ R0


 
(10.1)
E [N(·, R)N(·, S)] = µ(R ∩ S) ∀R, S ∈ R0 . (10.2)

The following Lemma explains why N(R) = N(ω, R) is called a (random) measure.
Lemma 10.2. The random set function R 7→ N(R) := N(ω, R), R ∈ R0 , is countably additive in
L2 , i.e. !
∞ n
·
[
N Rn = L2 - lim ∑ N(Rk ) a.s. (10.3)
n→∞
n=1 k=1
for every sequence (Rn )n∈N ⊂ R0 of mutually disjoint sets such that R := ∞ ·
n=1 Rn ∈ R0 . In
S

particular, N(R ∪· S) = N(R) + N(S) a.s. for disjoint R, S ∈ R0 such that R ∪· S ∈ R0 and N(0)
/ =0
a.s. (notice that the exceptional set may depend on the sets R, S).
Proof. From R = S = 0/ and E[N(0) / 2 ] = µ(0)
/ = 0 we get N(0) / = 0 a.s. It is enough to prove (10.3)
as finite additivity follows if we take (R1 , R2 , R3 , R4 . . . ) = (R, S, 0, / . . . ). If Rn ∈ R0 are mutually
/ 0,

"
S∞
·
disjoint sets such that R := n=1 Rn ∈ R0 , then
#
 n 2
E N(R) − ∑ N(Rk )
k=1
n n n
= EN 2 (R) + ∑ EN 2 (Rk ) − 2 ∑ E [N(R)N(Rk )] + ∑ E [N(R j )N(Rk )]
k=1 k=1 j6=k, j,k=1
n
(10.2)
= µ(R) − ∑ µ(Rk ) −−−→ 0
n→∞
k=1

where we use the σ -additivity of the measure µ.

64
Chapter 10: A digression: stochastic integrals 65

Example 10.3. a) (White noise) Let R = {(s,t] : 0 6 s < t < ∞} and µ = λ be Lebesgue
measure on (0, ∞). Clearly, R = R0 is a semiring. Let W = (Wt )t>0 be a one-dimensional standard
Brownian motion. The random set function

N(ω, (s,t]) := Wt (ω) −Ws (ω), 06s<t <∞

is a random orthogonal measure with control measure λ . This follows at once from
   
E (Wt −Ws )(Wv −Wu ) = t ∧ v − s ∨ u = λ (s,t] ∩ (u, v]

for all 0 6 s < t < ∞ and 0 6 u < v < ∞.


Mind, however, that N is not σ -additive. To see this, take Rn := (1/(n + 1), 1/n], where n ∈ N,
·S
and observe that n Rn = (0, 1]. Since W has stationary and independent increments, and scales

like Wt ∼ tW1 , we have
 ∞   ∞ 
E exp − ∑n=1 |N(Rn )| = E exp − ∑n=1 |W1/(n+1) −W1/n |
h i
= ∏n=1 E exp −(n(n + 1))−1/2 |W1 |

Jensen’s −1/2
α := Ee−|W1 | ∈ (0, 1).

6
ineq.
∏n=1 α (n(n+1)) ,

As the series ∑∞ −1/2 diverges, we get E exp [− ∞ |N(R )|] = 0 which means that
n=1 (n(n + 1)) ∑n=1 n

∑n=1 |N(ω, Rn )| = ∞ for almost all ω. This shows that N(·) cannot be countably additive. Indeed,
countable additivity implies that the series
!

[ ∞
N ω, Rn = ∑ N(ω, Rn )
n=1 n=1

converges. The left-hand side, hence the summation, is independent under rearrangements. This,
however, entails absolute convergence of the series ∑∞
n=1 |N(ω, Rn )| < ∞ which does not hold as
we have seen above.
b) (2nd order orthogonal noise) Let X = (Xt )t∈T be a complex-valued stochastic process defined
on a bounded or unbounded interval T ⊂ R. We assume that X has a.s. càdlàg paths. If E(|Xt |2 ) <
∞, we call X a second-order process; many properties of X are characterized by the correlation

function K(s,t) = E Xs X t , s,t ∈ T .
 
If E (Xt −Xs )(X v −X u ) = 0 for all s 6 t 6 u 6 v, s,t, u, v ∈ T , then X is said to have orthogonal
increments. Fix t0 ∈ T and define for all t ∈ T

E(|X − X |2 ), if t > t0 ,
t t0
F(t) :=
−E(|X − X |2 ), if t 6 t .
t0 t 0

Clearly, F is increasing and, since t 7→ Xt is a.s. right-continuous, it is also right-continuous.


Moreover,
F(t) − F(s) = E(|Xt − Xs |2 ) for all s 6 t, s,t ∈ T. (10.4)
66 R. L. Schilling: An Introduction to Lévy and Feller Processes

To see this, we assume without loss of generality that s 6 t0 6 t. We have

F(t) − F(s) = E(|Xt − Xt0 |2 ) − E(|Xs − Xt0 |2 )

= E(|(Xt − Xs ) + (Xs − Xt0 )|2 ) − E(|Xs − Xt0 |2 )


orth.
= E(|Xt − Xs |2 ).
incr.

This shows that µ(s,t] := F(t) − F(s) defines a measure on R = R0 = {(s,t] : −∞ < s < t <
∞, s,t ∈ T }, which is the control measure of N(ω, (s,t]) := Xt (ω) − Xs (ω). In fact, for s < t, u < v,
s,t, u, v ∈ T , we have
  
Xt − Xs = Xt − Xt∧v + Xt∧v − Xs∨u + Xs∨u − Xs
  
X v − X u = X u − X t∧v + X t∧v − X s∨u + X s∨u − X u .

Using the orthogonality of the increments we get


    
E (Xt − Xs )(X v − X u ) = E Xt∧v − Xs∨u X t∧v − X s∨u

= F(t ∧ v) − F(s ∨ u) = µ (s,t] ∩ (u, v] ,

i.e. N(ω, •) is a random orthogonal measure.


c) (Martingale noise) Let M = (Mt )t>0 be a square-integrable martingale with respect to the fil-
tration (Ft )t>0 , M0 = 0, and with càdlàg paths. Denote by hMi the predictable quadratic variation,
i.e. the unique (hMi0 := 0) increasing predictable process such that M 2 − hMi is a martingale. The
random set function
N(ω, (s,t]) := Mt (ω) − Ms (ω), s 6 t,

is a random orthogonal measure on R = {(s,t] : 0 6 s < t < ∞} with control measure µ(s,t] =
E(hMit − hMis ). This follows immediately from the tower property of conditional expectation
tower
E[Mt Mv ] = E[Mt E(Mv | Ft )] = E[Mt2 ] = EhMit if t 6 v

which, in turn, gives for all 0 6 s < t and 0 6 u < v

E[(Mt − Ms )(Mv − Mu )] = EhMit∧v − EhMis∧v − EhMit∧u + EhMis∧u


 
= µ (s,t] ∩ (0, v] − µ (s,t] ∩ (0, u]

= µ (s,t] ∩ (u, v] .

d) (Poisson random measure) Let X be a d-dimensional Lévy process,

S := B ∈ B(Rd ) : 0 ∈ R := (s,t] × B : 0 6 s < t < ∞, B ∈ S ,


 
/B ,

and Nt (B) the jump measure (Definition 9.1). The random set function

N(ω,
e (s,t] × B) := [Nt (ω, B) − tν(B)] − [Ns (ω, B) − sν(B)], R = (s,t] × B ∈ R,
Chapter 10: A digression: stochastic integrals 67

is a random orthogonal measure with control measure λ × ν where λ is Lebesgue measure on


(0, ∞) and ν is the Lévy measure of X. Indeed, by definition R = R0 , and it is not hard to see that
R is a semiring1 .
et (B) := N((0,t]
Set N e × B) and let B,C ∈ S , t, v > 0. As in the proof of Corollary 9.11 we have
 
E N et (C) = tν(B ∩C).
et (B)N

Since S is a semiring, we get B = (B ∩C) ∪· (B \C) = (B ∩C) ∪· B1 ∪· · · · ∪· Bn with finitely many


mutually disjoint Bk ∈ S such that Bk ⊂ B \C.
e k ) and N(C)
The processes N(B e are independent (Corollary 9.10) and centered. Therefore we
have for t 6 v
    n  
E Net (B)N
ev (C) = E Net (B ∩C)N
ev (C) + ∑ E Net (Bk )N
ev (C)
k=1
  n
=E Net (B ∩C)N et (Bk ) · EN
ev (C) + ∑ EN ev (C)
k=1
 
=E Net (B ∩C)N
ev (C) .

et (B ∩C) has independent and centered


Use the same argument over again, as well as the fact that N
increments (Lemma 9.4), to get
 
E Net (B)Nev (C)
 
=E N et (B ∩C)N
ev (B ∩C) =ENet (B∩C) E[N
ev (B∩C)−N
et (B∩C)]=0
  z  }| {
=E N et (B ∩C)N
et (B ∩C) + E N et (B ∩C) N ev (B ∩C) − N et (B ∩C)
  (10.5)
=E N et (B ∩C)N
et (B ∩C)
= tν(B ∩C)
= λ ((0,t] ∩ (0, v])ν(B ∩C).
For s 6 t, u 6 v and B,C ∈ S a lengthy, but otherwise completely elementary, calculation based
on (10.5) shows
 
E N((s,t]
e × B)N((u,
e v] ×C) = λ ((s,t] ∩ (u, v])ν(B ∩C).

e) (Space-time white noise) Let R := (0,t] × B : t > 0, B ∈ B(Rd ) and µ = λ Lebesgue




measure on the half-space H+ := [0, ∞) × Rd .


Consider the mean-zero, real-valued Gaussian process (W(R))R∈B(H+ ) whose covariance func-
tion is given by Cov(W(R)W(S)) = λ (R ∩ S).2 By its very definition W(R) is a random orthog-
onal measure on R0 with control measure λ .
1 BothS and I := {(s,t] : 0 6 s < t < ∞} are semirings, and so is their cartesian product R = I × S , see [54,
Lemma 13.1] or [55, Lemma 15.1] for the straightforward proof.
2 The map (R, S) 7→ λ (R ∩ S) is positive semidefinite, i.e. for R , . . . , R ∈ B(H+ ) and ξ , . . . , ξ ∈ R
1 n 1 n
n n Z Z  n 2
∑ ξ j ξk λ (R j ∩ Rk ) = ∑ ξ j 1R j (x)ξk 1Rk (x) λ (dx) = ∑ ξk 1Rk (x) λ (dx) > 0.
j,k=1 j,k=1 k=1
68 R. L. Schilling: An Introduction to Lévy and Feller Processes

We will now define a stochastic integral in the spirit of Itô’s original construction.

Definition 10.4. Let R be a semiring and R0 = {R ∈ R : µ(R) < ∞}. A simple function is a
deterministic function of the form
n
f (x) = ∑ ck 1R (x),
k
n ∈ N, ck ∈ R, Rk ∈ R0 . (10.6)
k=1

Intuitively, IN ( f ) = ∑nk=1 ck N(Rk ) should be the stochastic integral of a simple function f . The
only problem is the well-definedness. Since a random orthogonal measure is a.s. finitely addi-
tive, the following lemma has exactly the same proof as the usual well-definedness result for the
Lebesgue integral of a step function, see e.g. Schilling [54, Lemma 9.1] or [55, Lemma 8.1]; note
that finite unions of null sets are again null sets.

Lemma 10.5. Let f be a simple function and assume that f = ∑nk=1 ck 1Rk = ∑mj=1 b j 1S j has two
representations as step-function. Then
n m
∑ ck N(Rk ) = ∑ b j N(S j ) a.s.
k=1 j=1

Definition 10.6. Let N(R), R ∈ R0 , be a random orthogonal measure with control measure µ. The
stochastic integral of a simple function f given by (10.6) is the random variable
n
IN (ω, f ) := ∑ ck N(ω, Rk ). (10.7)
k=1

The following properties of the stochastic integral are more or less immediate from the defini-
tion.

Lemma 10.7. Let N(R), R ∈ R0 , be a random orthogonal measure with control measure µ, f , g
simple functions, and α, β ∈ R.

a) IN (1R ) = N(R) for all R ∈ R0 ;

b) S 7→ IN (1S ) extends N uniquely to S ∈ ρ(R0 ), the ring generated by R0 ;3

c) IN (α f + β g) = αIN ( f ) + β IN (g); (linearity)

d) E IN ( f )2 = f 2 dµ;
  R
(Itô’s isometry)

Proof. The properties a) and c) are clear. For b) we note that ρ(R0 ) can be constructed from R0
by adding all possible finite unions of (disjoint) sets (see e.g. [54, Proof of Theorem 6.1, Step 2]).

3A ring is a family of sets which contains 0/ and which is stable under unions and differences of finitely many sets.
Since R ∩ S = R \ (R \ S), it is automatically stable under finite intersections. The ring generated by R0 is the smallest
ring containing R0 .
Chapter 10: A digression: stochastic integrals 69

In order to see d), we use (10.6) and the orthogonality relation E [N(R j )N(Rk )] = µ(R j ∩ Rk ) to
get
n
E IN ( f )2 =
 
∑ c j ck E [N(R j )N(Rk )]
j,k=1
n
= ∑ c j ck µ(R j ∩ Rk )
j,k=1
Z n
= ∑ c j 1R j (x) ck 1Rk (x) µ(dx)
j,k=1
Z
= f 2 (x) µ(dx).

Itô’s isometry now allows us to extend the stochastic integral to the L2 (µ)-closure of the simple
functions: L2 (E, σ (R), µ). For this take f ∈ L2 (E, σ (R), µ) and any approximating sequence
( fn )n∈N of simple functions, i.e.
Z
lim | f − fn |2 dµ = 0.
n→∞

In particular, ( fn )n∈N is an L2 (µ) Cauchy sequence, and Itô’s isometry shows that the random
variables (IN ( fn ))n∈N are a Cauchy sequence in L2 (P):
 Z
E (IN ( fn ) − IN ( fm )) = E IN ( fn − fm ) = ( fn − fm )2 dµ −−−−→ 0.
2 2
  
m,n→∞

Because of the completeness of L2 (P), the limit limn→∞ IN ( fn ) exists and, by a standard argument,
it does not depend on the approximating sequence.
Definition 10.8. Let N(R), R ∈ R0 , be a random orthogonal measure with control measure µ. The
stochastic integral of a function f ∈ L2 (E, σ (R), µ) is the random variable
Z
f (x) N(ω, dx) := L2 (P)- lim IN (ω, fn ) (10.8)
n→∞

where ( fn )n∈N is any sequence of simple functions which approximate f in L2 (µ).


R
It is immediate from the definition of the stochastic integral, that f 7→ f dN is linear and enjoys
Itô’s isometry  Z 2  Z
E f (x) N(dx) = f 2 (x) µ(dx). (10.9)

Remark 10.9. Assume that the random orthogonal measure N is of space-time type, i.e. E =
(0, ∞) × X where (X, X ) is some measurable space, and R = {(0,t] × B : B ∈ S } where S is
a semiring in X . If for B ∈ S the stochastic process Nt (B) := N((0,t] × B) is a martingale w.r.t.
the filtration Ft := σ (N((0, s] × B), s 6 t, B ∈ S ), then
ZZ
Nt ( f ) := 1(0,t] (s) f (x) N(ds, dx), 1(0,t] ⊗ f ∈ L2 (µ), t > 0,

is again a(n L2 -)martingale. For simple functions f this follows immediately from the fact that
sums and differences of finitely many martingales (with a common filtration) are again a martin-
gale. Since L2 (P)-limits preserve the martingale property, the claim follows.
70 R. L. Schilling: An Introduction to Lévy and Feller Processes

At first sight, the stochastic integral defined in 10.8 looks rather restrictive since we can only
integrate deterministic functions f . As all randomness can be put into the random orthogonal mea-
sure, we have considerable flexibility, and the following construction shows that Definition 10.8
covers pretty much the most general stochastic integrals.

From now on we assume that

• the random measure N(dt, dx) on (E, E ) = ((0, ∞) × X, B(Rd ) ⊗ X ) is of space-time type,
cf. Remark 10.9, with control measure µ(dt, dx);

• (Ft )t>0 is some filtration in (Ω, A , P).

Let τ be a stopping time; the set K0, τK := {(ω,t) : 0 < t 6 τ(ω)} is called stochastic interval.
We define

• E ◦ := Ω × (0, ∞) × X;

• E ◦ := P ⊗ X where P is the predictable σ -algebra in Ω × (0, ∞), see Definition A.8 in


the appendix;

• R ◦ := {K0, τK × B : τ bounded stopping time, B ∈ S };

• µ ◦ (dω, dt, dx) := P(dω)µ(dt, dx) as control measure;

• N ◦ ω, K0, τK × B := N ω, (0, τ(ω)] × B as random orthogonal measure4 .


 

Lemma 10.10. Let N ◦ , R0◦ and µ ◦ be as above. The R0◦ -simple processes
n
f (ω,t, x) := ∑ ck 1K0,τ K (ω,t)1B (x),
k k
ck ∈ R, K0, τk K × Bk ∈ R0◦
k=1

are L2 (µ ◦ )-dense in L2 (E ◦ , P ⊗ σ (S ), µ ◦ ).

Proof. This follows from standard arguments from measure and integration; notice that the pre-
dictable σ -algebra P ⊗ σ (S ) is generated by sets of the form K0, τK × B where τ is a bounded
stopping time and B ∈ S , cf. Theorem A.9 in the appendix.

Observe that for simple processes appearing in Lemma 10.10


ZZ n n
f (ω,t, x) N(ω, dt, dx) := ∑ ck N ◦ (ω, K0, τk K × Bk ) = ∑ ck N(ω, (0, τk (ω)] × Bk )
k=1 k=1

is a stochastic integral which satisfies


 ZZ 2  n n
E f (·,t, x) N(·, dt, dx) = ∑ c2k µ ◦ (K0, τk K × Bk ) = ∑ c2k Eµ((0, τk ] × Bk ).
k=1 k=1

Just as above we can now extend the stochastic integral to L2 (E ◦ , P ⊗ σ (S ), µ ◦ ).


4 To see that it is indeed a random orthogonal measure, use a discrete approximation of τ.
Chapter 10: A digression: stochastic integrals 71

Corollary 10.11. Let N(ω, dt, dx) be a random orthogonal measure on E of space-time type
(cf. Remark 10.9) with control measure µ(dt, dx) and f : Ω × (0, ∞) × X → R be an element of
L2 (E ◦ , P ⊗ σ (S ), µ ◦ ). Then the stochastic integral
ZZ
f (ω,t, x) N(ω, dt, dx)

exists and satisfies the following Itô isometry


 ZZ 2  ZZ
E f (·,t, x) N(·, dt, dx) = E f 2 (·,t, x) µ(dt, dx). (10.10)

Let us show that the stochastic integral w.r.t. a space-time random orthogonal measure extends
the usual Itô integral. To do so we need the following auxiliary result.

Lemma 10.12. Let N(ω, dt, dx) be a random orthogonal measure on E of space-time type (cf.
Remark 10.9) with control measure µ(dt, dx) and τ a stopping time. Then
ZZ
φ (ω)1Kτ,∞J (ω,t) f (ω,t, x) N(ω, dt, dx)
ZZ (10.11)
= φ (ω) 1Kτ,∞J (ω,t) f (ω,t, x) N(ω, dt, dx)

for all φ ∈ L∞ (Fτ ) and f ∈ L2 (E ◦ , P ⊗ σ (S ), µ ◦ ).

Proof. Since t 7→ 1Kτ,∞J (ω,t) is adapted to the filtration (Ft )t>0 and left-continuous, the inte-
grands appearing in (10.11) are predictable, hence all stochastic integrals are well-defined.
1◦ Assume that φ (ω) = 1F (ω) for some F ∈ Fτ and f (ω,t, x) = 1K0,σ K (ω,t)1B (x) for some
bounded stopping time σ and K0, σ K × B ∈ R0◦ . Define

Nσ (ω, B) := N(ω, (0, σ (ω)] × B) = N ◦ (ω, K0, σ K × B)

for any K0, σ K × B ∈ R0◦ . The random time τF := τ1F + ∞1F c is a stopping time5 , and we have

φ 1Kτ,∞J f = 1F 1Kτ,∞J 1K0,σ K 1B = 1KτF ,∞J 1K0,σ K 1B = 1KτF ∧σ ,σ K 1B .

From this we get (10.11) for our choice of φ and f :


ZZ ZZ
φ 1Kτ,∞J (t) f (t, x) N(dt, dx) = 1KτF ∧σ ,σ K (t)1B (x) N(dt, dx)

= Nσ (B) − NτF ∧σ (B)


= 1F · (Nσ (B) − Nτ∧σ (B))
ZZ
= 1F 1Kτ∧σ ,σ K (t)1B (x) N(dt, dx)
ZZ
= 1F 1Kτ,∞J 1K0,σ K (t)1B (x) N(dt, dx)
ZZ
=φ 1Kτ,∞J (t) f (t, x) N(dt, dx).
( )
/ τ >t
0,
5 Indeed, {τF 6 t} = {τ 6 t} ∩ F = ∈ Ft for all t > 0.
F, τ 6 t
72 R. L. Schilling: An Introduction to Lévy and Feller Processes

2◦ If φ = 1F for some F ∈ Fτ and f is a simple process, then (10.11) follows from 1◦ because of
the linearity of the stochastic integral.
3◦ If φ = 1F for some F ∈ Fτ and f ∈ L2 (µ ◦ ), then (10.11) follows from 2◦ and Itô’s isometry:
Let fn be a sequence of simple processes which approximate f . Then
 ZZ 2 

E φ 1Kτ,∞J (t) fn (t, x) − φ 1Kτ,∞J (t) f (t, x) N(dt, dx)
ZZ h 2 i
= E φ 1Kτ,∞J (t) fn (t, x) − φ 1Kτ,∞J (t) f (t, x) µ(dt, dx)
ZZ h 2 i
6 E fn (t, x) − f (t, x) µ(dt, dx) −−−→ 0.
n→∞

4◦ If φ is an Fτ measurable step-function and f ∈ L2 (µ ◦ ), then (10.11) follows from 3◦ because


of the linearity of the stochastic integral.
5◦ Since we can approximate φ ∈ L∞ (Fτ ) uniformly by Fτ measurable step functions φn , (10.11)
follows from 4◦ and Itô’s isometry because of the following inequality:
 ZZ 2 
 
E φn 1Kτ,∞J (t) f (t, x) − φ 1Kτ,∞J (t) f (t, x) N(dt, dx)
ZZ  2 
= E φn 1Kτ,∞J (t) f (t, x) − φ 1Kτ,∞J (t) f (t, x) µ(dt, dx)
ZZ
kφn − φ k2L∞ (P) E f 2 (t, x) µ(dt, dx).
 
6

We will now consider ‘martingale noise’ random orthogonal measures, see Example 10.3.c),
which are given by (the predictable quadratic variation of) a square-integrable martingale M. For
these random measures our definition of the stochastic integral coincides with Itô’s definition.
Recall that the Itô integral driven by M is first defined for simple, left-continuous processes of the
form
n
f (ω,t) := ∑ φk (ω)1Kτ ,τ k k+1 K
(ω,t), t > 0, (10.12)
k=1
where 0 6 τ1 6 τ2 6 . . . 6 τn+1 are bounded stopping times and φk bounded Fτk measurable
random variables. The Itô integral for such simple processes is
Z n 
f (ω,t) dMt (ω) := ∑ φk (ω) Mτk+1 (ω) − Mτk (ω)
k=1

and it is extended by Itô’s isometry to all integrands from L2 (Ω × (0, ∞), P, dP ⊗ dhMit ). For
details we refer to any standard text on Itô integration, e.g. Protter [43, Chapter II] or Revuz & Yor
[44, Chapter IV].
We will now use Lemma 10.12 in the particular situation where the space component dx is not
present.

Theorem 10.13. Let N(dt) be a ‘martingale noise’ random orthogonal measure induced by the
square-integrable martingale M (Example 10.3). The stochastic integral w.r.t. the random orthog-
onal measure N(dt) and Itô’s stochastic integral w.r.t. M coincide.
Chapter 10: A digression: stochastic integrals 73

Proof. Let 0 6 τ1 6 τ2 6 . . . 6 τn+1 be bounded stopping times, φk ∈ L∞ (Fτk ) bounded random


variables and f (ω,t) be a simple stochastic process of the form (10.12). From Lemma 10.12 we
get
Z Z n
f (t) N(dt) = ∑ φk 1Kτ ,τ k k+1 K
(t) N(dt)
k=1
n Z
= ∑ φk 1Kτk ,∞J (t)1K0,τk+1 K (t) N(dt)
k=1
n Z
= ∑ φk 1Kτk ,∞J (t)1K0,τk+1 K (t) N(dt)
k=1
n
= ∑ φk (Mτ k+1
− Mτk ).
k=1

This means that both stochastic integrals coincide on the simple stochastic processes. Since both
integrals are extended by Itô’s isometry, the assertion follows.

Example 10.14. Using random orthogonal measures we can re-state the Lévy-Itô decomposition
appearing in Theorem 9.12. For this, let N(dt,
e dx) be the Poisson random orthogonal measure (Ex-
d
ample 10.3.d) on E = (0, ∞) × (R \ {0}) with control measure dt × ν(dx) (ν is a Lévy measure).
Additionally, we define for all deterministic functions h : (0, ∞) × Rd → R
ZZ
h(s, x) N(ω, ds, dx) := ∑ h(s, ∆Xs (ω)) ∀ω ∈ Ω
0<s<∞

provided that the sum ∑0<s<∞ |h(s, ∆Xs (ω))| < ∞ for each ω.6 If X is a Lévy process with charac-
teristic exponent ψ and Lévy triplet (l, Q, ν), then
ZZ 
p
Xt = QWt + 1(0,t] (s)y1(0,1) (|y|) N(ds, dy)
e =: Mt , L2 -martingale
ZZ 
tl + 1(0,t] (s)y1{|y|>1} N(ds, dy). =: At , bdd. variation

continuous pure jump part


Gaussian

Example 10.14 is quite particular in the sense that N(·, dt, dx) is a bona fide positive measure,
and the control measure µ(dt, dx) is also the compensator, i.e. a measure such that

N((0,t]
e × B) = N((0,t] × B) − µ((0,t] × B)

is a square-integrable martingale.

6 This is essentially an ω-wise Riemann–Stieltjes integral. A sufficient condition for the absolute convergence is,
e.g. that h is continuous and h(t, ·) vanishes uniformly in t in some neighbourhood of x = 0. The reason for this is the
fact that Nt (ω, Bcε (0)) = N(ω, (0,t] × Bcε (0)) < ∞, i.e. there are at most finitely many jumps of size exceeding ε > 0.
74 R. L. Schilling: An Introduction to Lévy and Feller Processes

Following Ikeda & Watanabe [22, Chapter II.4] we can generalize the set-up of Example 10.14
in the following way: Let N(ω, dt, dx) be for each ω a positive measure of space-time type. Since
t 7→ N(ω, (0,t] × B) is increasing, there is a unique compensator N(ω,
b dt, dx) such that for all B
with EN((0,t] × B) < ∞
b

N(ω,
e (0,t] × B) := N(ω, (0,t] × B) − N(ω,
b (0,t] × B), t > 0,

is a square-integrable martingale. If t 7→ N((0,t]


b × B) is continuous and B 7→ N((0,t]
b × B) a σ -
finite measure, then one can show that the angle bracket satisfies

N((0,
e ·] × B), N((0,
e ·] ×C) t = N
b (0,t] × (B ∩C) .

This means, in particular, that N(ω,


e dt, dx) is a random orthogonal measure with control measure
µ((0,t] × B) = EN((0,t]
b × B), and we are back in the theory which we have developed in the first
part of this chapter.
It is possible to develop a fully-fledged stochastic calculus for this kind of random measures.
Definition 10.15. Let (Ω, A , P) be a probability space with a filtration (Ft )t>0 . A semimartin-
gale is a stochastic process X of the form
Z tZ Z tZ
Xt = X0 + At + Mt + f (·, s, x) N(ds,
e dx) + g(·, s, x) N(ds, dx)
0 0
Rt R
( 0 := (0,t] ) where
• X0 is an F0 measurable random variable,

• M is a continuous square-integrable local martingale (w.r.t. Ft ),

• A is a continuous Ft adapted process of bounded variation,

• N(ds, dx), N(ds,


e dx) and N(ds,
b dx) are as described above,

• f 1K0,τn K ∈ L2 (Ω × (0, ∞) × X, P ⊗ X , µ ◦ ) for some increasing sequence τn ↑ ∞ of bounded


stopping times,
R tR
• g is such that 0 g(ω, s, x) N(ω, ds, dx) exists as an ω-wise integral,

• f (·, s, x)g(·, s, x) ≡ 0.
In this case, we even have Itô’s formula, see [22, Chapter II.5], for any F ∈ C2 (R, R):
Z t Z t Z t
1
F(Xt ) − F(X0 ) = F 0 (Xs− ) dAs + F 0 (Xs− ) dMs + F 00 (Xs− ) dhMis
0 0 2 0
Z tZ  
+ F(Xs− + f (s, x)) − F(Xs− ) N(ds,
e dx)
0
Z tZ  
+ F(Xs− + g(s, x)) − F(Xs− ) N(ds, dx)
0
Z tZ 
F(Xs + f (s, x)) − F(Xs ) − f (s, x)F 0 (Xs ) N(ds,

+ b dx)
0
Rt R
where we use again the convention that 0 := (0,t] .
11. From Lévy to Feller processes

We have seen in Lemma 4.8 that the semigroup Pt f (x) := Ex f (Xt ) = E f (Xt + x) of a Lévy pro-
cess (Xt )t>0 is a Feller semigroup. Moreover, the convolution structure of the transition semigroup
R
E f (Xt +x) = f (x +y)P(Xt ∈ dy) is a consequence of the spatial homogeneity (translation invari-
ance) of the Lévy process, see Remark 4.5 and the characterization of translation invariant linear
functionals (Theorem A.10). Lemma 4.4 shows that the translation invariance of a Lévy process
is due to the assumptions (L1) and (L2).
It is, therefore, a natural question to ask what we get if we consider stochastic processes whose
semigroups are Feller semigroups which are not translation invariant. Since every Feller semi-
group admits a Markov transition kernel (Lemma 5.2), we can use Kolmogorov’s construction to
obtain a Markov process. Thus, the following definition makes sense.

Definition 11.1. A Feller process is a càdlàg Markov process (Xt )t>0 , Xt : Ω → Rd , t > 0, whose
transition semigroup Pt f (x) = Ex f (Xt ) is a Feller semigroup.

Remark 11.2. It is no restriction to require that a Feller process has càdlàg paths. By a fundamental
result in the theory of stochastic processes we can construct such modifications. Usually, one
argues like this: It is enough to study the coordinate processes, i.e. d = 1. Rather than looking
at t 7→ Xt we consider a (countable, point-separating) family of functions u : R → R and show
that each t 7→ u(Xt ) has a càdlàg modification. One way of achieving this is to use martingale
regularization techniques (e.g. Revuz & Yor [44, Chapter II.2]) which means that we should pick
u in such a way that u(Xt ) is a supermartingale. The usual candidate for this is the resolvent
e−λt Rλ f (Xt ) for some f ∈ C+∞ (R). Indeed, if Ft = σ (Xs , s 6 t) is the natural filtration, f > 0 and
s 6 t, then
Z ∞ Z ∞
Ex Rλ f (Xt ) | Fs = EXs e−λ r Pr f (Xt−s ) dr = e−λ r Pr Pt−s f (Xs ) dr
 
0 0
Z ∞ Z ∞
−λ u
=e λ (t−s)
e Pu f (Xs ) du 6 eλ (t−s) e−λ u Pu f (Xs ) du
t−s 0

= eλ (t−s) Rλ f (Xs ).

Let Ft = FtX := σ (Xs , s 6 t) be the canonical filtration.

Lemma 11.3. Every Feller process (Xt )t>0 is a strong Markov process, i.e.

Ex f (Xt+τ ) | Fτ = EXτ f (Xt ), Px -a.s. on {τ < ∞}, t > 0,


 
(11.1)

holds for any stopping time τ, Fτ := {F ∈ F∞ : F ∩ {τ 6 t} ∈ Ft ∀t > 0} and f ∈ C∞ (Rd ).

75
76 R. L. Schilling: An Introduction to Lévy and Feller Processes

A routine approximation argument shows that (11.1) extends to f (y) = 1K (y) (where K is a
compact set) and then, by a Dynkin-class argument, to any f (y) = 1B (y) where B ∈ B(Rd ).

Proof. To prove (11.1), approximate τ from above by discrete stopping times τn = b2n τc + 1 2−n


and observe that for F ∈ Fτ ∩ {τ < ∞}


(i) (ii)  (iii) 
Ex [1F f (Xt+τ )] = lim Ex [1F f (Xt+τn )] = lim Ex 1F EXτn f (Xt ) = Ex 1F EXτ f (Xt ) .
 
n→∞ n→∞

Here we use that t 7→ Xt is right-continuous, plus (i) dominated convergence and (iii) the Feller
continuity 4.7.f); (ii) is the strong Markov property for discrete stopping times which follows
directly from the Markov property: Since {τn < ∞} = {τ < ∞}, we get

Ex [1F f (Xt+τn )] = ∑ Ex
 
1F∩{τn =k2−n } f (Xt+k2−n )
k=1

∑ Ex 1F∩{τn =k2−n } EXk2−n f (Xt )
 
=
k=1

= Ex 1F EXτn f (Xt ) .
 

In the last calculation we use that F ∩ {τn = k2−n } ∈ Fk2−n for all F ∈ Fτ .

Once we know the generator of a Feller process, we can construct many important martingales
with respect to the canonical filtration of the process.

Corollary 11.4. Let (Xt )t>0 be a Feller process with generator (A, D(A)) and semigroup (Pt )t>0 .
For every f ∈ D(A) the process
Z t
[f]
Mt := f (Xt ) − A f (Xr ) dr, t > 0, (11.2)
0

is a martingale for the canonical filtration FtX := σ (Xs , s 6 t) and any Px , x ∈ Rd .


[f]
Proof. Let s 6 t, f ∈ D(A) and write, for short, Fs := FsX and Mt := Mt . By the Markov property
 Z t 
E [Mt − Ms | Fs ] = E f (Xt ) − f (Xs ) − A f (Xr ) dr Fs
x x
s
Z t−s
Xs
= E f (Xt−s ) − f (Xs ) − EXs A f (Xu ) du.
0
On the other hand, we get from the semigroup identity (5.5)
Z t−s Z t−s
EXs A f (Xu ) du = Pu A f (Xs ) du = Pt−s f (Xs ) − f (Xs ) = EXs f (Xt−s ) − f (Xs )
0 0

which shows that Ex [Mt − Ms | Fs ] = 0.

Our approach from Chapter 6 to prove the structure of a Lévy generator ‘only’ uses the positive
maximum principle. Therefore, it can be adapted to Feller processes provided that the domain
D(A) is rich in the sense that C∞c (R ) ⊂ D(A). All we have to do is to take into account that
d

Feller processes are not any longer invariant under translations. The following theorem is due to
Courrège [14] and von Waldenfels [61, 62].
Chapter 11: From Lévy to Feller processes 77

Theorem 11.5 (von Waldenfels, Courrège). Let (A, D(A)) be the generator of a Feller process
such that C∞
c (R ) ⊂ D(A). Then A|C∞
d
d is a pseudo differential operator
c (R )
Z
Au(x) = −q(x, D)u(x) := − u(ξ )ei x·ξ dξ
q(x, ξ )b (11.3)

whose symbol q : Rd × Rd → C is a measurable function of the form


1
Z
1 − ei y·ξ + i y · ξ 1(0,1) (|y|) ν(x, dy)
 
q(x, ξ ) = q(x, 0) − i l(x) · ξ + ξ · Q(x)ξ + (11.4)
| {z } 2 y6=0
>0

and (l(x), Q(x), ν(x, dy)) is a Lévy triplet1 for every fixed x ∈ Rd .

If we insert (11.4) into (11.3) and invert the Fourier transform we obtain the following integro-
differential representation of the Feller generator A:
1
A f (x) = l(x) · ∇ f (x) + ∇ · Q(x)∇ f (x)
2
(11.5)
Z  
+ f (x + y) − f (x) − ∇ f (x) · y1(0,1) (|y|) ν(x, dy).
y6=0

This formula obviously extends to all functions f ∈ C2b (Rd ). In particular, we may use the function
f (x) = eξ (x) = ei x·ξ , and get2
e−ξ (x)Aeξ (x) = −q(x, ξ ). (11.6)

Proof of Theorem 11.5 (sketch). For a worked-out version see [9, Chapter 2.3]. In the proof of The-
orem 6.8 use, instead of A0 and A00

A0 f Ax f := (A f )(x) and A00 f Axx f := Ax (| · −x|2 f )

for every x ∈ Rd . This is needed since Pt and A are not any longer translation invariant, i.e. we
cannot shift A0 f to get Ax f . Then follow the steps 1◦ – 4◦ to get ν(dy) ν(x, dy) and 6◦ – 9◦ for
(l(x), Q(x)). Remark 6.9 shows that the term q(x, 0) is non-negative.
The key observation is, as in the proof of Theorem 6.8, that we can use in steps 3◦ and 7◦ the
positive maximum principle3 to make sure that Ax f is a distribution of order 2, i.e.

|Ax f | = |Lx f + Sx f | 6 CK k f k(2) for all f ∈ C∞ d


c (K) and all compact sets K ⊂ R .

Here Lx is the local part with support in {x} accounting for (q(x, 0), l(x), Q(x)), and Sx is the
non-local part supported in Rd \ {x} giving ν(x, dy).

With some abstract functional analysis we can show some (local) boundedness properties of
x 7→ A f (x) and (x, ξ ) 7→ q(x, ξ ).
1 Cf.Definition 6.10
2 This should be compared with Definition 6.4 and the subsequent comments.
3 To be precise: its weakened form (PP), cf. page 41.
78 R. L. Schilling: An Introduction to Lévy and Feller Processes

Corollary 11.6. In the situation of Theorem 11.5, the condition (PP) shows that

sup |A f (x)| 6 Cr k f k(2) for all f ∈ C∞


c (Br (0)) (11.7)
|x|6r

and the positive maximum principle (PMP) gives

sup |A f (x)| 6 Cr,A k f k(2) for all f ∈ C∞ d


c (R ), r > 0. (11.8)
|x|6r

Proof. In the above sketched analogue of the proof of Theorem 6.8 we have seen that the family
of linear functionals n o
C∞c (Br (0)) 3 f 7→ Ax f : x ∈ Br (0)

where Ax f := (A f )(x) satisfies

|Ax f | 6 cr,x k f k(2) , f ∈ C∞


c (Br (0)),

i.e. Ax : (C2b (Br (0)), k · k(2) ) → (R, | · |) is bounded. By the Banach–Steinhaus theorem (uniform
boundedness principle)
sup |Ax f | 6 Cr k f k(2) .
|x|6r

Since A also satisfies the positive maximum principle (PMP), we know from step 4◦ of the
(suitably adapted) proof of Theorem 6.8 that
Z
ν(x, dy) 6 Aφ0 (x) for some φ0 ∈ Cc (B1 (0)).
|y|>1

Let r > 1, pick χ = χr ∈ C∞ d


c (R ) such that 1B2r (0) 6 χ 6 1B3r (0) . We get for |x| 6 r

A f (x) = A[χ f ](x) + A[(1 − χ) f ](x)


Z  
= A[χ f ](x) + (1 − χ(x + y)) f (x + y) − (1 − χ(x)) f (x) ν(x, dy),
|y|>r | {z }
=0
and so

sup |A f (x)| 6 Cr kχ f k(2) + k f k∞ kAφ0 k∞ 6 Cr,A k f k(2) .


|x|6r

Corollary 11.7. In the situation of Theorem 11.5 there exists a locally bounded nonnegative func-
tion γ : Rd → [0, ∞) such that

|q(x, ξ )| 6 γ(x)(1 + |ξ |2 ), x, ξ ∈ Rd . (11.9)

Proof. Using (11.8) we can extend A by continuity to C2b (Rd ) and, therefore,

−q(x, ξ ) = e−ξ (x)Aeξ (x), eξ (x) = ei x·ξ

makes sense. Moreover, we have sup|x|6r |Aeξ (x)| 6 Cr,A keξ k(2) for any r > 1; since keξ k(2) is a
polynomial of order 2 in the variable ξ , the claim follows.
Chapter 11: From Lévy to Feller processes 79

For a Lévy process we have ψ(0) = 0 since P(Xt ∈ Rd ) = 1 for all t > 0, i.e. the process Xt
does not explode in finite time. For Feller processes the situation is more complicated. We need
the following technical lemmas.

Lemma 11.8. Let q(x, ξ ) be the symbol of (the generator of ) a Feller process as in Theorem 11.5
and F ⊂ Rd be a closed set. Then the following assertions are equivalent.

a) |q(x, ξ )| 6 C(1 + |ξ |2 ) for all x, ξ ∈ Rd where C = 2 sup sup |q(x, ξ )|.


|ξ |61 x∈F

|y|2
Z
b) sup q(x, 0) + sup |l(x)| + sup kQ(x)k + sup 2
ν(x, dy) < ∞.
x∈F x∈F x∈F x∈F y6=0 1 + |y|

If F = Rd ,
then the equivalent properties of Lemma 11.8 are often referred to as ‘the symbol
has bounded coefficients’.

Outline of the proof (see [57, Appendix] for a complete proof ). The direction b)⇒a) is proved as
Theorem 6.2. Observe that ξ 7→ q(x, ξ ) is for fixed x the characteristic exponent of a Lévy process.
The finiteness of the constant C follows from the assumption b) and the Lévy–Khintchine formula
(11.4).
For the converse a)⇒b) we note that the integrand appearing in (11.4) can be estimated by
|y|2
c 1+|y| 2 which is itself a Lévy exponent:

|y|2 1
Z Z ∞
2 /2λ
= [1 − cos(y · ξ )] g(ξ ) dξ , g(ξ ) = (2πλ )−d/2 e−|ξ | e−λ /2 dλ .
1 + |y|2 2 0

Therefore, by Tonelli’s theorem,


|y|2
Z ZZ
2
ν(x, dy) = [1 − cos(y · ξ )] ν(x, dy) g(ξ ) dξ
y6=0 1 + |y| y6=0
 
1
Z
= g(ξ ) Re q(x, ξ ) − ξ · Q(x)ξ − q(x, 0) dξ .
2
Lemma 11.9. Let A be the generator of a Feller process, assume that C∞ c (R ) ⊂ D(A) and denote
d

by q(x, ξ ) the symbol of A. For any cut-off function χ ∈ C∞ d


c (R ) satisfying 1B1 (0) 6 χ 6 1B2 (0) and
χr (x) := χ(x/r) one has
Z
1 + r−2 |ρ|2 + |ξ |2 χb(ρ) dρ,
 
q(x, D)(χr eξ )(x) 6 4 sup |q(x, η)| (11.10)
|η|61 Rd

lim A(χr eξ )(x) = −eξ (x)q(x, ξ ) for all x, ξ ∈ Rd . (11.11)


r→∞
d b (r(η − ξ )) and
r eξ (η) = r χ
Proof. Observe that χd
Z
q(x, D)(χr eξ )(x) = q(x, η) ei x·η χd
r eξ (η) dη
Z
= q(x, η) ei x·η rd χb(r(η − ξ )) dη (11.12)
Z
= q(x, ξ + r−1 ρ) ei x·(ξ +ρ/r) χb(ρ) dρ.
80 R. L. Schilling: An Introduction to Lévy and Feller Processes

Therefore we can use the estimate (11.9) with the optimal constant γ(x) = 2 sup|η|61 |q(x, η)| and
the elementary estimate (a + b)2 6 2(a2 + b2 ) to obtain
Z
q(x, D)(χr eξ )(x) 6 q(x, ξ + r−1 ρ) χb(ρ) dρ
Z 
1 + r−2 |ρ|2 + |ξ |2 χb(ρ) dρ.

6 4 sup |q(x, η)|
|η|61

This proves (11.10); it also allows us to use dominated convergence in (11.12) to get (11.11). Just
observe that χb ∈ S(Rd ) and χb(ρ) dρ = χ(0) = 1.
R

Lemma 11.10. Let q(x, ξ ) be the symbol of (the generator of ) a Feller process. Then the following
assertions are equivalent:

a) x 7→ q(x, ξ ) is continuous for all ξ .

b) x 7→ q(x, 0) is continuous.

c) Tightness: lim sup ν(x, Rd \ Br (0)) = 0 for all compact sets K ⊂ Rd .


r→∞ x∈K

d) Uniform continuity at the origin: lim sup |q(x, ξ ) − q(x, 0)| = 0 for all compact K ⊂ Rd .
|ξ |→0 x∈K

Proof. Let χ : [0, ∞) → [0, 1] be a decreasing C∞ -function satisfying 1[0,1) 6 χ 6 1[0,4) . The
functions χn (x) := χ(|x|2 /n2 ), x ∈ Rd and n ∈ N, are radially symmetric, smooth functions with
1Bn (0) 6 χn 6 1B2n (0) . Fix any compact set K ⊂ Rd and some sufficiently large n0 such that
K ⊂ Bn0 (0).

a)⇒b) is obvious.

b)⇒c) For m > n > 2n0 the positive maximum principle implies

1K (x)A(χn − χm )(x) > 0.

Therefore, −1K (x)q(x, 0) = limn→∞ 1K (x)Aχn+n0 (x) is a decreasing limit of continuous functions.
Since the limit function q(x, 0) is continuous, Dini’s theorem implies that the limit is uniform on
the set K. From the integro-differential representation (11.5) of the generator we get
Z 
1K (x)|Aχm (x) − Aχn (x)| = 1K (x) χm (x + y) − χn (x + y) ν(x, dy).
n−n0 6|y|62m+n0

Letting m → ∞ yields
Z 
1K (x)|q(x, 0) + Aχn (x)| = 1K (x) 1 − χn (x + y) ν(x, dy)
|y|>n−n0
Z 
> 1K (x) 1 − 1B2n+n0 (0) (y) ν(x, dy)
|y|>n−n0

> 1K (x)ν x, Bc2n+n0 (0) .



Chapter 11: From Lévy to Feller processes 81

where we use that K ⊂ Bn0 (0) and

χn (x + y) 6 1B2n (0) (x + y) = 1B2n (0)−x (y) 6 1B2n+n0 (0) (y).

Since the left-hand side converges uniformly to 0 as n → ∞, c) follows.


c)⇒d) Since the function x 7→ q(x, ξ ) is locally bounded, we conclude from Lemma 11.8 that
supx∈K |l(x)| + supx∈K kQ(x)k < ∞. Thus, lim|ξ |→0 supx∈K (|l(x) · ξ | + |ξ · Q(x)ξ |) = 0, and we
may safely assume that l ≡ 0 and Q ≡ 0. If |ξ | 6 1 we find, using (11.4) and Taylor’s formula for
the integrand,

|q(x,ξ ) − q(x, 0)|


Z h i
= 1 − ei y·ξ + i y · ξ 1(0,1) (y) ν(x, dy)
y6=0
1 2 2
Z Z Z
6 |y| |ξ | ν(x, dy) + |y||ξ | ν(x, dy) + 2 ν(x, dy)
0<|y|2 <1 2 16|y|2 <1/|ξ | |y|2 >1/|ξ |
|y|2
Z
ν(x, dy) 1 + |ξ |−1/2 |ξ | + 2ν x, {y : |y|2 > 1/|ξ |} .
 
6 2
0<|y|2 <1/|ξ | 1 + |y|

Since this estimate is uniform for x ∈ K, we get d) as |ξ | → 0.


d)⇒c) As before, we may assume that l ≡ 0 and Q ≡ 0. For every r > 0
1 |y/r|2
Z
ν(x, Bcr (0)) 6 2
ν(x, dy)
2 |y|>r 1 + |y/r|
η ·y
Z Z 
= 1 − cos g(η) dη ν(x, dy)
|y|>r Rd r
Z   
6 Re q x, ηr − q(x, 0) g(η) dη,
Rd

where g(η) is as in the proof of Lemma 11.8. Since (1 + |η|2 )g(η) dη < ∞, we can use (11.9)
R

and find
ν(x, Bcr (0)) 6 cg sup Re q(x, η) − q(x, 0) .
|η|61/r

Taking the supremum over all x ∈ K and letting r → ∞ proves c).


c)⇒a) From Lemma 11.9 we know that limn→∞ e−ξ (x)A[χn eξ ](x) = −q(x, ξ ). Let us show that
this convergence is uniform for x ∈ K. Let m > n > 2n0 . For x ∈ K

e−ξ (x)A[eξ χn ](x) − e−ξ (x)A[eξ χm ](x)


Z  
= eξ (y)χn (x + y) − eξ (y)χm (x + y) ν(x, dy)
y6=0
Z  
6 χm (x + y) − χn (x + y) ν(x, dy)
y6=0
Z  
6 χm (x + y) − χn (x + y) ν(x, dy)
n−n0 6|y|62m+n0

6 ν(x, Bcn−n0 (0)).


82 R. L. Schilling: An Introduction to Lévy and Feller Processes

In the penultimate step we use that, because of the definition of the functions χn ,

supp χm (x + ·) − χn (x + ·) ⊂ B2m (x) \ Bn (x) ⊂ B2m+n0 (0) \ Bn−n0 (0)

for all x ∈ K ⊂ Bn0 (0). The right-hand side tends to 0 uniformly for x ∈ K as n → ∞, hence
m → ∞.

Remark 11.11. The argument used in the first three lines of the step b)⇒c) in the proof of Lemma
11.10 shows, incidentally, that

x 7→ q(x, ξ ) is always upper semicontinuous

since it is (locally) a decreasing limit of continuous functions.


Remark 11.12. Let A be the generator of a Feller process, and assume that C∞ c (R ) ⊂ D(A);
d

although A maps D(A) into C∞ (Rd ), this is not enough to guarantee that the symbol q(x, ξ ) is
continuous in the variable x. On the other hand, if the Feller process X has only bounded jumps,
i.e. if the support of the Lévy measure ν(x, ·) is uniformly bounded, then q(·, ξ ) is continuous.
This is, in particular, true for diffusions.
This follows immediately from Lemma 11.10.c) which holds if ν(x, Bcr (0)) = 0 for some r > 0
and all x ∈ Rd .
We can also give a direct argument: pick χ ∈ C∞ d
c (R ) satisfying 1B3r (0) 6 χ 6 1. From the
representation (11.5) it is not hard to see that

A f (x) = A[χ f ](x) for all f ∈ C2b (Rd ) and x ∈ Br (0);

in particular, A[χ f ] is continuous.


If we take f (x) := eξ (x), then, by (11.6), −q(x, ξ ) = e−ξ (x)Aeξ (x) = e−ξ (x)A[χeξ ](x) which
proves that x 7→ q(x, ξ ) is continuous on every ball Br (0), hence everywhere.
We can now discuss the role of q(x, ξ ) for the conservativeness of a Feller process.

Theorem 11.13. Let (Xt )t>0 be a Feller process with infinitesimal generator (A, D(A)) such that
C∞
c (R ) ⊂ D(A), symbol q(x, ξ ) and semigroup (Pt )t>0 .
d

a) If x 7→ q(x, ξ ) is continuous for all ξ ∈ Rd and Pt 1 = 1, then q(x, 0) = 0.

b) If q(x, ξ ) has bounded coefficients and q(x, 0) = 0, then x 7→ q(x, ξ ) is continuous for all
ξ ∈ Rd and Pt 1 = 1.

Proof. Let χ and χr , r ∈ N, be as in Lemma 11.9. Then eξ χr ∈ D(A) and, by Corollary 11.4,
Z
Mt := eξ χr (Xt ) − eξ χr (x) − A(eξ χr )(Xs ) ds, t > 0,
[0,t)

is a martingale. Using optional stopping for the stopping time

τ := τRx := inf{s > 0 : |Xs − x| > R}, x ∈ Rd , R > 0,


Chapter 11: From Lévy to Feller processes 83

the stopped process (Mt∧τ )t>0 is still a martingale. Since Ex Mt∧τ = 0, we get
Z
x x
E (χr eξ )(Xt∧τ ) − χr eξ (x) = E A(χr eξ )(Xs ) ds.
[0,t∧τ)

Note that the integrand is evaluated only for times s < t ∧ τ where |Xs | 6 R + x. Since A(eξ χr )(x)
is locally bounded, we can use dominated convergence and Lemma 11.9 and we find, as r → ∞,
Z
x x
E eξ (Xt∧τ ) − eξ (x) = −E eξ (Xs )q(Xs , ξ ) ds.
[0,t∧τ)

a) Set ξ = 0 and observe that Pt 1 = 1 implies that τ = τRx → ∞ a.s. as R → ∞. Therefore,


Z
Px (Xt∧τ ∈ Rd ) − 1 = −Ex q(Xs , 0) ds,
[0,τ∧t)

and with Fatou’s Lemma we can let R → ∞ to get


Z  Z  Z t
x x x
0 = lim inf E q(Xs , 0) ds > E lim inf q(Xs , 0) ds = E q(Xs , 0) ds.
R→∞ [0,τ∧t) R→∞ [0,τ∧t) 0

Since x 7→ q(x, 0) is continuous and q(x, 0) non-negative, we conclude with Tonelli’s theorem that
Z t
1
q(x, 0) = lim Ex q(Xs , 0) ds = 0.
t→0 t 0

b) Set ξ = 0 and observe that the boundedness of the coefficients implies that
Z
Px (Xt ∈ Rd ) − 1 = −Ex q(Xs , 0) ds
[0,t)

as R → ∞. Since the right-hand side is 0, we get Pt 1 = Px (Xt ∈ Rd ) = 1.

Remark 11.14. The boundedness of the coefficients in Theorem 11.13.b) is important. If the
coefficients of q(x, ξ ) grow too rapidly, we may observe explosion in finite time even if q(x, 0) = 0.
A typical example in dimension 3 is the diffusion process given by the generator
1
L f (x) = a(x)∆ f (x)
2
where a(x) is continuous, rotationally symmetric a(x) = α(|x|) for a suitable function α(r), and

satisfies 1∞ 1/α( r) dr < ∞, see Stroock & Varadhan [60, p. 260, 10.3.3]; the corresponding
R

symbol is q(x, ξ ) = 12 a(x)|ξ |2 . This process explodes in finite time. Since this is essentially a
time-changed Brownian motion (see Böttcher, Schilling & Wang [9, Chapter 4.1]), this example
works only if Brownian motion is transient, i.e. in dimensions d = 3 and higher. A sufficient
criterion for conservativeness in terms of the symbol is

lim inf sup sup |q(y, η)| < ∞ for all x ∈ Rd ,


r→∞ |y−x|62r |η|61/r

see [9, Theorem 2.34].


12. Symbols and semimartingales

So far, we have been treating the symbol q(x, ξ ) of (the generator of) a Feller process X as an an-
alytic object. On the other hand, Theorem 11.13 indicates, that there should be some probabilistic
consequences. In this chapter we want to follow this lead, show a probabilistic method to calcu-
late the symbol and link it to the semimartingale characteristics of a Feller process. The blueprint
for this is the relation of the Lévy–Itô decomposition (which is the semimartingale decomposition
of a Lévy process, cf. Theorem 9.12) with the Lévy–Khintchine formula for the characteristic
exponent (which coincides with the symbol of a Lévy process, cf. Corollary 9.13).
For a Lévy process Xt with semigroup Pt f (x) = Ex f (Xt ) = E f (Xt + x) the symbol can be cal-
culated in the following way:

e−ξ (x)Pt eξ (x) − 1 Ex ei ξ ·(Xt −x) − 1 e−tψ(ξ ) − 1


lim = lim = lim = −ψ(ξ ). (12.1)
t→0 t t→0 t t→0 t
For a Feller process a similar formula is true.

Theorem 12.1. Let X = (Xt )t>0 be a Feller process with transition semigroup (Pt )t>0 and gen-
erator (A, D(A)) such that C∞c (R ) ⊂ D(A). If x 7→ q(x, ξ ) is continuous and q has bounded
d

coefficients (Lemma 11.8 with F = Rd ), then

Ex ei ξ ·(Xt −x) − 1
− q(x, ξ ) = lim . (12.2)
t→0 t
Proof. Pick χ ∈ C∞ d x

c (R ), 1B1 (0) 6 χ 6 1B2 (0) and set χn (x) := χ n . Obviously, χn → 1 as n → ∞.
By Lemma 5.4,
Z t
Pt [χn eξ ](x) − χn (x)eξ (x) = APs [χn eξ ](x) ds
0
Z t
= Ex A[χn eξ ](Xs ) ds
0
Z tZ  
=− Ex eη (Xs )q(Xs , η) χd
n eξ (η) dη ds
0 Rd | {z }
=nd χb(n(η−ξ ))
Z t  
−−−→ − Ps eξ q(·, ξ ) (x) ds.
n→∞ 0

Since x 7→ q(x, ξ ) is continuous, we can divide by t and let t → 0; this yields


1 
lim Pt eξ (x) − eξ (x) = −eξ (x)q(x, ξ ).
t→0 t

84
Chapter 12: Symbols and semimartingales 85

Theorem 12.1 is a relatively simple probabilistic formula to calculate the symbol. We want to
relax the boundedness and continuity assumptions. Here Dynkin’s characteristic operator becomes
useful.

Lemma 12.2 (Dynkin’s formula). Let (Xt )t>0 be a Feller process with semigroup (Pt )t>0 and
generator (A, D(A)). For every stopping time σ with Ex σ < ∞ one has
Z
Ex f (Xσ ) − f (x) = Ex A f (Xs ) ds, f ∈ D(A). (12.3)
[0,σ )

[f] Rt
Proof. From Corollary 11.4 we know that Mt := f (Xt ) − f (X0 ) − 0 A f (Xs ) ds is a martingale;
thus (12.3) follows from the optional stopping theorem.

Definition 12.3. Let (Xt )t>0 be a Feller process. A point a ∈ Rd is an absorbing point, if

Pa (Xt = a, ∀t > 0) = 1.

Denote by τr := inf{t > 0 : |Xt − X0 | > r} the first exit time from the ball Br (x) centered at the
starting position x = X0 .

Lemma 12.4. Let (Xt )t>0 be a Feller process and assume that b ∈ Rd is not absorbing. Then there
exists some r > 0 such that Eb τr < ∞.

Proof. 1◦ If b is not absorbing, then there is some f ∈ D(A) such that A f (b) 6= 0. Assume the
contrary, i.e.
A f (b) = 0 for all f ∈ D(A).

By Lemma 5.4, Ps f ∈ D(A) for all s > 0, and


Z t
Pt f (b) − f (b) = A(Ps f )(b) ds = 0.
0

So, Pt f (b) = f (b) for all f ∈ D(A). Since the domain D(A) is dense in C∞ (Rd ) (Remark 5.5), we
get Pt f (b) = f (b) for all f ∈ C∞ (Rd ), hence Pb (Xt = b) = 1 for any t > 0. Therefore,

Pb (Xq = b, ∀q ∈ Q, q > 0) = 1 and Pb (Xt = b, ∀t > 0) = 1,

because of the right-continuity of the sample paths. This means that b is an absorbing point,
contradicting our assumption.

2◦ Pick f ∈ D(A) such that A f (b) > 0. Since A f is continuous, there exist ε > 0 and r > 0 such
that A f |Br (b) > ε > 0. From Dynkin’s formula (12.3) with σ = τr ∧ n, n > 1, we deduce
Z
εEb (τr ∧ n) 6 Eb A f (Xs ) ds = Eb f (Xτr ∧n ) − f (b) 6 2k f k∞ .
[0,τr ∧n)

Finally, monotone convergence shows that Eb τr 6 2k f k∞ /ε < ∞.


86 R. L. Schilling: An Introduction to Lévy and Feller Processes

Definition 12.5 (Dynkin’s operator). Let X be a Feller process. The linear operator (A, D(A))
defined by
 x
 lim E f (Xτr ) − f (x) , if x is not absorbing,

A f (x) := r→0 Ex τr
0,

otherwise,
n o
D(A) := f ∈ C∞ (Rd ) : the above limit exists pointwise ,

is called Dynkin’s (characteristic) operator.

Lemma 12.6. Let (Xt )t>0 be a Feller process with generator (A, D(A)) and characteristic opera-
tor (A, D(A)).

a) A is an extension of A, i.e. D(A) ⊂ D(A) and A|D(A) = A.

b) (A, D) = (A, D(A)) if D = { f ∈ D(A) : f , A f ∈ C∞ (Rd )}.

Proof. a) Let f ∈ D(A) and assume that x ∈ Rd is not absorbing. By Lemma 12.4 there is some
r = r(x) > 0 with Ex τr < ∞. Since we have A f ∈ C∞ (Rd ), there exists for every ε > 0 some δ > 0
such that
|A f (y) − A f (x)| < ε for all y ∈ Bδ (x).

Without loss of generality let δ < r. Using Dynkin’s formula (12.3) with σ = τδ , we see
Z
x x x
E f (Xτδ ) − f (x) − A f (x)E τδ 6 E |A f (Xs ) − A f (x)| ds 6 εEx τδ .
[0,τδ ) | {z }

Thus, limr→0 Ex f (Xτr ) − f (x) Ex τr = A f (x).




If x is absorbing and f ∈ D(A), then A f (x) = 0 and so A f (x) = A f (x).

b) Since (A, D) satisfies the (PMP), the claim follows from Lemma 5.11.

Theorem 12.7. Let (Xt )t>0 be a Feller process with infinitesimal generator (A, D(A)) such that
C∞
c (R ) ⊂ D(A) and x 7→ q(x, ξ ) is continuous . Then
d 1

Ex ei(Xτr −x)·ξ − 1
− q(x, ξ ) = lim (12.4)
r→0 Ex τr
1
for all x ∈ Rd (as usual, ∞ := 0). In particular, q(a, ξ ) = 0 for all absorbing states a ∈ Rd .

Proof. Let χn ∈ C∞ d
c (R ) such that 1Bn (0) 6 χn 6 1. By Dynkin’s formula (12.3)
Z
e−ξ (x)Ex [χn (Xτr ∧t )eξ (Xτr ∧t )] − χn (x) = Ex e−ξ (x)A[χn eξ ](Xs ) ds.
[0,τr ∧t)

1 For instance, if X
has bounded jumps, see Lemma 11.10 and Remark 11.12. Our proof will show that it is actually
enough to assume that s 7→ q(Xs , ξ ) is right-continuous.
Chapter 12: Symbols and semimartingales 87

Observe that A[χn eξ ](Xs ) is bounded if s < τr , see Corollary 11.6. Using the dominated conver-
gence theorem, we can let n → ∞ to get
Z
x x
e−ξ (x)E eξ (Xτr ∧t ) − 1 = E e−ξ (x)Aeξ (Xs ) ds
[0,τr ∧t)
Z (12.5)
(11.6)
= −Ex eξ (Xs − x)q(Xs , ξ ) ds.
[0,τr ∧t)

If x is absorbing, we have q(x, ξ ) = 0, and (12.4) holds trivially. For non-absorbing x, we pass to
the limit t → ∞ and get, using Ex τr < ∞ (see Lemma 12.4),

e−ξ (x)Ex eξ (Xτr ) − 1 1 x


Z
= − E eξ (Xs − x)q(Xs , ξ ) ds.
Ex τr Ex τr [0,τr )

Since s 7→ q(Xs , ξ ) is right-continuous at s = 0, the limit r → 0 exists, and (12.4) follows.

A small variation of the above proof yields

Corollary 12.8 (Schilling, Schnurr [57]). Let X be a Feller process with generator (A, D(A)) such
that C∞
c (R ) ⊂ D(A) and x 7→ q(x, ξ ) is continuous. Then
d

Ex ei(Xt∧τr −x)·ξ − 1
− q(x, ξ ) = lim (12.6)
t→0 t

for all x ∈ Rd and r > 0.

Proof. We follow the proof of Theorem 12.7 up to (12.5). This relation can be rewritten as

Ex ei(Xt∧τr −x)·ξ − 1
Z t
1
= − Ex eξ (Xs − x)q(Xs , ξ )1[0,τr ) (s) ds.
t t 0

Observe that Xs is bounded if s < τr and that s 7→ q(Xs , ξ ) is right-continuous. Therefore, the limit
t → 0 exists and yields (12.6).

Every Feller process (Xt )t>0 such that C∞


c (R ) ⊂ D(A) is a semimartingale. Moreover, the
d

semimartingale characteristics can be expressed in terms of the Lévy triplet (l(x), Q(x), ν(x, dy))
of the symbol. Recall that a (d-dimensional) semimartingale is a stochastic process of the form
Z tZ
X0 + Xtc + y1(0,1) (|y|) µ X (·, ds, dy) − ν(·, ds, dy) + ∑ 1[1,∞) (|∆Xs |)∆Xs + Bt
 
Xt =
0 s6t

where X c is the continuous martingale part, B is a previsible process with paths of finite variation
(on compact time intervals) and with the jump measure

µ X (ω, ds, dy) = ∑ δ(s,∆Xs (ω)) (ds, dy)


s : ∆Xs (ω)6=0

whose compensator is ν(ω, ds, dy). The triplet (B,C, ν) with the (predictable) quadratic variation
C = [X c , X c ] of X c is called the semimartingale characteristics.
88 R. L. Schilling: An Introduction to Lévy and Feller Processes

Theorem 12.9 (Schilling [53], Schnurr [59]). Let (Xt )t>0 be a Feller process with infinitesimal
generator (A, D(A)) such that C∞
c (R ) ⊂ D(A) and symbol q(x, ξ ) given by (11.4). If q(x, 0) = 0,
d 2

then X is a semimartingale whose semimartingale characteristics can be expressed by the Lévy


triplet (l(x), Q(x), ν(x, dy))
Z t Z t
Bt = l(Xs ) ds, Ct = Q(Xs ) ds, ν(·, ds, dy) = ν(Xs (·), dy) ds.
0 0

Proof. 1◦ Combining Corollary 11.4 with the integro-differential representation (11.5) of the
generator shows that
Z
[f]
Mt = f (Xt ) − A f (Xs ) ds
[0,t)
1
Z Z
= f (Xt ) − l(Xs ) · ∇ f (Xs ) ds − ∇ · Q(Xs )∇ f (Xs ) ds
[0,t) 2 [0,t)
Z Z  
− f (Xs + y) − f (Xs ) − ∇ f (Xs ) · y1(0,1) (|y|) ν(Xs , dy) ds
[0,t) y6=0

is for all f ∈ C2b (Rd ) ∩ D(A) a martingale.

2◦ We claim that C2c (Rd ) ⊂ D(A). Indeed, let f ∈ C2c (Rd ) with supp f ⊂ Br (0) for some r > 0
and pick a sequence fn ∈ C∞c (B2r (0)) such that limn→∞ k f − f n k(2) = 0. Using (11.7) we get

sup |A fn (x) − A fm (x)| 6 c3r k fn − fm k(2) −−−−→ 0.


|x|63r m,n→∞

Since supp fn ⊂ B2r (0) and fn → f uniformly, there is some u ∈ C∞


c (B3r (0)) with | f n (x)| 6 u(x).
Therefore, we get for |x| > 2r
Z
|A fn (x) − A fm (x)| 6 | fn (x + y) − fm (x + y)| ν(x, dy)
y6=0
Z
62 u(x + y) ν(x, dy) = 2Au(x) −−−→ 0
y6=0 |x|→∞

uniformly for all m, n. This shows that (A fn )n∈N is a Cauchy sequence in C∞ (Rd ). By the closed-
ness of (A, D(A)) we gather that f ∈ D(A) and A f = limn→∞ A fn .

3◦ Fix r > 1, pick χr ∈ C∞ d


c (R ) such that 1B3r (0) 6 χ 6 1, and set

σ = σr := inf{t > 0 : |Xt − X0 | > r} ∧ inf{t > 0 : |∆Xt | > r}.

Since (Xt )t>0 has càdlàg paths and infinite life-time, σr is a family of stopping times with σr ↑ ∞.

2A sufficient condition is, for example, that X has infinite life-time and x 7→ q(x, ξ ) is continuous (either for all ξ
or just for ξ = 0), cf. Theorem 11.13 and Lemma 11.10.
Chapter 12: Symbols and semimartingales 89

For any f ∈ C2 (Rd ) ∩ Cb (Rd ) and x ∈ Rd we set fr := χr f , f x := f (· − x), frx (· − x), and consider

1
Z Z
[ f x]
Mt∧σ = f x (Xt∧σ ) − l(Xs ) · ∇ f x (Xs ) ds − ∇ · Q(Xs )∇ f x (Xs ) ds
[0,t∧σ ) 2 [0,t∧σ )
Z Z
f x (Xs + y) − f x (Xs ) − ∇ f x (Xs ) · y1(0,1) (|y|) ν(Xs , dy) ds
 

[0,t∧σ ) y6=0
1
Z Z
= frx (Xt∧σ ) − l(Xs ) · ∇ frx (Xs ) ds − ∇ · Q(Xs )∇ frx (Xs ) ds
[0,t∧σ ) 2 [0,t∧σ )
Z Z
frx (Xs + y) − frx (Xs ) − ∇ frx (Xs ) · y1(0,1) (|y|)
 
− ν(Xs , dy) ds
[0,t∧σ ) 0<|y|<r
r[ f x]
= Mt∧σ .

[ f x]
Since fr ∈ C2c (Rd ) ⊂ D(A), we see that Mt is a local martingale (with reducing sequence σr ,
r > 0), and by a general result of Jacod & Shiryaev [27, Theorem II.2.42], it follows that X is a
semimartingale with the characteristics mentioned in the theorem.

We close this chapter with the discussion of a very important example: Lévy driven stochastic
differential equations. From now on we assume that

Φ : Rd → Rd×n is a matrix-valued, measurable function,


L = (Lt )t>0 is an n-dimensional Lévy process with exponent ψ(ξ ),

and consider the following Itô stochastic differential equation (SDE)

dXt = Φ(Xt− ) dLt , X0 = x. (12.7)

If Φ is globally Lipschitz continuous, then the SDE (12.7) has a unique solution which is a strong
Markov process, see Protter [43, Chapter V, Theorem 32]3 . If we write Xtx for the solution of (12.7)
with initial condition X0 = x = X0x , then the flow x 7→ Xtx is continuous [43, Chapter V, Theorem
38].
If we use Lt = (t,Wt , Jt )> as driving Lévy process where W is a Brownian motion and J is a
pure-jump Lévy process (we assume4 that W ⊥ ⊥ J), and if Φ is a block-matrix, then we see that
(12.7) covers also SDEs of the form

dXt = f (Xt− ) dt + F(Xt− ) dWt + G(Xt− ) dJt .

Lemma 12.10. Let Φ be bounded and Lipschitz, X the unique solution of the SDE (12.7), and
denote by A the generator of X. Then C∞
c (R ) ⊂ D(A).
d

3 Protterrequires that L has independent coordinate processes, but this is not needed in his proof. For existence and
uniqueness the local Lipschitz and linear growth conditions are enough; the strong Lipschitz condition is used for the
Markov nature of the solution.
4 This assumption can be relaxed under suitable assumptions on the (joint) filtration, see for example Ikeda &

Watanabe [22, Theorem II.6.3]


90 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Because of Theorem 5.12 (this theorem does not only hold for Feller semigroups, but
for any strongly continuous contraction semigroup satisfying the positive maximum principle), it
suffices to show
1
lim Ex f (Xt ) − f (x) = g(x) and g ∈ C∞ (Rd )

t→0 t

for any f ∈ C∞ d
c (R ). For this we use, as in the proof of the following theorem, Itô’s formula to get

Z t
Ex f (Xt ) − f (x) = Ex A f (Xs ) ds,
0

and a calculation similar to the one in the proof of the next theorem. A complete proof can be
found in Schilling & Schnurr [57, Theorem 3.5].

Theorem 12.11. Let (Lt )t>0 be an n-dimensional Lévy process with exponent ψ and assume that Φ
is Lipschitz continuous. Then the unique Markov solution X of the SDE (12.7) admits a generalized
symbol in the sense that

Ex ei(Xt∧τr −x)·ξ − 1
lim = −ψ(Φ> (x)ξ ), r > 0.
t→0 t

If Φ ∈ C1b (Rd ), then X is Feller, C∞


c (R ) ⊂ D(A) and q(x, ξ ) = ψ(Φ (x)ξ ) is the symbol of the
d >

generator.

Theorem 12.11 indicates that the formulae (12.4) and (12.6) may be used to associate symbols
not only with Feller processes but with more general Markov semimartingales. This has been
investigated by Schnurr [58] and [59] who shows that the class of Itô processes with jumps is
essentially the largest class that can be described by symbols; see also [9, Chapters 2.4–5]. This
opens up the way to analyze rather general semimartingales using the symbol. Let us also point
out that the boundedness of Φ is only needed to ensure that X is a Feller process.

Proof. Let τ = τr be the first exit time for the process X from the ball Br (x) centered at x = X0 .
We use the Lévy–Itô decomposition (9.8) of L. From Itô’s formula for jump processes (see, e.g.
Protter [43, Chapter II, Theorem 33]) we get

1 x 
E eξ (Xt∧τ − x) − 1
t  Z t∧τ
1 x 1 t∧τ
Z
= E i eξ (Xs− − x)ξ dXs − eξ (Xs− − x)ξ · d[X, X]cs ξ
t 0 2 0


+ e−ξ (x) ∑ eξ (Xs ) − eξ (Xs− ) − i eξ (Xs− )ξ · ∆Xs
s6τ∧t

=: I1 + I2 + I3 .
Chapter 12: Symbols and semimartingales 91

We consider the three terms separately.


1 x t∧τ
Z
I1 = E i eξ (Xs− − x)ξ dXs
t 0
Z t∧τ
1
= Ex i eξ (Xs− − x)ξ · Φ(Xs− ) dLs
t 0
Z t∧τ  
1 x
Z
= E i eξ (Xs− − x)ξ · Φ(Xs− ) ds ls + y Ns (dy)
t 0 |y|>1

=: I11 + I12
where we use that the diffusion part and the compensated small jumps of a Lévy process are a
martingale, cf. Theorem 9.12. Further,
I3 + I12
Z t∧τ Z
1
= Ex
 
eξ (Xs− − x) eξ (Φ(Xs− )y) − 1 − i ξ · Φ(Xs− )y1(0,1) (|y|) ds Ns (dy)
t 0 y6=0
1 x t∧τ
Z Z  
= E eξ (Xs− − x) eξ (Φ(Xs− )y) − 1 − i ξ Φ(Xs− )y1(0,1) (|y|) ν(dy) ds
t 0 y6=0
Z  i ξ ·Φ(x)y 
−−→ e − 1 − i ξ · Φ(x)y1(0,1) (|y|) ν(dy).
t→0 y6=0

Here we use that ν(dy) ds is the compensator of ds Ns (dy), see Lemma 9.4. This requires that the
integrand is ν-integrable, but this is ensured by the local boundedness of Φ(·) and the fact that
2
R
y6=0 min{|y| , 1} ν(dy) < ∞. Moreover,
Z t∧τ
1
I11 = Ex i eξ (Xs− − x)ξ · Φ(Xs− )l ds −−→ i ξ · Φ(x)l,
t 0 t→0
and, finally, we observe that
hZ Z ic
c
[X, X] = Φ(Xs− ) dLs , Φ(Xs− ) dLs
Z
= Φ(Xs− ) d[L, L]cs Φ> (Xs− )
Z
= Φ(Xs− ) Q Φ> (Xs− ) ds ∈ Rd×d
which gives
1 t∧τ
Z
I2 = − Ex eξ (Xs− − x)ξ · d[X, X]cs ξ
2t 0
1 x t∧τ
Z
=− E eξ (Xs− − x)ξ · Φ(Xs− )QΦ> (Xs− )ξ ds
2t 0
1
−−→ − ξ · Φ(x)QΦ> (x)ξ .
t→0 2
This proves
1
q(x, ξ ) = − i l · Φ> (x)ξ + ξ · Φ(x)QΦ> (x)ξ
Z 2
i y·Φ> (x)ξ
+ i y · Φ> (x)ξ 1(0,1) (|y|) ν(dy)
 
+ 1−e
y6=0
>
= ψ(Φ (x)ξ ).
92 R. L. Schilling: An Introduction to Lévy and Feller Processes

For the second part of the theorem we use Lemma 12.10 to see that C∞ c (R ) ⊂ D(A). The con-
d

tinuity of the flow x 7→ Xtx (Protter [43, Chapter V, Theorem 38])—X x is the unique solution of
the SDE with initial condition X0x = x—ensures that the semigroup Pt f (x) := Ex f (Xt ) = E f (Xtx )
maps f ∈ C∞ (Rd ) to Cb (Rd ). In order to see that Pt f ∈ C∞ (Rd ) we need that lim|x|→∞ |Xtx | = ∞
a.s. This requires some longer calculations, see e.g. Schnurr [58, Theorem 2.49] or Kunita [34,
Proof of Theorem 3.5, p. 353, line 13 from below] (for Brownian motion this argument is fully
worked out in Schilling & Partzsch [56, Corollary 19.31]).
13. Dénouement

It is well known that the characteristic exponent ψ(ξ ) of a Lévy process L = (Lt )t>0 can be used
to describe many probabilistic properties of the process. The key is the formula

Ex ei ξ ·(Lt −x) = Eei ξ ·Lt = e−tψ(ξ ) (13.1)

which gives access to the Fourier transform of the transition function Px (Lt ∈ dy) = P(Lt +x ∈ dy).
Although it is not any longer true that the symbol q(x, ξ ) of a Feller process X = (Xt )t>0 is the
characteristic exponent, we may interpret formulae like (12.4)

Ex ei(Xτr −x)·ξ − 1
−q(x, ξ ) = lim
r→0 Ex τr
as infinitesimal versions of the relation (13.1). What is more, both ψ(ξ ) and q(x, ξ ) are the
Fourier symbols of the generators of the processes. We have already used these facts to discuss
the conservativeness of Feller processes (Theorem 11.13) and the semimartingale decomposition
of Feller processes (Theorem 12.9).
It is indeed possible to investigate further path properties of a Feller process using its symbol
q(x, ξ ). Below we will, mostly without proof, give some examples which are taken from Böttcher,
Schilling & Wang [9]. Let us point out the two guiding principles.

I For sample path properties, the symbol q(x, ξ ) of a Feller process assumes same
role as the characteristic exponent ψ(ξ ) of a Lévy process.
II A Feller process is ‘locally Lévy’, i.e. for short-time path properties the Feller
process, started at x0 , behaves like the Lévy process (Lt + x0 )t>0 with characteristic
exponent ψ(ξ ) := q(x0 , ξ ).

The latter property is the reason why such Feller processes are often called Lévy-type processes.
The model case is the stable-like process whose symbol is given by q(x, ξ ) = |ξ |α(x) where α :
Rd → (0, 2) is sufficiently smooth1 . This process behaves locally, and for short times t  1, like
an α(x)-stable process, only that the index now depends on the starting point X0 = x.
The key to many path properties are the following maximal estimates which were first proved
in [53]. The present proof is taken from [9], the observation that we may use a random time τ
instead of a fixed time t is due to F. Kühn.
1 In dimension d = 1 Lipschitz or even Dini continuity is enough (see Bass [4]), in higher dimensions we need
something like C5d+3 -smoothness, cf. Hoh [21]. Meanwhile, Kühn [33] established the existence for d > 1 with α(x)
satisfying any Hölder condition.

93
94 R. L. Schilling: An Introduction to Lévy and Feller Processes

Theorem 13.1. Let (Xt )t>0 be a Feller process with generator A, C∞


c (R ) ⊂ D(A) and symbol
d

q(x, ξ ). If τ is an integrable stopping time, then


 
Px sup |Xs − x| > r 6 c Ex τ sup sup |q(y, ξ )|. (13.2)
s6τ |y−x|6r |ξ |6r−1

Proof. Denote by σr = σrx the first exit time from the closed ball Br (x). Clearly,
 
{σr < τ} ⊂ sup |Xs − x| > r ⊂ {σrx 6 τ} .
x
s6τ

Pick u ∈ C∞ d
c (R ), 0 6 u 6 1, u(0) = 1, supp u ⊂ B1 (0), and set
 
x y−x
ur (y) := u .
r

In particular, uxr |Br (x)c = 0. Hence,

1{σrx 6τ} 6 1 − uxr (Xτ∧σrx ).

Now we can use (5.5) or (12.3) to get

Px (σrx 6 τ) 6 Ex 1 − uxr (Xτ∧σrx )


 

Z
x
= E q(Xs , D)uxr (Xs ) ds
[0,τ∧σrx )
Z Z
x
= E uxr (ξ ) dξ ds
1Br (x) (Xs )eξ (Xs )q(Xs , ξ )b
[0,τ∧σrx )
Z Z
x
6 E uxr (ξ )| dξ ds
sup |q(y, ξ )||b
[0,τ∧σrx ) |y−x|6r
Z
= Ex [τ ∧ σrx ] uxr (ξ )| dξ
sup |q(y, ξ )||b
|y−x|6r
(11.9)
Z
6 c E τ sup x
sup |q(y, ξ )| (1 + |ξ |2 )|b
u(ξ )| dξ .
L 11.8 |y−x|6r |ξ |6r−1

There is a also a probability estimate for sups6t |Xs − x| 6 r, but this requires some sector
condition for the symbol, that is an estimate of the form

| Im q(x, ξ )| 6 κ Re q(x, ξ ), x, ξ ∈ Rd . (13.3)

One consequence of (13.3) is that the drift (which is contained in the imaginary part of the symbol)
is not dominant. This is a familiar assumption in the study of path properties of Lévy processes,
see e.g. Blumenthal & Getoor [8]; a typical example where this condition is violated are (Lévy)
1
symbols of the form ψ(ξ ) = i ξ + |ξ | 2 . For a Lévy process the sector condition on ψ coincides
with the sector condition for the generator and the associated non-symmetric Dirichlet form, see
Jacob [26, Volume 1, 4.7.32–33].
Chapter 13: Dénouement 95

With some more effort (see [9, Theorem 5.5]), we may replace the sector condition by imposing
conditions on the expression
Re q(x, ξ )
sup as r → ∞.
|y−x|6r |ξ | | Im q(y, ξ )|

Theorem 13.2 (see [9, pp. 117–119]). Let (Xt )t>0 be a Feller process with infinitesimal generator
A, C∞
c (R ) ⊂ D(A) and symbol q(x, ξ ) satisfying the sector condition (13.3). Then
d
 
x cκ,d
P sup |Xs − x| < r 6 . (13.4)
s6t t sup|ξ |6r−1 inf|y−x|62r |q(y, ξ )|

The maximal estimates (13.2) and (13.4) are quite useful tools. With them we can estimate the
mean exit time from balls (X and q(x, ξ ) are as in Theorem 13.2):
c cκ
6 Ex σrx 6
sup|ξ |61/r inf|y−x|6r |q(y, ξ )| sup|ξ |6k∗ /r inf|y−x|6r |q(y, ξ )|

for all x ∈ Rd and r > 0 and with k∗ = arccos 2/3 ≈ 0.615.


p

Recently, Kühn [32] studied the existence of and estimates for generalized moments; a typical
result is contained in the following theorem.

Theorem 13.3. Let X = (Xt )t>0 be a Feller process with infinitesimal generator (A, D(A)) and
C∞c (R ) ⊂ D(A). Assume that the symbol q(x, ξ ), given by (11.4), satisfies q(x, 0) = 0 and has
d

(l(x), Q(x), ν(x, dy)) as x-dependent Lévy triplet. If f : Rd → [0, ∞) is (comparable to) a twice
continuously differentiable submultiplicative function such that
Z
sup f (y) ν(x, dy) < ∞ for a compact set K ⊂ Rd ,
x∈K

then the generalized moment supx∈K sups6t Ex f (Xs∧τK − x) < ∞ exists and

Ex f (Xt∧τK ) 6 c f (x)eC(M1 +M2 )t ;

here τK = inf{t > 0 : Xt ∈


/ K} is the first exit time from K, C = CK is some absolute constant and
 Z  Z
2
M1 = sup |l(x)| + |Q(x)| + (|y| ∧ 1) ν(x, dy) , M2 = sup f (y) ν(x, dy).
x∈K y6=0 x∈K |y|>1

If X has bounded coefficients (Lemma 11.8), then K = Rd is admissible.

There are also counterparts for the Blumenthal–Getoor and Pruitt indices. Below we give two
representatives, for a full discussion we refer to [9].

Definition 13.4. Let q(x, ξ ) be the symbol of a Feller process. Then


sup|η|6|ξ | sup|y−x|61/|ξ | |q(y, η)|
 
x
β∞ := inf λ > 0 : lim =0 , (13.5)
|ξ |→∞ |ξ |λ
inf|η|6|ξ | inf|y−x|61/|ξ | |q(y, η)|
 
x
δ∞ := sup λ > 0 : lim =∞ . (13.6)
|ξ |→∞ |ξ |λ
96 R. L. Schilling: An Introduction to Lévy and Feller Processes

By definition, 0 6 δ∞x 6 β∞x 6 2. For example, if q(x, ξ ) = |ξ |α(x) with a smooth exponent
function α(x), then β∞x = δ∞x = α(x); in general, however, we cannot expect that the two indices
coincide.
As for Lévy processes, these indices can be used to describe the path behaviour.

Theorem 13.5. Let (Xt )t>0 be a d-dimensional Feller process with the generator A such that
c (R ) ⊂ D(A) and symbol q(x, ξ ). For every bounded analytic set E ⊂ [0, ∞), the Hausdorff
C∞ d

dimension ( )
dim {Xt : t ∈ E} 6 min d, sup β∞x dim E . (13.7)
x∈Rd

A proof can be found in [9, Theorem 5.15]. It is instructive to observe that we have to take
the supremum w.r.t. the space variable x, as we do not know how the process X moves while we
observe it during t ∈ E. This shows that we can only expect to get ‘exact’ results if t → 0. Here is
such an example.

Theorem 13.6. Let (Xt )t>0 be a d-dimensional Feller process with symbol q(x, ξ ) satisfying the
sector condition. Then, Px -a.s.
sup06s6t |Xs − x|
lim =0 ∀λ > β∞x , (13.8)
t→0 t 1/λ
sup06s6t |Xs − x|
lim =∞ ∀λ < δ∞x . (13.9)
t→0 t 1/λ
As one would expect, these results are proved using the maximal estimates (13.2) and (13.4) in
conjunction with the Borel–Cantelli lemma, see [9, Theorem 5.16].

If we are interested in the long-term behaviour, one could introduce indices ‘at zero’, where we
replace in Definition 13.4 the limit |ξ | → ∞ by |ξ | → 0, but we will always have to pay the price
that we loose the influence of the starting point X0 = x, i.e. we will have to take the supremum or
infimum for all x ∈ Rd .

With the machinery we have developed here, one can also study further path properties, such as
invariant measures, ergodicity, transience and recurrence etc. For this we refer to the monograph
[9] as well as recent developments by Behme & Schnurr [7] and Sandrić [49, 50].
A. Some classical results

In this appendix we collect some classical results from (or needed for) probability theory which
are not always contained in a standard course.

The Cauchy–Abel functional equation

Below we reproduce the standard proof for continuous functions which, however, works also for
right-continuous (or monotone) functions.

Theorem A.1. Let φ : [0, ∞) → C be a right-continuous function satisfying the functional equation
φ (s + t) = φ (s)φ (t). Then φ (t) = φ (1)t .

Proof. Assume that φ (a) = 0 for some a > 0. Then we find for all t > 0

φ (a + t) = φ (a)φ (t) = 0 =⇒ φ |[a,∞) ≡ 0.

To the left of a we find for all n ∈ N


 a
n a

0 = φ (a) = φ n =⇒ φ n = 0.

Since φ is right-continuous, we infer that φ |[0,∞) ≡ 0, and φ (t) = φ (1)t holds.


Now assume that φ (1) 6= 0. Setting f (t) := φ (t)φ (1)−t we get

f (s + t) = φ (s + t)φ (1)−(s+t) = φ (s)φ (1)−s φ (t)φ (1)−t = f (s) f (t)

as well as f (1) = 1. Applying the functional equation k times we conclude that

k
  1
k
f n = f n for all k, n ∈ N.

The same calculation done backwards yields


 1
k  1
n nk  n
 nk  k
f n = f n = f n = f (1) n = 1.

Hence, f |Q+ ≡ 1. Since φ , hence f , is right-continuous, we see that f ≡ 1 or, equivalently,


φ (t) = [φ (1)]t for all t > 0.

97
98 R. L. Schilling: An Introduction to Lévy and Feller Processes

Characteristic functions and moments


Theorem A.2 (Even moments and characteristic functions). Let Y = (Y (1) , . . . ,Y (d) ) be a random
variable in Rd and let χ(ξ ) = E ei ξ ·Y be its characteristic function. Then E(|Y |2 ) exists if, and
2
only if, the second derivatives ∂∂ξ 2 χ(0), k = 1, . . . , d, exist and are finite. In this case all mixed
k
second derivatives exist and
1 ∂ χ(0) ∂ 2 χ(0)
EY (k) = and E(Y (k)Y (l) ) = − . (A.1)
i ∂ ξk ∂ ξk ∂ ξl

Proof. In order to keep the notation simple, we consider only d = 1. If E(Y 2 ) < ∞, then the formu-
lae (A.1) are routine applications of the differentiation lemma for parameter-dependent integrals,
see e.g. [54, Theorem 11.5] or [55, Satz 12.2]. Moreover, χ is twice continuously differentiable.
Let us prove that E(Y 2 ) 6 −χ 00 (0). An application of l’Hospital’s rule gives
1 χ 0 (2h) − χ 0 (0) χ 0 (0) − χ 0 (−2h)
 
00
χ (0) = lim +
h→0 2 2h 2h
0 0
χ (2h) − χ (−2h)
= lim
h→0 4h
χ(2h) − 2χ(0) + χ(−2h)
= lim
h→0 4h2
" 2 #
ei hY − e− i hY
= lim E
h→0 2h
"  #
sin hY 2
= − lim E .
h→0 h

From Fatou’s lemma we get


"  2 #
sin hY
χ 00 (0) 6 −E lim = −E Y 2 .
 
h→0 h

In the multivariate case observe that E|Y (k)Y (l) | 6 E (Y (k) )2 + E (Y (l) )2 .
   

Vague and weak convergence of measures


A sequence of locally finite1 Borel measures (µn )n∈N on Rd converges vaguely to a locally finite
measure µ if Z Z
lim φ dµn = φ dµ for all φ ∈ Cc (Rd ). (A.2)
n→∞

Since the compactly supported continuous functions Cc (Rd ) are dense in the space of continuous
functions vanishing at infinity C∞ (Rd ) = {φ ∈ C(Rd ) : lim|x|→∞ φ (x) = 0}, we can replace in (A.2)
the set Cc (Rd ) with C∞ (Rd ). The following theorem guarantees that a family of Borel measures
is sequentially relatively compact2 for the vague convergence.
1 I.e. every compact set K has finite measure.
2 Note that compactness and sequential compactness need not coincide!
Appendix A: Some classical results 99

Theorem A.3. Let (µt )t>0 be a family of measures on Rd which is uniformly bounded, in the sense
that supt>0 µt (Rd ) < ∞. Then every sequence (µtn )n∈N has a vaguely convergent subsequence.

If we test in (A.2) against all bounded continuous functions φ ∈ Cb (Rd ), we get weak conver-
gence of the sequence µn → µ. One has

Theorem A.4. A sequence of measures (µn )n∈N converges weakly to µ if, and only if, µn converges
vaguely to µ and limn→∞ µn (Rd ) = µ(Rd ) (preservation of mass). In particular, weak and vague
convergence coincide for sequences of probability measures.

Proofs and a full discussion of vague and weak convergence can be found in Malliavin [40,
Chapter III.6] or Schilling [55, Chapter 25].
q (ξ ) := Rd ei ξ ·y µ(dy) its characteristic function.
For any finite measure µ on Rd we denote by µ
R

Theorem A.5 (Lévy’s continuity theorem). Let (µn )n∈N be a sequence of finite measures on Rd .
If µn → µ weakly, then the characteristic functions µqn (ξ ) converge locally uniformly to µ
q (ξ ).
d
qn (ξ ) = χ(ξ ) exists for all ξ ∈ R and defines a function χ
Conversely, if the limit limn→∞ µ
which is continuous at ξ = 0, then there exists a finite measure µ such that µ q (ξ ) = χ(ξ ) and
µn → µ weakly.

A proof in one dimension is contained in the monograph by Breiman [10], d-dimensional ver-
sions can be found in Bauer [5, Chapter 23] and Malliavin [40, Chapter IV.4].

Convergence in distribution
d
By −−→ we denote convergence in distribution.

Theorem A.6 (Convergence of types). Let (Yn )n∈N , Y and Y 0 be random variables and suppose
that there are constants an > 0, cn ∈ R such that
d d
Yn −−−→ Y and anYn + cn −−−→ Y 0 .
n→∞ n→∞

If Y and Y 0 are non-degenerate, then the limits a = lim an and c = lim cn exist and Y 0 ∼ aY + c.
n→∞ n→∞

Proof. Write χZ (ξ ) := Eeiupξ Z for the characteristic function of the random variable Z.

1◦ By Lévy’s continuity theorem (Theorem A.5) convergence in distribution ensures that


locally unif. locally unif.
χanYn +cn (ξ ) = ei cn ·ξ χYn (an ξ ) −−−−−−→ χY 0 (ξ ) and χYn (ξ ) −−−−−−→ χY (ξ ).
n→∞ n→∞

Take some subsequence (an(k) )k∈N ⊂ (an )n∈N such that limk→∞ an(k) = a ∈ [0, ∞].

2◦ Claim: a > 0. Assume, on the contrary, that a = 0.

|χan(k)Yn(k) +cn(k) (ξ )| = |χYn(k) (an(k) ξ )| −−−→ |χY (0)| = 1.


k→∞
100 R. L. Schilling: An Introduction to Lévy and Feller Processes

Thus, |χY 0 | ≡ 1 which means that Y 0 would be degenerate, contradicting our assumption.

3◦ Claim: a < ∞. If a = ∞, we use Yn = (an )−1 (Yn − cn ) and the argument from step 1◦ to reach
the contradiction
(an(k) )−1 −−−→ a−1 > 0 ⇐⇒ a < ∞.
k→∞

4◦ Claim: There exists a unique a ∈ [0, ∞) such that limn→∞ an = a. Assume that there were two
different subsequential limits an(k) → a, am(k) → a0 and a 6= a0 . Then

|χan(k)Yn(k) +cn(k) (ξ )| = |χan(k)Yn(k) (ξ )| −−−→ χY (aξ ),


k→∞

|χam(k)Ym(k) +cm(k) (ξ )| = |χam(k)Ym(k) (ξ )| −−−→ χY (a0 ξ ).


k→∞

On the other hand, χY (aξ ) = χY (a0 ξ ) = χY 0 (ξ ). If a0 < a, we get by iteration

a0 a0 N
  
|χY (ξ )| = χY aξ = · · · = χY a ξ −−−→ |χY (0)| = 1.
N→∞

Thus, |χ| ≡ 1 and Y is a.s. constant. Since a, a0 can be interchanged, we conclude that a = a0 .

5◦ We have
χanYn +cn (ξ ) χanYn +cn (ξ ) χY 0 (ξ )
ei cn ·ξ = = −−−→ .
χanYn (ξ ) χYn (an ξ ) n→∞ χY (aξ )

Since χY is continuous and χY (0) = 1, the limit limn→∞ ei cn ·ξ exists for all |ξ | 6 δ and some small
δ . For ξ = tξ0 with |ξ0 | = 1, we get

χY 0 (tξ0 ) ei δ cn ·ξ0 − 1 2
Z δ Z δ
0< dt = lim eitcn ·ξ0 dt = lim 6 lim inf ,
0 χY (taξ0 ) n→∞ 0 n→∞ i cn · ξ0 n→∞ |cn · ξ0 |

0
and so lim supn→∞ |cn | < ∞; if there were two limit points c 6= c0 , then ei c·ξ = ei c ·ξ for all |ξ | 6 δ .
This gives c = c0 , and we finally see that cn → c, ei cn ·ξ → ei c·ξ , as well as

χY 0 (ξ ) = ei c·ξ χY (aξ ).

A random variable is called symmetric if Y ∼ −Y .

Theorem A.7 (Symmetrization inequality). Let Y1 , . . . ,Yn be independent symmetric random vari-
ables. Then the partial sum Sn = Y1 + · · · +Yn is again symmetric and
 
P(|Y1 + · · · +Yn | > u) > 21 P max |Yk | > u . (A.3)
16k6n

If the Yk are iid with Y1 ∼ µ, then

1
1 − e−nP(|Y1 |>u) .

P(|Y1 + · · · +Yn | > u) > 2 (A.4)
Appendix A: Some classical results 101

Proof. By independence, Sn = Y1 + · · · +Yn ∼ −Y1 − · · · −Yn = −Sn .


Let τ = min{1 6 k 6 n : |Yk | = max16l6n |Yl |} and set Yn,τ = Sn −Yτ . Then the four (counting
all possible ± combinations) random variables (±Yτ , ±Yn,τ ) have the same law. Moreover,

P(Yτ > u) 6 P(Yτ > u, Yn,τ > 0) + P(Yτ > u, Yn,τ 6 0) = 2P(Yτ > u, Yn,τ > 0),

and so
P(Sn > u) = P(Yτ +Yn,τ > u) > P(Yτ > u, Yn,τ > 0) > 21 P(Yτ > u).

By symmetry, this implies (A.3). In order to see (A.4), we use that the Yk are iid, hence

P( max |Yk | 6 u) = P(|Y1 | 6 u)n 6 e−nP(|Y1 |>u) ,


16k6n

along with the elementary inequality 1 − p 6 e−p for 0 6 p 6 1. This proves (A.4).

The predictable σ -algebra


Let (Ω, A , P) be a probability space and (Ft )t>0 some filtration. A stochastic process (Xt )t>0 is
called adapted, if for every t > 0 the random variable Xt is Ft measurable.

Definition A.8. The predictable σ -algebra P is the smallest σ -algebra on Ω × (0, ∞) such that
all left-continuous adapted stochastic processes (ω,t) 7→ Xt (ω) are measurable. A P measurable
process X is called a predictable process.

For a stopping time τ we denote by

K0, τK := {(ω,t) : 0 < t 6 τ(ω)} and Kτ, ∞J:= {(ω,t) : t > τ(ω)} (A.5)

the left-open stochastic intervals. The following characterization of the predictable σ -algebra is
essentially from Jacod & Shiryaev [27, Theorem I.2.2].

Theorem A.9. The predictable σ -algebra P is generated by any one of the following families of
random sets

a) K0, τK where τ is any bounded stopping time;

b) Fs × (s,t] where Fs ∈ Fs and 0 6 s < t.

Proof. We write Pa and Pb for the σ -algebras generated by the families listed in a) and b),
respectively.

1◦ Pick 0 6 s < t, F = Fs ∈ Fs and let n > t. Observe that sF := s1F + n1F c is a bounded stopping
time3 and F × (s,t] =KsF ,tF K. Therefore, KsF ,tF K =K0,tF K\K0, sF K ∈ Pa , and we conclude that
Pb ⊂ Pa .
( )
/ s>t
0,
3 Indeed, {sF 6 t} = {s 6 t} ∩ F = ∈ Ft for all t > 0.
F, s 6 t
102 R. L. Schilling: An Introduction to Lévy and Feller Processes

2◦ Let τ be a bounded stopping time. Since t 7→ 1K0,τK (ω,t) is adapted and left-continuous, we
have Pa ⊂ P.

3◦ Let X be an adapted and left-continuous process and define for every n ∈ N



Xtn := ∑ Xk2 −n 1Kk2−n , (k+1)2−n K (t).
k=0

Obviously, X n = (Xtn )t>0 is Pb measurable; because of the left-continuity of t 7→ Xt , the limit


limn→∞ Xtn = Xt exists, and we conclude that X is Pb measurable; consequently, P ⊂ Pb .

The structure of translation invariant operators


Let ϑx f (y) := f (y + x) be the translation operator and fe(x) := f (−x). A linear operator L :
C∞
c (R ) → C(R ) is called translation invariant if
d d

ϑx (L f ) = L(ϑx f ). (A.6)

A distribution λ is an element of the topological dual (C∞ d 0


c (R )) , i.e. a continuous linear func-
tional λ : C∞
c (R ) → R. The convolution of a distribution with a function f ∈ Cc (R ) is defined
d ∞ d

as
d 0
f ∗ λ (x) := λ (ϑ−x fe), λ ∈ C∞
c (R ) , f ∈ Cc (R ), x ∈ R .
d d
 ∞

If λ = µ is a measure, this formula generalizes the ‘usual’ convolution


Z Z
f ∗ µ(x) = f (x − y) µ(dy) = fe(y − x) µ(dy) = µ(ϑ−x fe).

Theorem A.10. If λ ∈ (C∞ d 0


c (R )) is a distribution, then L f (x) := f ∗ λ (x) defines a translation
invariant continuous linear map L : C∞
c (R ) → C(R ).
d d

Conversely, every translation-invariant continuous linear map L : C∞ c (R ) → C(R ) is of the


d d

form L f = f ∗ λ for some unique distribution λ ∈ (C∞ d 0


c (R )) .

Proof. Let L f = f ∗ λ . From the very definition of the convolution we get

(ϑ−x f ) ∗ λ = λ (ϑ−x ϑ
]−x f ) = λ (ϑ−x [ f (x − ·)])

= ϑ−x λ (ϑ−x [ f (− ·)])


= ϑ−x ( f ∗ λ ).

For any sequence xn → x in Rd and f ∈ C∞ c (R ) we know that ϑ−xn f → ϑ−x f in Cc (R ). Since


d e e ∞ d

λ is a continuous linear functional on C∞ d


c (R ), we conclude that

lim L f (xn ) = lim λ (ϑ−xn fe) = λ (ϑ−x fe) = L f (x)


n→∞ n→∞

which shows that L f ∈ C(Rd ). (With a similar argument based on difference quotients we could
even show that L f ∈ C∞ (Rd ).)
Appendix A: Some classical results 103

In order to prove the continuity of L, it is enough to show that L : C∞


c (K) → C(R ) is continuous
d

for every compact set K ⊂ Rd (this is because of the definition of the topology in C∞ d
c (R )). We
will use the closed graph theorem: Assume that fn → f in C∞ c (R ) and f n ∗ λ → g in C(R ), then
d d

we have to show that g = f ∗ λ .


For every x ∈ Rd we have ϑ−x fen → ϑ−x fe in C∞ d
c (R ), and so

g(x) = lim ( fn ∗ λ )(x) = lim λ (ϑ−x fen ) = λ (ϑ−x fe) = ( f ∗ λ )(x).


n→∞ n→∞

Assume now, that L is translation invariant and continuous. Define λ ( f ) := (L fe)(0). Since L
is linear and continuous, and f 7→ fe and the evaluation at x = 0 are continuous operations, λ is a
continuous linear map on C∞ d
c (R ). Because of the translation invariance of L we get

(L f )(x) = (ϑx L f )(0) = L(ϑx f )(0) = λ (ϑ x f ) = λ (ϑ−x f ) = ( f ∗ λ )(x).


g e

If µ is a further distribution with L f (0) = f ∗ µ(0), we see

(µ − λ )( fe) = f ∗ (µ − λ )(0) = f ∗ µ(0) − f ∗ λ (0) = 0 for all f ∈ C∞ d


c (R )

which proves µ = λ .
Bibliography

[1] Aczél, J.: Lectures on Functional Equations and Their Applications. Academic Press, New
York (NY) 1966.

[2] Applebaum, D.: Lévy Processes and Stochastic Calculus. Cambridge University Press, Cam-
bridge 2009 (2nd ed).

[3] Barndorff-Nielsen, O.E. et al.: Lévy Processes. Theory and Applications. Birkhäuser, Boston
(MA) 2001.

[4] Bass, R.: Uniqueness in law for pure-jump Markov processes. Probab. Theor. Relat. Fields
79 (1988) 271–287.

[5] Bauer, H.: Probability Theory. De Gruyter, Berlin 1996.

[6] Bawly, G.M.: Über einige Verallgemeinerungen der Grenzwertsätze der Wahrscheinlich-
keitsrechnung. Mat. Sbornik (Réc. Math. Moscou, N.S.) 1 (1936) 917–930.

[7] Behme, A., Schnurr, A.: A criterion for invariant measures of Itô processes based on the
symbol. Bernoulli 21 (2015) 1697–1718.

[8] Blumenthal, R.M., Getoor, R.K.: Sample functions of stochastic processes with stationary
independent increments. J. Math. Mech. 10 (1961) 493–516.

[9] Böttcher, B., Schilling, R.L., Wang, J.: Lévy-type Processes: Construction, Approxima-
tion and Sample Path Properties. Lecture Notes in Mathematics 2099 (Lévy Matters III),
Springer, Cham 2014.

[10] Breiman, L.: Probability. Addison–Wesley, Reading (MA) 1968 (several reprints by SIAM,
Philadelphia).

[11] Bretagnolle, J.L.: Processus à accroissements independants. In: Bretagnolle, J.L. et al.:
École d’Été de Probabilités: Processus Stochastiques. Springer, Lecture Notes in Mathe-
matics 307, Berlin 1973, pp. 1–26.

[12] Çinlar, E.: Introduction to Stochastic Processes. Prentice–Hall, Englewood Cliffs (NJ) 1975
(reprinted by Dover, Mineola).

104
Bibliography 105

[13] Courrège, P.: Générateur infinitésimal d’un semi-groupe de convolution sur Rn , et formule
de Lévy–Khintchine. Bull. Sci. Math. 88 (1964) 3–30.

[14] Courrège, P.: Sur la forme intégro différentielle des opérateurs de Ck∞ dans C satisfaisant
au principe du maximum. In: Séminaire Brelot–Choquet–Deny. Théorie du potentiel 10
(1965/66) exposé 2, pp. 1–38.

[15] de Finetti, B.: Sulle funzioni ad incremento aleatorio. Rend. Accad. Lincei Ser. VI 10 (1929)
163–168.

[16] Dieudonné, J.: Foundations of Modern Analysis. Academic Press, New York (NY) 1969.

[17] Ethier, S.N., Kurtz, T.G.: Markov Processes. Characterization and Convergence. John Wiley
& Sons, New York (NY) 1986.

[18] Gikhman, I.I., Skorokhod, A.V.: Introduction to the Theory of Random Processes. W.B.
Saunders, Philadelphia (PA) 1969 (reprinted by Dover, Mineola 1996).

[19] Grimvall, A.: A theorem on convergence to a Lévy process. Math. Scand. 30 (1972) 339–
349.

[20] Herz, C.S.: Théorie élémentaire des distributions de Beurling. Publ. Math. Orsay no. 5, 2ème
année 1962/63, Paris 1964.

[21] Hoh, W.: Pseudo differential operators with negative definite symbols of variable order. Rev.
Mat. Iberoamericana 16 (2000) 219–241.

[22] Ikeda, N., Watanabe S.: Stochastic Differential Equations and Diffusion Processes. North-
Holland Publishing Co./Kodansha, Amsterdam and Tokyo 1989 (2nd ed).

[23] Itô, K.: On stochastic processes. I. (Infinitely divisible laws of probability). Japanese J. Math.
XVIII (1942) 261–302.

[24] Itô, K.: Lectures on Stochastic Processes. Tata Institute of Fundamental Research, Bombay
1961 (reprinted by Springer, Berlin 1984).
http://www.math.tifr.res.in/~publ/ln/tifr24.pdf

[25] Itô, K.: Semigroups in probability theory. In: Functional Analysis and Related Topics, Pro-
ceedings in Memory of K. Yosida (Kyoto 1991). Springer, Lecture Notes in Mathemtics
1540, Berlin 1993, 69–83.

[26] Jacob, N.: Pseudo Differential Operators and Markov Processes (3 volumes). Imperial Col-
lege Press, London 2001–2005.

[27] Jacod, J., Shiryaev, A.N.: Limit Theorems for Stochastic Processes. Springer, Berlin 1987
(2nd ed Springer, Berlin 2003).
106 R. L. Schilling: An Introduction to Lévy and Feller Processes

[28] Khintchine, A.Ya.: A new derivation of a formula by P. Lévy. Bull. Moscow State Univ. 1
(1937) 1–5 (Russian; an English translation is contained in Appendix 2.1, pp. 44–49 of [45]).

[29] Khintchine, A.Ya.: Zur Theorie der unbeschränkt teilbaren Verteilungsgesetze. Mat. Sbornik
(Réc. Math. Moscou, N.S.) 2 (1937) 79–119 (German; an English translation is contained in
Appendix 3.3, pp. 79–125 of [45]).

[30] Knopova, V., Schilling, R.L.: A note on the existence of transition probability densities for
Lévy processes. Forum Math. 25 (2013) 125–149.

[31] Kolmogorov, A.N.: Sulla forma generale di un processo stocastico omogeneo (Un problema
die Bruno de Finetti). Rend. Accad. Lincei Ser. VI 15 (1932) 805–808 and 866–869 (an En-
glish translation is contained in: Shiryayev, A.N. (ed.): Selected Works of A.N. Kolmogorov.
Kluwer, Dordrecht 1992, vol. 2, pp.121–127).

[32] Kühn, F.: Existence and estimates of moments for Lévy-type processes. Stoch. Proc. Appl.
(in press). DOI: 10.1016/j.spa.2016.07.008

[33] Kühn, F.: Probability and Heat Kernel Estimates for Lévy(-Type) Processes. PhD Thesis,
Technische Universität Dresden 2016.

[34] Kunita, H.: Stochastic differential equations based on Lévy processes and stochastic flows
of diffeomorphisms. In: Rao, M.M. (ed.): Real and Stochastic Analysis. Birkhäuser, Boston
(MA) 2004, pp. 305–373.

[35] Kunita, H., Watanabe, S.: On square integrable martingales. Nagoya Math. J. 30 (1967)
209–245.

[36] Kyprianou, A.: Introductory Lectures on Fluctuations and Lévy Processes with Applications.
Springer, Berlin 2006.

[37] Lévy, P.: Sur les intégrales dont les élements sont des variables aléatoires indépendantes.
Ann. Sc. Norm. Sup. Pisa 3 (1934) 337–366.

[38] Lévy, P.: Théorie de l’addition des variables aléatoires. Gauthier–Villars, Paris 1937.

[39] Mainardi, F., Rogosin, S.V.: The origin of infinitely divisible distributions: from de Finetti’s
problem to Lévy–Khintchine formula. Math. Methods Economics Finance 1 (2006) 37–55.

[40] Malliavin, P.: Integration and Probability. Springer, New York (NY) 1995.

[41] Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations.
Springer, New York (NY) 1983.

[42] Prokhorov, Yu.V.: Convergence of random processes and limit theorems in probability the-
ory. Theor. Probab. Appl. 1 (1956) 157–214.
Bibliography 107

[43] Protter, P.E.: Stochastic Integration and Differential Equations. Springer, Berlin 2004 (2nd
ed).

[44] Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Springer, Berlin 2005
(3rd printing of the 3rd ed).

[45] Rogosin, S.V., Mainardi, F.: The Legacy of A.Ya. Khintchine’s Work in Probability Theory.
Cambridge Scientific Publishers, Cambridge 2010.

[46] Rosiński, J.: On series representations of infinitely divisible random vectors. Ann. Probab.
18 (1990) 405–430.

[47] Rosiński, J.: Series representations of Lévy processes from the perspective of point pro-
cesses. In: Barndorff-Nielsen et al. [3] (2001) 401–415.

[48] Samorodnitsky, G., Taqqu, M.S.: Stable Non-Gaussian Random Processes. Chapman &
Hall, New York (NY) 1994.

[49] Sandrić, N.: On recurrence and transience of two-dimensional Lévy and Lévy-type pro-
cesses. Stoch. Proc. Appl. 126 (2016) 414–438.

[50] Sandrić, N.: Long-time behavior for a class of Feller processes. Trans. Am. Math. Soc. 368
(2016) 1871–1910.

[51] Sato, K.: Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press,
Cambridge 1999 (2nd ed 2013).

[52] Schilling, R.L.: Conservativeness and extensions of Feller semigroups. Positivity 2 (1998)
239–256.

[53] Schilling, R.L.: Growth and Hölder conditions for the sample paths of a Feller process.
Probab. Theor. Relat. Fields 112 (1998) 565–611.

[54] Schilling, R.L.: Measures, Integrals and Martingales. Cambridge University Press, Cam-
bridge 2011 (3rd printing).

[55] Schilling, R.L.: Maß und Integral. De Gruyter, Berlin 2015.

[56] Schilling, R.L., Partzsch, L.: Brownian Motion. An Introduction to Stochastic Processes. De
Gruyter, Berlin 2014 (2nd ed).

[57] Schilling, R.L., Schnurr, A.: The symbol associated with the solution of a stochastic differ-
ential equation. El. J. Probab. 15 (2010) 1369–1393.

[58] Schnurr, A.: The Symbol of a Markov Semimartingale. PhD Thesis, Technische Universität
Dresden 2009. Shaker-Verlag, Aachen 2009.
108 R. L. Schilling: An Introduction to Lévy and Feller Processes

[59] Schnurr, A.: On the semimartingale nature of Feller processes with killing. Stoch. Proc. Appl.
122 (2012) 2758–2780.

[60] Stroock, D.W., Varadhan, S.R.S.: Multidimensional Stochastic Processes. Springer, Berlin
1997 (2nd corrected printing).

[61] von Waldenfels, W.: Eine Klasse stationärer Markowprozesse. Kernforschungsanlage Jülich,
Institut für Plasmaphysik, Jülich 1961.

[62] von Waldenfels, W.: Fast positive Operatoren. Z. Wahrscheinlichkeitstheorie verw. Geb. 4
(1965) 159–174.

You might also like