Ito Shit

F70CF1 Continuous-Time Finance
Torsten Kleinow: Room CM.F11

Email: t.kleinow@hw.ac.uk
Aims
This course develops the theory and practice of financial derivatives pricing in continuous time.
The first part of the course provides an introduction to stochastic processes in continuous time. These
processes are essential for modelling asset price processes in continuous time. The main topics which
we will cover are
Theory of martingales in continuous time;
Brownian motion: definitions and properties;
Brownian motion as the limit of a binomial random-walk process;
Introduction to stochastic integration, stochastic differential equations and Itos formula;
Geometric Brownian motion and the Ornstein-Uhlenbeck process;
Introduction to Girsanovs theorem and the martingale representation theorem.
In the second part of the course we will cover some of the core topics in continuous-time financial
derivative pricing. The main topics we will cover are
The Black-Scholes model;
Derivatives pricing using the Black-Scholes model using the martingale and PDE approaches
to pricing;
Extensions to foreign currencies and dividend-paying stocks;
Portfolio risk management using the Greeks;
Introduction to interest rate models;
Introduction to credit risk models.
Web Page
Course information will be posted on Vision, the Universitys online learning environment. All
handouts will also be available on Vision.
Office Hours
Office hours will be announced on Vision.
Feedback
Verbal feedback on tutorial solutions will be given during tutorial hours. For example, if you have
a tutorial solution which differs from the official solution you could ask whether your solution is
correct. Students also have the opportunity to get feedback during office hours. I am also happy to
give feedback on your attempts at past exams.
Reading
Some additional reading is suggested to get a better understanding of the course material.
For the first part of the course, the recommended textbooks are Williams, and Durrett and/or
ksendal. Williams covers introductory probability, integration (i.e. expectation), and the theory
of discrete-time martingales. Durrett and ksendal cover the continuous-time setting, including
Brownian motion, continuous-time martingales, Girsanovs theorem, the martingale representation
theorem and, of course, the theory of stochastic differential equations.
For the second part of the course, the recommended textbooks are Baxter and Rennie, Bj
ork
and Hull. Baxter and Rennie give a good description of the Black-Scholes model. Bjork provides
a nice, simple description of much of the stochastic analysis needed for the course, as well as much
continuous-time theory. Hull has a broad coverage of the practical side of modelling derivatives
markets, and is one of the best known books in the field.
There are many other useful books which can be found in the library, containing similar or related
material.
Bibliography
1. Williams, D. (1991), Probability with martingales. Cambridge University Press.
2. Durrett, R. (1996), Stochastic Calculus, A practical Introduction. CRC Press.
3. ksendal, B. (2003), Stochastic differential equations, 6th ed. Springer.
4. Baxter, M. and Rennie, A. (1996), Financial calculus. Cambridge University Press.
5. Bj
ork (2009), Arbitrage theory in continuous time, 3rd ed. Oxford University Press.
6. Hull, J. (2008) Options, futures and other derivative securities, 7th ed. Prentice Hall.
7. Jacod, J. and Protter, P. (2004) Probability Essentials, 2nd ed. Springer.
(NB: earlier or later editions of these books should be fine.)
Stochastic Processes
1.1
Preliminaries
Probability Space (, F , P)
is a sample space, the set of all possible outcomes, , of an experiment. For example = {H, T }
for tossing a coin.
An event A is a collection of outcomes (paths).
F is a family of events to which we will assign a probability. F must be a -algebra:
(i) , F ;
(ii) If A F then Ac F ;
(iii) If A1 , A2 , . . . is a countably infinite sequence of events in F then
i=1 Ai F .
c c
[
i=1 Ai = (i=1 Ai ) F ]
The pair (, F ) is called a measure space.

A probability measure P is a function P : F [0, 1] which satisfies
(i) P() = 0, P() = 1
(ii) P(Ac ) = 1 P(A)
(iii) If A1 , A2 , . . . is a sequence of disjoint events then P(
i=1 Ai ) =
i=1
P(Ai ).
The triple (, F , P) is called a probability space, which we now fix.

An event A is said to be null if P(A) = 0.
An event A is said to be almost sure (a.s.) if P(A) = 1.
Two -algebras G1 , G2 are said to be independent if P(G1 G2 ) = P(G1 )P(G2 ) whenever G1 G1
and G2 G2 .
If G1 and G2 are two -algebras (no assumption of independence!), then so is G1 G2 , which is smaller
than both G1 and G2 .
Random Variables
Given a -algebra G on a sample space , a function X : R is said to be measurable w.r.t. G if
{X a} := { : X() a} G ,
a R.
Lemma 1.1 If X, Y are measurable w.r.t. G , and f is a continuous function then f (X, Y ) is
measurable w.r.t. G .
Proof. For 1-dim case. This would need the results that a function is continuous if and only if the
preimage of closed sets is closed, and a function is measurable if and only if {X B} G for all
closed subsets B R. Then its easy:

{f (X) a} = X f 1 (, a) G .

3
A random variable (r.v.) X is a function X : R which is measurable w.r.t. F .

Given a random variable X, there exists a smallest -algebra on with respect to which X is
measurable. This is called the -algebra generated by X, denoted (X).
Example 1.2 Toss a coin twice. = {HH, HT, T H, T T }, F = {all subsets of }.
(
1 if 1st toss is H
Let X1 =
0 otherwise.
and X2 similarly
What is (X1 )?
This is the smallest -algebra containing events of the form {X1 a}.
(X1 ) = {, {T H, T T } , {HH, HT } , }.
Similarly, (X2 ) = {, {HT, T T } , {HH, T H} , }.
Is X1 (X2 )-measurable? i.e. {X1 a} (X2 )?
{X1 1} = (X2 )
{X1 0} = {T H, T T } 6 (X2 )
Therefore, X1 is NOT (X2 )-measurable.
Similarly, X2 is NOT (X1 )-measurable.
Similarly, given a family of random variables (Xt )tT , there exists a smallest sigma algebra {Xt : t T}
with respect to which all Xt are measurable.
Two random variables X, Y are said to be independent if the -algebras (X) and (Y ) are independent. This agrees with the traditional definition because {X x} (X) and {Y y} (Y )
so
P({X x} {Y y}) = P(X x)P(Y y).
Lemma 1.3 If X is independent of F and f is a continuous function then the r.v. f (X) is independent of F .
Almost sure statements about r.v.s
Given two r.v.s X, Y we say X = Y a.s. if the event {X = Y } is almost sure.
That is, P(X = Y ) = 1.
Given a sequence of r.v.s (Xn ), and a r.v. X, we have Xn X a.s. as n if the event
A = {Xn X} is almost sure, i.e.
P({ : Xn () X() as n }) = 1.
4
Expectations of Random Variables

Let X be a random variable on a probability space (, F , P). The expectation represents the average
value that X can take.
R
We wish to define the expectation as an integral EP [X] = X()dP().
Pn
A random variable is called simple if it has
( the form X() = i=1 ai 1Ai () where A1 , A2 , . . . , An
1 if Ai
are disjoint events, ai R, and 1Ai () =
0 otherwise.
In this case
Z
EP [X] =
X()dP :=
n
X
ai P(Ai ).
i=1
For a general random variable X 0, take a sequence X1 , X2 , X3 , . . . of simple random variables

such that 0 Xn X and Xn X a.s. as n .
Define EP [X] := supn EP [Xn ].

Simple properties of expections
(i) If X = 1A a.s. then EP [X] = P(A).
(ii) If X 0 a.s. then EP [X] 0.
(iii) EP [a1 X1 + a2 X2 ] = a1 EP [X1 ] + a2 EP [X2 ].
The variance of a random variable X is defined by

Var(X) := EP (X EP [X])2 = EP X 2 EP [X]2 .
The variance tells you about the average (squared) distance that X spends away from its mean value.
Useful theorems for EP [X]
Theorem 1.4 (Monotone convergence theorem) If Xn 0 is a monotone increasing sequence
of random variables with Xn X a.s. as n then EP [X] = EP [limn Xn ] = limn EP [Xn ].
5
Typically, if we want to prove a result about EP [X], we first take an approximating sequence of
simple random variables with Xn X a.s., prove the result for EP [Xn ] (hopefully easier), then pass
to the limit.
An example of this is the following result.
Proposition 1.5 If F (x) = P(X x) is the distribution function of X then
Z
xdF (x).
EP [X] =
Proof (sketch, for X 0). Because of monotone convergence theorem (m.c.t.), it is sufficient to
prove this for simple X. Any simple r.v. can be written in the form
X=
n
X
0 = a0 < a1 < a2 < < an
a i 1A i
i=0
where Ai s are disjoint and i Ai = [Just reorder ai in increasing order and add A0 := (i Ai )c ].
P
By definition, EP [X] = ni=1 ai P(Ai ).
Clearly, F (a0 ) = P(X a0 ) = P(A0 ) and F (a1 ) = P(X a1 ) = P(A0 ) + P(A1 ) ... etc
So,
xdF (x) = a1 (F (a1 ) F (a0 )) + a2 (F (a2 ) F (a1 ))
+ + an (F (an ) F (an1 ))
= a1 P(A1 ) + a2 P(A2 ) + + an P(An )
= EP [X] .
N.B. If F is differentiable then EP [X] =
xf (x)dx, where f (x) = F 0 (x) is the density function.
Heres another application of the MCT:

Theorem 1.6 (Fubinis theorem) If Xn is a sequence of non-negative r.v.s then
"
#
X
X
EP
Xn =
EP [Xn ] .
n=1
Proof. Define Yn =
Pn
i=1
n=1
Xi . Then Yn is an increasing sequence of r.v.s such that Yn
i=1
Xi as
n . By the M.C.T.
EP
"
X
"
Xi = EP
lim
n
X
i=1
h
i
Xi = EP lim Yn
n
i=1
"
M.C.T.
lim EP [Yn ] = lim EP
= lim
n
X
EP [Xi ] =
i=1
n
X
#
Xi
i=1
EP [Xi ]
i=1
Theorem: If X, Y are independent r.v.s and f, g are continuous functions then

EP [f (X)g(Y )] = EP [f (X)] EP [g(Y )] .
Proof: See probability essentials (Jacod Protter). Essentially,
P(X x, Y y) = P(X x)P(Y y)
implies

EP 1{Xx} 1{Y y} = EP 1{Xx} EP 1{Y y} .
Then use the monotone convergence theorem.
Jensens Inequality
Let f (x) be a convex function i.e.
f (cx1 + (1 c)x2 ) cf (x1 ) + (1 c)f (x2 )
0c1
e.g. f (x) = x2 , ex , |x|.

If f is twice differentiable, then f convex f 00 0.
Another characterisation:
f convex at every point x0 there exists a linear function la,b (x) = ax+b such that f (x0 ) = la,b (x0 )
and f (x) la,b (x) for all x.
Thus all tangents lie below f , and if

L := (a, b) R2 : f (x) la,b (x)x R
then
f (x) = sup la,b (x).
(a,b)L
x R.
Lemma 1.7 (Jensen) If f is a convex function, then

EP [f (X)] f (EP [X]).
Proof. Given (a, b) L
EP [f (X)] EP [la,b (X)] = EP [aX + b]
= aEP [X] + b = la,b (EP [X]).
Since this is true for all (a, b) L we have
EP [f (X)] sup la,b (EP [X]) = f (EP [X]).
(a,b)L

Way to remember:
Similarly, if g is concave (g = f ) then EP [g(X)] g(EP [X]).

Conditional expectations
If X, Y are discrete random variables
EP [X|Y = y] =
xP(X = x|Y = y).
As y varies, the value of EP [X|Y = y] will vary i.e. the conditional expectation is a function of y.
Hence EP [X|Y ] is a function of a random variable Y and is therefore itself a random variable.
Example: Two fair coins = {HH, HT, T H, T T }
(
1 if ith toss H
Xi =
; i = 1, 2
0 otherwise
Consider EP [X1 + X2 |X1 ].
Elementary calculation:
EP [X1 + X2 |X1 = 0] = 0 + EP [X2 ] =
1
2
EP [X1 + X2 |X1 = 1] = 1 + EP [X2 ] =
3
2
Therefore, as a random variable it is defined as

HH
3
EP [X1 + X2 |X1 ]
2
HT
TH
TT
3
2
1
2
1
2
Important observations
Recall: F1 = (X1 ) = {, {T H, T T } , {HH, HT } , }
8
(i)

3
: EP [X1 + X2 |X1 ] =
= {HH, HT }
2
= {X1 = 1}
F1 .

1
: EP [X1 + X2 |X1 ] =
2

= {T H, T T }
= {X1 = 0}
F1 .
= EP [X1 + X2 |X1 ] is F1 -measurable.

(ii) Let A F1 and let
(
1 A
1A () :=
0
6 A
Consider EP [1A EP [X1 + X2 |X1 ]].

Take A = {HH, HT } = {X1 = 1}, so 1A = X1 .
Then
EP [1A EP [X1 + X2 |X1 ]]
1
3
= 1. .P(X1 = 1) + 0. .P(X1 = 0)
2
2
3
3 1
= 1. . = .
2 2
4
EP [1A (X1 + X2 )]
= EP [1A X1 ] + EP [1A X2 ]

= EP X12 + EP [X1 X2 ]
= EP [X1 ] + EP [X1 ] EP [X2 ]
1 1 1
3
= + . = .
2 2 2
4
Similarly, if A = {T H, T T } = {X1 = 0}, 1A = 1 X1 . So,
EP [1A EP [X1 + X2 |X1 ]]
3
1
1
= 0. .P(X1 = 1) + 1. .P(X1 = 0) =
2
2
4
EP [1A (X1 + X2 )]
= EP [1A X1 ] + EP [1A X2 ]
= EP [(1 X1 )X1 ] + EP [(1 X1 )X2 ]
= 0 + EP [X2 ] EP [X1 X2 ]
1 1
1
= = .
2 4
4
Finally, take A = , 1A = 1.
1 1 3 1
EP [EP [X1 + X2 |X1 ]] = . + . = 1
2 2 2 2
1 1
EP [X1 + X2 ] = + = 1
2 2
9
Conclusion: A F1 holds:
EP [1A EP [X1 + X2 |X1 ]] = EP [1A (X1 + X2 )] .
This example suggests the following definition:
Given (, F , P), let X be an F -meas. r.v., and let G F be a smaller -algebra (in general, X is
not G -measurable).
Definition 1.8 The conditional expectation EP [X|G ] is defined to be the a.s. unique r.v. satisfying
(i) EP [X|G ] is G -measurable
(ii) EP [1A EP [X|G ]] = EP [1A X] A G .
The definition is equivalent to saying that EP [X|G ] is the unique G -meas. r.v. s.t.
EP [Y EP [X|G ]] = EP [Y X]
G -meas. r.v. Y.
(*)
So, to prove a statement along the lines of EP [X|G ] = Z for some G -meas. r.v. Z, we could show
that
EP [Y Z] = EP [Y X]
G -meas. r.v. Y.
(**)
Properties of conditional expectations
(i) If X 0 a.s. then EP [X|G ] 0 a.s. for any G F
(ii) Linearity EP [a1 X1 + a2 X2 |G ] = a1 EP [X1 |G ] + a2 EP [X2 |G ].
(iii) Taking out what is known If Z is G -meas. r.v., then EP [XZ|G ] = ZEP [X|G ].
Proof. Let Y be a G -meas. r.v. Then since Y Z is G -meas., (*) implies that
EP [Y ZEP [X|G ]] = EP [Y ZX] .
By (**) this means that EP [ZX|G ] = ZEP [X|G ].
(iv) In particular, if X is G -meas., then EP [X|G ] = X.

(v) EP [EP [X|G ]] = EP [X]. (Let Y = 1 in (*) or A = )
(vi) Tower property
If G1 G2 are -algebras then
EP [EP [X|G2 ] |G1 ] = EP [X|G1 ] .
Proof. To prove EP [X|G1 ] = Z for some expression Z, need to show that EP [Y Z] = EP [Y X]
G1 -meas. Y .
With Z := EP [EP [X|G

h 2 ] |G
i 1 ], we will show that EP [Y Z] = EP [Y X]. First, let X = EP [X|G2 ].
1 , by defn (see (*))
Then since Z = EP X|G
h
i
= EP [Y EP [X|G2 ]]
EP [Y Z] = EP Y X
= EP [EP [Y X|G2 ]]
(by (iii), since Y is also G2 -meas.)
= EP [Y X]
by (v).
10
(vii) Independence and conditioning X is independent of G (i.e. P({X a}A) = P(X a)P(A),
a R, A G ) iff EP [X|G ] = EP [X].
Proof ( = only). Suppose X is independent of G . We need to show that
EP [Y EP [X]] = EP [Y X]
G -meas Y.
LHS = EP [X] EP [Y ]
because EP [X] is a constant.
RHS = EP [Y X] = EP [Y ] EP [X]
by independence of X and Y (Independence of X and Y is an easy exercise!).
Note: The conditional expectation EP [X|G ] is the G -measurable rv Y that minimizes EP [(Y X)2 ].
It is used in least squares estimation (LSE): Think of G as some data available, in which case
EP [X|G ] is the LSE of X.
The monotone convergence theorem and Jensens inequality apply to conditional expectations.
e.g. Monotone convergence theorem
If Xn 0 s.t. 0 X1 X2 . . . and Xn X a.s. as n then
h
i
lim EP [Xn |G ] = EP lim Xn |G = EP [X|G ] a.s.
n
Jensens inequality
If f is a convex function,
EP [f (X)|G ] f (EP [X|G ]) a.s.
Relationship to elementary notion of conditional expectation:
For a fixed value y we know that EP [X|Y = y] is a number which represents a particular value of the
random variable EP [X|(Y )]. Similarly, the r.v. EP [X|G ] can take on values of the form EP [X|A],
for some A G .
For example, in the previous coin tossing example, X = X1 + X2 , G = (X1 ).
We calculated that:
3
EP [X|X1 = 1] = .
2
1
EP [X|X1 = 0] = ,
2
11
1.2
Stochastic processes
Let T [0, ) be a time index. The two usual cases are T = {0, 1, 2, . . . } and T = [0, )
Definition 1.9 A real-valued stochastic process is a family of real-valued random variables X =
{Xt }tT defined on (, F , P).
If T = {0, 1, 2, . . . } then X is a sequence of random variables, and we say that X is a discrete-time
stochastic process (or a time series). If T = [0, ) then we say that X is a continuous-time stochastic
process.
We shall sometimes write X(t) instead of Xt , but these mean the same. For a particular value we
write X(t, ) = Xt (). When we write X() we mean a particular paths of X.
If you are observing a share price over time then information about the process will be gradually
revealed over time. We model this with a filtration.
Definition 1.10 A filtration of (, F , P) is an increasing family of -algebras {Ft }tT such that
Ft F . We shall say that (, F , (Ft )tT , P) is a filtered probability space.
Increasing means that if s t, then Fs Ft . Think of a filtration as the history of the process
X.
Definition 1.11 Let X be a stochastic process defined on (, F , P). We say that X is adapted to
the filtration (Ft )tT if Xt is Ft -measurable for each t T.
Any process X is automatically adapted to the filtration FtX := {Xs : s t} generated by itself.
Example 1.12 Suppose that the first coin is tossed at time 1, and the second at time 2. The
winnings from both bets can be represented by a process Y = {Yt }tT with T = {0, 1, 2}:
Y0 := 0,
Y1 := X1 ,
Y2 := X1 + X2 .
Let {Ft }tT be the filtration generated by Y :

F0 := {, } ,
F1 := (Y1 ) = (X1 ) = {, {HH, HT } , {T H, T T } , } ,
F2 := (Y1 , Y2 ) = (X1 , X2 ) = F .
Check that at each time t T, the r.v. Yt is meas. w.r.t. Ft :
e.g. for t = 1, Y1 = X1 is (by definition!) (Y1 )-measurable, (Y1 ) = F1 .
Lets check directly: {X1 0.3} = {T H, T T } etc.
1.3
Martingales
Definition 1.13 A stochastic process {Mt }tT is said to be a P-martingale with respect to a filtration
{Ft }tT if
(i) EP [|Mt |] < for all t T,
(ii) {Mt }tT is {Ft }tT -adapted,
(iii) EP [Mt |Fs ] = Ms if s t.
12
N.B: Due to property (v) of EP [.|F ],

EP [Mt ] = EP [EP [Mt |Fs ]] = EP [Ms ] .
s t.
Note however, if Mt satisfies EP [Mt ] = EP [Ms ] for all s t then Mt does not have to be a martingale.
A process Mt is a {Ft }-supermartingale if (i)-(ii) hold and (iii) EP [Mt |Fs ] Ms a.s. So EP [Mt ]
EP [Ms ] for s t.
A process Mt is a {Ft }-submartingale if (i)-(ii) hold and (iii) EP [Mt |Fs ] Ms a.s. So EP [Mt ]
EP [Ms ] for s t.
Note: A process Mt is a martingale iff it is a super- and sub-martingale.
Proposition 1.14 A discrete-time stochastic process Mn is an Fn -martingale iff
EP [Mn+1 |Fn ] = Mn
n N.
Proof. We need to show that EP [Mn |Fk ] = Mk for all k < n.
EP [Mn |Fk ] = EP [EP [Mn |Fn1 ] |Fk ]

because Fk Fn1 so tower property.
= EP [Mn1 |Fk ]
= ...
= EP [Mk+1 |Fk ]
= Mk .
Some examples of (super-/sub-)martingales

Example 1.15 (Sums of i.i.d. r.v.s) Let X1 , X2 , X3 , . . . be i.i.d. random variables with EP [|X1 |] <
.
Let Mn := m0 + X1 + X2 + + Xn , m0 R.
Let Fn = {X1 , X2 , . . . , Xn } = {M1 , M2 , . . . , Mn }.
Then
If EP [X1 ] = 0 then Mn is a {Fn }-martingale.
If EP [X1 ] 0 then Mn is a {Fn }-supermartingale.
If EP [X1 ] 0 then Mn is a {Fn }-submartingale.
Check the 3 conditions in the definition of martingales:
(i)
EP [|Mn |] = EP [|m0 + X1 + + Xn |]
EP [|m0 | + |X1 | + + |Xn |]
= |m0 | + nEP [|X1 |] < .
13
(ii) This follows by definition of {Fn }

(iii)
EP [Mn+1 |Fn ] = EP [m0 + X1 + + Xn + Xn+1 |Fn ]
= EP [Mn |Fn ] + EP [Xn+1 |Fn ]
= Mn + EP [Xn+1 ]
= Mn if EP [Xn+1 ] = 0
Mn if EP [Xn+1 ] 0
Mn if EP [Xn+1 ] 0.
The symmetric random walk is a special case of the last example: Let X1 = 1 with equal probability,
so EP [X1 ] = 0.
Example 1.16 (Product of i.i.d. r.v.s) Let Z1 , Z2 , . . . be i.i.d. with EP [|Z1 |] < .
Q
Let Mn = m0 nk=1 Zk , m0 R. Suppose EP [Z1 ] = 1 and let Fn = {Z1 , Z2 , . . . , Zn }. Then Mn is
a {Fn }-martingale.
"
EP [|Mn |] = EP |m0 |
n
Y
#
|Zk | = |m0 |
k=1
n
Y
EP [|Zk |]
k=1
= |m0 |EP [|Z1 |]n < .
EP [Mn+1 |Fn ] = EP [m0 Z1 Z2 . . . Zn Zn+1 |Fn ]

= Mn EP [Zn+1 |Fn ]
= Mn EP [Zn+1 ]
= Mn .
Similarly, if EP [Z1 ] 1 (resp. EP [Z1 ] 1), Mn is a submartingale (resp. supermartingale).
The Binomial Tree is a special case of the last example. This is how a stock price was modelled in
Discrete-Time Finance. We set
Sn = S0 Z1 Z2 . . . Zn
where Zn are i.i.d. and
(
u
Z1 =
d
with probability p
with probability 1 p.
e.g.
14

2
S0 u3
*

S0 u
HH
p
*

H
jp
H

1
S0 u
S0 u2 d

H
p
p
*

*

HH

jp
H

1
S0 ud
S0

HH
HH
p
*

H
H
jp
H
jp
H

1
1
S0 ud2
S0 d

HH
p
*

H
jp
H

1
2
S0 d

HH
H
jp
H
1
S0 d3

Note that EP [Z1 ] = pu + (1 p)d. If we choose p =
e d
ud
then EP [e Z1 ] = 1, so
Mn := (e Z1 )(e Z2 ) . . . (e Zn ) = en Sn
is a martingale.
Example 1.17 (Doobs martingale) Let X be a r.v. on (, F , {Ft }tT , P) such that EP [|X|] <
, and let Mt := EP [X|Ft ]. Then Mt is a martingale.
(i) By Jensens inequality
EP [|Mt |] = EP [|EP [X|Ft ] |]
EP [EP [|X||Ft ]] = EP [|X|] <
(ii) Mt is Ft -measurable, so the process is adapted
(iii) By the tower property, EP [Mt |Fs ] = EP [EP [X|Ft ] |Fs ] = EP [X|Fs ] = Ms .
Proposition 1.18 (Jensens inequality) Suppose Mt is a martingale, and let f be a convex function. Then f (Mt ) is a submartingale (assuming EP [f (Mt )] < ).
Proof. EP [f (Mt )|Fs ] f (EP [Mt |Fs ]) = f (Ms ).
Similarly, if g is concave then g(Mt ) is a supermartingale.
15
1.4
Brownian motion as the limit of a random-walk
Consider the first step of a symmetric random walk, Mt , starting at 0.

1m

*

0m
HH
HH
H
HH
H
HH
j m
-1
t=0
t=1
E [M1 ] = 0,

VarP (M1 ) = E M12 E [M1 ]2 = 1.
Suppose we would like to model the random walk on a finer timescale.
*

i
H

*
j
H
e
H

*
*

j e
H

m
HH
* HH
j e
j
HH
*

j i
HH
j
m
*

e
HH
j e
*

e
HH
j i
*

e
HH
j e
*

e
HH
j m
Keep doubling the number of time steps. Define

t := 2n ,
(n)
(n)
(n)
and
T(n) := {0, t, 2t, 3t, . . . } .
Let Xt , X2t , X3t , . . . be i.i.d. random variables, with

(
t
w.p. 1/2
(n)
Xt :=
t w.p. 1/2.
Note that E
(n)
Xt
= 0 and VarP
(n)
M0
= 0,
(n)
Xt
= t. Now define the random walk
(n)
Mt
:=
k
X
(n)
Xi.t ,
t = k.t T(n) .
i=1
(0)
Note that Mt
is just the original symmetric random walk, Mt .

16
We know from
Example
1.15 that each process M (n) is a martingale. Furthermore, for any t = kt
h
i
(n)
we have E Mt
= 0, and
VarP
(n)
Mt
k
X
VarP (Xi.t ) =
i=1
k
X
t = k.t = t,
i=1

(n)
e.g. VarP M1
= 1.
Rewrite
(n)
Mt
= t
!
Pk
X
kE
[X
]
i.t
t
i=1
p
.
k VarP (Xt )
Keeping t fixed, the Central Limit Theorem states that the term in brackets has a standard normal
distribution in the limit as n . Therefore,
()
Wt :=Mt
N (0, t).
Properties of Wt
(i) W0 = 0:
(n)
W0 = limn M0
= 0.
(ii) Wt has continuous trajectories:

(n)
As n gets larger, the paths which the random walk Mt takes become continuous. In the
limit, you could draw the trajectories without lifting pen from paper.
(iii) Wt Ws N (0, t s):
Take s = j.t < t = k.t. Then
(n)
Mt
Ms(n)
k
X
Xit
i=j+1
= ts
Pk
!
X
(k
j)E
[X
]
i.t
t
i=j+1
p
.
(k j) VarP (Xt )
By the CLT, we have

()
Wt Ws = Mt
Ms() N (0, t s).
(iv) Wt Ws (t > s) is independent of Wu for u s (independent increments):

P
P
Xt are i.i.d., therefore ji=1 Xi.t , is independent of li=k+1 Xit provided 0 < j k < l.
In the limit as n we see that Wu is independent of Wt Ws , provided u s < t.
A process W with properties (i)-(iv) is called Brownian Motion.
17
1.5
Geometric Brownian motion as the limit of the binomial tree
Lets return to the binomial tree (Example 1.16), but with a time-step of t = 2n . That is,
(n)
St
where
(n)
Zt
(
u
=
d
= S0 Zt Z2t . . . Zt
with probability pu
with probability 1 pu
are i.i.d. r.vs modeling the returns. Suppose we choose pu = 1/2, and we set
1
u = e( 2
1
d = e( 2
2 )t+
2 )t
We have carefully chosen u, d so that the mean and variance of the return over each time interval
are approximately et and 2 t:
Using the expansion ex = 1 + x + x2 /2 + . . . ,
E [Zt ] = pu + (1 p)d
1 2
1 2
1
1
= e( 2 )t+ t + e( 2 )t t
2
2
1
1 2
1
= 1 + ( )t + t + 2 t + . . .
2
2
2

1 2
1
+1 + ( )t t + 2 t + . . .
2
2
= 1 + t + . . .
et .

VarP (Zt ) E (Zt et )2
1
1
= (u et )2 + (d et )2
2
2
1 2t 1 2 t+t
= e (e 2
1)2
2
1 2
1
+ e2t (e 2 t t 1)2
2
1 2t 1 2
1
= e ( t + t + 2 t + . . . )2
2
2
2
1 2t 1 2
1
+ e ( t t + 2 t + . . . )2
2
2
2
2t 2
2
e t = (1 + 2t + . . . ) t
2 t.
Define
(n)
(n)
Xit
ln Zit ( 21 2 )t
:=
=
18
(
t
w.p. 1/2
t w.p. 1/2.
Then
(n)
ln St
= ln S0 +
= ln S0 +
k
X
(n)
ln Zit
(t = k.t)
i=1
k
X
1
(n)
(( 2 )t) + Xit
2
i=1
k
X (n)
1
= ln S0 + k( 2 )t +
Xit
2
i=1
1
(n)
= ln S0 + ( 2 )t + Mt
2
Taking the limit as n ,
1
ln St = ln S0 + ( 2 )t + Wt .
2
Thus
St = S0 e( 2
2 )t+W
t
This process is called geometric Brownian motion, and is the model that we will use for a share price.
19
1.6
Brownian motion. Definitions and properties
The most important example of a continuous-time martingale is a Brownian motion. This process is
named after Robert Brown, a Scottish botanist (born in Montrose, Angus) who was studying pollen
grains in suspension in the early nineteenth century. He observed the pollen was performing a very
random movement and thought this was because the pollen grains were alive. We now know this
rapid movement is due to collisions at the molecular level.
Two of the many people who are responsible for giving a mathematical description of Brownian
motion are Louis Bachelier (in his PhD thesis Theorie de la Speculation in 1900), and Albert
Einstein (in a 1905 paper).
Definition 1.19 A process X is said to have continuous sample paths if the map t 7 Xt () is
continuous for almost all .
Definition 1.20 A stochastic process W = {Wt }t0 is called a standard Brownian motion if
(i) W0 = 0;
(ii) W has continuous sample paths;
(iii) Wt Ws N (0, t s) for s t;
(iv) Wt Ws is independent of FsW := {Wu : u s}.

FtW t0 is the filtration generated by the Brownian motion. Hence (iv) means that the increments
are independent of everything that happened up to time s.

Brownian motion is also frequently referred to as a Wiener process.

Properties of Brownian motion
Although paths of Brownian motion are continuous, they are nowhere differentiable (a.s.)
One dimensional Brownian motion is recurrent i.e. it always returns eventually to any level in
finite time (a.s.).
Let Wt be a standard Brownian motion. Then so are the following:
(i) Bt = c1 Wc2 t for any c 0 (Brownian scaling)
(ii) Bt = Wt (Reflection, or symmetry)
20
(iii) Bt = Wt+s Ws for fixed s (Stationary indep. increment property)

(iv) Bt = tW1/t (Time inversion)

Lemma 1.21 A Brownian motion W is a martingale with respect to P and FtW t0 .
Proof. We write Ft for FtW
(i) For each t > 0, Wt is Ft -measurable by definition of Ft . Hence {Wt }t0 is {Ft }t0 adapted.
(ii) Since Wt N (0, t) we have
Z
1
2
|w|ew /2t dw
E [|Wt |] =
2t
Z
2
2
=
wew /2t dw
2t 0
r

2
2
tew /2t 0
=
t
r
2t
=
< .
(iii) Given s t
E [Wt |Fs ] = E [Wt Ws |Fs ] + E [Ws |Fs ]
= E [Wt Ws ] + Ws = Ws .
In the tutorials, well show that both Mt := Wt2 t and Zt := exp(Wt (2 /2)t) are {Ft }t0 martingales.
Multidimensional Brownian motion
(Wt1 , Wt2 , . . . , Wtn ) where Wti are independent 1-dimensional Brownian motions.
21
1.7
Quadratic variation
Want to measure the amount of oscillation (or volatility) in a process. Take an interval [0, t], and
partition into n intervals of equal length t = t/n.
The variation of a process X is defined to be

lim
n1
X
|X(si+1 ) X(si )|,
i=0
where si := it and t = t/n.

For Brownian motion this is (a.s.)
The quadratic variation of a process X is defined to be
[X]t := lim
n1
X
(X(si+1 ) X(si ))2 .
i=0
For Brownian motion the quadratic variation on [0, t] is t. Thus [W ]t = t.

Note: If A is a continuous, finite variation process then [A]t = 0.
Given two processes X, Y , their quadratic co-variation is defined by
[X, Y ]t := lim
n1
X
(Xsi+1 Xsi )(Ysi+1 Ysi ).
i=0
Some useful results:

[X, X]t = [X]t
[X, Y ]t = [Y, X]t , [X, Y + Z]t = [X, Y ]t + [X, Z]t .
If X and Y are independent then [X, Y ]t = 0. Therefore
[X + Y ]t = [X + Y, X + Y ]t = [X]t + 0 + [Y ]t .
If either X or Y has finite variation then [X, Y ]t = 0.
22
1.8
Introduction to stochastic integration
Riemann integration
Let h be a function on [0, t]. Then we define
Z
n1
X
h(s)ds := lim
h(si )(si+1 si ),
i=0
where [0, t] is divided into n intervals [si , si+1 ].

Riemann-Stieltjes integration
Let F be a continuous function with finite variation on [0, t]. The integral
t
h(s)dF (s) := lim
Note also that

Z
h(s)dF (s) = lim
n1
X
h(si )[F (si+1 ) F (si )].
i=0
n1
X
h(si+1 )[F (si+1 ) F (si )].
i=0
Stochastic integration
Let Ht be a stochastic process. We will need to define the stochastic integrals
Z t
Z t
Hs dWs .
Hs ds
and
Jt :=
It :=
0
The first integral can be defined pathwise i.e. fix , set h(s) := Hs () and do normal
Riemann integration.
Rt
Note that the variation of It is 0 |Hs |ds, and the quadratic variation [I]t is identically zero.
The second integral is more difficult. It looks like a Riemann-Stieltjes integral, but because Wt has
infinite variation, the sum
X
Hsi [Wsi+1 Wsi ]
i
diverges a.s. The standard construction of the integral wont work.

Rt
Example 1.22 Compare the following two possible approximations to 0 Ws dWs .
(i)
Pn1
Wsi (Wsi+1 Wsi ).
(ii)
Pn1
Wsi+1 (Wsi+1 Wsi ).
i=0
i=0
Take expectations in (i) and (ii).
23
In (i),
E
" n1
X
n1
X
Wsi (Wsi+1 Wsi ) =
i=0
i=0
n1
X

E Wsi (Wsi+1 Wsi )

E [Wsi ] E Wsi+1 Wsi
i=0
= 0.
In (ii),
E
" n1
X
#
Wsi+1 (Wsi+1 Wsi )
i=0
n1
X
=
=
=
=
i=0
n1
X
i=0
n1
X
i=0
n1
X

E Wsi+1 (Wsi+1 Wsi ) 0
E Wsi+1 (Wsi+1
n1
X

W si )
E Wsi (Wsi+1 Wsi )
i=0

E (Wsi+1 Wsi )2
(si+1 si ) = t.
i=0
The approximation (i) leads to the Ito integral, while (ii) leads to the Stratanovich integral. We will
concentrate on the It
o integral.
Construction of the It
o integral
A process Kt is called simple if it has the following form:
Kt = H0 1[0,s1 ] (t) +
n1
X
Hi 1(si ,si+1 ] (t)
i=1
where Hi are Fsi -measurable random variables.
Step 1: Define
Z
Ks dWs :=
0
Hi (Wsi+1 Wsi ).
24
Step 2:
t
Z
"

Ks dWs = E
#
X
Hi (Wsi+1 Wsi )

E [Hi ] E Wsi+1 Wsi = 0.
Also
"Z
2 #
Ks dWs
0
!2
X
= E
Hi (Wsi+1 Wsi )
"
#
X
=E
Hi (Wsi+1 Wsi )Hj (Wsj+1 Wsj )
i,j

E Hi2 (Wsi+1 Wsi )2

E Hi (Wsi+1 Wsi )Hj (Wsj+1 Wsj )
i6=j

E Hi2 E (Wsi+1 Wsi )2

E Hi2 (si+1 si )
"
=E
#
X
Hi2 (si+1 si )
i
t
Z
Ks2 ds
=E

.
Step 3:
Any continuous, adapted process Ht with E
hR
t
0
i
Hs2 ds < for all t can be approximated by a
sequence of simple processes Kn (t) in the L sense:

Z t

2
i.e. E
(Hs Kn (s)) ds 0
as n .
Step 4: Steps 2 and 3 imply that

" Z
Rt
0
t
Kn (s)dWs converge in L2 to a limit J(t)

2 #
Kn (s)dWs J(t)
i.e. E
as n .
Define the integral to be this limit

Z
Hs dWs := J(t).
0
For more information on the above construction, see Bjork, or ksendal.

25
Properties of It
o integrals
(i) Jt is adapted to {Ft }t0 , where Ft := {Wu : u t}.
Rt
Ru
Rt
(ii) 0 Hs dWs = 0 Hs dWs + u Hs dWs for u < t.
Rt
Rt
Rt
(iii) 0 (aHs + bHs0 )dWs = a 0 Hs dWs + b 0 Hs0 dWs for any a, b R.
i
hR
t
2
(iv) If E 0 Hs ds < for all t > 0 then the results of Step 2 are preserved:
Z

Hs dWs = 0
E
0
and
"Z
E
2 #
Hs dWs
Z
=E
Hs2 ds

.
This is called the Ito isometry.

(v) If Hs is deterministic (i.e. non-random) then
Z t

Z t
2
Hs dWs N 0,
Hs ds .
0
(vi) For a stochastic integral Jt :=

Theorem 1.23 If E
hR
t
Hs2 ds
0
Rt
0
Hs dWs , the quadratic variation is [J]t =
< t then Jt :=
Rt
0
Rt
0
Hs2 ds.
Hs dWs is a martingale.
Proof (sketch). We only show property (iii) of a martingale:

EP [Jt |Fs ] = EP [Js + (Jt Js )|Fs ]
= Js + EP [Jt Js ]
= Js + EP [Jt ] EP [Js ] = Js .

hR
t
Rt
Hs2 ds
0
Hs dWs for Ht satisfying E

< .
Rt 2
The definition can actually be extended to include Ht satisfying P( 0 Hs ds < ) = 1, however then
the nice properties (ii) and (iii) above do not hold. In this case Jt might fail to be a martingale - in
which case its local martingale.
So far, we have been able to define the integral Jt :=
1.9
It
o processes and It
os formula
Definition 1.24 An Ito process is any stochastic process of the form

Z t
Z t
s ds +
s dWs .
Xt := X0 +
0
where Wt is a standard Brownian motion, and t and t are integrable processes.

26
Usually, we write the above using the shorthand

dXt = t dt + t dWt ,
(*)
To work out the quadratic variation of an Ito process, we write Xt = X0 + It + Jt , where

Z t
Z t
It :=
s ds
Jt :=
s dWs .
0
Then
[X]t = [X0 + I + J]t = [I + J, I + J]t
= [I]t + 2[I, J]t + [J]t
Z t
s2 ds
=0+0+
0
In shorthand,
d[X]t = t2 dt.
Its easy to define integrals of an Ito process: Multiply both sides of (*) by another process t0 ,
t0 dXt = t0 t dt + t0 t dWt .
Then
s0 dXs
Z
:=
s0 s ds
Z
+
s0 s dWs .
It
os formula
Recall the fundamental theorem of calculus:
t
f 0 (s)ds.
f (t) f (0) =
0
More generally, if xt has finite variation:

Z
f (xt ) f (x0 ) =
f 0 (xs )dxs .
The stochastic version of the fundamental theorem of calculus is the Ito formula.
Proposition 1.25 Let f (t, x) be a smooth function. Let f(t, x) :=
2
f 00 (t, x) = xf2 (t, x). Then f (t, Wt ) is an It
o process satisfying
f
(t, x),
t
f 0 (t, x) :=
f (t, Wt ) f (0, W0 )
Z
Z t
Z t
1 t 00
0
=
f (s, Ws )ds +
f (s, Ws )dWs +
f (s, Ws )ds.
2 0
0
0
Proof (Sketch). Write down Taylor series expansion of f (around 0)
X f
X f
f (t, Wt ) = f (0, W0 ) +
si +
Wi
s
x
i
i
X f
1 X 2f
2
+
(s
)
+
si Wi
i
2 i s2
sx
i
1 X 2f
+
(Wi )2 + o((si )3 , (Wi )3 )
2 i x2
27
f
(t, x)
x
and
where si+1 := si+1 si , Wi+1 := Wsi+1 Wsi for some partition [si , si+1 ] of [0, t]. Let si ,
Wi 0 in above.

Itos formula is usually abbreviated to
1
df (t, Wt ) = (f(t, Wt ) + f 00 (t, Wt ))dt + f 0 (t, Wt )dWt .
2
If f (t, x) = f (x) this becomes
1
df (Wt ) = f 0 (Wt )dWt + f 00 (Wt )dt.
2
Examples 1.26
(i) If Yt = exp(Wt ), what is dYt ?

f (x) = ex
f 0 (x) = f 00 (x) = ex
1
dYt = df (Wt ) = f 0 (Wt )dWt + f 00 (Wt )(dWt )2
2
1
1
= eWt dWt + eWt dt = Yt dt + Yt dWt .
2
2
(ii) Calculate
Rt
0
Ws dWs . First guess:
1
Wt2 ...
2
f (x) = x2 , f 0 (x) = 2x, f 00 (x) = 2

1
dXt = 2Wt dWt + 2 (dWt )2 = 2Wt dWt + dt.
2
Xt = Wt2
Thus,
Wt2
= X t X0 =
dXs = 2
0
so in fact,
Rt
0
Ws dWs = 12 Wt2 21 t.
28
Ws dWs + t,
0
Multidimensional It
os formula for general It
o processes
Proposition 1.27 Let X = (Xt )t0 and Y = (Yt )t0 be two Ito processes, and let f be a smooth
function of two real variables. Then the stochastic process Zt := f (Xt , Yt ) is also an Ito process,
satisfying
dZt = df (Xt , Yt )
f
f
(Xt , Yt )dXt +
(Xt , Yt )dYt
=
x
y

1 2f
2f
+
(Xt , Yt )d[X, Y ]t
(X
,
Y
)d[X]
+
2
t
t
t
2 x2
xy

2f
+ 2 (Xt , Yt )d[Y ]t .
y
We can calculate the quadratic variation and covariation processes above as follows. Suppose that
we can write dXt and dYt in the form
dXt = t dt + t dWt ,
dYt = 0t dt + t0 dWt .
Then we can use the multiplication table
dWt
dt
dWt
dt
0
dt
0
0
to obtain
d[X]t = (dXt )2 = (t dt + t dWt )2 = t2 dt,
d[X, Y ]t = dXt dYt = (t dt + t dWt )(0t dt + t0 dWt ) = t t0 dt,
d[Y ]t = (dYt )2 = (0t dt + t0 dWt )2 = (t0 )2 dt.
Itos formula is just a Taylor expansion up to degree 2.
Example 1.28 Product rule: What is d(Xt Yt )?
Let f (x, y) = xy. Then
d(Xt Yt ) = df (Xt , Yt )
f
f
(Xt , Yt )dXt +
(Xt , Yt )dYt
=
x
y
2f
1 2f
2
+
(X
,
Y
)(dX
)
+
(Xt , Yt )dXt dYt
t
t
t
2 x2
xy
1 2f
+
(Xt , Yt )(dYt )2
2 y 2
= Yt dXt + Xt dYt + dXt dYt .
29
1.10
Stochastic differential equations
The movement of a drifting particle X (could be an asset price...), which is affected by some noise,
is frequently modeled by writing
Xt+t = Xt
+ (t, Xt )t + (t, Xt )Wt
= old position + drift
+ noise,
where Wt := Wt+t Wt , and t is very small.
As t 0, we write the limit of the above equation as a stochastic differential equation (SDE)
dXt = (t, Xt )dt + (t, Xt )dWt ,
X0 = x0 ,
meaning
Z
X t = x0 +
Z
(t, Xt )dt +
(t, Xt )dWt .
0
The coefficient is called the diffusion coefficient, while the coeff. is called the drift coeff.
[If = 0 this is just an ordinary differential equation (ODE):
dXt = (t, Xt )dt
or
dXt
= (t, Xt ).]
dt
As with ODEs, the idea is to find a solution to the equation.

When the coefficients and do not depend on t, = (Xt ), = (Xt ), the solution to an SDE is
called a diffusion process.
Some examples of SDEs
Geometric Brownian motion
Common model for share prices.
Let St satisfy the SDE
dSt = St dt + St dWt ,
, R.
This is a linear SDE because (t, s) := s and (t, s) := s.

If = 0, solution to the ODE
dSt
dt
= St is St = S0 et .
Define Zt := ln St . In preparation for Itos formula, lets write Zt = f (St ), where f (s) := ln s,
f 0 (s) = 1/s, f 00 (s) = 1/s2 . Then
1
dZt = f 0 (St )dSt + f 00 (St )d[S]t
2
1
1 1
= (St dt + St dWt )
(St dt + St dWt )2
St
2 St2
1
= dt + dWt 2 dt
2
1 2
= ( )dt + dWt .
2
To solve this SDE, just integrate:
Z t
Z t
1 2
1
Zt Z0 =
( )ds +
dWs = ( 2 )t + Wt .
2
2
0
300
Therefore
1
ln St = ln S0 + ( 2 )t + Wt
2
Hence
St = S0 e( 2
2 )t+W
t
Compare with Section 1.5 (GBM as limit of Binomial Tree)

Exercise 1.29 Use Itos formula on the above solution, to check that dSt = St dt + St dWt .
Ornstein-Uhlenbeck process
Let Xt satisfy
dXt = a(b Xt )dt + dWt
where a > 0, b, R. See Tutorials.
1.11
Martingale Representation Theorem
In hTheoremi1.23 we saw that the stochastic integral Jt :=

Rt
EP 0 Hs2 ds < t > 0. The MRT is the reverse of this.
Rt
0
Hs dWs is a martingale, provided
2
W
Theorem 1.30 Let Wt be a BM, and let Mt be a continuous
R t Ft -martingale such
R tthat2 EP [Mt ] < .
Then there exists an adapted process Ht such that Mt = 0 Hs dWs , and [M ]t = 0 Hs ds < .
The brief version of this is dMt = Ht dWt and d[M ]t = (dMt )2 = Ht2 (dWt )2 = Ht2 dt.
Rt
Integrals of the form 0 Xs dMs are just Ito integrals w.r.t. B.M.
Z
Z
Xs dMs =
Xs Hs dWs
In particular, if Xt =
Rt
0
Ys dMs then Xt is a (local) martingale if and only if Mt is a local martingale.
Example1.31 Consider
the martingale
Mt = exp Wt 12 2 t (see tutorial). How to represent this as a stochastic integral?

Apply Itos formula to f (t, x) = exp x 21 2 t
dMt = df (t, Wt )
1 2
1 2
1 2
1
1
= eWt 2 t dWt 2 eWt 2 t dt + 2 eWt 2 t dt
2
2
= Mt dWt .
Motivation for the martingale representation theorem?!

Let X be a random variable which is measurable with respect to FT , and such that EP [|X|2 ] < .
Then the process
Xt = EP [X|Ft ]
t [0, T ]
is a martingale (see Example 1.17) with
X0 = EP [X]
and XT = X.
31
Using the martingale representation theorem, we see that there exists an adapted process ht such
that
Z t
Xt = X 0 +
hs dWs .
0
Now letting t = T ,
Z
X = EP [X] +
hs dWs .
0
Rt
Later on, h represents a trading strategy, and 0 hs dWs is the gains from trading the asset W , so
any financial payoff X can be synthesized by a trading strategy (the market is complete). This is
not quite right however, because our asset does not follow a Brownian motion!
1.12
Change of Measure
Let X be a non-negative random variable on a probability space (, F , P) such that EP [X] = 1.

Define now a map Q : F R by Q(A) := EP [X1A ].
Proposition 1.32 Q is a probability measure on (, F ). Furthermore, if P(A) = 0 for some A F
then Q(A) = 0.
Proof. Suppose first that P(A) = 0. Then
P(X1A = 0) = P(X = 0 or 1A = 0) P(1A = 0) = P(Ac )
= 1.
Thus X1A = 0 P-a.s., and hence Q(A) = EP [X1A ] = 0.
Lets show that Q is a prob. meas.
(i) Since P() = 0 we immediately have Q() = 0.
(ii) Q(Ac ) = EP [X1Ac ] = EP [X(1 1A )] = EP [X] EP [X1A ] = 1 Q(A).
(iii) Q() = 1 Q() = 1.
(iv) Since X 0 we have Q(A) = EP [X1A ] 0. Also Q(A) = 1 Q(Ac ) 1 0 = 1.
(v) Let A1 , A2 , . . . be a sequence of disjoint events. Then by Fubinis theorem (Th 1.6)
"
#
X

Q(
= EP X
1A i
i=1 Ai ) = EP X1
i=1 Ai
i=1
EP [X1Ai ] =
i=1
Q(Ai ).
i=1
Definition 1.33 If P(A) = 0 = Q(A) = 0 A F then we say that Q is absolutely continuous

with respect to P, and we write Q P.
If Q P and P Q then we say that Q is equivalent to P, and we write Q P.
Warning: This certainly does not mean that they are the same measure!
32
Lemma 1.34 Suppose that Q P. Then for any A F we have

(i) P(A) = 1 = Q(A) = 1
(ii) Q(A) > 0 = P(A) > 0
(iii) Q(A) < 1 = P(A) < 1
NB: If Q P then .
Proof. See Tutorial.
The converse of Proposition 1.32 is also true. If Q is a probability measure such that Q P then
there exists an a.s. unique, F -meas. random variable X such that EP [X] = 1 and Q(A) = EP [X1A ].
to make it clear which measures
We call X the Radon-Nikodym derivative, and we use the notation dQ
dP
we are changing between. Thus

dQ
Q(A) = EP
1A .
dP
Note that if
dQ
dP
1 then Q(A) = EP [1.1A ] = P(A), so Q = P.
Proposition 1.35 Let Q P. Given any F -measurable, non-negative r.v. Y we have

dQ
Y .
EQ [Y ] = EP
dP
Proof. Let Y =
Pn
i=1
ai 1Ai be a simple r.v. Then

n
X
n
X

dQ
EQ [Y ] :=
ai Q(Ai ) =
ai E P
1A
dP i
i=1
i=1
"
#

n
dQ X
dQ
= EP
ai 1Ai = EP
Y
dP i=1
dP
The M.C.T. does the rest: If Yn are simple r.v.s such that Yn Y then

dQ
dQ
EQ [Y ] = lim EQ [Yn ] = lim EP
Yn = EP
Y .
n
n
dP
dP
Lemma 1.36 (Change

of measure) Let Q P be two probability measures. Define the P-martingale

dQ
Zt := EP dP |Ft . An adapted process Xt is a Q-martingale if and only if Xt Zt is a P-martingale.
Proof. Given t > 0 and any event A Ft , Proposition 1.35 implies that

dQ
dQ
EQ [Xt 1A ] = EP
Xt 1A = EP EP
Xt 1A |Ft
dP
dP

dQ
= EP EP
|Ft Xt 1A = EP [Zt Xt 1A ] .
dP
33
Now
Xt is a Q-martingale
EQ [Xt |Fs ] = Xs t > s
EQ [Xt 1A ] = EQ [Xs 1A ]
A Fs
EP [Zt Xt 1A ] = EP [Zs Xs 1A ]
A Fs
EP [Zt Xt |Fs ] = Zs Xs t > s
Zt Xt is a P-martingale.
Let Wt be a P-B.M., let Ft be the filtration generated by Wt , fix a time T (0, ). Suppose that
RT
t is an Ft -adapted process such that 0 s2 ds < a.s.
R
t = dWt + t dt). If t is non-zero then W
t
t := Wt + t s ds (equivalently, dW
Define the process W
0
has a non-zero drift.
Theorem 1.37 (Girsanov Theorem) If
Z t

Z
1 t 2
Zt := exp
s dWs
ds
2 0 s
0
t is a
is a P-martingale then there is a probability measure Q P defined on FT , such that W
dQ
Q-Brownian motion, and dP = ZT .
Proof. We only give the proof for = constant, in which case Zt is automatically a martingale. Since
= ZT . From
Z0 = 1, we have EP [ZT ] = 1. Define Q by Q(A) := EP [1A ZT ] (for A F ), so dQ
dP
Lemma 1.36,

1
2
t t is a Q-martingale
exp W
2

1
2
t t is a P-martingale
Zt exp W
2

1 2
1 2
exp Wt t exp Wt + t t
2
2
is a P-martingale

1
2
exp ( )Wt ( ) t
2
is a P-martingale.
t is a Q-BM.
Since Wt is a P-BM, the above statements are true, and hence (see Tutorials) W
Remark 1.38 A sufficient condition for Zt to be a P-martingale is Novikovs condition

Z T

1
2
EP exp
ds
< .
2 0 s
Theorem 1.39 (Girsanov converse) If WRt is a P-Brownian motion, and Q P, then there exists
t := Wt + t s ds is a Q-Brownian motion. Moreover,
an adapted process t such that W
0
Z T

Z
dQ
1 T 2
= exp
s dWs
ds .
dP
2 0 s
0
34

Proof. The process Zt := EP dQ
|Ft is a positive martingale with Z0 = 1. Using the MRT, there
dP
exists an adapted process Ht such that dZt = Ht dWt . Define t = Ht /Zt , so that dZt = t Zt dWt .
Define Yt = ln(Zt ), so Y0 = 0. By Itos formula,
1
1 1
1
dZt
(dZt )2 = t dWt t2 dt.
2
Zt
2 Zt
2
Rt
Rt
Integrating this gives Yt = 0 s dWs 12 0 s2 ds. Hence
dYt =
Zt = eYt = e
Note that ZT =
dQ
,
dP
Rt
0
s dWs 21
Rt
0
s2 ds
t is a Q-B.M.
so Q is the measure Girsanovs Theorem, and W
35

Ito Shit

Uploaded by

Copyright:

Available Formats

Ito Shit

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ito Shit

Uploaded by

Copyright:

Available Formats

F70CF1 Continuous-Time Finance

Torsten Kleinow: Room CM.F11

The pair (, F ) is called a measure space.

The triple (, F , P) is called a probability space, which we now fix.

A random variable (r.v.) X is a function X : R which is measurable w.r.t. F .

Expectations of Random Variables

For a general random variable X 0, take a sequence X1 , X2 , X3 , . . . of simple random variables

Define EP [X] := supn EP [Xn ].

0 = a0 < a1 < a2 < < an

xdF (x) = a1 (F (a1 ) F (a0 )) + a2 (F (a2 ) F (a1 ))

N.B. If F is differentiable then EP [X] =

xf (x)dx, where f (x) = F 0 (x) is the density function.

Heres another application of the MCT:

Xi . Then Yn is an increasing sequence of r.v.s such that Yn

lim EP [Yn ] = lim EP

Theorem: If X, Y are independent r.v.s and f, g are continuous functions then

f (cx1 + (1 c)x2 ) cf (x1 ) + (1 c)f (x2 )

e.g. f (x) = x2 , ex , |x|.

Lemma 1.7 (Jensen) If f is a convex function, then

Similarly, if g is concave (g = f ) then EP [g(X)] g(EP [X]).

xP(X = x|Y = y).

EP [X1 + X2 |X1 = 1] = 1 + EP [X2 ] =

Therefore, as a random variable it is defined as

= EP [X1 + X2 |X1 ] is F1 -measurable.

Consider EP [1A EP [X1 + X2 |X1 ]].

(iv) In particular, if X is G -meas., then EP [X|G ] = X.

With Z := EP [EP [X|G

Let {Ft }tT be the filtration generated by Y :

N.B: Due to property (v) of EP [.|F ],

Proof. We need to show that EP [Mn |Fk ] = Mk for all k < n.

EP [Mn |Fk ] = EP [EP [Mn |Fn1 ] |Fk ]

Some examples of (super-/sub-)martingales

(ii) This follows by definition of {Fn }

= |m0 |EP [|Z1 |]n < .

EP [Mn+1 |Fn ] = EP [m0 Z1 Z2 . . . Zn Zn+1 |Fn ]

Note that EP [Z1 ] = pu + (1 p)d. If we choose p =

Similarly, if g is concave then g(Mt ) is a supermartingale.

Brownian motion as the limit of a random-walk

Consider the first step of a symmetric random walk, Mt , starting at 0.

Suppose we would like to model the random walk on a finer timescale.

Keep doubling the number of time steps. Define

T(n) := {0, t, 2t, 3t, . . . } .

Let Xt , X2t , X3t , . . . be i.i.d. random variables, with

= t. Now define the random walk

is just the original symmetric random walk, Mt .

(ii) Wt has continuous trajectories:

By the CLT, we have

Ms() N (0, t s).

(iv) Wt Ws (t > s) is independent of Wu for u s (independent increments):

Geometric Brownian motion as the limit of the binomial tree

Brownian motion. Definitions and properties

Brownian motion is also frequently referred to as a Wiener process.

(iii) Bt = Wt+s Ws for fixed s (Stationary indep. increment property)

The variation of a process X is defined to be

|X(si+1 ) X(si )|,

where si := it and t = t/n.

For Brownian motion the quadratic variation on [0, t] is t. Thus [W ]t = t.

(Xsi+1 Xsi )(Ysi+1 Ysi ).

Lemma 1.34 Suppose that Q P. Then for any A F we have

Proposition 1.35 Let Q P. Given any F -measurable, non-negative r.v. Y we have