MFG Ambiguity Aversion

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Mean-Field Games and Ambiguity AversionI

Xuancheng Huanga , Sebastian Jaimungala


a
Department of Statistical Sciences, University of Toronto, Toronto, Canada

Abstract

In this paper, we wish to extend the probabilistic framework for mean-field games to accom-
modate ambiguity aversion. We study a fairly general class of models under independence,
along with a smaller class of models when there is common noise.
This paper looks at a general framework for mean-field games with ambiguity averse
players based on the probabilistic framework described in Carmona and Delarue (2013). A
framework for mean-field games with ambiguity averse players is presented, using a version of
the stochastic maximum principle to find the optimal controls of the players. The dynamics
under the optimal control are characterized through a forwards-backwards stochastic differ-
ential equation and a relationship between the finite player game and the mean-field game is
established. Explicit solutions are derived in the case of the linear-quadratic mean-field game.
Keywords: Mean-field games, Nash equilibrium, Stochastic games, Model Uncertainty,
Ambiguity aversion

I
The authors would like to thank NSERC for partially funding this work.
Email addresses: xuancheng.huang@mail.utoronto.ca (Xuancheng Huang),
sebastian.jaimungal@utoronto.ca (Sebastian Jaimungal)

Electronic copy available at: https://ssrn.com/abstract=3033869


1. Introduction

The first questions of existence and uniqueness of mean-field games was addressed by Lasry
and Lions (2007) and Huang et al. (2006). One of the first papers to consider robustness in
mean-field games was Bauso et al. (2012), and more recently, extending to the linear-quadratic
framework, Moon and Başar (2017). However, in both of these papers, there is no mean-field
term in the dynamics, which allows them to use the result from Lasry and Lions (2007)
to establish existence and uniqueness. In this paper, we propose to use the framework of
Carmona and Delarue (2013) for a general existence/uniqueness result for robust convex–
concave mean-field games. Afterwards, we look at the linear-quadratic framework with a
mean-field term in the dynamics, and characterize its solution as well as the conditions for
existence, uniqueness and an -Nash equilibrium to hold in section 5.
In Carmona and Delarue (2013), the players’ dynamics were given by

dXti = bi (t, X i , µt , αi )dt + σ(t, Xti )dWti

and the respective cost functions were


Z T
i i
J (α ) = E[ f i (t, X i , µi , αi )dt + g i (T, X i )]
0

where the goal of player i is to minimize her cost through her control αi , and µt is the
mean-field measure flow which describes the distribution of player states at time t.
We now extend this setup to incorporate model uncertainty. In particular, we will look at
a mean-field game where all the players are ambiguity averse.

1.1. N-player game


To begin, for 1 ≤ i ≤ N , we let W it be an m-dimensional uncorrelated standard Brownian
motion defined on the filtered probability space (Ω, F i , {Fti }t≥0 , P), where {Fti }t≥0 is the P-
augmentation of the natural filtration generated by W it . We search for optimal controls αti , ηti
within the admissible sets A, E ⊂ Rk respectively. The private states of player i are denoted as
Uti ∈ Rd . The drift and volatility processes respectively are bi : [0, T ] × Rd × P(Rd ) × A × E →
Rd and σ i : [0, T ] → Rd×m . We define the empirical distribution of the players’ states
(Ut1 , . . . , UtN ) are
N
1 X
ν̄t = δ i (1)
N i=1 Ut

Electronic copy available at: https://ssrn.com/abstract=3033869


The stochastic differential equation for each player’s state is then

dUti = bi (t, Uti , ν̄t , αti , ηti )dt + σ i (t)dWti (2)

The strategies αti , ηt are chosen such that


Z T Z T
E[ |αti |2 dt] < ∞, E[ |ηti |2 dt] < ∞ (3)
0 0

and minimize/maximize the cost function


Z T
i
J (α, η) = E[ f i (t, Uti , ν̄t , αti , ηti )dt + g i (UTi , ν̄T )] (4)
0

where f i : [0, T ] × Rd × P(Rd ) × A × E → R and g i : Rd × P(Rd ) → R. In particular, the


players wish to find a control α bti such that
Z T
i
J (b
α, ηb) = inf
i
sup E[ f i (t, Uti , ν̄t , αti , ηti )dt + g i (UTi , ν̄T )] (5)
αt ∈A η i ∈E 0
t

1.2. Mean-Field Game

We can introduce the mean-field limit of the game with ambiguity averse players, with
the frozen measure flow µt replacing the empirical distribution ν̄t , in which case the players’
dynamics become
dXti = bi (t, X i , µt , αi , η i )dt + σ(t, Xti )dWti

with cost functions


Z T
i i i
J (α , η ) = E[ f i (t, X i , µt , αi , η i )dt + g i (XTi , µT )]
0

Note that by having a frozen measure flow µt , the minimax cost for a representative agent
can be viewed as a standard optimization problem.
The objective in this case is to minimize the worst-case misspecification of the model, so
player i wishes to find a strategy α̂i such that the the following cost function is minimized
Z T
i i
J(α , ηb ) = sup E[ f i (t, X i , µt , αi , η i )dt + g i (XTi , µT )]
ηti ∈E 0

A simple class of models which fall under this framework would be the linear-quadratic
model, in which case we would assume that the drift is linear and the cost is quadratic and
convex in (x, α), and quadratic and concave in η i .

3
1.3. Optimizing via the Hamiltonian

We introduce the Hamiltonian

H(t, x, µ, y, α, η) = hb(t, x, µ, α, η), yi + f (t, x, µ, α, η)

The purpose is to find the minimax of the Hamiltonian, and determine sufficient conditions
under which the minimax controls for the Hamiltonian exist, and will result in the minimax
controls for the mean-field game.
To this end, we assume that there is a pair (b
α, ηb) such that

H(t, X̂t , α̂t , Ŷt , Ẑt , η̂t ) = min max H(t, X̂t , α, Ŷt , Ẑt , η) (6)
α∈A η∈E

From 2.1 we will see that under certain conditions, this equilibrium gives a minimax value for
the game.
Here we follow the notation introduced by Carmona and Delarue (2013) and denote Pp (E)
to be the set of probability measures of order p, so that
Z 1/p
Mp,E (µ) = kxkpE dµ(x) <∞
E

where E is a separable Banach space and µ ∈ Pp (E).

Assumption 1.1. The function [0, T ] ∈ t → b1 (t, x, µ) is affine in x, α and η. ie b(t, x, µ, α, η) =


b0 (t, µ) + b1 (t) x + b2 (t) α + b3 (t) η

Assumption 1.2. The cost function satisfies the following inequality

f (t, x0 , µ, α0 , η) − f (t, x, µ, α, η) − h(x0 − x, α0 − α), ∂x,α f (t, x, µ, α, η)i ≥ λ(|α0 − α|2 ) (7)

This is the standard definition for the cost function to be jointly convex in (x, α).

Given that η is maximizing the cost function, we can see that having η 0 = η gives us the
usual definitions for joint convexity.

Assumption 1.3. The function x → g(x, µ) is locally bounded, once continuously differen-
tiable, convex and has a Lipschitz continuous first order derivative.

4
Assumption 1.4. For the minimizer α
b(t, x, µ, y) of the Hamiltonian in feedback form, we
have that the Hamiltonian is jointly concave in the variables (y, η).

H(t, x, µ, α b0 , y 0 , η 0 )−h(y−y 0 , η−η 0 ), ∂y,η H(t, x, µ, α


b, y, η)−H(t, x, µ, α b, y, η)i ≥ λη (|η 0 −η|2 ) (8)

where α
b=α b0 = α
b(t, x, µ, y) and α b(t, x, µ, y 0 ).

Although this assumption looks strange at first, it is consistent with the assumptions for
the verification theorem 2.1 to hold and allows the problem to have a well-defined control
ηb which maximizes the performance criteria. In the linear-quadratic case, the purpose of
equation (8) is similar to the concept of disturbance attenuation in Bauso et al. (2016), as
the optimal strategy for a player who is extremely ambiguity averse may not be admissible.
Mathematically, a cost function which is not sufficiently concave may not have an attainable
supremum. We will show in section 5 that this is a sharp bound for the linear-quadratic game.

Lemma 1.1. There exists a unique minimizer (resp maximizer) α̂ (resp η̂) of H. Moreover,
α̂, η̂ are Lipschitz-continuous with respect to (x, y), uniformly in (t, µ)

Proof. We have the inequality

H(t, x, µ, 0, y, η) ≥ H(t, x, µ, α
b, y, η)
(9)
2
≥ H(t, x, µ, 0, y, η) + hα̂, ∂α Hi + λ|α̂|

so that
1
α̂(t, x, µ, y) ≤ (|∂α f (t, x, µ, 0, η)| + |b2 (t)|y) (10)
λ
Similarly, we have that

1
η̂(t, x, µ, y) ≤ (|∂η f (t, x, µ, α, 0)| + |b3 (t)|y) (11)
λη

These bounds help to control the growth rate of α


b, ηb in later proofs. The following three
assumptions are regularity assumptions which allow us to show that there exists a unique
equilibrium for the mean-field game.

5
Assumption 1.5. For all 0 ≤ t ≤ T, x, x0 ∈ Rd , α, α0 ∈ RK , µ, µ0 ∈ P2 (Rd ), we have

|(f, g)(t, x0 , µ0 , α0 , η 0 ) − (f, g)(t, x, µ, α, η)|

≤ C[1 + |(x0 , α0 , η 0 )| + |(x, α, η)| + M2 (µ) + M2 (µ0 )][|(x0 , α0 , η 0 ) − (x, α, η)| + W22 (µ0 , µ)]
(12)

Furthermore, we have that b0 , b1 , b2 are bounded by C and

|b0 (t, µ0 ) − b0 (t, µ)| ≤ C W2 (µ, µ0 ) (13)

We note that a consequence of Assumption 1.5 is that there exists a λ0 ∈ R+ such that

f (t, x0 , µ, α0 , η 0 )−f (t, x, µ, α, η)−h(x0 −x, α0 −α), ∂x,α f (t, x, µ, α, η)i ≥ λ(|α0 −α|2 )−λ0 (|η 0 −η|2 )
(14)

Assumption 1.6. for all (t, x, µ), |∂α f (t, x, µ, 0, η)|, |∂η f (t, x, µ, α, 0)| ≤ C

Assumption 1.7. for all (t, x), hx, ∂x f (t, 0, δx , 0, 0)i ≥ −C (1 + |x|) and hx, ∂x g(0, δx )i ≥
−C (1 + |x|)

2. Stochastic Nash-Pontryagin Principle

We introduce a modification of the stochastic Pontryagin principle to deal with the


minimization-maximization problem, as the standard stochastic maximum principle only pro-
vides a result for either minimization or maximization. A deterministic version is proven in
Geering (2007), however the techniques used break down in the stochastic control setting.
We provide a proof of the stochastic Nash-Pontryagin principle in our setting to establish the
existence of a Nash equilibrium.

Theorem 2.1. [Stochastic Nash-Pontryagin Principle] Assume that there exists a solution
Ŷt , Ẑt to the BSDE

dŶt = −∂x H(t,


b X̂t , α̂t , Ŷt , Ẑt , η̂t )dt + Ẑt dWt , YT = g(XT )

such that
H(t,
b X̂t , α̂t , Ŷt , Ẑt , η̂t ) = min max H(t, X̂t , α, Ŷt , Ẑt , η)
α∈A η∈E

6
and
H(t, x, αt , ŷ, ẑ, ηbt ) = max H(t, x, α, ŷ, ẑ, ηb)
η∈B

is convex in (α, x) and

H(t, x, α̂t (x, y, z), y, z, ηt ) = min H(t, x, α, y, z, η)


α∈A

is concave is (y, η) and convex in x.


Given the mean-field-game measure flow µt , the optimal strategy of the ith agent αi (t) mini-
mizes her cost function J(αi ; α−i ), so that for any strategy αi ∈ A and

J(αi , η̂ i ) ≥ J(α̂i , η̂ i ) , ∀i

and η i maximizes the cost function such that for any alternative strategy η i ∈ E, and for the
optimal αi ∈ A we have
J(α̂i , η i ) ≤ J(α̂i , η̂ i )

and hence
J(α̂, η̂) = min max J(α, η) = max min J(α, η)
α∈A η∈E η∈E α∈A

Proof. The verification for a mean-field game with ambiguity aversion is two-fold. First, we
must verify that the ambiguity averse measure maximizes the expected value of the cost func-
tional given the control αi . Then we must verify that the ambiguity averse agent’s control
minimizes the cost functional under the ambiguity averse measure. The arguments used are
standard in the theory of optimal control, but are provided below for completeness.
We first consider the terminal cost g(XT ). In the stochastic control problem we in fact solved
E[inf α supη U (α)]

Holding αi ∈ A to a fixed adapted strategy, we write


Z T
J(α, ηb) − J(α, ηb) = E[ f (t, µt , X̂t , αt , ηb) − f (t, µt , Xt , αt , ηb)dt + g(X̂T ) − g(XT )]
0

where it is understood that X̂t depends on an arbitrary control αt and the optimal control
for ηt .
For the remainder of the proof, we will use the notation Ŷt = Yt (X̂t ), Yt = Yt (Xt ).

7
Under the assumption of convexity, we have that
Z T Z T
g(X̂T ) − g(XT ) ≥ (X̂T − XT )YT = (X̂t − Xt )dYt + (dX̂t − dXt )Yt
0 0

Furthermore, we have
Z T
E[ f (t, µt , X̂t , α̂t , η̂t ) − f (t, µt , Xt , α̂t , ηbt )dt]
0
Z T Z T
= E[ H(t, X̂t , α b, Ŷt , Ẑt , ηbt ) − H(t, Xt , α b, Yt , Ẑt , ηt )dt − (dX̂t Ŷt − dXt Yt )]
0 0
Z T (15)
= E[ H(t, X̂t , α b, Ŷt , Ẑt , ηbt ) − H(t, Xt , α b, Yt , Ẑt , ηt )dt
0
Z T Z T
− (b(t, X̂t , Ŷt , η̂t ) − b(t, Xt , Yt , ηt )) Yt dt − b(t, X̂t , Ŷt , η̂t ) (Ŷt − Yt )dt]
0 0

where we take advantage of the fact that

H(t, Xt , α b, Yt , Zt , ηt ) = σ T (Ẑt − Zt )
b, Yt , Ẑt , ηt ) − H(t, Xt , α

We now obtain independent bounds for three portions of this problem.


We have that
Z T Z T
E[ H(t, X̂t , α
b, Yt , Zt , ηt )−H(t, Xt , α
b, Yt , Zt , ηt )dt− (X̂t −Xt ) ∂x H(t, Xt , α
b, Yt , Zt , ηt )dt] ≥ 0
0 0

by convexity.
In addition, we have
Z T Z T
E[ H(t, X̂t , α
b, Ŷt , Ẑt , η̂t ) − H(t, X̂t , α
b, Yt , Ẑt , ηt )dt − ∂y H(t, X̂t , α
b, Ŷt , η̂t ) (Ŷt − Yt )dt] ≥ 0
0 0

since H is concave in (Yt , ηt ), so that

H(t, X̂t , α
b, Ŷt , Ẑt , η̂t ) − H(t, X̂t , α
b, Yt , Ẑt , ηt )
(16)
≥ ∂y H(t, X̂t , α
b, Ŷt , Ẑt , η̂t )(Ŷt − Yt ) + ∂η H(t, X̂t , α
b, Ŷt , Ẑt , η̂t )(η̂t − ηt )

and the latter term ∂η H(t, X̂t , α


b, Ŷt , η̂t )(η̂t − ηt ) ≥ 0 Therefore,

J(α̂t , η̂t ) − J(α̂t , ηt )


hZ T
≥E H(t, X̂t , α̂t , Ŷt , Ẑt , η̂t ) − H(t, Xt , α̂t , Yt , Ẑt , ηt )dt (17)
0
Z T Z T i
− (dX̂t Ŷt − dXt Yt ) + (X̂t − Xt )dYt + (dX̂t − dXt )Yt ≥ 0
0 0

8
and so ηt achieves a maximum with respect to any admissible control αt .
A similar argument yields that, again by convexity of g and Ito’s formula, comparing the
optimal strategy α̂t with an arbitrary control αt , we get J(α̂t , η̂t ) ≤ J(αt , η̂t ). The proof is
provided for completeness. Under the assumption of convexity, we have that
Z T Z T
g(X̂T ) − g(XT ) ≤ (X̂T − XT )ŶT = (X̂t − Xt )dŶt + (dX̂t − dXt )Ŷt
0 0

Furthermore, we have
Z T
E[ f (t, µt , X̂t , α̂t , ηbt ) − f (t, µt , Xt , αt , ηbt )dt]
0
Z T Z T (18)
= E[ H(t, X̂t , α̂t , Ŷt , Ẑt , ηb) − H(t, Xt , αt , Ŷt , Ẑt , ηb)dt − (dX̂t − dXt ) Ŷt ]
0 0

where the second statement is ≤ 0, so again, adding up the bounds for f and g yields

J(α̂t , η̂t ) − J(αt , η̂t )


Z T Z t 
≤E H(t, X̂t , α̂t , Ŷt , Ẑt , ηb) − H(t, Xt , αt , Ŷt , Ẑt , ηb)dt + (X̂t − Xt ) dŶt
0 0
Z T Z t 
≤E H(t, X̂t , α̂t , Ŷt , Ẑt , ηb) − H(t, Xt , αt , Ŷt , Ẑt , ηb)dt − (X̂t − Xt ) ∂x H(t, X̂t , α̂t , Ŷt , Ẑt .b
η) ≤ 0
0 0

(19)

by convexity of H with respect to (x, α)

Lemma 2.1. Under the same assumptions of Lemma (1.1), we have that
Z T Z T
0
µ0
µ µ 0
α , ηb ; µ) + hx0 − x0 , Y0 i + λE[
J(b |b 2
α − αt | dt] − C E[ ηtµ − ηbtµ |2 dt]
|b
0 0
Z T (20)
0 0
≤ J([bαµ , ηbµ , µ0 ]; µ) + E[ hb0 (t, µ0 ) − b0 (t, µ), Yt idt]
0
µ0 µ0
α , ηb , µ0 ]; µ) uses the optimal controls α
where J([b b, ηb with respect to the measure µ0 , but
evaluates the cost function with respect to the measure µ.
Furthermore, Assumptions 1.4 and 1.5 give us
Z T
0
J(b µ µ
α , ηb ; µ) + hx00 − x0 , Y00 i − λη E[ ηtµ − ηbtµ |2 dt]
|b
0
Z T (21)
µ0 µ0 0
≥ J([b
α , ηb , µ ]; µ) + E[ hb0 (t, µ0 ) − b0 (t, µ), Yt0 idt]
0
0
where Yt0 in the second expression is the adjoint process based on the BSDE driven by (Xt0 , ηbtµ )

9
Proof. We substitute in Assumptions (1.2),(1.4) in the respective expressions in the proof of
Theorem (2.1).
In particular, we have that
Z T
hXT0 − XT , YT i = hX00 − X0 , Y0 i + dhXt0 − Xt , Yt i (22)
0

and the contribution of the drift terms with respect to the measures µ, µ0 comes from the
equation

hb(t, x, µ0 , α, η), yi + f (t, x, µ, α, η) = H(t, x, µ, α, y, η) + hb0 (t, µ0 ) − b0 (t, µ), yi (23)

Corollary 2.1. There exists constant C, C 0 ∈ R+ such that


Z T
µ µ 0 0 0
α , ηb ; µ) + Chx0 − x0 , Y0 − Y0 i + λE[
J(b αµ − αt |2 dt]
|b
0
Z T (24)
0 0
αµ , ηbµ , µ0 ]; µ) + C 0 E[
≤ J([b hb0 (t, µ0 ) − b0 (t, µ), Yt − Yt0 idt]
0

3. Mean-Field FBSDE

The optimal controls α


b, ηb can be inserted into the stochastic differential equation for
the state Xt , as well as a stochastic differential equation for the adjoint process Yt . The
dependency of the feedback form for α
b, ηb on (Xt , Yt ) creates a coupling between these SDEs
and allows us to derive a McKean-Vlasov type forwards backwards stochastic differential
equation (FBSDE).
The FBSDE can be written as

dXt = b(t, Xt , µt , α
bt , ηbt )dt + σdWt
(25)
dYt = −∂x H(t, Xt , µt , α
bt , Yt , Zt , ηbt ) + Zt dWt

with initial condition X0 = x0 and terminal condition YT = ∂x g(Xt , µt ).


The existence and uniqueness of an equilibrium measure µt for the mean-field game comes
as a result of the Lipschitz continuity of the parameters in the problem, as well as Lipschitz
regularity for the measure µt with respect to the 2-Wasserstein distance.

Theorem 3.1. Under the assumptions (1.1,1.2,1.4,1.5,1.6,1.7), (25) has a solution.

10
Lemma 3.1. Given a measure flow µt ∈ P2 (Rd ), the FBSDE has a unique solution and there
exists a constant c, and a function u(t, x; µ) such that

|u(t, x; µ) − u(t, x0 ; µ)| ≤ c|x − x0 | (26)

and P-almost surely, Yt = u(t, x; µ).

Proof. Standard results from (Delarue (2002)) gives us a unique solution of the FBSDE in a
local neighbourhood of t, due to the Lipschitz structure of the backwards SDE in (x,y).
We note that ηbt = ηb(t, Xt , µt , Yt , αt ) in feedback form, so that we can view the FBSDE as the
solution of a standard optimization problem.
We now claim that
0
|Ytx − Ytx |2 ≤ c|x − x0 |2 , ∀x, x0 ∈ Rd (27)

so that Delarue (2002) implies that a unique solution exists on [0,T]


By the estimates in (Lemma 2.1), we can deduce that there exists some constant c independent
of t such that
Z T Z T
t0 ,x0 t0 ,x00 2 t ,x00 2 t ,x00
E[ |b
ηt − ηbt | ]dt + E[ αtt0 ,x0 − α
|b bt 0 | ]dt ≤ chx00 − x0 , Yt00 − Ytt00 ,x0 i (28a)
t t

By standard SDE (see Karatzas and Shreve (2014)) and BSDE (see Karoui and Mazliak
(1997), Yong and Zhou (1999)) estimates, we have, for some constant C
Z T Z T
t0 ,x0 t0 ,x00 2 t0 ,x0 t0 ,x00 2 t0 ,x0 t0 ,x00 2 t ,x00 2
ηtt0 ,x0 −b

E[ sup |Xt −Xt | ]+E[ sup |Yt −Yt | ] ≤ C E[ |b
αt −b αt | ]+E[ |b ηt 0 |]
t0 ≤t≤T t0 ≤t≤T t0 t0
(29)
and so combining the estimates, we obtain

t ,x00 2 t ,x00 2 t ,x00


E[ sup |Xtt0 ,x0 − Xt 0 | ] + E[ sup |Ytt0 ,x0 − Yt 0 | ] ≤ c0 hx00 − x0 , Yt00 − Ytt00 ,x0 i (30)

t0 ≤t≤T t0 ≤t≤T

for some constant c0 , which proves (27)


The representation u(t, x) = Y t,x almost surely comes from Lipschitz continuity in x and
Delarue (2002).

Definition 3.1. Φ : P(Rd ) → P(Rd ) is a mapping of the measure µt to Law(Xt ) under the
dynamics induced by (b
α, ηb).

11
Φ is the mapping of the frozen measure flow to the distribution of Xt . In order for a
mean-field game to exist, we need to show that there exists a measure flow µt such that the
consistency equation Φ(µt ) = µt is satisfied.

Lemma 3.2. The system (25) is solvable if we assume that |∂x g|, |∂x f | ≤ c for some constant
c, as well Assumptions (1.1,1.2,1.4,1.5,1.6,1.7).

Proof. First Step


We first establish a bound on the derivatives of the Hamiltonian. In particular,

|∂x H(t, x, µt , α
bt , y, ηbt )| ≤ C1 + C2 |y| (31)

and so, by Gronwall’s Lemma, we observe that there is a constant c depending on C1 , C2 such
that
|Ytx0 | ≤ c, ∀0 ≤ t ≤ T (32)

and therefore there is a (possibly different) constant c such that

|b
α(t, Xt , µt , Yt )|, |b
η (t, Xt , µt , Yt )| ≤ c, ∀0 ≤ t ≤ T (33)

Finally these estimates tell us that there is a constant C such that

E[ sup |Xtx0 ,µ |4 ] < C (34)


0≤t≤T

We therefore restrict ourselves to the set of measures with finite fourth moments

E = {µ ∈ P4 | M4 (µ) ≤ C} (35)

Second Step
We can see that Φ is relatively compact since
h i
|Xtx0 ;µ − Xsx0 ;µ | x0 ;µ
≤ c (t − s)(1 + sup |Xr |) + |Bt − Bs | (36)
0≤r≤T

Third Step
We first note that
0
W1 (Φ(µ), Φ(µ0 )) ≤ E[ sup |Xtx0 ;µ − Xtx0 ;µ ] (37)
0≤t≤T

12
and from the same proof as in Lemma 2.1, we obtain
Z T
J(b
α, ηb; µ) + λE[ αt0 − α
|b bt |2 dt]
0
Z T (38)
0 0 0
≤ J([b
α , ηb , µ ]; µ) + E[ hb0 (t, µ0t ) − b0 (t, µt ), Yt idt]
0
so that
Z T
λE[ αt0 − α
|b α, ηb; µ0 ) − J(b
bt |2 dt] ≤J(b α, ηb; µ) + J([b α0 , ηb0 , µ0 ]; µ) − J(b
α0 , ηb0 ; µ0 )
0
Z T (39)
+ C E[ hb0 (t, µ0t ) − b0 (t, µt ), Yt − Yt0 idt]
0
We define the controlled diffusion

dUt = [b0 (t, µ0t ) + b1 (t)Ut + b2 (t)b


αt + b3 (t)b
ηt ]dt + σdWt , U0 = x0 (40)

from which Gronwall’s Lemma implies that there is some constant C (which may differ line
by line) such that
h i Z T
x0 ;µ 2
E sup |Xt − Ut | ≤ C W22 (µt , µ0t )dt (41)
0≤t≤T 0
By Assumption 1.5
Z T
0 0 0 0 0 0 0
α, ηb; µ ) − J(b
J(b α , ηb ; µ ]; µ) − J(b
α, ηb, µ) + J([b α , ηb ; µ ) ≤ C( W22 (µt , µ0t ))1/2 (42)
0

which allows us to obtain


Z T Z T
0
|b
α −α
b| dt ≤ C 2
W22 (µt , µ0t )dt (43)
0 0

We can use the approximation for η from Lemma 2.1 with Assumption 1.5 and derive a
RT 0
η − ηb|2 dt by using (43) which, when combining the above result with
similar result for 0 |b
Assumption 1.2, (41) and (43), gives us
Z T Z T Z T 1/2
0 0 x0 ;µ x0 ;µ0 2
E[ |b
αt −b 2
αt | dt]+E[ |b
ηt −b 2
ηt | dt]+E[ sup |Xt −Xt | ] ≤ C W22 (µt , µ0t ) (44)
0 0 0≤t≤T 0

by Gronwall’s Lemma. Holder’s inequality for probability spaces, in conjunction with the fact
that µ ∈ E has bounded 4th order moments, then gives
Z T 1/4 Z T 1/4
0 0 1/2
W1 (Φ(µ), Φ(µ )) ≤ c 2
W2 (µt , µt ) ≤c W1 (µt , µ0t ) (45)
0 0

from which we can conclude that Φ is continuous on E with respect to the 1-Wasserstein
distance W1 .
Therefore, by Schauder’s fixed point theorem, there exists an equilibrium measure µt for the
mean-field game and so (25) is solvable.

13
4. -Nash equilibrium

Assumption 4.1. The cost functional f (t, x, µ, α, η) satisfies


Z T Z T
2
|βt | dt → ∞ =⇒ f (t, x, µ, βt , ηbt )dt → ∞ (46)
0 0

where ηb = ηb(t, Xt , βt , Yt ) in feedback form.

A consequence of this assumption is that a player’s control must be almost surely bounded.
The following result is provided as reference during the proof.

Lemma 4.1. [Theorem 10.2.1 Rachev and Rüschendorf (1998)] Given µ ∈ Pd+5 (Rd ), we
have that
E[W22 (µ̄N , µ)] ≤ CN −2/(d+4) (47)

where C is a constant that depends on d and Md+5 (µ), and µ̄N denotes the empirical measure
for a sample of size N with respect to the probability measure µ

Before we proceed, there are a few definitions which will be used in order to (hopefully)
avoid confusion when differentiating the finite player game from the mean-field game.

Definition 4.1. We let J¯N,i (α) denote the cost function of the ith player in an N player
¯
game, when the players use the strategy α. In contrast, J(α) will denote the cost function of
the ith player in the mean-field game when the players use the strategy α.
btN,i is used when the ith player employs the mean-field optimal strategy in the N-
Similarly, α
player game.
µN
t is the empirical distribution in the N-player version of the game.

X̄ti is used to denote the ith player’s state in the mean-field game, while Xti denotes the state
in the N-player game.

Theorem 4.1. Under Assumptions (1.1,1.2,1.4,1.5,1.6,1.7,4.1), the strategies αtN,i form an


-Nash equilibrium of the N-player game. Specifically, we have

• N ≤ c N −1/(d+4)

• J¯N,i (β i ; α−i ) ≥ J¯N,i (α) − N

14
for some constant c and sequence of positive real numbers {N }N ∈N where β i ∈ A and
limN →∞ N = 0

Proof. By exchangeability, the problem is symmetric for any particular player and so we can
assume without loss of generality that i = 1.
We first define the process Ut1 to be the dynamics of the player under an arbitrary admissible
control βt .
dUt1 = b(t, Ut1 , µt , βt1 , ηt1 )dt + σ dWt1 (48)

By Gronwall’s inequality, we have that


Z T
E[ sup |Ut1 |2 ] ≤ c(1 + E[ |βt1 |2 + |ηt1 |2 dt]) (49)
0≤t≤T 0

and, observing that E[sup0≤t≤T |Ut1 |2 ] < c we get that


N Z T
1 X 1 2 1
E[ sup |Ut | ] ≤ c(1 + E[ |βt1 |2 + |ηt1 |2 dt]) (50)
N j=1 0≤t≤T N 0

Then, Snitzman’s proof allows us to derive a uniform bound of

max E[ sup |Xti − X̄ti |2 ] ≤ c N −2/(d+4) (51)


1≤i≤N 0≤t≤T

so that the inequality


N
2 X i 1
W22 (µ̄N
t , µt ) ≤ |Xt − X̄ti |2 + 2W22 ( δX̄ti , µt ) (52)
N i=1 N
gives us the estimate
−2/(d+4)
sup E[W22 (µ̄N
t , µt )] ≤ c N (53)
0≤t≤T

Lipschitz properties of f, g give us the bound of

|J−J N |
Z T Z T
= |E[g(XT , µT ) + f (t, X̄ti , µt , α
bti , ηbti )dt − g(XT , µN
T) − f (t, Xti , µN btN,i , ηbtN,i )dt]|
t ,α
0 0
N
1 X
≤ cE[(1 + |X̄Ti |2 + |XTi |2 + |Xji |2 )]1/2 E[|X̄ti − Xti |2 + W22 (µt , µN
t )]
1/2
N j=1
Z T N
 1 X i2
+c E[(1 + |X̄ti |2 + |Xti |2 + αti |2
|b + αtN,i |2
|b + ηti |2
|b + ηtN,i |2
|b + |X | )dt]1/2
0 N j=1 j

× E[|X̄ti − Xti |2 + |b btN,i |2 + |b
αti − α ηti − ηbtN,i |2 + W22 (µt , µN
t )]1/2
dt

(54)

15
so that we can use our previous bounds to obtain

|J − J N | ≤ cE[|X̄Ti − XTi |2 + W22 (µt , µNt )]


1/2

Z T (55)
+ c( E[|X̄ti − Xti |2 + |b
αti − α ηti − ηbtN,i |2 + W22 (µt , µN
btN,i |2 + |b t )]
1/2
)
0

and hence, since the controls α


b, ηb are Lipschitz in X

αti − ᾱtN,i |, |b
|b ηti − η̄tN,i | ≤ C |X̄ti − Xti | (56)

we have that
J¯N = J + O(N −1/(d+4) ) (57)

Furthermore, we get
h i c Z TX N h i
1 1 2 j j 2
E sup |Ut − Xt | ≤ E sup |Ur − Xr | ds
0≤s≤t N 0 j=1 0≤r≤s
Z T Z T (58)
N,1 2
+ cE 1
|βt − αt | dt + cE |ηt1 − ηtN,1 |2 dt
0 0

N
Z TX
h
i i 2
i c h
j j 2
i
E sup |Ut − Xt | ≤ E sup |Ur − Xr | ds (59)
0≤s≤t N 0 j=1 0≤r≤s

so that Gronwall’s inequality gives


N Z T Z T
1 X h j j 2
i c N,1 2

E sup |Ut − Xt | ≤ E 1
|βt − αt | dt + E |ηt1 − ηtN,1 |2 dt (60)
N j=1 0≤t≤T N 0 0

and so therefore
Z T Z T
h i c N,1 2

i i 2
sup E |Ut − Xt | ≤ E 1
|βt − αt | dt + E |ηt1 − ηtN,1 |2 dt (61)
0≤t≤T N 0 0

If we fix a constant A > 0, then we can see that there exists a constant CA such that
hZ T i
E |βt | dt ≤ A =⇒ max sup E[|Uti − X̄ti |2 ] ≤ CA N −2/(d+4)
1 2
(62)
0 2≤i≤N 0≤t≤T

from which we can then obtain the bound


N
1 X
E[|Utj − X̄tj |2 ] ≤ CA N −2/(d+4) (63)
N − 1 j=2

16
Applying a triangle inequality for the Wasserstein distance, we get
h 1 X N N
1 X i
E[W22 (ν̄tN , µt )] ≤cE W22 δU j , δU j
N j=1 t N − 1 j=2 t
N N
(64)
1 X j j 2
h  1 X
2
i
+ E[|Ut − X̄t | ] + E W2 δX̄ j , µt
N − 1 j=2 N − 1 j=2 t

combining with the inequality


h 1 X N N N
1 X i 1 X
2
E W2 δU j , δU j ≤ E[|Ut1 − Utj |2 ] (65)
N j=1 t N − 1 j=2 t N (N − 1) j=2

as well as previous estimates and Lemma 4.1, we get

E[W22 (ν̄tN , µt )] ≤ C N −2/(d+4) (66)

Finally, if we define
dŪt1 = b(t, Ūt1 , µt , βt1 , ηbt1 )dt + σ dWt1 (67)

we get
Z t Z t Z t
Ut1 − Ūt1 = [b0 (s, ν̄sN ) − b0 (s, µs )]ds + b1 (s)[Us1 − Ūs1 ]ds + η U − ηbŪ ]
b3 (s)[b (68)
0 0 0

From (56) and the fact that ηb is Lipschitz with respect to x, we can see that an application
of Gronwall’s inequality gives us

sup E[|Ut1 − Ūt1 |2 ] ≤ CA N −2/(d+4) (69)


0≤t≤T

(where CA may be a different constant than previously defined) and so, applying the same
techniques as in the derivation of (57), we get that

J¯N,1 (β 1 ; α−1 ) ≥ J − CA N −1/(d+4) (70)

As Assumption 4.1 always allows us to find such a constant A > 0, by combining (57) with
(70) we obtain the bound
J¯N,i (β i ; α−i ) ≥ J¯N,i (α) − N (71)

for any β i ∈ A.

17
4.1. Comments on assumptions
The concavity assumptions in the paper are in line with the multiplier preference setting,
as the relative entropy of the measure change in a setting with Brownian motions is quadratic
in nature. In particular, 8 and 2.1 are both satisfied when the cost functional is quadratic in
α and η subject to certain constraints on the constants in front of the quadratic terms.
Furthermore, although having bounded derivatives ∂x f, ∂x g is fairly restrictive in general, the
class of functions which satisfy these assumptions can be used as approximate solutions to
other problems. Unfortunately, the convex-concave nature of the problem makes a formal
approximation theorem difficult to prove, however we believe that an approximation theorem
like the one in Carmona and Delarue (2013) should hold and leave the proof for a future
paper.

5. Linear-Quadratic Game

The setup for an ambiguity averse agent without a mean-field component in the dynamics
has been looked at in Bauso et al. (2016). We will look at a vector-valued version of the
problem with a mean-field term in the drift, and characterize its solution.

5.1. Problem Formulation


Let Wt be an n-dimensional Brownian motion, and xt ∈ Rn , vt ∈ Rk . We wish to find an
optimal control v ∈ L2 ([0, T ], Rn ) to minimize
1 T |
Z
dQ 
J(v) =E[ xt Qt xt + vt| Rt vt + (xt − St E[xt ])| Q̄t (xt − St E[xt ]) − φ log dt]
2 0 dP (72)
1 1
+ E[ x|T QT xT + (xT − ST E[xT ])| Q̄T (xT − ST E[xT ])]
2 2
subject to the dynamics

dxt = (At xt + Bt vt + Āt E[xt ] + ηt )dt + σt (dWt + ηt dt), x(0) = x0 (73)


dQ
where ηt is the change in drift induced by the measure change dP
. Due to Girsanov’s theorem,
we have an explicit representation for the measure change of the form log dQ
dP
= ηt σt−1 dWt +
1 | −1
η σ ηt dt
2 t t
so that the performance criteria can be rewritten as
1 T |
Z
1 
J(v) =E[ xt Qt xt + vt| Rt vt + (xt − St E[xt ])| Q̄t (xt − St E[xt ]) − (ηt )| Φ−1
t ηt dt]
2 0 2 (74)
1 | 1 |
+ E[ xT QT xT + (xT − ST E[xT ]) Q̄T (xT − ST E[xT ])]
2 2

18
Theorem 5.1. The above problem is solvable with optimal control v = −R−1 B | p and measure
change ηt = Φt p where (y, p) satisfy the forwards-backwards stochastic differential equation

dyt = At yt − (Bt R−1 Bt| − Φt ) pt dt + Āt E[yt ]dt + σt dWt

−dpt = [(Qt + Q̄t )yt + A|t pt ]dt + [−Q̄t St E[yt ] − St| Q̄t (I − St )E[yt ] + Ā|t E[pt ]]dt + qt dWt

y 0 = x0

pT = (Qt + Q̄t )yT − Q̄T ST E[yT ] − ST| Q̄T (I − ST )E[yT ]


(75)
Proof. First, we define

H(t, x, v, p, η) =(At xt + Bt vt + Āt E[xt ] + ηt )pt


 1  (76)
| | | | −1
+ xt Qt xt + vt Rt vt + (xt − St E[xt ]) Q̄t (xt − St E[xt ]) − (ηt ) Φt ηt
2
Then by solving the equations

∂v H(t, x, v, p, η) = 0
(77)
∂η H(t, x, v, p, η) = 0
we obtain the optimal controls

v = −R−1 B | p
(78)
ηt = Φt p
plugging these back in, and noting that it should satisfy the FBSDE

dXt = ∂y H(t, x, v, p, η)dt + σt dWt

dYt = −∂x H(t, x, v, p, η)dt + qt dWt


(79)
X 0 = x0

YT = (Qt + Q̄t )XT − Q̄T ST E[XT ] − ST| Q̄T (I − ST )E[XT ]


we obtain the equations

dyt = At yt − (Bt R−1 Bt| − Φt ) pt dt + Āt E[yt ]dt + σt dWt

−dpt = [(Qt + Q̄t )yt + A|t pt ]dt + [−Q̄t St E[yt ] − St| Q̄t (I − St )E[yt ] + Ā|t E[pt ]]dt + qt dWt

y 0 = x0

pT = (Qt + Q̄t )yT − Q̄T ST E[yT ] − ST| Q̄T (I − ST )E[yT ]


(80)

19
Although the ambiguity averse setting is in general distinct from the standard linear-
quadratic setup, if Bt is invertible for all t ∈ [0, T ], then there is an equivalent linear-
quadratic game without ambiguity aversion with the same control, namely obtained by re-
parameterizing R to R − Φ−1 B.

5.2. -Nash equilibrium

Since the dynamics under the optimal control of the linear-quadratic mean-field game
with ambiguity aversion can be associated with a version of the linear-quadratic mean-field
game without ambiguity aversion, the -Nash equilibrium holds with the same arguments as
in Bensoussan et al. (2016). We note that the existence, uniqueness and validity of the -Nash
equilibrium is contingent on Assumption 1.2, so that the condition provides a sufficient bound
for the disturbance attenuation parameter Φ.

5.3. Comments

We can see that in the linear-quadratic case, ambiguity aversion is equivalent to reducing
the cost of the control in the cost function. Furthermore, Assumption 1.2 and the concavity
assumption on the Hamiltonian are equivalent in this scenario. We note that when 1.2 be-
comes close to being violated, the optimal control approaches ∞, so that it is unattainable.
Therefore, for linear-quadratic games, the condition is both necessary and sufficient for the
mean-field game to exist.
Bauso et al. (2016) provide a framework for robust mean-field games in which the solution
requires iteratively solving two PDE. The setup with Pontryagin maximum principle allows
us to use numerical FBSDE schemes, which is an active area of research. Some examples of
FBSDE numerical schemes are presented in Douglas et al. (1996), Bender et al. (2008) and
Zhao et al. (2014).

20
References

Bauso, D., H. Tembine, and T. Başar (2012). Robust mean field games with application to
production of an exhaustible resource. IFAC Proceedings Volumes 45(13), 454–459.

Bauso, D., H. Tembine, and T. Başar (2016). Robust mean field games. Dynamic games and
applications 6(3), 277–303.

Bender, C., J. Zhang, et al. (2008). Time discretization and markovian iteration for coupled
fbsdes. The Annals of Applied Probability 18(1), 143–177.

Bensoussan, A., K. Sung, S. C. P. Yam, and S.-P. Yung (2016). Linear-quadratic mean field
games. Journal of Optimization Theory and Applications 169(2), 496–529.

Carmona, R. and F. Delarue (2013). Probabilistic analysis of mean-field games. SIAM Journal
on Control and Optimization 51(4), 2705–2734.

Delarue, F. (2002). On the existence and uniqueness of solutions to fbsdes in a non-degenerate


case. Stochastic processes and their applications 99(2), 209–286.

Douglas, J., J. Ma, P. Protter, et al. (1996). Numerical methods for forward-backward stochas-
tic differential equations. The Annals of Applied Probability 6(3), 940–968.

Geering, H. P. (2007). Optimal Control with Engineering Applications. Springer Berlin


Heidelberg.

Huang, M., R. P. Malhamé, P. E. Caines, et al. (2006). Large population stochastic dynamic
games: closed-loop mckean-vlasov systems and the nash certainty equivalence principle.
Communications in Information & Systems 6(3), 221–252.

Karatzas, I. and S. Shreve (2014). Brownian Motion and Stochastic Calculus. Graduate Texts
in Mathematics. Springer New York.

Karoui, N. and L. Mazliak (1997). Backward Stochastic Differential Equations. Chapman &
Hall/CRC Research Notes in Mathematics Series. Taylor & Francis.

Lasry, J.-M. and P.-L. Lions (2007). Mean field games. Japanese journal of mathematics 2(1),
229–260.

21
Moon, J. and T. Başar (2017). Linear quadratic risk-sensitive and robust mean field games.
IEEE Transactions on Automatic Control 62(3), 1062–1077.

Rachev, S. T. and L. Rüschendorf (1998). Mass Transportation Problems: Volume I: Theory,


Volume 1. Springer Science & Business Media.

Yong, J. and X. Y. Zhou (1999). Stochastic controls: Hamiltonian systems and HJB equations.
Springer-Verlag: New York, NY.

Zhao, W., Y. Fu, and T. Zhou (2014). New kinds of high-order multistep schemes for
coupled forward backward stochastic differential equations. SIAM Journal on Scientific
Computing 36(4), A1731–A1751.

22

You might also like