Microeconomic Theory For The Social Sciences
Microeconomic Theory For The Social Sciences
Microeconomic Theory For The Social Sciences
Takashi Hayashi
August 30, 2015
Preface
This book covers microeconomic theory at the level of intermediate/advanced
undergraduates, but I also intend it to be an introduction for those with other
intellectual backgrounds, who do not necessarily agree to what so-called mainstream economists say, but at least feel it OK to know how they think and see
things.
ii
iii
On terminologies
Microeconomics is related to the society, and because of this the words used
there naturally overlap real-life wordings. This is actually dangerous.
For example, what do you imagine from the words such as utility, perfect
competition, and eciency? If you imagine some kind of substance from
the word utility, thats wrong. If you imagine a situation like everybody killing
each other from the word perfect competition, thats wrong. If you imagine a
one-dimensional criterion which ranks between all social alternatives from the
word eciency, thats wrong.
In economics these words are given precise boundaries in the form of definitions, and I relegate them to the corresponding chapters. What I like to say
to you here is that you should wipe away what you imagine from the usual life
usages of those words.
Unfortunately, the terminologies like above are accepted ones, and it may
be embarrassing if I make up new ones, so I decided to accept most of them
by putting adequate discussions when I introduce them. Nevertheless, I decided to use unconventional terminologies in some cases when Im afraid using
the accepted terminology causes a serious misunderstanding. I hope it doesnt
embarrass you too much.
On mathematical expositions
I use mathematical exposition as long as it is easier than the verbal one.
Economic theorists use mathematics not because they like to mystify but because it is the easiest way to share understandings precisely. It is the easiest
way to precisely share definitions, assumptions and the process of deriving conclusions from them, and to avoid confusions and errors which often happen in
the arguments by natural languages.
Of course what we mean by easier will depend on the audiences. I guess it is
hard in the beginning, but I bet you will see it much easier as you proceed. Also
I tried to give explanations to mathematical notions in a self-contained matter
as much as possible when they are necessary for reading ahead.
He or She
Because of the nature of the subject, I use third-person singular pronoun repeatedly. It is always a problem for economic theorists if we should use he or
she.
There isnt a gender-neutral third-person singular pronoun in English (neither in my mother language) which refers to an abstract individual, and it
will be embarrassing if I make up such one. So I have to make a choice. I
sometimes use she in research papers, but given that Im a male this might
be somewhat artificial. So I decided to use he. I wish this doesnt make my
texts look sexist.
iv
Acknowledgements
This book is largely based on my book published in Japanese (1st edition in
2007, 2nd in 2013). I like to thank Minerva Shobo publisher and Mr. Kentaro
Horikawa for their cooperation which enabled the publication of that book. Most
of the materials are originally based on my lecture notes given at the University
of Texas at Austin. Some topics are based on my lecture notes given at the
University of Glasgow.
Mathematical Notation
Im not actually using serious mathematics and the most of diculties you might
face will be simply due to unfamiliar notations, which I use for the purpose of
concision. Here I give brief description of them.
x X is read as x belongs to X or x belonging to X.
{x X : f (x)} denotes the set consisting of x belonging to X which
satisfies proposition f (x).
is universal quantifier. x; f (x) is read as every x satisfies proposition
f (x), and x X; f (x) is read as every x belonging to X satisfies
proposition f (x).
is existential quantifier. x; f (x) is read as there exists at least one
x which satisfies proposition f (x), and x X; f (x) is read as there
exists at least one x belonging to X which satisfies proposition f (x).
= is the symbol of implication. For two propositions A, B, A = B
is read as If A is true, then B is true.
is the symbol of logical equivalence. For two propositions A, B,
A B is read as A is true if and only if B is true.
R denotes the set of real numbers.
R+ denotes the set of non-negative real numbers.
R++ denotes the set of positive real numbers.
Rn denotes the set of n-dimensional vectors of real numbers. Its element
is for example denoted by x = (x1 , , xn ), where its i-th coordinate is
xi .
Rn+ = {x Rn : xi 0, i = 1, , n} denotes the set of n-dimensional
non-negative vectors.
Rn++ = {x Rn : xi > 0, i = 1, , n} denotes the set of n-dimensional
positive vectors.
n
i=1 Ai denotes the product of n sets A1 , , An , that is, A1 An .
Contents
I
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
12
13
16
21
3 Preference
3.1 Preference relation . . . . . . . . . . . . . . . . . . . . .
3.2 Preference over consumptions . . . . . . . . . . . . . . .
3.3 Marginal rate of substitution . . . . . . . . . . . . . . .
3.4 Smooth preferences . . . . . . . . . . . . . . . . . . . . .
3.5 Convexity and diminishing marginal rate of substitution
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
25
31
33
34
35
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
means of
. . . . . .
. . . . . .
. . . . . .
vi
. . . . . .
. . . . . .
marginal
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
36
40
48
51
53
54
54
57
61
63
63
64
65
CONTENTS
5.8
5.9
vii
6 Demand analysis
6.1 Normal and inferior goods . . . . . . . . .
6.2 Ordinary and Gien goods . . . . . . . . .
6.3 Gross substitutes and gross complements .
6.4 Elasticity of demand . . . . . . . . . . . .
6.5 Substitution eect and income eect . . .
6.6 Income evaluation of welfare change . . .
6.7 Exercises . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
66
70
71
71
72
73
75
77
81
88
.
.
.
.
.
.
.
89
89
90
95
97
99
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
101
101
102
103
105
113
115
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
116
116
116
117
119
120
122
125
129
133
10 Revealed preference
.
.
.
.
.
.
134
CONTENTS
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
148
148
149
152
157
164
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
165
165
169
171
172
14 Production technology
14.1 1-input/1-output case . . . . . . . . . . . . . . . . . . . . . . . .
14.2 2-input/1-output case . . . . . . . . . . . . . . . . . . . . . . . .
14.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
173
173
175
180
.
.
.
.
222
223
224
228
CONTENTS
III
ix
19 Monopoly
19.1 Monopoly equilibrium . . . . . . . . . . . . .
19.2 Pareto ineciency of monopoly equilibrium .
19.3 Price discrimination and monopolistic surplus
19.4 Exercises . . . . . . . . . . . . . . . . . . . .
229
.
.
.
.
.
.
.
.
.
.
.
.
230
231
233
234
241
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
243
244
247
248
251
253
260
265
268
271
.
.
.
.
.
.
272
272
273
278
280
283
286
22 Oligopoly
22.1 Simultaneous quantity setting (Cournot competition)
22.2 Sequential quantity setting: Stackelberg competition
22.3 Simultaneous price setting: Bertand competition . .
22.4 Sequential price setting . . . . . . . . . . . . . . . .
22.5 Convergence to perfect competition . . . . . . . . . .
22.6 Collusion . . . . . . . . . . . . . . . . . . . . . . . .
22.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
287
288
291
294
299
302
304
306
IV
. . . . . .
. . . . . .
extraction
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
information
309
. . . . . . . . . . 309
. . . . . . . . . . 315
. . . . . . . . . . 316
CONTENTS
24 Auction
24.1 Prominent auction formats . . . . . . . . . . .
24.2 Information, timeline and the natures of values
24.3 Preferences . . . . . . . . . . . . . . . . . . . .
24.4 First-price auction . . . . . . . . . . . . . . . .
24.5 Second-price auction . . . . . . . . . . . . . . .
24.6 The revenue equivalence theorem . . . . . . . .
24.7 Exercises . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
317
317
318
319
319
323
325
326
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
327
327
331
335
338
342
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26 Externality
345
26.1 Market failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
26.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
27 Public goods and the free-rider problem
27.1 Public goods . . . . . . . . . . . . . . . . . .
27.2 Eciency criterion: the Samuelson condition
27.3 The case of quasi-linear preferences . . . . . .
27.4 The free-rider problem . . . . . . . . . . . . .
27.5 Strategy-proof mechanism . . . . . . . . . . .
27.6 Exercises . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
353
353
353
357
359
360
362
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
371
371
377
380
382
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
383
383
386
389
389
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
xi
. . . .
. . . .
equi. . . .
. . . .
397
397
398
401
404
Postscripts
407
424
Part I
Chapter 1
On the concept of
rationality in economics
Economics is very often criticized of making unrealistic assumptions. The most
common criticism will be about rationality, saying that real human beings are
not rational as assumed in economics.
I agree to some of them eventually (probably not in the way the readers
expect), but let me give some clarifications before I proceed, since the word
rationality is broad in daily life the word rational or irrational has
been used even as a convenient rhetoric to justify and praise or criticize and dis
somebodys choice or action while maintaining the appearance of being valueneutral.
It is obviously a hard problem to summarize the notion of rationality in
economics so that everybody can agree. Let me try, however, to summarize what
I understand is consistently underling economic theory. I would say rationality
in economics refers to that
1. an individual has certain consistent subjective criterion of value (called
preference);
2. he takes all the relevant contingencies into account and perceive them
correctly;
3. he goes through logically correct reasonings; and
4. he fulfills the criterion up to the maximum.
That is, the notion of rationality here is purely a formal one at an individual
level. As far as the above conditions are met we have to say that one is rational
even if he is a vicious killer. At the same time, this notion of rationality does not
presume that one is selfish, and does not exclude altruism to be a component
of individuals subjective criterion at all, as far as it remains to be consistent.
2
CHAPTER 1. RATIONALITY
Put dierently, some action being rational for an individual and its being
socially desirable are dierent issues.
Of course this clarified notion of rationality again faces criticisms, saying
that real human beings are not consistent or knowledgeable or precise or smart
as described above. It is an idealized principle which is impossible if you take
it literally.
To borrow an outdated analogy (its a shame but I can come up with only
this one), this is analogous to how physics starts its first-step argument by
assuming vacuum and no friction. There is no vacuum or frictionless situation
in reality, but such assumption helps us to build the first several laws in classical
mechanics, which kicks us upstairs so that we can understand more realistic
situations.
I view that in social sciences it is not only helpful to start with such idealized
assumption but rather necessary, in that we cannot see or understand reality as
it is without standing on such baseline. Reality is of course dierent from the
baseline, but it can be understood only by seeing how it is dierent or distant
from the baseline.
A natural question arising here is what is the postulate for a good choice of
such baseline. A most extreme form of positivist view says that it doesnt have
to have anything to do with reality and it should be the simplest assumption
under the simplest setting which can derive predictions consistent with real
phenomena as many as possible, and it is rather better as it is more unrealistic.
It says for example that it is meaningless to test whether individuals are really
solving their maximization problems rationally, and what is important is that
their behaviors are explained as if they are solving maximization problems
rationally.
I dont take this view, however, because it does not say anything about
the necessity of particular assumptions, as there may be several equally simple
principles which can explain the reality in the as if way. Why do we have to
choose the above assumptions over the others?
Also, economists have a task to do welfare analysis and provide normative
arguments, which critically depend on how much individuals are responsible
for their rationality. If an economist takes the above pure positivist view he
should not be able to draw any normative implication from his positive analysis.
If he does so it must be a deception. A typical deception is that in positive
analysis one describes individuals choices as if they are acting rationally and
in its normative implication he switches the interpretation implicitly so that the
individuals are indeed rational and responsible for their choices.
Another view about rationality often invoked is an evolutionary story, which
says that if one is not rational he would die or perish either in the social or
biological sense, meaning that there is little to lose by assuming that those who
are living (that is, who have survived by now) are rational.
CHAPTER 1. RATIONALITY
I dont take this view either, since it says at most that certain characteristics
make one the fittest and more likely to survive under certain environment.
We cannot draw any normative implication from this either, while application
of evolutionary arguments to social sciences often falls in the confusion that
such characteristics are desirable and those with such characteristics should
be dominant in the society, even in a modern civilized society it is in the
beginning strange to insist that, since if some people are really the fittest they
would have been already dominant before saying should.
I would say, the primary role of economics is to provide a consistent and
meaningful understanding of commensurability between dierent individual values each of which is solid, stable and deliberate, and how
to realize such commensuration. It is not to provide an explanation or
prediction of behavior in general, nor to grab an organic formation of value
sentiments in the society as a whole.
The choice of baseline should serve this objective, and certain abstraction is
necessary in order that we can clearly see solid, stable and deliberate individual
values.
Note that such abstraction is purely a formal one, in the sense that we
identify time horizons, spaces and contingencies over which the notion of solid,
stable and deliberate individual values makes sense, by abstracting away certain
ranges of idiosyncratic details of choice situations. It is not selecting a particular
content of social life over another, such as selecting economic rationality and
abstracting away the other ones such as political, social and cultural.
Of course, in this sense, we should note that economics has an imperialistic
ambition which tries to apply its methodology to any aspect of social life that
possesses the same formal structure.
It is natural that you wonder if such rationality approach works (or should
work). So let me briefly explain the nature of the approach, what types of
abstraction are carried out there, and list challenges to it.
To illustrate, let x denote input, which is an observable external element
given to the individual (such as constraint or situation), and let y denote output,
which is his observed behavior. If we take the most extreme stance of socalled behavioralism, then we consider only a functional relationship which holds
between input and output. Denote such functional relationship by f , then the
relation between input and output is denoted by
y = f (x).
In the empirical side, given data consisting of pairs of input and output
(x1 , y2 ), (x2 , y2 ), , (xn , yn ),
CHAPTER 1. RATIONALITY
= f (x1 )
y2
= f (x2 )
..
.
yn
= f (xn ),
but it does not impose any other features on it, particularly the ones about
internal mental state.
On the other hand, the rationality approach in economics considers the
functional relationship
y = g(, x),
where the parameter describes preference (corresponding to 1 in the above)
and g describes the maximization behavior (corresponding to 2 to 4 in the
above). There the right-hand-side g(, x) refers the choice which achieves preference up to the maximum under given external condition x.
In the empirical side, given data consisting of pairs of input and output
(x1 , y2 ), (x2 , y2 ), , (xn , yn ),
if you can back up parameter which meets
y1
y2
= g(, x1 )
= g(, x2 )
..
.
yn
= g(, xn )
CHAPTER 1. RATIONALITY
CHAPTER 1. RATIONALITY
where (pasta, hamburger) denotes the choice of eating pasta on the first day and
hamburger on the second day (although we cannot specify the ranking among
the three on the right-hand-side from this observation alone).
It is natural to wonder, As you set the time horizon longer you can absorb
variation of choices across periods into the length of the time horizon. Doesnt
it mean that by taking time horizon arbitrarily long we can explain anything
as rational choice generated by a fixed preference over arbitrarily long objects? In other words, if we take it literally that life is just once anything is
rational since there is only one sample.
This is ultimately a problem for the outside observer, who judges how long
the time horizon should be taken so that observed choice data are seen as a repetition of some complete problem and it is meaningful to think of consistency
and inconsistency across samples. For example, you can think of proportions of
kinds of lunch meals during one month, and see the data as a repetition of such
monthly summary.
What if the life is not a repetition of an identical problem, and like a whole
life only one dynamic choice problem is given to an individual and we can
observe just one sample of his life path? In statistical treatment of such dynamic
choice problems, we usually consider that people are ex-ante identical and
take dierent life paths because of dierent inputs, which are observable, and
unobservable noises.
Now, how should we think if we see inconsistencies of choices even after selecting the time horizon as adequately as possible? Let us think of the following
example.
Example 1.1 You have a choice of starting cocaine or not. There are three
possible paths:
A: Start it and quit it later.
B: Start it and continue.
C: Dont take it at all.
As you have not started cocaine yet and you are curious, your preference
over the paths is
A>C>B
However, because of the nature of addiction, once you start cocaine you become
a dierent person, literally, and the preference of your new personality is
B>A
That is, there are two dierent selves, before and after taking cocaine, who
have contradicting preferences.
CHAPTER 1. RATIONALITY
In such cases one preference cannot simply determine the choice. There
are at least two ways of choice. One is that the current self makes choice by
(mistakenly) believes that he can control his future selves. It is called naive
decision. In the above example, the naive decision is to start taking cocaine,
intending to stop later, and does not actually stop it later. The other one
is that the current self foresees how future selves behave, and makes choice by
taking how future selves behave as a given constraint. It is called sophisticated
decision. In the above example the sophisticated decision is not to take cocaine
at all, given that the future self cannot stop taking cocaine once he starts it.
The cocaine example may be a bit too extreme, but this type of problem
often occurs in choice with habit formation.
Let us think of one more example.
Example 1.2 Consider the following two choice problems.
Problem A
A1: Receiving 1000 dollars after one year.
A2: Receiving 1050 dollars after one year and one week.
Problem B
B1: Receiving 1000 dollars now.
B2: Receiving 1050 dollars after one week.
The example as presented like above is somewhat misleading since you can
save money, so assume that you have to spend the money immediately after
receiving. Then, (in more carefully designed experiments) the pair of choices
like A2 and B1 is frequently observed.
Whats the problem with this? Suppose you choose A2 in A and B1 in B.
Then you would choose to sign a contract to receive 1050 dollars after one year
and one week, rather than a contract to 1000 dollars after one year. After one
year, you will regret, since you like to receive 1000 dollars immediately rather
than to wait for one more week. Thus, there is a conflict between current self
and self after one year.
How do economists think when such successive selves are present? Mostly
we then take an individual as a society consisting of dierent selves, and
game theory or social choice theory to such micro-society. Although, we should
be careful about the applicability of these theories to the micro-society, since
successive selves are not totally dierent persons from each other. I will come
to this issue in the last part of the postscripts.
Issue 2: It is untrue that an individual chooses the best available thing for him.
There are cases in which he gives up selfish choice, due to certain social
reasons.
CHAPTER 1. RATIONALITY
Consider the following example. Suppose you receive 100 dollars from somebody, and you can either spend all of that for yourself or give half of that to
your brother. If you care only about your consumption in the current period
you will spend all for your self, but many of you may give half of that to your
brother.
We can explain this in two ways.
1. The apparent departure from rationality is again due to misspecification
of time horizon and relevant contingencies. There is nothing wrong in that
a rational action causes loss in the short-run whereas it realizes gains in
the long run.
Also, there is nothing wrong in that a rational action causes loss under
particular state, while it is profitable in expectation from the ex-ante viewpoint. Insurance is a typical example.
In such a way, we can explain mutual help as a collection of individuals
selfish behaviors in the long run or under uncertainty.
2. Altruism and care for social status are nothing but a part of preference. It
appears that an individual is not choosing what he likes because the outside observer is mis-specifying his preference or captures it only partially.
In such cases, an individual compares between satisfaction of is his selfish motive and satisfaction of his altruistic motive and his taste for social
status, and after weighing he makes the total decision.
By going through the above ways we can extend the standard choice theory,
often borrowing helps of game theory to be covered in the later part of the
book. I would say that economics puts priority on the first way, since allowing
the second way without discipline may lead to anything goes. In any case, from
the viewpoint of rationality as summarized above it is not essential whether
an individual is selfish or not.
Issue 3: Individuals choice criterion does not exist independently of choice
situation.
In the above model, the structural parameter is supposed to exist prior to
and independently of x being given. That is, preference is supposed to exist
independently of choice situation. The rationality approach then considers
that behavior is a function of preference and choice situation.
It is shown by many experimental studies, however, that individuals choice
criterion depends on how choice opportunities are given. The following example
is due to Benartzi and Thaler [3].
Example 1.3 Consider coin flipping and the following two choice problems.
Problem 1: Split 100 dollars between two securities below.
CHAPTER 1. RATIONALITY
10
Chapter 2
The method (not substantive contents conveyed by it) covered in this book
can be applied to general kinds of spheres of society, not only to individual
and material consumptions. It allows for example that ones consumption may
aect other ones consumptions (externality), and that there may be a good
which many people can use at the same time (public good). Also it is not
limited to material consumption but can be applied to non-material kinds of
social actions such as political or social or cultural choice.
In the beginning, the set of choice objects X is just an abstract set. For
example, the set of choice objects in US presidential election is lets say
X = {Obama, Clinton, Romney, Palin, McCain, },
and the set of choice objects in the problem of which school to attend is lets
say
X = {Univ. A, Univ. B, Univ. C, Univ. D, Univ. E, },
and the set of choice objects in the problem of which company to work for is
lets say
X = {Co. A, Co. B, Co. C, Co. D, Co. E, }.
I guess the readers wonder here. How can we choose from them even when
not all of them can be the presidential candidates? How can we choose from
them even when not all of those schools make oers to me? How can we choose
from them even when not all of those companies make oers to me? Here I
take X to be the set of all the conceivable and potentially available objects,
putting it aside which ones are actually available to choose.
11
12
The first step in microeconomics is to write down the right set of choice
objects according to the interest. Consider again the example
X = {Co.A, Co.B, Co.C, Co.D, Co.E, }.
Here it is implicitly assumed that salary is already a fixed component of each
companys feature. However, one can consider that salary is an explicit variable
as well, then the set of choice objects is
X = {Co.A, Co.B, Co.C, Co.D, Co.E, } R+ ,
where R+ denotes the non-negative half line. Its element is for example (Co.D, m),
which means working for Company D for salary m.
Also, one can consider that which city to work in is also an explicit variable
as well. Then the set of choice objects is
X = {Co.A, Co.B, Co.C, Co.D, Co.E, } R+ {City , City , City , },
and its element is for example (Co.D, m, ), which means working for Company
D for salary m and living in City .
Also, consider what to eat for lunch then the set of choice objects is
X = {pizza, humburger, pasta, sandwitch, fish and chips, }
but if you are talking not just about lunch for today lunch but also about lunch
for tomorrow, the right description is
X
and its element is for example (sandwich, pizza), which says eating sandwich
today and pizza tomorrow.
2.2
Opportunity sets
As explained above, the set of choice objects X consists of all the potentially
available ones. However, in actual choice opportunities we are given only a
subset of it. Let us call it an opportunity set. Denote it lets say by B, then
it must satisfy B X and B = .
In the example of school choice, given the set of all schools
X = {Univ. A, Univ. B, Univ. C, Univ. D, Univ. E, },
the set of schools one can be admitted to is lets say
B = {Univ. C, Univ. E, Univ. J}.
13
In the example choosing which company to work for, given the set of all companies
X = {Co. A, Co. B, Co. C, Co. D, Co. E, }.
the set of companies from which one can get an oer is lets say
B = {Co. A, Co. K, Co. M, Co. Q}.
Here let B denote the family of opportunity sets which are institutionally
possible. The simplest form of B will consist of all the non-empty subsets of
X, but it is not always the case institutionally. For example, in US presidential
election because no more than one candidate can run from one party we cannot
have a choice opportunity like
B = {Obama, Clinton}.
2.3
Consumption set
So far, the set of choice object X or choice objects x and y can be anything.
However, in the first half of this book we consider individual consumptions
mostly, as far as we are concerned with market theory.
In this context the set of choice objects, which consists of all the potentially
possible combination of consumptions, is called consumption set. Again, because consumers are constrained by their incomes not all of them are always
available to choose. I need to explain consumption set first, however.
2.3.1
To simplify the explanation, we mostly assume that there are just two goods.
Of course this does not mean that there are really only two goods in the world,
and it is simply that the two-good illustration is enough for understanding of
the contents covered in this book. Here let me call them Good 1 and Good 2.
Also, we mostly assume that each good is homogeneous and divisible.
Like gasoline, we consider that this 1 gallon of it and that 1 gallon of it are
identical and we can buy it in arbitrarily fine quantities such as 1.367... gallons.
Of course actual accounting does now allow this but let us consider that we
can do something like this as closely as possible. On the other hand, this
house and that house are typically dierent. They are heterogeneous. Also
typically we cannot buy 0.47 units of house. It is indivisible. We will consider
heterogeneous goods and indivisible goods in the next section and in a later
chapter.
As we assume each of the two goods is homogeneous and divisible, the consumption set is given as the non-negative quadrant of the 2-dimensional plane
R2+ . Its element x = (x1 , x2 ) is called consumption vector. When the consumer is receiving x = (x1 , x2 ) it means he is receiving x1 units of Good 1 and
x2 units of Good 2 (see Figure 2.1). For example, when Good 1 is gasoline and
14
Good 2
6
x2
r x
x1
- Good 1
2.3.2
Indivisible goods
There are variations in how to describe heterogeneity and indivisibility, but here
let me pick a simple illustration: Good 1 is indivisible but homogeneous and
Good 2 is homogeneous and divisible. I will take heterogeneity and indivisibility
more seriously in Chapter 28.
Good 1, which is homogeneous but indivisible, allows only integer amounts
of consumption. Good 2 is the same as before. Then the consumption set is
Z+ R+ . See Figure 2.2, where the first-coordinate consists only of integer
values. For example, x = (x1 , x2 ) is in the consumption set because x1 = 3
is an integer. On the other hand, y = (y1 , y2 ) is not in the consumption set
because y1 is not an integer.
2.3.3
In the above formulation of consumption set we have assumed that unless the
consumer is constrained by his budget we may consider arbitrarily large amounts
of consumptions. This will be inadequate for the case of labor and leisure, as
one cannot work more than 24 hours a day in the beginning. Therefore, when
we analyze the choice of labor and leisure we assume that available hours per
period are limited in the outset, after excluding minimal necessary hours for
subsistence such as sleeping hours.
15
Good 2
6
ry
rx
- Good 1
Figure 2.2: Good 1 is indivisible
2.3.4
What is important in economics is that even if goods are materially the same
they are treated as dierent goods if they are to be consumed at dierent time
periods and dierent contingencies. For example, gasoline to be consumed today
and gasoline to be consumed tomorrow are dierent goods. Saving is an action
to buy future consumptions by means of selling current consumptions.
The simplest model of such intertemporal consumption is 2-period model.
Assume that there are just two periods, Period 1 and Period 2, and there is just
one material good in each period. It might look too simple, but this is enough
for the understanding and it can be extended to many periods.
Then the consumption set is the non-negative quadrant R2+ . That is, when
16
Consumption
6
r (l, c)
- Leisure
Figure 2.3: Leisure/labor and consumption
2.3.5
Same argument holds for uncertainty as well. For example, 1 gallon of gasoline
when Republicans win the US presidential election is a dierent good than one
gallon of gasoline when Democrats win. If you have to make some investment
decision before the election your choice = bet is described in the form of statecontingent consumption.
To simplify, focus on the case that there are just two possible states of the
world, like Republicans or Democrats and hot summer or cold summer. Call
the first one State 1 and the second one State 2, and there is just one material
good at each state. It might look too simple again, but this is enough for the
understanding and it can be extended to many states.
Then the set of state-contingent consumption vectors is described by the nonnegative quadrant R2+ . That is, when a vector of state-contingent consumption
x = (x1 , x2 ) is given it means that the consumer receives x1 units of consumption
at State 1 and x2 units st State 2.
2.4
Budget constraint
17
Good 2
6
w/p2
ry
p1 x1 + p2 x2 = w
w/p1
- Good 1
prices of the goods. That is, the set of consumptions aordable under the
budget constraint is a subset of the consumption set.
Budget constraint in the standard form
Let us think of the simplest case, which I call standard form. When a pair of
prices (called price vector) p = (p1 , p2 ) and income w are given, any aordable
combination of consumption x = (x1 , x2 ) must satisfy
p1 x1 + p2 x2 w.
As the left-hand-side is expenditure and the right-hand-side is income, the above
inequality says that expenditure should not exceed income.
Graphically speaking, any aordable consumption vector cannot go outside
of the triangle depicted in Figure 2.4. For example, consumption vector y =
(y1 , y2 ) is not aordable.
Remark 2.1 One may naturally ask, where does the price p = (p1 , p2 ) come
from, and how is it determined? Please delay this question until the chapters
on market. Here Im just talking about how consumers respond to given prices.
Given a price vector p = (p1 , p2 ) and income w, denote the set of consumption vectors satisfying the budget constraint by
B(p, w) = {x R2+ : p1 x1 + p2 x2 w}
This is called budget set. Graphically, B(p, w) corresponds to the area surrounded by the triangle as in Figure 2.4. Its upper-left face is called budget
line. When the consumer spends all his income his consumption vector must
line on the budget line. Budget line is described by the equality
p1 x1 + p2 x2 = w,
18
2.4.1
In the above budget constraint in the standard form I did not specify the source
of income w. Income may have many sources in reality, such as sales of goods,
wage, returns from assets, dividend from firm shares, and so on. Here let me
consider the simplest one, income in an exchange economy.
In an exchange economy each consumer brings her initial endowment
e = (e1 , e2 ) to the market. Then he either sells Good 1 and buys Good 2, or
sells Good 2 and buys Good 1, or sells or buys nothing. Given a price vector
p = (p1 , p2 ), his income is the market valuation of his initial endowment
p1 e1 + p2 e2 .
19
Good 2
6
rx
re
p1 x1 + p2 x2 = p1 e1 + p2 e2
- Good 1
Figure 2.5: Budget constraint in an exchange economy
x2 > e 2
then he is selling Good 1 and buying Good 2. Similarly for the opposite direction.
Numeraire
By the way, it is immediate to see that the budget constraint p1 x1 + p2 x2
p1 e1 + p2 e2 is equivalent to
p1
p1
x1 + x2 e1 + e2 .
p2
p2
20
That is, in an exchange economy only relative price does matter, not its
absolute level. Therefore it is OK to normalize the price of some good equal to
1. Such good is called numeraire. Any good can be a numeraire, but here lets
say it is Good 2, and let p denote the relative price of Good 1 for Good 2, then
the budget constraint it.
px1 + x2 pe1 + e2 .
2.4.2
q
p
is the wage measure by the consumption good, which is the real wage.
Here if the consumer wants to increase 1 extra unit of leisure then he has to
give up pq units of consumption. Thus, the opportunity cost of extra 1 unit of
leisure is pq units of the consumption good.
2.4.3
Let me repeat that even if goods are materially the same they are treated as
dierent goods if they are to be consumed at dierent time periods. We describe
this by the two-period model, and let me introduce budget constraint here.
In the two-period model initial endowment is interpreted as earning stream.
That is, when the consumer has initial endowment e = (e1 , e2 ) it means that
he earns e1 units of the consumption good in the current period and e2 units in
the future period. It might be too short to be called a stream, but let me go
with this.
21
2.5
Exercises
Exercise 1 You have 120 units of income. Price of Good 1 is 4, that of Good
2 is 3.
22
Chapter 3
Preference
3.1
Preference relation
Preference relation describes an individuals subjective ranking over choice objects. It is denoted by , , . To help understanding you may take an analogy
to inequality and equality symbols , >, =, while this analogy is not quite right
as seen below.
Let X be the set of choice objects, which is at this point abstract and it may
consist of anything. Then, given choice objects x, y X, the relation
xy
is read as x is at least as good as y for the individual.
Likewise,
xy
is read as x is better than y for the individual.
Also,
xy
is read as x is as good as y for the individual or the individual is indierent
between x and y.
We will need only, under the completeness condition introduced below,
because x y may be defined by y x is not true, and x y may be defined
by both x y and y x are true.
Above I wrote that the analogy to , >, = is not quite right. This is because
two dierence objects can be equally preferable. That is, the relation x y can
be true for two dierent objects x and y. On the other hand, the equality
relation x = y can be true only when x and y are an identical object.
Throughout the book we assume that individual preference relation satisfies
the following two properties. One is
23
CHAPTER 3. PREFERENCE
24
CHAPTER 3. PREFERENCE
25
Good 2
6
A
rx
B
C
- Good 1
3.2
3.2.1
CHAPTER 3. PREFERENCE
26
Good 2
6
ry
rx
- Good 1
Figure 3.2: Indierence curves
indierence curve C and C cross as in Figure 3.3, and denote the intersection
by x. Then, since y in the figure is above x across C we have y x. Likewise,
since z is above y across C we have z y. However, since x is above z across
C we have x z, which leads to a cycle and contradicts Transitivity.
Let me give you some examples of preference. The simplest one consists of
parallel and straight indierence curves as in Figure 3.4. Here the two goods
are said to be perfect substitutes of each other. Here the slope of indierence
curves being 3 in the graph means (x1 , x2 ) (x1 + t, x2 3t) for any t. In
other words, the consumer is willing to give up 3 units of Good 2 per one extra
unit of Good 1. Thus the slope of indierence curves express the consumers
subjective rate of exchange between two goods. We call this marginal rate of
substitution of Good 2 for Good 1, while its more general definition will be
given later.
Next example consists of L-shaped indierence curves located parallel along
an upward-sloping straight line passing through the origin. Here the two goods
are said to be perfect complements of each other. In this graph the L-shaped
indierence curves are located parallel along the line x1 = 2x2 . This means the
consumer sticks to some fixed proportion between Good 1 and Good 2, which is
2:1 here, and any extras have no value for him. So for example when he originally
has (8, 4), receives extra 6 units of Good 1 and ends up with (14, 4), because
the extra 6 units of Good 1 have no value we have (8, 4) (14, 4). Likewise,
when we add 5 units of Good 2 to (8, 4) so as to obtain (8, 9), again because
the extra 5 units of Good 2 have no value and thus we have (8, 4) (8, 9).
In this book I refer to perfect substitution and perfect complementarity as
extreme cases mostly. More flexible preferences will be between the two.
CHAPTER 3. PREFERENCE
27
Good 2
6
r
z
rx
ry
C
C
- Good 1
Good 2
6
1
3
?
- Good 1
Figure 3.4: Perfect substitution
CHAPTER 3. PREFERENCE
28
Good 2
6
14
- Good 1
3.2.2
Monotonicity
Next I introduce two assumptions which are natural for preferences over consumptions. One is monotonicity, which says more is better.
Strong Monotonicity: For any x = (x1 , x2 ) and y = (y1 , y2 ), if x1 y1 ,
x2 y2 and if at least one of these inequalities are strict, then x y.
This means the consumer is better o when the consumptions of both goods
increase or the consumption of one good increases while that of the other stays
the same. Therefore our mountain does not have a peak, and extends upward
to the north-east direction. Under Monotonicity, the set of consumption vectors
better than x contains the quadrant of north-east direction and the indierence
curves are always downward-sloping (see Figure 3.6).
Of course you can think of preferences which violate monotonicity. Consider
for example that the consumer dislikes some commodity, which is a bad for him,
then his preference violates monotonicity. Also it is violated when the consumer
gets full and consumption more than that makes him sick.
Monotonicity is pretty innocuous, however. If there is a harmful commodity
one can trade it for a negative price and it is equivalent to trading the right to
put that away for a positive price, in which monotonicity is taken to hold with
regard to such right. What about the case of becoming full? It is a matter of
how long we take one period to be. If we take it to be short we may have a case
that the consumer becomes full, but it we take to be suciently long then the
consumers preference satisfies the property that more is better.
Now, as I put the word strong in the above definition it suggests that there
is a weaker definition.
Weak Monotonicity: For all x = (x1 , x2 ), y = (y1 , y2 ), if x1 > y1 and
x2 > y2 , then x y.
CHAPTER 3. PREFERENCE
29
Good 2
6
- Good 1
Figure 3.6: Monotonicity
This says it is better if you increase the amounts of both goods, and it leaves the
possibility that you dont get strictly better o when you increase the amount
of just one good.
To illustrate the dierence, consider the case of perfect complementarity.
When you increase the amounts of both Good 1 and Good 2 at (x1 , x2 ) and
obtain (y1 , y2 ), we have (y1 , y2 ) (x1 , x2 ) and weak monotonicity is met. On
the other hand, when we increase the amount of Good 1 only, lets say by t, as
we just move along the same indierence curve we have (x1 + t, x2 ) (x1 , x2 ),
which says strong monotonicity fails. Likewise, when we increase the amount of
Good 2 only, lets say by t, as we just move along the same indierence curve
we have (x1 , x2 + s) (x1 , x2 ), which again says strong monotonicity fails.
3.2.3
Convexity
CHAPTER 3. PREFERENCE
30
Good 2
6
rx
rx + (1 )y
ry
- Good 1
Figure 3.7: Convexity
now and 0 in the future, (0, 10) refers to consuming 0 now and 10 in the future,
the midpoint (5, 5) refers to consuming 5 both now and in the future. It is
thus natural to prefer the midpoint when the consumer dislikes fluctuation over
time.1
In the setting of consumption under uncertainty, taking middle corresponds
to hedging uncertainty. For example, while state-contingent vector (10, 0) refers
to consuming 10 if Republicans win and 0 if Democrats win, (0, 10) refers to
consuming 0 if Republicans win and 10 if Democrats win, the mid point (5, 5)
refers to consuming 5 regardless of the election outcome. It is thus natural to
prefer the midpoint when the consumer dislikes uncertainty.
Now, as I put the word strict in the above definition it suggests that there
is a weaker definition.
Weak Convexity: For any x = (x1 , x2 ) and y = (y1 , y2 ), if x y then for all
0 < < 1 it holds
x + (1 )y x y.
This means that taking middle of any two equally preferable points does not
make the consumer worse o. The dierence here is that the consumer may not
get strictly better o.
To illustrate, consider the case of perfect substitution. Because indierence
curves are straight here, any point between any two equally preferable points
is again equally preferable to those, which fails to satisfy strict convexity while
the weak one is met.
1 We need to be a bit more careful, since typically a consumer is not indierent between
10 units to be received now and 10 units to be received in the future, since he is normally
impatient. I will come to the issue of impatience in Chapter 8, and let us pretend here that
the consumer is patient and 10 units now and 10 units in the future are equally valuable.
CHAPTER 3. PREFERENCE
3.3
31
CHAPTER 3. PREFERENCE
32
Good 2
6
x1
(x1 , x2 ) r x2
r?(x + x , x + x )
1
1
2
2
- Good 1
Figure 3.8: Marginal rate of substitution
must be on the same indierence curve. While this indierence curve is not
straight, it is taken to be straight locally when the change in consumption
is very small. This local slope is equal to the slope of the tangent line to the
2
indierence curve at (x1 , x2 ). Denote this by x
x1 . This is the amount of Good
2 one can give up in order to get 1 extra unit of Good 1, in the local sense.
Recall that the indierence curves are downward sloping as preference satisfies monotonicity. Thus, when x1 is positive x2 is negative. Hence the local
2
slope of any indierence curve x
x1 is negative. Because we are interested in the
absolute value of it, the marginal rate of substitution at x is given by
x2
= x2 .
M RS(x) =
x1
x1
Here I put (x) after M RS because I like to emphasize the fact that marginal
rate of substitution, which is the local slope of an indierence curve, varies
across points. That is, M RS(x) is a function of x.
Now let us describe marginal rate of substitution for several examples of
preference.
Example 3.3 Perfect substitution: Consider preference exhibiting perfect
substitution, where the (absolute value of) slope of indierence curves is .
Then we have
M RS(x) =
for all x = (x1 , x2 ). Note that = 3 in the first example.
Example 3.4 Cobb-Douglass preference: MRS of Good 2 for Good 1 takes
the following form.
x2
M RS(x) =
x1
CHAPTER 3. PREFERENCE
33
if x1 /x2 <
if x1 /x2 =
if x1 /x2 >
That is, when Good 1 is scarce compared to the determined rate it cannot
be compensated by any larger amount of Good 2, hence MRS is infinity. On
the other hand, when Good 1 is abundant compared to the determined rate
it is simply valueless and the consumer does not want to sacrifice any amount
of Good 2 for that, hence MRS is 0. Finally, MRS is not uniquely determined
at kinked points.
3.4
Smooth preferences
CHAPTER 3. PREFERENCE
34
Good 2
6
r?
r-?
- Good 1
Figure 3.9: Diminishing MRS
3.5
Let us rethink the meaning of convexity in terms of marginal rate of substitution. Indierence curves generated by convex preference are steeper as Good 1
quantity is smaller, and flatter as Good 1 quantity is larger, as in Figure 3.9.
Equivalently, marginal rate of substitution M RS(x) is larger as x1 is smaller,
and smaller as x1 is larger. That is, when Good 1 is scarce the amount of Good
2 one can give for one extra unit of it is larger, and when Good 1 is abundant
the amount of Good 2 one can give for one extra unit of it is smaller. This is
called the law of diminishing marginal rate of substitution. Ill come tho
this in relation to so called the law of diminishing marginal utility.
CHAPTER 3. PREFERENCE
3.6
35
Exercises
Exercise 2 Let Good 1 be consumption good at Period 1, and Good 2 be consumption good at Period 2.
(i) Describe by means of indierence curves the preference of a consumer who
cares only about consumption at Period 1.
(ii) Describe by means of indierence curves the preference of a consumer exhibiting perfect substitution between consumptions at two periods, such that
he cares more about the current consumption.
(iii) Describe by means of indierence curves the preference of a consumer exhibiting perfect substitution between consumptions at two periods, such that
he cares more about the future consumption.
Exercise 3 Let Good 1 be the consumption good available at State 1, and
Good 2 be the one available at State 2.
(i) Suppose the probability of State 1 2/3, and describe by means of indierence
curves the preference of a consumer who cares only about the expected value of
his consumption.
(ii) Describe by means of indierence curves the preference of a consumer who
cares only about the worst case, meaning the case with the lower amount of
consumption.
Chapter 4
36
37
of relative values, and we should not read any economic content in such numbers
or be aware that there must be certain faith in order to do so.
Definition 4.1 Function u : X R is said to be a utility representation of
preference if for all x, y X it holds that
x y implies u(x) u(y)
x y implies u(x) > u(y)
x y implies u(x) = u(y).
In other words, it says that the assigned numbers are consistent with the given
preference.
For example, suppose that the preference is given by
x y z w.
Let us assign numbers to the above alternatives consistently with the ranking. Denote the number assigned to each alternative by u(x), u(y), u(z), u(w),
respectively.
Then a set of numbers consistent with the ranking is for example
u(x) = 3, u(y) = u(z) = 2, u(w) = 1.
This is one utility representation of the preference.
However, for the preference x y z w the above one is not the only
utility representation. For example let us double the assigned numbers and
obtain
u (x) = 6, u (y) = u (z) = 4, u (w) = 2.
This is also a utility representation of preference x y z w. Now, if we represent the preference by u instead of u is our consumers happiness doubled?
Thats nonsense. They are simply representations of the same preference.
Likewise, let us add 4 uniformly to the first set of numbers:
u (x) = 7, u (y) = u (z) = 6, u (w) = 5
This is also a utility representation of preference x y z w. Now, if we
represent the preference by u instead of u is our consumer 4 units happier?
Thats nonsense again. They are simply representations of the same preference.
There is no meanings like 4 units happier
Actually, any set of number is fine as far as it is consistent with the ranking.
For example
u
(x) = 5, u
(y) = u
(z) = 2, u
(w) = 3
is fine and
u
(x) = 2, u
(y) = u
(z) = 5, u
(w) = 8
38
is fine as well.
More generally, for any monotone transformation we have the following result. Any function from real numbers to real numbers f is said to be monotone
transformation if f (u) > f (v) whenever u > v. That is, it is any function
such that its graph is upward-sloping.
Theorem 4.1 Suppose that function u is a utility representation of preference
. Then for any monotone transformation the function defined by
u
(x) f (u(x))
is also a utility representation of .
Proof. If x y, because uis a utility representation we have u(x) > u(y). Since
f is a monotone transformation, f (u(x)) > f (u(y)).
If x y, because u is a utility representation we have u(x) = u(y). Hence we
have f (u(x)) = f (u(y)).
If x y it holds either x y or x y, from the above we have either f (u(x)) >
f (u(y)) or f (u(x)) = f (u(y)). Therefore it holds f (u(x)) f (u(y)).
That is, once there is a utility representation to a given preference relation
we can cook up arbitrarily many representations for the same preference by taking arbitrary monotone transformation. Utility representation has
meaning only as a representation of preference ordering and it has no quantitative meaning. Thus is called ordinal utility.
Let me restate this point with regard to preference over a consumption space.
To illustrate consider the case of perfect substitution, in which the slope of
indierence curve is , that is marginal rate of substitution is always constant
and equal to . The simplest representation of it will be like
u(x) = x1 + x2 .
Indeed, if we take utility level u
any consumption vector (x1 , x2 ) yielding this
satisfies
x1 + x2 = u
,
which describes a straight line with slope .
This is not the only representation, of course. For example, if we double the
above representation (that is, by transforming via f (u) = 2u) we obtain
u
(x) = 2x1 + 2x2 .
Then the above noted indierence curve is described by
2x1 + 2x2 = 2
u
Is our consumers happiness doubled? No. Both u and u
are no more than representations of the same preference. It is immediate to see that the indierence
39
curve described by x1 + x2 = u
and the one described by 2x1 + 2x2 = 2
u are
the same.
That is, the utility level u
here is no more than an index referring to
which indierence curve we are looking at.
Likewise, we can think of transformation such as f (u) = eu . Because this is
a monotone transformation the function
u
(x) = ex1 +x2
is also a representation of the same preference, and the above-noted indierence
curve is described by
ex1 +x2 = eu .
Again it is immediate to see that the indierence curve described by x1 +x2 = u
40
which is a generalization of hyperbola and has x1 -axis and x2 -axis as its asymptotes.
Again, this is of course not the utility representation. We can lets say take
log transformation and obtain
u
(x) = ln(xa1 xb2 ) = a ln x1 + b ln x2 ,
which is a representation of the same preference. It is immediate to see that
the indierence curve described by xa1 xb2 = u
and the one described by a ln x1 +
b ln x2 = ln u
are the same.
We can obtain arbitrarily many representations of Cobb-Douglass preference
by taking arbitrary monotone transformation, but the above two are the typical
ones. Sometimes the first one is easier to handle and sometimes the second is
easier.
4.2
Marginal utility
Just like values of utility representation have no quantitative or economic meaning, so-called marginal utility has no quantitative or economic meaning either.
Why do I introduce it, nevertheless? It is a technical concept which helps us
to describe marginal rate of substitution in an operational manner, but that is
the only role. Economists tend to use the word marginal utility as if it is a
substantive concept, however, when its relation to marginal rate of substitution
is straightforward. We should keep that in mind.
4.2.1
One-good case
To illustrate, let me start with the case that there is just one good and preference
satisfies monotonicity, which simply says more of it is better. Then its representation u(x) can be any monotone increasing function, that is, any function
with upward-sloping graph.
Let me start with the simplest example of monotone increasing function,
which is a function with a linear graph,
u(x) = ax + b
with a > 0. Now let me ask in this representation how much utility increases
as the consumer gains extra one unit of the good, knowing that such question
has no economic content. It is immediate to see that the answer is a, slope of
the graph.
Now, let us consider another representation
u
b(x) = ln(ax + b).
This time its graph is not linear, so we have to look at local slopes of the graph.
Suppose we are currently at x and we increase the amount of the good slightly.
41
42
rate of substitution instead, in the sense that the first sip of beer if more
precious and the consumer is willing to sacrifice more units of the other goods,
and as he has more units beer he is willing to give up less units of the other
goods. To understand this, however, we need to think of the case of 2 goods at
least.
4.2.2
2-good case
When there are two or more goods, one has to obtain marginal utility of each
good. That is, in the two-good case, marginal utility of Good 1 corresponds to
how much utility increases as the consumer gets extra one unit of
Good 1 as he keeps the amount of Good 2 to be constant. Likewise,
marginal utility of Good 2 corresponds to how much utility increases as
the consumer gets extra one unit of Good 2 as he keeps the amount
of Good 1 to be constant. Of course, we should note that a phrase like how
much utility increases has no economic content.
Let me illustrate with the case of perfect substitution. Of course there are
arbitrarily many equivalent representation of the same preference, so Im taking
the simplest one, which is a linear function
u(x) = ax1 + bx2 .
Here it is easy to see how much utility increases as we add 1 unit of Good 1
alone, which is a, the coecient on x1 . This is the marginal utility of Good 1 in
the given representation. Likewise, it is easy to see how much utility increases
as we add 1 unit of Good 2 alone, which is b, the coecient on x2 . This is the
marginal utility of Good 2 in the given representation.
Graphically, this is illustrated as follows. The mountain given by representation u(x) = ax1 + bx2 consist of a plane as in Figure 4.1. Marginal utility
of Good 1 corresponds to the slope of this mountain toward the east direction,
that is, in the direction along the x1 -axis. Likewise, marginal utility of Good 2
corresponds to the slope of this mountain toward the north direction, that is,
in the direction along the x2 -axis. Since this mountain is a straight plane, the
slope toward east is a everywhere and the slope toward north is b everywhere.
Of course one can take another representation of the same preference, however, for example
u
b(x) = ln(ax1 + bx2 ).
The mountain given by this representation looks like Figure 4.2. Because u
and u
b are representations of the same preference they induce the same series of
indierence curves. That is, when you look at the mountain given by u and
the mountain given by u
b from above you will see the same series of level
curves as in Figure 4.3. Let me repeat that only how the level curves look like
should matter economically.
Put this in mind, and look at the mountain given by general utility representation u. This time the mountain is not necessarily straight but its slope
Utility
6
* Good 2
PP
PPP
PP
PP
PP
P
q Good 1
Figure 4.1: u(x) = ax1 + bx2
Utility
6
* Good 2
PP
PP
PP
PP
PP
P
P
q Good 1
Figure 4.2: u
b(x) = ln(ax1 + bx2 )
43
44
Good 2
6
- Good 1
Figure 4.3: Preference represented by u(x) = ax1 + bx2 , u
b(x) = ln(ax1 + bx2 )
varies across points, therefore we need to look at its local slope. See Figure
4.4. Suppose we are now at x = (x1 , x2 ), and consider local slope toward east.
To see that, keep the amount of Good 2 x2 to be constant and consider adding
a slight amount of Good 1. Let x1 denote this slight amount. That is, we
move from (x1 , x2 ) to (x1 + x1 , x2 ) along the x1 -axis. Then the local slope
toward east is given by
u(x1 + x1 , x2 ) u(x1 , x2 )
.
x1
Now as we make this slight amount x1 tend to be indefinitely small we
obtain so-called partial derivative of u by x1 at x = (x1 , x2 ),
u(x)
u(x1 + x1 , x2 ) u(x1 , x2 )
= lim
x1 0
x1
x1
This is the marginal utility of Good 1 obtained for representation u at x =
(x1 , x2 ).
Consider for example u
b(x) = ln(ax1 +bx2 ), then the marginal utility of Good
1 is
a
ln(ax1 + bx2 )
=
x1
ax1 + bx2
Here the partial derivative of a function by x1 is obtained by taking x2 to be
constant and taking the derivative of it as if it is a single-variable function of
x1 .
Likewise, consider the local slope toward north at x = (x1 , x2 ). See Figure
4.5. Now keep the Good 1 quantity x1 to be constant and consider adding a
slight amount of Good 2. Let x2 denote this slight amount. That is, we
are moving from (x1 , x2 ) to (x1 , x2 + x2 ) along the x2 -axis. Then the local
45
Utility
6
u
P
q6
P
x1
* Good 2
x
2
PP
P
q
P
PPP
x1
PP
x1 PP
PP
P
q Good 1
Figure 4.4: Marginal utility of Good 1
46
Utility
6
u
6
*
x2
* Good 2
x
2
PP
*
x2
PPP
PP
x1 PP
PP
P
q Good 1
Figure 4.5: Marginal utility of Good 2
47
x13 x23
x1 x22
x21 x42
48
respectively. Note that they represent the same preference and yield the same
series of indierence curves.
However, for each of them marginal utility of Good 1 at x = (x1 , x2 ) is
1
x13 x23
1 2 2
= x1 3 x23
x1
3
x1 x22
= x22
x1
x21 x42
= 2x1 x42
x1
respectively, and the first one is decreasing, second is constant, third is increasing
in x1 , respectively. Again, when dierent representations of the same preference
lead to dierent conclusions about a proposition, it means that such proposition
has no economic content.
4.3
This books takes the stance that the concept of marginal utility helps us to
describe marginal rate of substitution in an operational manner, but it has no
more role than that. In other words, only marginal rate of substitution has
economic content and marginal utility itself has no such content.
Now how can we describe marginal rate of substitution by means of marginal
utilities? Consider that we are at x = (x1 , x2 ) as in Figure 3.8, and ask how
much of Good 2 the consumer can give up in order to get a slight amount of
Good 1. Denote this slight amount of Good 1 by x1 , and let x2 denote the
amount of Good 2 he is willing to give up. Then two points (x1 , x2 ) and (x1 +
x1 , x2 + x2 ) must be on the same indierence curve. Because indierence
curves are downward-sloping, when x1 is positive x2 must be negative.
Then let u denote the change from utility at u(x1 , x2 ) at (x1 , x2 ) to utility
u(x1 + x1 , x2 + x2 ) at (x1 + x1 , x2 + x2 ).
Here because the local slope toward the east is u(x)
x1 , the change of
height made when we move to the east by x1 is
cause the local slope toward the north
is u(x)
x2 , the
u(x)
x2 x2 .
u(x)
x1 x1 .
Also, be-
u(x)
u(x)
x1 +
x2
x1
x2
49
Now recall that we are moving along the same indierence curve and the
change of utility must be kept zero (See Figure 3.8). Therefore, from u = 0
we have
u(x)
u(x)
0=
x1 +
x2 .
x1
x2
By rearranging this equality we obtain
x2
=
x1
u(x)
x1
u(x)
x2
Notice that the left-hand-side above is the marginal rate of substitution, the
absolute value of the local slope of the indierence curve. Recall that since
indierence curves are downward-sloping, when x1 is positive x2 is negative,
and thats why we are putting the minus sign.
Summing up, we obtain
u(x)
x1
u(x)
x2
M RS(x) =
u(x)
x1
u(x)
x2
a
,
b
u
b(x)
x1
u
b(x)
x2
a
ax1 +bx2
b
ax1 +bx2
a
.
b
u
e(x)
x1
u
e(x)
x2
aeax1 +bx2
a
= .
beax1 +bx2
b
50
u(x)
x1
u(x)
x2
axa1
xb2
ax2
1
=
b1
a
bx1
bx1 x2
u
b(x)
x1
u
b(x)
x2
a
x1
b
x2
ax2
.
bx1
u(x)
x1
u(x)
x2
\
M
RS(x) =
u
b(x)
x1
u
b(x)
x2
but since u
b(x) = 2u(x) we have
\
M
RS(x) =
2u(x)
x1
2u(x)
x2
2 u(x)
x1
2 u(x)
x2
and as 2s in the numerator and the denominator are canceled out we obtain
\
M
RS(x) =
\
which implies M RS(x) = M
RS(x).
u(x)
x1
u(x)
x2
51
While marginal utilities have no economic content, why does marginal rate
of substitution described by them have economic content? This is because
marginal rate of substitution is given as the ratio between marginal utilities,
where the scales of marginal utilities are canceled out across the numerator and
the denominator.
Now we have the following general claim.
Theorem 4.2 For any monotone transformation f both u(x) and u
b(x) = f (u(x))
give the same marginal rate of substitution.
Proof. Let M RS(x) denote the marginal rate of substitution derived from
\
u(x), and let M
RS(x) denote the marginal rate of substitution derived from
u
b(x).
Marginal utility of Good 1 in representation u
b(x) is
b
u(x)
f (u(x))
u(x)
=
= f (u(x))
,
x1
x1
x1
from the formula for partial derivative of composite function. Likewise, marginal
utility of Good 2 in representation u
b(x) is
f (u(x))
u(x)
b
u(x)
=
= f (u(x))
x2
x2
x2
Therefore,
\
M
RS(x) =
u
b(x)
x1
u
b(x)
x2
f (u(x)) u(x)
x1
f (u(x)) u(x)
x1
u(x)
x1
u(x)
x1
= M RS(x).
4.4
52
you and she is 1.3 times happier than he is. From the standpoint of ordinal
utility, the only cardinal notion, which has quantitative meanings, is marginal
rate of substitution.
On the other hand, there is a standpoint of cardinal utility, asserting that
utilities do have quantitative meanings. Without a faith, it is hard to get
convinced with the standpoint of cardinal utility, in particular the assertion
that utilities are comparable across individuals. However, it is true that this
appeals to our intuition. One may say for example, if we cannot compare
utilities across individuals, cant we compare between the utility from extra
10 dollars for a poor and the utility from extra 10 dollars for a rich? Thats
absurd! I somehow understand such claims, but I think it should be justified
(if we want to) in a richer model with richer dimensions and descriptions of time
and uncertainty which are implicit and fixed in our current argument. Having
said that, I will proceed with the standpoint of ordinal utility.
While this book maintains the standpoint of ordinal utility throughout, in
later chapters the readers may wonder that Im using some notions of cardinal
utility, particularly in the chapters about quasi-linear preferences, discounted
utility and expected utility.
Let me emphasize Im not. If it looks as if Im using some notions of cardinal
utility, it will be because certain type of preference allows a class of natural
forms of representations, which is a subset of all the possible utility representations of it, and I restrict attention to such natural ones.
For example, if a preference relation allows representation in the form
u(x) = v1 (x2 ) + v2 (x2 )
it is said to be additively separable. When we argue about quasi-linear preferences, discounted utility and expected utility, v1 and v2 are considered to have
certain quantitative meanings as far as we restrict attention to the class
of additively separable representations
Of course, even if a preference has additively separable representations it is
not the only class of representations of it. For example we can take exponential
transformation and obtain
u
b(x) = eu(x) = ev1 (x2 )+v2 (x2 ) = ev1 (x1 ) ev2 (x2 ) ,
and let vb1 (x1 ) = ev1 (x1 ) , vb2 (x2 ) = ev2 (x1 ) . Then obtain a multiplicatively
separable representation
u
b(x) = vb1 (x1 )b
v2 (x2 ).
Also we can take any arbitrary monotone transformation f and obtain another
representation
u
e(x) = f (u(x)),
which is in general not additively separable.
53
With this point kept in mind, economists choose the explanation that the
class of additive representations is the natural one and interprets v1 to be the
utility of Good 1 and v2 to be the utility of Good 2 and that the individual
cares about the sum of the two, and call v1 and v2 cardinal utility. We
should note, however, that such additions and quantitative comparisons
of utilities are limited to those within a given individual. Therefore, it
still does not get into any kind of interpersonal comparability of utility, which
requires a dierent level of faith as said above.
4.5
Exercises
Chapter 5
As before, let X the set of all the potentially available choice alternatives, which
is the consumption set in the context of choosing consumptions. Let denote
preference relation defined over X. Now suppose that an opportunity set B is
a subset of X. Then the best choice in B for a given individual is defined as
a maximal element in it according to his preference.
Definition 5.1 Say that x B is a maximal element in B for preference
relation if there is no x B such that x x .
When the preference satisfies completeness and transitivity, this is equivalent
to saying that it is at least as good as any element in B. That is, one can also
say that x B is a maximal element in B for preference relation if x x
for all x B.
Now let us apply this to consumption choice under budget constraint.
Definition 5.2 Given price vector p = (p1 , p2 ) and income w, consumption
vector x = (x1 , x2 ) B(p, w) is said to be a maximal element in budget set
B(p, w) if there is no x B(p, w) such that x x .
Let me illustrate this in Figure 5.1. Pick any point in the budget set, say
x. Is the maximal? No, because we can find a point in the budget set which is
above the indierence curve passing through x, such as y. Then how about y?
It is not maximal, because we can find a point in the budget set which is above
the indierence curve passing through y, such as z. And so on.
In the end, the consumer will choose a point like x . There is no point in
the budget set which is above the indierence curve passing through x . Thus
x is a maximal element in the given budget set.
54
55
Good 2
6
ry
rx
rz
- Good 1
Now I depicted the maximal element x on the budget line, but if preference
does not meet monotonicity and allows to get full at some point then at the
consumer might not spend all his income at his optimal choice.
However, we can show that under monotonicity any maximal element must
be on the budget line.
Proposition 5.1 If preference satisfies (weak) monotonicity then any maximal
element in the budget set must be on the budget line.
Proof. Suppose that there is a maximal element which is not on the budget line,
at which the consumer does not spend all his income. Denote it by x = (x1 , x2 ),
then we have p1 x1 +p2 x2 < w. Then we can take y = (y1 , y2 ) such that y1 > x1 ,
y2 > x2 and p1 y1 + p2 y2 w.
From the weak monotonicity of preference we have y x, which contradicts
to x being maximal in the budget set.
Since monotonicity of preference is an innocuous assumption after making
appropriate interpretation as discussed before, we consider the case that any
maximal element is on the budget equation throughout.
Now look at Figure 5.1 again, then you might notice that there is just one
maximal element here, which is x . In general, however, maximal elements may
not be just one. For example, when preference which is weakly convex but not
strictly convex then we may have a situation like Figure 5.2, in which all the
points on the flat part of the indierence curve coinciding with the budget line
are maximal elements. Also, if preference even fails be weakly convex then as
in Figure 5.3 all the points mutually distant to each other are maximal.
However, we can show that maximal element is always unique when preference is strictly convex.
56
Good 2
6
- Good 1
Figure 5.2: Weakly convex preference
Good 2
6
r
r
r
- Good 1
57
5.2
There are two more points I like you to notice about the maximal element as
depicted in Figure 5.1. One is that it is not on either edge of the budget line and
the other is that at the maximal element the indierence curve passing through
it is tangent to the budget line.
When the maximal element is strictly between the endpoints and both of
quantities of Good 1 and Good 2 are positive it is said to be an interior solution. On the other hand, when the maximal element is on one of the edges as
in Figure 5.4 it is called a corner solution.
Thus, let me say that the consumption choice meets tangency condition
when the corresponding indierence curve is tangent to the budget line at the
maximal element. The tangency condition may not hold even when the solution
is in interior, when the corresponding indierence curve has a kink as in Figure
5.5.
Let me say that the consumption choice is smooth when it is an interior
solution and meets the tangency condition.
Proposition 5.3 When preference is smooth consumption choice is smooth.
Let me omit the rigorous proof, but the idea is as follows.
Consider the case that the corresponding indierence curve is steeper than
the budget line, like at x as depicted in Figure 5.6. Then the local slope of
the indierence curve is greater than the slope of the budget line. This means
that the marginal rate of substitution of Good 2 for Good 1 is greater than the
relative price of Good 1 for Good 2. That is, we have
M RS(x) >
p1
p2
58
Good 2
6
- Good 1
Good 2
6
r
- Good 1
Figure 5.5: Indierence curves with kinks
59
Recall that M RS(x) is the amount of Good 2 the consumer is willing to give
up in order to get one extra unit of Good 1. On the other hand, pp12 is the
amount of Good 2 he has to give up in order to get one extra unit of Good 1,
that is, the opportunity cost of one extra unit of Good 1 measure by Good 2.
In the current situation the amount of sacrifice he is willing to make in order to
get extra one unit of Good 1 is greater than the amount of sacrifice he has to
pay. Thus he will increase the amount of Good 1 by sacrificing the consumption
of Good 2.
Consider the opposite case as well. Consider the case that the corresponding
indierence curve is flatter than the budget line, like at x as depicted in Figure
5.6. Then the local slope of the indierence curve is smaller than the slope of
the budget line. This means that the marginal rate of substitution of Good 2
for Good 1 is smaller than the relative price of Good 1 for Good 2. That is, we
have
p1
M RS(x ) <
p2
In the current situation the amount of sacrifice he is willing to make in order
to get extra one unit of Good 1 is smaller than the amount of sacrifice he has
to pay. Or, equivalently speaking by taking the inverse in the above inequality,
the amount of Good 1 he is willing to make in order to get extra one unit of
Good 2 is greater than the amount of Good 1 he has to sacrifice. Thus he will
increase the amount of Good 2 by sacrificing the consumption of Good 1.
When the corresponding indierence curve is tangent to the budget line, like
at x as depicted in Figure 5.6, the local slope of the indierence curve is equal
to the slope of the budget line. That is, we have equality
M RS(x ) =
p1
.
p2
Here the amount of sacrifice he is willing to make in order to get extra one unit
of Good 1 is equal to the amount of sacrifice he has to pay. If he moves to
upper-left on the budget line he leads to the first situation in which the amount
of Good 1 is too small, and if he moves to lower-right on the budget line he
leads to the second situation in which the amount of Good 1 is too large.
Therefore, the optimal consumption is determined to be the point on the
budget line at which the marginal rate of substitution and relative price are
equalized. As marginal rate of substitution moves between 0 and infinity monotonically and continuously under smoothness of preference, there is a unique
point x with x1 , x2 > 0 which satisfies the above equality.
Thus, smooth consumption is determined by the tangency condition and the
maximal element x satisfies the equation
M RS(x ) =
p1
.
p2
60
Good 2
6
rx
rx
rx
- Good 1
By solving the above two equations with two unknowns we can find x =
(x1 , x2 ).
5.2.1
u(x)
x1
u(x)
x2
u(x)
x1
a
x1
b
x2
.
a
x1
and
u(x)
x2
b
x2 .
Hence the
ax2
.
bx1
bx1
p2
By rewriting this we obtain a linear relationship x2 =
budget equation, then we have
p1 x1 + p2 x2 = p1 x1 + p2
bp1
ap2 x1 .
(a + b)p1
bp1
x =
x1 = w.
ap2 1
a
61
w
a
a + b p1
bp1
ap2 x1
we obtain
b
w
.
a + b p2
Consumption choice given by Cobb-Douglas preference is that first the consumer splits his income between Good 1 and Good 2 at the proportion a versus
b and then divides the income allocated to each good by its price. This proportionality is observed in data in a quite robust manner, and this is the reason why
Cobb-Douglas preference (or its generalization) is frequently used in application.
5.3
62
Good 2
6
- Good 1
Good 2
6
- Good 1
Figure 5.8: Perfect substitution: Case 2
63
Good 2
6
- Good 1
Figure 5.9: Perfect substitution: Case 3
5.4
Finally let us consider consumption choice given by preferences exhibiting perfect complementarity, which is an example of interior solution to which the
tangency condition does not apply. Here represent the preference exhibiting
perfect complementarity lets say by u(x) = min{ xa1 , xb2 }.
The series of indierence curves given by this preference are L-shaped and
parallel along the line of locus of kinks xx12 = ab . Therefore the maximal element
is obtained as the intersection of the line of locus of kinks xx21 = ab and the
budget line p1 x1 + p2 x2 = w, as depicted in Figure 5.10. By solving these two
equations we obtain
x1 =
aw
,
ap1 + bp2
x2 =
bw
.
ap1 + bp2
Here the tangency condition does not apply because the corresponding indierence curve is kinked at the maximal element and we cannot take the marginal
rate of substitution at this point.1
5.5
Demand function
Hereafter, given price p = (p1 , p2 ) and income w denote the maximal element
in B(p, w) by
x(p, w) = (x1 (p, w), x2 (p, w))
and call it demand function I restate like this because I want to emphasize
that consumption choice may vary as the price-income pair (p, w) varies.
1 This may fall in a generalized version of the tangency condition when we generalize
the notion of derivative, though.
64
Good 2
6
- Good 1
Figure 5.10: Perfect complementarity
.
a + b p1 a + b p2
In order to be able to define the demand function the maximal element
has to be unique, but what if there are multiple optima? For example, consider
preference exhibiting perfect substitution represented in the form u(x) = ax1 +
bx2 . Then what would correspond to demand function is
w
if pp12 < ab
( p1 , 0)
x(p, w) = all the points on the budget line if pp12 = ab
(0, pw2 )
if pp12 > ab
This maps each point to a set, not from point to point. It will be better to call
this correspondence. So more generally it is better to call the mapping demand
correspondence rather than demand function, but as we mostly consider the case
that the maximal element is uniquely determined given each price-income pair,
we will call it demand function.
5.6
So far we have not specified the source of income w, but here let us consider consumption choice and demand in an exchange economy in which each consumers
income is given as the market value of his initial endowment.
65
Given a price vector p = (p1 , p2 ), the income of a consumer who has initial
endowment e = (e1 , e2 ) is p1 e1 + p2 e2 . Therefore the budget constraint is
p1 x2 + p2 x2 p1 e1 + p2 e2
Taking the initial endowment e to be fixed, the budget set is denoted by B(p),
where
B(p) = {x R2+ : p1 x2 + p2 x2 p1 e1 + p2 e2 }.
Here we obtain the demand function denoted by
x(p) = (x1 (p), x2 (p)),
as the maximal element in B(p) for each given p. It is obtained by replacing w
by p1 e1 + p2 e2 in the previous argument.
For example, the demand function given by Cobb-Douglas preference represented in the form u(x) = a ln x1 + b ln x2 is
(
)
a
p1 e1 + p2 e2
b
p1 e1 + p2 e2
(x1 (p), x2 (p)) =
.
a+b
p1
a+b
p2
Here if demand (x1 (p), x2 (p)) is in the left of the initial endowment point,
that is,
x1 (p) < e1 , x2 (p) > e2
then the consumer is buying Good 2 by selling Good 1 under given price p =
(p1 , p2 ). Similarly for the opposite case.
5.7
66
subject to
p1 x1 + p2 x2 w.
Because the maximized value depends on the price-income pair (p, w) we write
it as
v(p, w) =
max
u(x)
x:p1 x1 +p2 x2 w
subject to
p1 x1 + p2 x2 w
then the solution must be the same. Consider for example Cobb-Douglas preference, then we must obtain the same solution whether we maximize u(x) = xa1 xb2
or we maximize u
b(x) = a ln x1 + b ln x2 .
In the above I first introduced the notion of maximal element without using
utility maximization, because the description of utility maximization is only
for convenience to the analyst. Having said that, hereafter we will borrow the
description by utility maximization in order to be operational.
5.8
67
subject to
u(x) u.
68
Good 2
6
r
x
ry
rz
- Good 1
p1
.
p2
On the other hand, we are following the constraint that consumption at least
satisfies the given utility level u. Since it is enough to satisfy the minimally
necessary level of utility here, we obtain another equation
u(x ) = u
By solving these two equations we obtain the expenditure-minimizing point.
This is called compensated demand under p = (p1 , p2 ) given utility level
u.
Let us do this with Cobb-Douglas preference which is represented by u(x) =
a ln x1 + b ln x2 .
As obtained in the previous arguments, marginal rate of substitution here is
M RS(x) =
ax2
.
bx1
69
h2 (p, u) =
h1 (p, u)
=
=
70
Proposition 5.4 Let h(p, u) and e(p, u) denote the compensated demand function and the expenditure function defined for utility representation u(x), respectively.
Let f be any monotone transformation, and denote the compensated demand
function and the expenditure function defined for representation v(x) = f (u(x)).
Then it holds
b
h1 (p, v) =
b
h2 (p, v) =
eb(p, v) =
5.9
h1 (p, f 1 (v))
h2 (p, f 1 (v))
e(p, f 1 (v)).
Exercises
2
Chapter 6
Demand analysis
In this chapter let us consider how demand responds to price and income
changes, and welfare evaluation of such changes.
6.1
We can classify goods according to how demands for them respond to income
change.
Definition 6.1 A good is said to be normal if higher (lower) income leads to
larger (smaller) demand for it. A good is said to be inferior if higher (lower)
income leads to smaller (larger) demand for it.
More mathematically speaking, lets say Good 1 is normal if for slight amount
of extra income w it holds
x1 (p, w + w) x1 (p, w) > 0
Now by making this slight amount w tend to be infinitesimally small, the
condition means that the partial derivative of demand function for Good 1 by
w is positive, that is,
x1 (p, w)
> 0.
w
Likewise, the condition for inferior good is that the inequality above is reversed.
For example, consider the demand function generate by Cobb-Douglas preference represented by u(x) = a ln x1 + b ln x2 ,
x1 (p, w) =
w
a
,
a + b p1
x2 (p, w) =
b
w
a + b p2
Since income w appears in the numerator in the formula for x1 (p, w), and similarly for x2 (p, w), when it increase the demand for both good increase. Hence
both Good 1 and 2 are normal.
71
72
Good 2
6
w+w
p2
w
p2
r
w
p1
- Good 1
w+w
p1
It sounds puzzling that when one has more income he consumes less amount
of some good, but it is well-possible under standard preferences satisfying completeness, transitivity monotonicity and convexity. See for example Figure 6.1,
in which as income increase from w to w+w the demand for Good 1 decreases.
This is intuitive is such that you consume it only when you are poor, and do
not consume it when you are rich. Note, however, that not all goods can be
inferior simultaneously (why?).
Let me give an example of demand function allowing the existence of inferior
good.
a
w(1 ew )
b
wew
x1 (p, w) =
, x2 (p, w) =
a+b
p1
a+b
p2
When you take the derivative of demand function for Good 1 by income you get
x1 (p, w)
a
1 ew + wew
=
> 0,
w
a+b
p1
which implies Good 1 is always normal. On the other hand, when you take the
derivative of demand function for Good 2 by income you get
x2 (p, w)
a
(1 w)ew
=
> 0,
w
a+b
pw
which is negative when w > 1, meaning inferior there.
6.2
Next let us consider how demand for each good responds to own price change.
That is, we consider how demand for Good 1 changes as the price of Good 1
changes, and how demand for Good 2 changes as the price of Good 2 changes.
73
6.3
Next let us consider how demand for each good responds to price change of
another good.
Definition 6.3 A good is said to be a gross substitute of another one if the
demand for it increases as the price of the second increases.
A good is said to be a gross complement of another one if the demand for it
decreases as the price of the second increases.
More mathematically speaking, lets say Good 1 is a gross substitute of Good
2 if for slight amount of own price increase p2 it holds
x1 (p1 , p2 + p2 , w) x1 (p1 , p2 , w) > 0
74
Good 2
6
w
p2
r
w
p1 +p1
w
p1
- Good 1
sider preference represented by u(x) = a x1 + b x2 . Let us derive the demand function, as it is the first time to see this one. Since this preference is
smooth
we
can use the tangency condition. As the
marginal
utility of Good 1 is
(a x1 +b x2 )
(a x1 +b x2 )
a
= 2x1 and that of Good 2 is
= 2bx2 , the marginal
x1
x2
1 Is
Since M RS(x) =
p1
p2
a
2 x1
b
2 x2
75
a x2
=
b x1
a x2
p1
=
b x1
p2
b2 p2
Solve this for x2 , then we obtain x2 = a2 p12 . Plug this into the budget equation
2
p1 x1 + p2 x2 = w and solve for x1 . Plug this again into the relationship between
x2 and x1 , then we obtain x1 . Thus we obtain
x1 (p, w) =
a2 w
p1 a2 + b2
p1
p2
),
x2 (p, w) =
b2 w
p2 a2
p2
p1
+ b2
).
aw
,
ap1 + bp2
x2 (p, w) =
bw
.
ap1 + bp2
Since p2 is in the denominator in the formula for x1 (p, w), as Good 2 becomes
more expensive the demand for Good 1 decreases. Likewise, since p1 is in the
denominator in the formula for x2 (p, w), as Good 1 becomes more expensive the
demand for Good 2 decreases. Hence Good 1 and 2 are a gross complement of
each other.
Also, lets go back to the demand function given by Cobb-Douglas preference,
then p2 is not in the formula for x1 (p, I). Hence the demand for Good 1 is not
aected by the price of Good 2. Likewise, p1 is not in the formula for x2 (p, I).
Hence the demand for Good 2 is not aected by the price of Good 1.
6.4
Elasticity of demand
By the way, we have to adjust units in order to compare how demands are
sensitive to income and price change across goods, otherwise such comparison
will be meaningless. In economics we use a concept called elasticity.
76
Elasticity of demand for some good to the change in some variable is the
percentage change in quantity of the good demanded in response to one percent
change in the variable.
To illustrate let me continue to focus on Good 1. Then income elasticity
of demand for Good 1 is the percentage change in quantity of Good 1 demanded
in response to one percent change in income, which is given by
e1,w =
x1
x1
w
w
x1 w
w x1
e1,w =
For example, when demand for Good 1 increases from 100 to 110 as income
increases from 500 to 600 the income elasticity is
110100
100
600500
500
0.1
= 0.5.
0.2
e1,p1 =
For example, when demand for Good 1 decreases from 100 to 90 as its price
rises from 20 to 30 the price elasticity is
90100
100
3020
20
0.1
= 0.2
0.5
By the way, when the (absolute value of) price elasticity is smaller than 1
the good is said to be a necessity and when it is greater than 1 it is said to be
a luxury.
Likewise, cross price elasticity of demand for Good 1 against Good 2
is the percentage change in quantity of Good 1 demanded in response to one
percent change in the price of Good 2, which is given by
e1,p2 =
x1
x1
p2
p2
x1 p2
p2 x1
77
x1 p2
p2 x1
For example, when demand for Good 1 increases from 100 to 120 as the price
of Good 2 rises from 20 to 22 the cross price elasticity is
120100
100
2220
20
0.2
=2
0.1
6.5
Now let me come back to the reason why I put the word gross on substitute
and complement in the previous definition.
First of all, price change of a good must have two eects, one is the change of
relative price for other goods and the other is the change of real income. When
some good becomes more expensive, first it becomes relatively more expensive
compared to other goods (rise of relative price) and second the quantity of it
you can buy within your income decreases (fall of real income). The first eect
is called substitution eect and the second is called income eect. Gross
demand change now means the whole change without making a distinction between the two.
See Figure 6.3. Let w be the income, let p = (p1 , p2 ) denote the price pair
before price change, and let x = (x1 , x2 ) denote the consumption choice before
the price change. Suppose the price of Good 1 goes up from p1 to p1 + p1 .
Hence the price pair after the change is (p1 + p1 , p2 ). Then the new budget
w
1
line passes through ( p1 +p
, 0) and (0, pw2 ), and its slope is p1 +p
. Note that
p2
1
the x2 -intercept has not changed. Denote the demand after the price change by
x = (x1 , x2 ).
Gross demand change for Good 1 is x1 x1 and for Good 2 it is x2 x2 .
Now in order to separate between substitution eect and income eect, draw
1
a line with slope p1 +p
which is tangent to the indierence curve passing
p2
through x, the demand before the price change.
1
Let x
b denote the point at which the line with slope p1 +p
is tangent to the
p2
corresponding indierence curve. Then it is the expenditure-minimizing point
which yields the same welfare level as x does. That is, x
b is the compensated
demand under (p1 + p1 , p2 ) corresponding to the welfare level given by x.
Since the change from the original demand x to the compensated demand x
b is
made by suitably compensating income so that the consumer can maintain the
same welfare level after the price change, it reflects only the change of relative
price. Therefore the move from x to x
b, that is x
b x = (b
x1 x1 , x
b2 x2 ), is the
substitution eect of the price change.
78
Good 2
6
I
p2
b
rx
rx
I
p1 +p1
I
p1
- Good 1
Figure 6.3: Substitution and income eect of the price change of Good 1
1
On the other hand, since the line with slope p1 +p
which passes through x
b
p2
is parallel to the budget line after the price change, the move from x
b to x reflects
b2 )
b1 , x2 x
only the change of real income. Therefore, the vector x x
b = (x1 x
is the income eect of the price change.
Notice that the substitution eect of price increase of Good 1 is always negative (non-positive, precisely) on Good 1 and positive (non-negative, precisely)
on Good 2. Since indierence curves are downward-sloping, the increase of relative price of Good 1 moves the point meeting the tangency condition to the
upper-left direction. Therefore the increase of Good 1 price always decrease the
compensated demand for Good 1 and increases the compensated demand for
Good 2.
More generally, we have
Proposition 6.1 When the price of some good goes up, its substitution eect
on itself is non-positive and that on the other goods is non-negative.
Let me explain this analytically. From the duality relationship stated in the
last chapter the compensated demand for Good 1 h1 (p, u) is given by
h1 (p, u) = x1 (p, e(p, u)).
By taking the partial derivative of this with regard to p1 we obtain
h1
x1
x1 e
=
+
p1
p1
w p1
From the definition of expenditure function e(p, u) = p1 h1 (p, u) + p2 h2 (p, u)
79
we have
e
p1
(p1 h1 + p2 h2 )
p1
h1
h2
= h1 + p1
+ p2
p1
p1
(
)
h2
p1 h1
= h1 + p2
+
p2 p1
p1
=
Since the tangency condition M RS(x) = pp12 is met at x(p, e(p, u)) = x we
can rewrite the above formula into
(
)
e
h1
h2
= h1 + p2 M RS(x)
+
p1
p1
p1
Now notice that (M RS(x), 1) is the normal vector to the corresponding
indierence curve at x, since (1, M RS(x)) is the tangent vector of the curve at
x from the definition of marginal rate of substitution. Because the compensated
demand( moves along
) this indierence curve, we see that vector (M RS(x), 1) and
h1 h2
vector p1 , p1 are orthogonal to each other. Hence the inner product of the
1
two is zero, that is, M RS(x) h
p1 +
h2
p1
e
p1
= 0. Thus we obtain
= h1 .
h1 ,
p1
p1
w
h1
1
in which x
p1 explains the gross eect of price change, p1 explains the sub1
stitution eect, and x
w h1 explains the income eect. This formula is called
Slutsky equation
Likewise, we obtain Slutsky equations for the other combinations
x2
p1
x1
p2
x2
p2
=
=
=
h2
x2
h1
p1
w
h1
x1
h2
p2
w
h2
x2
h2
p2
w
80
Good 2
6
w
p2
b
rx
rx
r
w
p1 +p1
w
p1
- Good 1
Example 6.1 (Gien good): As depicted in Figure 6.4, consider the case
that Good is a Gien good, that is, demand for it is larger despite its price goes
up.
As stated above, the substitution eect on Good 1 is always negative (at least
b1 | > |b
x1 x1 | shows
in the weak sense), that is, x
b1 x1 0. However, |x1 x
that the income eect is large enough to overcome the substitution eect. Here
Good 1 is an inferior good, which exhibits negative income eect in the sense
that as income is smaller it is consumed more. Moreover, since the negativity
of income eect is so severe that it overcomes the substitution eect and leads
to the gross demand increase.
Thus, Gien good is characterized as an inferior good such that its negativity
of income eect is severe enough to overcome the substitution eect to itself.
Example 6.2 (Gross complement): I wrote that substitution eect of price
increase of some good on another good is always greater than or equal to zero.
That is, in a pure sense every good is a substitute of any other good. How
can it be consistent with the existence of gross complements?
Reconsider the example of preference exhibiting perfect complementarity.
As depicted in Figure 6.5, consider that the price of Good 1 goes up from p1 to
p1 + p1 . Then the compensated demand under (p1 + p1 , p2 ) which gives the
same welfare level as x does is x itself. That is, the substitution eect is zero
and the gross demand change consists only of income eect.
More generally, one is a gross complement of another if income eect of price
increase of the other is greater than the substitution eect of it, and one is a
gross substitute of another otherwise.
81
Good 2
6
w
p2
rx
b
rx = x
w
p1 +p1
w
p1
- Good 1
6.6
Consumer may gain or lose from price changes. If the government should compensate consumers loss due to price change or tax consumers gain due to price
change in a lump-sum manner, how much should it be? Here we consider how
to evaluate such gain and loss in terms of income.
6.6.1
82
Let me illustrate for the case that the price of Good 1 increases by p1 , using
Figure 6.6 Let x = x(p, w) denote the demand under the price before the change
p = (p1 , p2 ), and let x = x(p , w) denote the demand under the price after the
change p = (p1 + p1 , p). Then compensated demand is x
b = h(p , u(x(p, w))),
which minimizes expenditure under the price after the change as far as it yields
the same welfare level as before. Then the budget line corresponding to the
minimal necessary income is the dotted line passing through x
b, and the income
corresponding to this is e(p , u(x(p, w))) = w + CV . That is, this budget line is
obtained by shifting the budget line after the price change to the above vertically
by CV /p2 (or to the right horizontally by CV /(p1 + p1 )).
The other one is
Under the current price, if we are to maintain the given consumers
welfare level the same as after the potential price change, how much
of income should be taken away or compensated?
Such value is called equivalent variation.
Again, let p = (p1 , p2 ) be the current price and let p = (p1 , p2 ) be the one
after the potential change. Denote the income by w.
Then the minimal income which is necessary for purchasing consumption
under the current price in order to maintain the same welfare level as after the
potential price change, u(x(p , w)), is given using the expenditure function in
the form
e(p, u(x(p , w)))
Hence the equivalent variation is the dierence between the original income and
this,
EV = w e(p, u(x(p , w)))
Using indirect utility function it is given in the form
v(p, w EV ) = v(p , w).
Let me illustrate for the case that the price of Good 1 increases by p1 ,
using Figure 6.6. Let x = x(p, w) denote the demand under the current price
p = (p1 , p2 ), and let x = x(p , w) denote the demand under the price after
the potential change p = (p1 + p1 , p). Then compensated demand is x
e =
h(p, u(x(p , w))), which minimizes expenditure under the current price as far as
it yields the same welfare level as after the potential price change. Then the
budget line corresponding to the minimal necessary income is the dotted line
passing through x
e, and the income corresponding to this is e(p, u(x(p , w))) =
w + EV . Here EV is negative, and this budget line is obtained by shifting the
budget line under the current price to the below vertically by |EV |/p2 (or to
the left horizontally by |EV |/p1 ).
Let us compare between compensated variation and equivalent variation,
again for the case that the price of Good 1 increase from p1 to p1 + p1 . Note
83
that
CV
Since it holds
u(x(p1 + p1 , p2 , w)) < u(x(a, p2 , w)) < u(x(p1 , p2 , w))
at all p1 < a < p1 + p1 and it holds h1 (a, p2 , u(x(a, p2 , w)) = x1 (a, p2 , w)
because of duality, we have
p1 +p1
CV =
h1 (a, p2 , u(x(p1 , p2 , w)))da
p1
p1 +p1
x1 (a, p2 , w)da
p1
84
Good 2
I+CV
p2
I
p2
I+EV
p2
b
rx
x
er
rx
I
p1 +p1
I
p1
- Good 1
Since it holds
u(x(p1 + p1 , p2 , w)) < u(x(a, p2 , w)) < u(x(p1 , p2 , w))
for all p1 < a < p1 + p1 and it holds h1 (a, p2 , u(x(a, p2 , w)))) = x1 (a, p2 , w)
because of duality, we have
p1 +p1
EV =
h1 (a, p2 , u(x(p1 + p1 , p2 , w)))da
p1
p1 +p1
x1 (a, p2 , w)da
p1
6.6.2
85
Good 1 price
h1 h1
6
r
p1 + p1
p1
x1
x1
- Good 1
and we need to know the entire preference in order to know it. We are given
only a demand function at best as data in practice, however. Now how can we
evaluate welfare change in terms of income given only a demand function.
Again, consider the price change of Good 1. Given demand function
x(p, w) = (x1 (p, w), x2 (p, w))
fix income w and p2 the price of Good 2, solve the equation
x1 = x1 (p1 , p2 , w)
for p1 , and denote the solution by
p1 = p1 (x1 )
This is called inverse demand function. Plot this on the plane as in Figure
6.8, where we take quantity of Good 1 on the horizontal axis and its price on
the vertical axis.
For example, Cobb-Douglas preference represented by u(x) = a ln x1 +b ln x2
yields the demand function
x1 (p, w) =
w
a
,
a + b p1
x2 (p, w) =
a
w
.
a + b x1
b
w
,
a + b p2
86
Good 1 price
6
- Good 1
Figure 6.8: Inverse demand function
This inverse demand function describes apparent willingness to pay for the
given good, that is, the amount of income the consumer is willing to give up in
order to have an extra unit of consumption of it. Ill come back later to explain
why it is an apparent one. Let me first illustrate this using integers, then
p1 (1) is the willingness to pay for the first unit, p1 (2) is the willingness to pay
for the second unit, and so on, and p1 (k) is the willingness to pay for the k-th
unit.
Given price p1 , the consumer buys an extra unit as far as p1 (k) > p1 , since
willingness to pay is greater than the price, and decreases the consumption when
p1 (k) < p1 , since willingness to pay is smaller than the price. Thus optimal
consumption is x1 such that p1 (x1 ) = p1 .
Since the consumer is willing to pay p1 (1) while he has to pay p1 , he has
gained p1 (1) p1 from the first unit of purchase in the net sense. Since the
consumer is willing to pay p1 (2) while he has to pay p1 , he has gained p1 (2) p1
from the first unit of purchase in the net sense.
By adding this up to x1 we have an accumulated net gain
x1
p1 (k) px1
k=1
Now consider making the grid finer rather than taking integers, we have an
integral formula
x1
p1 (z)dz px1
0
87
Good 1 price
6
p1
x1
- Good 1
p1 . Then the consumer surplus after the price change is the area surrounded
by the inverse demand curve, horizontal line p1 + p1 and the vertical axis in
Figure 6.10. Hence the change in consumer surplus due to the price change is
the area surrounded by two horizontal lines p1 +p1 and p1 , the inverse demand
curve and the vertical axis in Figure 6.10.
By rotating Figure 6.10 by 90-degrees, we see the integral description of the
change in consumer surplus
p1 +p1
CS =
x1 (r)dr,
p1
88
Good 1 price
6
r
p1 + p1
p1
x1
- Good 1
6.7
Exercises
a2 p2 I
,
+ b2 p21
a2 p1 p2
x2 (p, I) =
b2 p1 I
+ b2 p1 p2
a2 p22
On the demand for Good 1, find is income elasticity, price elasticity and cross
price elasticity to Good 1.
2
Chapter 7
7.1
You might have read the following argument in introductory books, or not. In
any case, it is what economists once believed seriously, but it does not
have any economic content. So I like you to read this thinking of which part
is flawed.
1. Denote the utility of q units of consumption by U (q).
2. Given price of a given good p, the consumers surplus, which is the net
utility obtained by subtracting expenditure pq from the gross utility U (q),
is
U (q) pq
The consumer maximizes this.
3. At the utility-maximizing point, we have equality of marginal utility and
price,
U (q) = p.
That is, the net utility is maximized when incremental utility from incremental consumption of the given good is equal to the incremental cost of
it.
1 Ill
89
90
4. Take q on the horizontal axis and p on the vertical axis, and plot the above
maximization point, then we obtain a demand curve
p = U (q).
7.2
Due to the above diculties certain some people say that the consumer surplus
analysis is dead. From this standpoint, consumer surplus can be taught in
introductory courses only as a convenience to motivate learners and it should
be forgotten once they go up to higher levels.
Nevertheless the concept of benefit (how much the consumer is willing to
pay for the given good) and consumers surplus (how much he has gained fro
trade) is very intuitive. In this chapter I argue how this concept can be defined
consistently in the framework of ordinal utility.
We should note that the argument below is dicult, however. It is dicult, in the sense that it can be applied only to a very local and specific phase
of an economy which can be isolated from the rest, and cannot be extended
to the economy as a whole. Therefore modern economics applies the concept
of benefit and consumer surplus when we can isolate a local and specific
phase and calls it partial equilibrium analysis. On the other hand, directly
treating an economy as a whole without doing such isolation is called general
equilibrium analysis.
How can such isolation work? Market of a good under consideration of
partial equilibrium analysis is like a boat floating on the ocean. Since the
boat is negligibly small compared to the ocean, one can ignore its eect on the
ocean, and can focus on the movement of the boat itself. That is, when the
market for the commodity under consideration is very small compared to the
entire economy its behavior will not aect the entire economy and the behavior
of the markets for the other commodities will be unchanged at least in the
approximate sense.
91
Good 2
6
x1
x2
- Good 1
r x
92
Change in income
6
Income
6
- Good 1
- Good 1
Figure 7.2: Change in income when the background income is suciently large
income transfer.
Definition 7.1 The preference is said to be quasi-linear if the indierence
curves are parallel along the vertical axis.
Indierence curves for a quasi-linear preference are depicted as in Figure 7.3.
Indierence curves being parallel along the Good-2-axis means that marginal
rate of substitution is independent of the quantity of Good 2. That is, the
amount of income the consumer is willing to give up in order to get one extra
unit of Good 1 is independent of how much income he is holding (see the dotted
line in Figure 7.3). It is that there is no income eect on Good 1.
When is the no income eect assumption valid? This argument dates back
to Marshall, who thought that when the commodity under consideration is
negligibly small compared to the entire set of commodities the income eect on
it is negligible. For example, if you dont buy a small thing such one can of
cola for 2 dollars it is not because you cant pay 2 dollars or paying 2 dollars
seriously aects consumption of the other goods, but because you judge that it
does not deserve 2 dollars.
Marshall thought that in such situations in which income eect is negligible
consumption is determined only by comparison between willingness to pay and
price. 2
2 Of course Marshall did not lead to handle this formally. It was done first by Vives [37], who
considered an increasing set of commodities and shows that when the number of commodity
tends to be arbitrarily large and income tends to be large as well at the same rate, each single
commodity becomes negligibly small compared to the entire set income eect on it converges
to zero.
93
Good 1
6
- Good 1
94
Good 1
6
x1
- Good 1
v(x1 )
let us call w2 the consumer surplus generated by x, in the sense that it is the
evaluation of net gain from consumption in terms of income.
To find the consumer surplus, draw the indierence curve passing through
x = (x1 , x2 ) and look at its intercept with the vertical axis, (0, w2 ), as in Figure
7.5. Since the indierence curves are parallel along the vertical axis we have
w2 = v(x1 ) + x2 , that is, it holds
x (0, v(x1 ) + x2 )
Hence the consumer surplus gained from x is v(x1 ) + x2 .
Now, pick any x = (x1 , x2 ) and y = (y1 , y2 ). Since
x (0, v(x1 ) + x2 ), y (0, v(y1 ) + y2 ),
we have x y if and only if x2 + v(x1 ) y2 + v(y1 ). Thus consumer surplus
v(x1 ) + x2 is a representation of the quasi-linear preference. Notice, however,
that any monotone transformation of a representation of a given preference is
again a representation of the same preference, for arbitrary monotone transformation f the function f (v(x1 )+x2 ) represents the above quasi-linear preference.
Thus we obtain the proposition below.
Proposition 7.1 When a preference is quasi-linear with respect to Good 2 it
is represented in the form
u(x) = f (v(x1 ) + x2 )
where f is arbitrary monotone transformation.
95
Good 2
6
w2 = v(x1 ) + x2
x2
x1
- Good 1
v(x1 )
Notice that while the whole utility representation u has no quantitative meaning
since f is arbitrary, the consumer surplus v(x1 ) + x2 inside f has certain quantitative meaning, in that it is interpreted to be a measure of gains from trade
in terms of income. While there can be arbitrarily many utility representations
for a given preference, the function describing ones willingness to pay, that is v,
is uniquely determined when the preference is quasi-linear in income. This v is
a component of utility representation, not a utility representation by itself. To
emphasize the distinction let me call v the willingness to pay function for a
given consumer. Willingness to pay has economic content under the assumption
of quasi-linearity within the framework of partial equilibrium analysis, while the
whole utility representation again has no quantitative meaning.
Historically, the willingness to pay function was believed to be the utility
function and this made people believe that utility function has a quantitative
meaning. This is a confusion, however, while nowadays teachers very often make
use of this confusion between them in introductory courses for educational
purpose.
7.3
96
f (v(x1 )+x2 )
x1
f (v(x1 )+x2 )
x2
f (v(x1 ) + x2 )v (x1 )
= v (x1 )
f (v(x1 ) + x2 )
7.4
97
subject to px1 + x2 = w
Because we have x2 = px1 + w from the budget constraint the above
problem is equivalent to
max f (v(x1 ) px1 + w),
x1
98
Good 2
6
b
rx
r
x
rx
- Good 1
because the marginal willingness to pay is smaller than the price he will rather
reduce the quantity to buy. In the end the consumption choice of Good 1 will
be the point at which the equality holds.
Also the tangency condition is saying that the price such that the consumer
continues to buy Good 1 up to x1 units is v (x1 ). That is, the marginal willingness to pay is interpreted to be the inverse demand function in which the
Good 1 quantity x1 is the independent variable. That is, the inverse function
of the demand function x1 (p), which is denoted by p(x1 ), is given by
p(x1 ) = v (x1 ).
Now let us reconfirm that the income eect on Good 1 is zero. Assuming
that the price of Good 2 is normalized to 1, consider that the price of Good 1
goes up from p to p as in Figure 7.7, in which the consumption moves from
x = (x1 , x2 ) to x = (x1 , x2 ). Let x
b = (b
x1 , x
b2 ) be the compensated demand
under the price after the change which yields the same welfare level as x does.
Because the indierence curves are parallel along the vertical axis, x
b is precisely
in the above of x , hence the income eect on Good 1, that is x1 x
b1 , is zero.
Let us go over an example. Suppose that the willingness to pay function is
v(x1 ) = x1 .
Consumption of Good 1 is determined by the tangency condition v (x1 ) = p.
Since v (x1 ) = 21x1 here, the condition leads to
1
=p
2 x1
99
1
.
4p2
Since consumer surplus is v(x1 (p)) px1 (p), by plugging the demand function
1
into this we obtain 4p
.
7.5
x1
w
h1
x1
=
,
p1
p1
which means that the demand curve and compensated demand curve coincide.
In Figure 6.7, it means that all three curves collapse into one curve. Hence the
three criteria coincide under the assumption of no income eect..
Let me explain this using Figure 7.7. Assuming that the price of Good 2 is
normalized to 1, consider that the price of Good 1 goes up from p to p + p.
Let x denote the demand before the price change, x denote the demand after
the price change, x
b denote the compensated demand under p + p which yields
the same welfare level as x does, and x
e denote the compensated demand under
p which yields the same welfare level as x does.
Here the compensated variation is x
b2 x2 and the equivalent variation is x2
x
e2 , and the change in consumer surplus is the dierence between the intercepts
of the two indierence curves with the vertical axis. Because the indierence
curves are parallel along the vertical axis, these three coincide.
100
Good 2
6
b
rx
r
x
rx
r
x
e
- Good 1
Chapter 8
Intertemporal choice
8.1
Let me repeat that even if goods are materially the same they are treated as
dierent goods if they are to be consumed at dierent time periods. Here saving
is understood to be selling current consumption and buying future consumption,
and borrowing is understood to be buying current consumption and selling
future consumption.
In the book I describe this by the two-period model. For simplicity let us
assume that there are just two periods and there is just one consumption good
available at each period. This is enough for our purpose while it can be extended
so as to allow many periods and many good at each period.
Thus the consumption set for a given consumer is R2+ . Its element, typically
denoted by x = (x1 , x2 ) is called a consumption stream, where x1 refers to
consumption at Period 1 and x2 refers to consumption at Period 2.
In the two-period model ones initial endowment is interpreted to be his
earning stream. That is, when a consumer has his initial endowment e =
(e1 , e2 ) it means that he earns e1 units of consumption good at Period 1 and e2
units of consumption good at Period 2.
Let r denote pure interest rate. Assume there is no inflation, as I will come
to it in the next section.
Now, suppose one consumes x1 units in this period then he is saving e1 x1
units of consumption good available today. This can be negative, as he is
borrowing. Return from saving to be received in the next period is obtained by
multiplying gross interest rate 1 + r to e1 x1 , that is, (1 + r)(e1 x1 ). The
disposable income in the next period is obtained by adding this (1 + r)(e1 x1 )
to the earning in the next period e2 , that is, e2 + (1 + r)(e1 x1 ).
Therefore consumption in the next period, denoted x2 obeys the constraint
x2 e2 + (1 + r)(e1 x1 ).
101
102
1
1
x2 e1 +
e2 .
1+r
1+r
Here the right-hand-side is the amount of consumption good in the current period which you can get when you borrow against all of your future earnings
and spend all your lifetime income on current consumption. Since it is the lifetime income measured in terms of current consumption it is called the present
value of lifetime income. Notice again that this is a special case of the standard form of budget constraint, in which we take current consumption to be
1
numeraire and take p1 = 1 and p2 = 1+r
.
Future value of lifetime income corresponds to the x2 -intercept of the budget
line and present value of lifetime income corresponds to the x1 -intercept.
In any case the slope of budget line is pp12 = 1 + r in its absolute value,
hence gross interest rate is the relative price of current consumption for future
consumption. That is, as you increase one unit of current consumption you
have to give up 1 + r units of future consumption. In other words, it is the
opportunity cost of extra one unit of current consumption which is measured in
terms of future consumption.
8.2
Now let us see how the intertemporal budget constraint looks like under inflation. Denote the inflation rate by , and normalize the price level of consumption good in the current period equal to 1, then the price level of consumption
good in the future period is 1 + . Here r is now nominal interest rate.
Notice that under inflation there is dierence between nominal and real.
Suppose you consume x1 in the current period then you save e1 x1 in
terms of current consumption good. Then return from saving is obtained by
multiplying gross interest rate 1 + r to saving e1 x1 , that is, (1 + r)(e1 x1 ).
Since saving is made in terms of current consumption good it is not inflated
in the future period. On the other hand, as you earn e2 units of consumption
good in the future period its value is inflated and (1 + )e2 is counted into the
disposable income in the future period. Summing up the disposable income in
103
1+r
(e1 x1 ),
1+
Now consider
1+r
1+
to be the real gross interest rate then actually nothing changes. That is, define
real (pure) interest rate by
1+r
=1+
1+
Then the budget constraint is rewritten into the form
x2 e2 + (1 + )(e1 x1 )
and we can rewrite this into the future value form of budget constraint
(1 + )x1 + x2 (1 + )e1 + e2
or into the present value form of budget constraint
x1 +
1
1
x2 e1 +
e2 .
1+
1+
Therefore, when there is inflation intertemporal budget constraint is determined by real interest rate. Since our consumer is assumed to be rational
throughout the book he cares only about real. Thus, unless mentioned specifically the interest rate r is hereafter taken to be the real one, which is already
adjusted to inflation.
8.3
1
e2 .
1+r
104
1
That is, holding this asset and holding e1 + 1+r
e2 units of consumption good
in the current period are equivalent. This is because if you take e1 out of
1
1
e1 + 1+r
e2 units of consumption good in the current period and save 1+r
e2
1
then in the future period you receive (1 + r) 1+r e2 = e2 units of consumption
good. Thus you can mimic the same consumption stream generated by the this
asset.
Suppose the relative price of this asset for the consumption good in the
current period, denoted P , is smaller than its discounted present value, that is,
P < e1 +
1
e2 .
1+r
1
e2 .
1+r
The one can get this asset from somebody telling that he pays e1 in the current
period and e2 in the future period, and sells it for P , then he can generate
something from nothing again, as he can generate a consumption stream
(x1 , x2 ) satisfying
(
)
1
1
x1 +
x2 P e1 +
e2 .
1+r
1+r
Such action of generating something from nothing is called arbitrage.
When there is an arbitrage opportunity anybody will try to exploit it unless
he is stupid. In the first case then there will be an excess demand for the
asset and the asset price will go up, and in the second there will be an excess
supply of it and its price will do down. In the end the arbitrage opportunity
will disappear and there will be no free lunch, and there the asset price will be
equal to the discounted present value of its earning stream. 1
We can extend the discounted present value formula to three or more periods.
Given that real interest rate per period is r the discounted present value of
stream (e1 , e2 , e3 , , eT ) is
(
)2
)T 1
)t1
(
(
T
1
1
1
1
e1 +
e2 +
e3 + +
eT =
et
,
1+r
1+r
1+r
1+r
t=1
1 Practically, there may be may people who are stupid, or even when they recognize the
arbitrage opportunity they may not be able to exploit because of barriers such as transaction
fees. If that is the case arbitrage opportunities may not disappear.
105
where the receipt/payment one period later is discounted once, the receipt/payment
two periods later is discounted twice, and so on, ....., and the receipt/payment
T 1 periods later is discounted T 1 times.
W can think of T being infinity, then the discounted present value of stream
(e1 , e2 , e3 , ) is
e1 +
1
e2 +
1+r
1
1+r
)2
e3 + =
t=1
(
et
1
1+r
)t1
.
Of course we die on some day, but we are not sure exactly when we die. In order
to think of such open-ended situation, infinity is a good approximation. Also,
if we think of monthly or weekly consumption-savings lets say 30 years is an
extremely long period and we can take it to be essentially infinite.
Also we can consider that the interest rate varies over time, while it has
been assume to be constant in th above. Let r1 denote the interest rate between
Period 1 and Period 2, r2 denote that between Period 2 and Period 3, and so
on, then the discounted present value of stream (e1 , e2 , e3 , ) is given by
t1
1
1
1
1
e1 +
e2 +
e3 + =
,
et
1 + r1
1 + r1 1 + r2
1 + rk
t=1
k=0
where r0 = 0.
8.4
8.4.1
8.4.2
106
v (x1 )
.
v (x2 )
107
108
Period 2
6
r c
c/
?
- Period 1
We should note, however, the periodwise evaluation function has quantitative meanings only in its curvature, not in its absolute level or scale. Hence it
has no meaning like how much the consumer is happy or happier. This thing
is the same as before.
While curvature of a function may change by taking monotone transformation in general, it is invariant under taking any positive ane transformation
(adding constants, multiplying positive constants). Therefore, whenever you
take any positive ane transformation of a given periodwise evaluation function it describes the preference in the discounted utility form.
For example, when transform v(z) = ln z into ve(z) = a ln z + b it does not
change the series of indierence curves. This will be clear from
ve(x1 ) + e
v (x2 ) = (a ln x1 + b) + (a ln x2 + b)
=
=
a(ln x1 + ln x2 ) + b
a(v(x1 ) + v(x2 )) + b
109
8.4.3
t1
u(x) = f
v(xt ) .
t=1
We can even think of T being infinite. Of course we die on some day, but
we are not sure exactly when we die. In order to think of such open-ended
situation, infinity is a good approximation. Also, if we think of monthly or
weekly consumption-savings lets say 30 years is an extremely long period and
we can take it to be essentially infinite.
Problem of time consistency
Potential problem in the many-period extension is consistency of intertemporal
choice. In the two-period model the consumers saving decision is made just
once, where the saving decision in Period 1 automatically determines the level
of consumption in Period 2 basically. 2
However, when there are three periods or more, there consumer faces tradeos not only between today and tomorrow, but also between tomorrow and
the day after tomorrow, and so on, and he faces the problem of reoptimization.
To illustrate, let me give one stupid example.
2 This may leave a problem of reoptimization in Period 2 about which good to consume
more or less, but we are rather focusing on intertemporal trade-os between consumption
levels at dierent periods.
110
111
112
ality),
u1 (x1 , , xT ) =
v(x ) 1
=1
u2 (x2 , , xT ) =
v(x ) 2
=2
..
.
ut (xt , , xT )
uT (xT )
..
.
=
vi (x ) t
=t
v(xT )
To illustrate, let me write down the preference at Period 1 without using the
summation symbol,
u1 (x1 , , xT ) = v(x1 ) + v(x2 ) + v(x3 ) 2 + + v(xT ) T 1
Here we see that we evaluation consumption at each period by the same function
v and take the sum of them after discounting by , where we dont discount the
current consumption, discount once for the next period, discount twice for two
periods later, and so on. Since we use the same function v and discount factor
over time we call this a series of stationary discounted utility preferences.
Let us confirm that any series of stationary discounted utility preferences is
time-consistent. Suppose it holds at Period 1 that
(c1 , x2 , , xT ) 1 (c1 , y2 , , yT ).
Under stationary discounted utility preference this is equivalent to
113
8.5
x2
e2
e1 +
.
1+r
1+r
114
Recall that the general form of tangency condition is M RS(x) = pp12 . Since
the relative price of current consumption for future consumption is the gross
interest rate here we have pp12 = 1 + r. Hence the tangency condition is now
M RS(x) =
v (x1 )
=1+r
v (x2 )
e2
x2
= e1 +
1+r
1+r
then we can obtain the demand for current consumption and future consumption
respectively,
x1 (r), x2 (r)
Thus, saving (or borrowing) in the current period is given
e1 x1 (r)
x1 (r) =
e1 +
e2 , x2 (r) =
((1 + r)e1 + e2 )
1+
1+r
1+
and saving (or borrowing) is
1
1
e1
e2 .
1+
1+ 1+r
We can see the following properties from the above result, while they can be
seen more generally,
1. As interest rate r is higher (lower), assuming the other elements stay the
same, the saving is larger (smaller).
115
8.6
Exercises
Exercise 10 (i) When the pure interest rate is 4%, what is the present value
of an asset which yields 300 in the current period and 400 in the next period?
(ii) When the pure interest rate is 10%, what is the present value of an asset
which yields 200 in the current period, 300 in the next period and 500 two
periods later?
(iii) When the pure interest rate is 6%, what is the present value of an asset
which yields 300 every period?
Exercise 11 Consider a two-period consumption-saving problem. Consumers
preference is represented by u(x) = ln x1 + 0.95 ln x2 , his earning stream is
(40, 30) and the pure interest rate is 4%. How much does he save in Period 1?
Chapter 9
9.2
Risk attitude
When you make choice under risk, what do you care about? The first thing will
be the expected value of return. But is that all? Let us think of the following
examples.
Example 9.1 Consider two options. One is a bet such that you flip a coin and
you 100 dollars if you have head and nothing if you have tail. The other is to
receive 50 dollars for sure. If you care only about expected return you will be
indierent between the two, and strictly prefer the first one if the sure amount
in the second is slightly less, such as 49. This is not realistic, however.
Example 9.2 (St. Petersburg paradox): You can flip a coin repeatedly
until you have tail. If you have head k times before having tail, you receive 2k
dollars. What is the expected return of this gamble?
Since the probability that you have head in the first k flips and tail in the
116
( 1 )k
2
1
2
( 1 )k+1
2
( )k+1
1
k=0
117
2 =
1
k=0
If you care only about expected value of returns you are willing to pay arbitrary
amount in order to attend this gamble, lets say one million. But this is not
realistic.
The above example suggest that we need to take decision makers risk attitudes into account. Risk attitudes can be classified into three types, although
we can think of mixtures of them.
1. Risk-averse: If I can receive a sure amount which is equal to the expected return of a given gamble, I will take the sure receipt.
2. Risk-loving: If I can receive a sure amount which is equal to the expected return of a given gamble, I will rather take the gamble.
3. Risk-neutral: I care only about expected returns.
9.3
118
16
0.6
0.5
25 36
50 60
100
Now what about another bet (100; 0.6, 0; 0.4)? Again if the decision maker is
risk-neutral the certainty equivalent of this bet for hims is the expected return
which is 60. However, again because we are talking about a risk-averse decision
maker, we say the certainty equivalent is lower than 60, say 36. In order to be
consistent with the value of certainty equivalent, we assign v(36) so that
v(36) = 0.6v(100) + 0.4v(0) = 0.6.
By repeating this argument in the end we obtain a function
v : [0, 100] [0, 1]
which exhibits a graph like Figure 9.1. Let us call this function von-Neumann/Morgenstern
index (vNM index), after the names of the founders of the theory.
When the decision maker is risk-averse the vNM index obtained exhibits a
graph which is convex to the top. On the other hand, if the decision maker is
risk-neutral the certainty equivalent of any bet is equal to its expected return
and the obtained vNM index is a straight line (it corresponds to the dotted line
connecting (0, 0) and (100, 1)).
Once we obtain the vNM index of a given decision maker, we can find the
certainty equivalent of any bet for him. For example, the certainty equivalent
of (25; 0.5, 81; 0.5) is z such that v(z) = 0.5v(25) + 0.5v(81).
In order
to fit the above numerical example one can take the vNM index
v(z) = z/10 (actually I made the numerical examples so that it is the case,
in order to make the explanation simpler...).Hence the certainty
equivalent
of
bet (25; 0.5, 81; 0.5) is given by z such that z/10 = 0.5 25/10 + 0.5 81/10,
and we obtain z = 49 by solving it. Note that the expected return of the bet is
53 on the other hand.
z
=
10
k=0
119
( )k+1 k
1
2
.
2
10
In contrast to the previous case, the right -hand-side of the above equation is
finite, since
)k
(
( )k+1 k
1 1
1
2
2+ 2
=
.
=
2
10
20
20
2
k=0
9.4
k=0
3+2 2
,
2
Expected utility represents the decision makers preference over bets. Thus we
first need to formalize bets. A bet is given as a probability distribution over
outcomes, which is called a lottery. Let Z be the set of outcomes. In typical
application we have Z = R+ which consists of possible values of receipts. A
lottery is a list of possible outcomes and their probabilities. For example, a
lottery denoted by p = (x1 ; p1 , x2 , p2 , , xn ; pn ) gives outcome xk Z with
probability pk for each k = 1, , n, where n is the number of possible outcomes
in lottery
Since lottery is a probability distribution over outcomes we have
p.
n
to have k=1 pk = 1. Let (Z) denote the set of such lotteries over Z.
Note that we can mix two lotteries in (Z) and make a so-called compound
lottery again as an element of (Z). For example, given p = (x1 ; p1 , , xn , pn )
and q = (y1 ; q1 , , ym ; qm ), consider a lottery which gives p with probability
p and q with probability 1 . In this compound lottery the probability
of outcome xk is pk for each k = 1, , n, and the probability of outcome yj
is (1 )qj for each j = 1, , m. Hence the compound lotter is given by
p + (1 )q = (x1 ; p1 , , xn ; pn , y1 ; (1 )q1 , , ym ; (1 )qm ),
where duplicated outcomes are suitably recounted. For example, the compound
lottery made of p = (A; 0.5, B; 0.3, C; 0.2) and q = (C; 0.7, D; 0.3) with proportion 0.4 : 0.6 is
0.4p + 0.6q = (A; 0.2, B; 0.12, C; 0.5, D; 0.18)
The decision maker has preference over (Z). For example, the relation
pq
120
u(p) = f
v(xk )pk
k=1
9.5
pq
121
r
1
r
1
122
which realizes q and p respectively he would say, wait, I like to receive p rather
than q. Such phenomenon is called dynamic inconsistency
Mixture Independence is interpreted to require that such inconsistency does
not arise. Thus it is appealing at least as a normative rationality requirement
which imposes dynamic consistency.
The independence condition is often violated in experiments, however. I will
come to this later.
Below is called the expected utility representation theorem, but I would
relegate its proof to an advanced book such as Mas-Colell, Whinston and Green
[21].
Theorem 9.1 Preference over lotteries satisfies the above three conditions
if and only if there exists a vNM index v such that the preference is represented
in the form p = (x1 ; p1 , , xn ; pn )
( n
)
u(p) = f
v(xk )pk
k=1
9.6
certainty
of (0; 0.5, 100; 0.5) for each case, then the former yields z = 0.5 0 + 0.5 100,
which implies the certainty equivalent is 25, while the latter yields z = 0.5 0 +
0.5 100, which implies the certainty equivalent is 50.
When you take any monotone transformation of the whole representation
u it again represents the same preference. However, if you take an arbitrary
monotone transformation of a vNM index v, which is square root in the above
casse, it in general changes the preference which it describes. Hence v has certain quantitative meaning, while its meaning has certain limitation as discussed
below.
Again, let me emphasize that while v is cardinal the entire representation u is ordinal because any monotone transformation of it represents
the same preference. Thats why I choose not to call v a utility function but
call it a vNM index in order to avoid confusion.
Note, however that a vNM index has its meaning only in its curvature and its value itself has no quantitative meaning. That is, it has no
123
meaning like how much happy the decision maker is. This is the same as before.
Curvature of a function is changed in general if you take an arbitrary monotone
transformation of the function, but it does not change under any ane transformation, which consists of multiplying positive constants and adding constants.
Thus the preference to be described by a given vNM index and certainty equivalents for the preference are unchanged
under ane transformations.
z = p1 x1 + p2 x2
Notice that this z satisfies
a z + b = p1 (a x1 + b) + p2 (a x2 + b) = a(p1 x1 + p2 x2 ) + b
as well, which implies it is the certainty equivalent obtained from ve(z) = a z+b.
Converse is true as well
In general you can say the following.
Proposition 9.1 Suppose v is a vNM index describing some preference in the
expected utility form. The for any constants a, b with a > 0, av + b describes
the same preference in the expected utility form.
Also, if two vNM indices describe the same preference over lotteries in the
expected utility form, then there exist constant a, b with a > 0 such that vb =
av + b.
Proof. = part: Since v is a vNM index which describes the given preference
in the expected utility form, for any monotone transformation f
( n
)
v(xk )pk
f
k=1
f
(av(xk ) + b)pk = f a
v(xk )pk + b
k=1
124
v(z)b
v (z) v(z)b
v (z)
v(z) v(z)
v(z) v(z)
,
v(z) v(z)
(ax1 + b) + (1 )(ax2 + b)
= a(x1 + (1 )x2 ) + b
= v(x1 + (1 )x2 )
which implies the decision maker is indierent between the bet and sure
receipt of its expected return.
2. A decision maker is risk-averse when the vNM index describing his preference in the expected utility form is concave, that is, when it holds
v(x1 ) + (1 )v(x2 ) < v(x1 + (1 )x2 ).
This means that given a bet (x1 ; , x2 ; 1 ) and its expected return
x1 + (1 )x2 the decision maker chooses the latter to be received for
sure.
125
3. A decision maker is risk-loving when the vNM index describing his preference in the expected utility form is convex, that is, when it holds
v(x1 ) + (1 )v(x2 ) > v(x1 + (1 )x2 ).
This means that given a bet (x1 ; , x2 ; 1 ) and its expected return
x1 + (1 )x2 the decision maker chooses the bet.
You can think of which of the following ones correspond to risk aversion or risk
loving.
v(z) = ln z
v(z) = z 2
v(z) = ez
v(z) = ez
Of course one may think of a vNM index which exhibits risk aversion (concavity)
in some region and risk loving (convexity) in some region, but let me omit it in
this book.
9.7
Applications
In the applications below we assume there any bet can have at most two outcomes and also that the probabilities are fixed. That is, we look at the expected
utility representation applied to binary bets
u(x1 ; , x2 ; 1 ) = v(x1 ) + (1 )v(x2 ),
where we fix , 1 and vary x = (x1 , x2 ) only. Thus, simplify the notation to
u(x) = v(x1 ) + (1 )v(x2 )
9.7.1
Insurance purchase
Example 9.3 The decision maker has initial income w, his risk attitude is
described by vNM index v. If an accident happens to his he loses L, where the
accident probability is .
There is an insurance available for purchase, and one dollar of expense on it pays
R dollars of income, where R > 1. How much does he spend on the insurance?
Let t denote the income he spends on the insurance. Then,
if an accident happens he ends up with final income w L t + Rt = w L +
(R 1)t;
if no accident happens he ends up with final income w t.
126
v (w t)
R1
()
We find the optima expenditure on the insurance by solving equation (). For
example, if v(z) = ln z the solution is
t=
R 1
1
w+
L.
R1
R1
Now, suppose
1
()
9.7.2
Portfolio choice 1
Example 9.4 The decision maker has income w, and his risk attitude is described by vNM index v(x). There are two assets available. One is safe, and
pays R per one unit of investment. The other is risky, which pays R per one
unit of investment with probability and pays R per one unit with probability
1 , where R > R > R. Let me call the first case State 1 and the second State
2.
127
9.7.3
Remember that even if goods are materially the same they are treated as different goods if they are to be delivered at dierent contingencies. For example,
1 gallon of gasoline when Republicans win the US presidential election is a different good than one gallon of gasoline when Democrats win. If you have to
make some investment decision before the election your choice=bet is described
in the form of state-contingent consumption.
Again, to illustrate lets restrict attention the cases that there are just two
scenarios, such as Republicans or Democrats or hot summer or cool summer. Let me call one State 1, the other State 2, while the analysis is easily
extended to the cases of many states.
Then, a vector of state-contingent consumption denoted by x = (x1 , x2 )
says that the decision maker receives x1 units of consumption good if State 1
happens and x2 units if State 2 happens.
128
This is nothing but a special case of the two-good model we studied in the
previous chapters, where Good 1 refers to consumption contingent on State 1
and Good 2 refers to consumption contingent on State 2.
State-contingent consumption has the following interpretation:
Good 1 = a security which gives one unit of consumption per one unit of it if
State 1 occurs and is a junk if State 2 occurs;
Good 2 = a security which is a junk if State 1 occurs and gives one unit of
consumption per one unit of it if State 2 occurs.
Such security is called Arrow security.
Any security is given as a combination of Arrow securities. To illustrate,
go back to the previous example. There holding one unit of a safe asset which
yields R units of consumption for sure is equivalent to holding R units of Arrow
security 1 and R units of Arrow security 2. Also, holding one unit of a risk asset
which yields R units of consumption at State 1 and R units of consumption
at State 2 is equivalent to holding R units of Arrow security 1 and R units of
Arrow security 2.
Note, however, that this interpretation assumes that short-sales are allowed
for such real securities as far as the holdings of Arrow securities are non-negative.
To illustrate, see Figure 9.3. When you put all initial income w on the safe asset
you get the upper-left end point of the thick segment on the 45-degree line, which
gives you the holding of Arrow securities (Rw, Rw). When you put all initial
income on the risky asset you get the lower-right end point of the thick segment
which gives you the holding of Arrow securities (Rw, Rw). However, without
allowing short-selling you cannot obtain every point on the standard budget
line in the sense introduced before. For example, any point on the dotted part
lower-right than (Rw, Rw) can be obtained only by short-selling the safe asset,
and any point on the dotted part upper-right than (Rw, Rw) can be obtained
only by short-selling the risky asset. Thus directly handling the exchange of
Arrow securities on the budget line in the standard sense requires that such
short-selling is allowed.
Let us assume that the decision makers preference is represented in the
expected utility form where the vNM index is denoted by v. Let denote the
probability that State 1 occurs, where State 2 occurs with probability 1 .
Then his preference induced over state-contingent consumptions is represented
in the form
u(x) = v(x1 ) + (1 )v(x2 )
Notice that here the marginal rate of substitution of Good 2 for Good 1 is given
by
v (x1 )
M RS(x) =
.
(1 )v (x2 )
Let e = (e1 , e2 ) denote the decision makers initial holding of the securities.
In a security exchange market, given a vector of security prices p = (p1 , p2 ), he
129
Security 2
6
- Security 1
Figure 9.3: Short-sale constraints
subject to
p1 x1 + p2 x2 = p1 e1 + p2 e2
where his income is given by the market value of his initial security holdings.
When the vNM index is smooth, the optimal choice is given by the tangency
condition that marginal rate of substitution is equal to the relative price. Then
we have
v (x1 )
p1
=
(1 )v (x2 )
p2
Combine this with the budget constraint p1 x1 + p2 x2 = p1 e1 + p2 e2 and solve
the equations for x = (x1 , x2 ) then we obtain the security demand.
For example, when the vNM index is v(z) = ln z, his preference over statecontingent consumptions are represented by
u(x) = ln x1 + (1 ) ln x2
This is nothing but Cobb-Douglass preference, the demand for state-contingent
consumptions is obtained by applying the previous result, which yields
x1 (p) =
9.8
9.8.1
(p1 e1 + p2 e2 )
p1
x2 (p) =
(1 )(p1 e1 + p2 e2 )
.
p2
As noted above, the mixture independence condition is often violated in experiments. Consider the following example.
130
v(xk )pk
u(p) = f
k=1
131
Example 9.5
Mom has a single indivisible item a treat
which she can give to either daughter Abigail or son Benjamin. Assume that she is indierent between Abigail getting the treat and
Benjamin getting the treat, and strongly prefers either of these outcomes to the case where neither child gets it. However, in a violation
of the precepts of expected utility theory, Mom strictly prefers a coin
flip over either of these sure outcomes, and in particular, strictly
prefers 1/2: 1/2 to any other pair of probabilities. This random
allocation procedure would be straightforward, except that Benjie,
who cut his teeth on Raias classic Decision Analysis, behaves as
follows:
Before the coin is flipped, he requests a confirmation from
Mom that, yes, she does strictly prefer a 50:50 lottery
over giving the treat to Abigail. He gets her to put this
in writing. Had he won the flip, he would have claimed
the treat. As it turns out, he loses the flip. But as Mom
is about to give the treat to Abigail, he reminds Mom of
her preference for flipping a coin over giving it to Abigail
(producing her signed statement), and demands that she
flip again.
What would your Mom do if you tried to pull a stunt like this? She
would undoubtedly say You had your chance! and refuse to flip
the coin again. This is precisely what Mom does.
Each of Moms claim and Benjamins claim amounts to a problem. If we accept
Benjamins claim and flip coin again, we run into a problem of dynamic inconsistency, since Abigail winning the item with probability 1/2 1/2 = 1/4
and Benjamin winning with probability 1/2 + 1/2 1/2 = 3/4, which is unfair
in any sense from the ex-ante viewpoint. Abigail will say the same thing when
she loses. So Mom will have to flip coins forever.
If we accept Moms claim, we have to go outside of the standard notion of
rationality called consequentialism which says out decision should not be
aected by bygones. Here Moms claim you had a change brings up what
Benjamin (and Abigail) could have got if the first he won the flip, which is
nothing but a bygone.
9.8.2
There is another but related implicit assumption behind the equivalence between Mixture Independence and dynamic consistency. It is that the decision
maker is indierent to timing of resolution of risk so that only the probability
distributions over final outcomes matter.1
To illustrate, let us think of the following example.
1 Depending
132
1. Flip a coin twice, and receive 100 dollars if both flips are head, 50 if the
first is head and the second flip is tail, 30 if the first flip is tail and the
second is head, 0 if both flips are tail.
2. Throw a four-face die, and receive 100 dollars if the face is 1, 50 if it is 2,
30 if 3, 0 if 4.
Since each outcome occurs with quarter probability in both gambles, they induce
the same probability distribution over outcomes. However, they are dierent
when the decision maker cares about timing of resolution of risk, and even more
when the second coin flip is made after certain time.
Recall the explanation of Mixture Independence. There was actually a
cheat. It is the assumption that the two stage lottery which gives p with probability and q with probability 1 and the compound lottery p + (1 )q
are equivalent. However, the latter by definition is a one-stage lottery
p + (1 )q = (x1 ; p1 , , xn ; pn , y1 ; (1 )q1 , , ym ; (1 )qm )
If the decision maker cares about timing of resolution of risk they may not be
equivalent. Then we need to think of lotteries over lotteries, lotteries over
lotteries over lotteries, and so on, so that they are treated dierently across
the layers of timings. It is the idea of recursive utility theory due to Kreps and
Porteus [15].
9.8.3
Let us think more about the assumption that only probability distributions over
final outcomes matter. Consider the following example.
QA: When 50 dollars are initially given, which one do you choose?
A1= receiving 50 more dollars with probability 50% and nothing with
probability 50%
A2= receiving 25 more dollars for sure
QB: When 100 dollars are initially given, which one do you choose?
B1= losing 50 dollars with probability 50% and nothing with probability
50%
B2= losing 25 dollars for sure
It is observed in experiments using similar numbers that there are many subjects who choose A2 in QA and B1 in QB. The two choice problems are equivalent, however, under the assumption that only probability distributions over final
outcomes matter, as both A1 and B1 induce the same lottery (100; 0.5, 0; 0.5)
and A2 and B2 induce (75; 1).
The above example shows that the decision makers tend to be risk averse
when the risk is about how much to gain and tend to be risk loving when the
133
risk is about how much to lose, in the sense that they rather choose gambling
than losing something for sure. This is called loss aversion, and one of the key
components of the prospect theory due to Tversky and Kahnemann [34].
9.9
Exercises
Chapter 10
Revealed preference
Until the previous chapter we have assumed a priori that each individual has
his preference and chooses best alternatives according to it. It is of course a
natural response to wonder, however, if thats true.
You cannot open your brain physically, however, in order to show your preference or maximization process directly as biological objects or substances or
structures or processes.1
So here we take the standpoint to consider if observed choices can be explained consistently as the maximization some preference, rather than thinking
if we can find preferences as physical entities. This is called the revealed preference approach.
The revealed preference approach starts with observed choice data. Here the
data is taken to be a list of pairs, each of which consists of a set of available
alternatives called an opportunity set, say denoted by B, and a subset of it
denoted by (B) consisting of alternatives chosen from B. Let X be the set
of all the potentially available alternatives, which is assumed to be finite for
simplicity. We assume that an opportunity set can be any nonempty subset of
X. Thus the family of all the possible opportunity sets is given by
B = {B : B X, B = }
Given an opportunity set B B, let (B) denote the set of alternatives
chosen from it. Here we allow that (B) may consist of several elements, that
is, we leave ties as they are and do not get into how ties are broken. Thus (B)
is a nonempty subset of B
In other words, is a mapping from B into itself with the property that
(B) B for all B B. Thus it is called a choice mapping. Any observed
data is given as a choice mapping.
1 Im of aware that there are such line of researches in recent decades, but let me take a
classical and conservative standpoint here.
134
135
136
137
What kind of choices are not rationalizable in the above sense? There are
still certain order in such choices.
Let us think of the following example.
Example 10.1 (Minimax regret): Consider choice under uncertainty with
two states of the world, denoted by s1 and s2 . Consider the following statecontingent receipt of prize, where for example y = (1, 5) denotes 1 unit if s1
occurs and 5 units if s2 occurs.
x
y
z
s1
2
1
5
s2
2
5
1
x
y
z
ex-post maximum
outcome
s1
2
1
5
5
s2
2
5
1
5
regret
s1
3
4
0
maximal regret
s2
3
0
4
3
4
4
The minimax regret choice violates Contraction (it violates Expansion too,
but I omit it here). For example, if you drop z and consider a binary choice
{x, y}, then the table becomes
x
y
ex-post maximum
outcome
s1
2
1
2
s2
2
5
5
regret
s1
0
1
138
maximal regret
s2
3
0
3
1
Part II
139
Chapter 11
11.1
Perfect competition
11.1.1
What do you imagine from the word competition? Do you imagine a situation
like everybody killing each other?
To my understanding, whether one is for or against economic competition is
largely aected by how the word competition sounds to him or her, that is,
whether he or she feels heroism in this word in positive way or negative way. It
has nothing to do with what economics is talking about.
I like you to forget about this sound. Let me start with giving a bare-bone
tedious definition.
Definition 11.1 Market is said to be perfectly competitive if every market
participant takes the market price as given.
140
141
11.1.2
142
2: It may be OK for consumers, but produces cannot be just passively responding to the price announced by the auctioneer.
A large producer would rather take price as a function of its supply, instead
that it returns its supply as a function of price. Such producer would rather try
to manipulate price in in favor of it. Assuming the existence of auctioneer itself
does not guarantee that market participants passively respond to price.
When there are large market participants which can manipulate price we
say that the market is imperfectly competitive, and that they have market
power. When the market is perfectly competitive such large participants will
strategically behave and try to outwit each other. This may the competitive
market many of you might imagine, but in economics we call it imperfect
competition.
The above two are criticism to the assumption of price-taking. Let us concede again for moments and accept this. Even after this, there still remains a
problem.
3: Even if the assumption of price-taking is met it is a dierent question
if we reach competitive equilibrium. Demand and supply may remain
unmatched.
Even if we assume the existence of auctioneer and consider the tatonnement
process as described it if a dierent question if the process lead the auctioneer
announce the right price so that demand matches supply. This problem applies
to the case of imperfect competition as well, but let me first explain this in the
context of perfect competition.
11.1.3
143
Let me emphasize that when the market consists of a small number of large
traders instead they have in general market powers and the condition of perfect
competition fails. Then we need a theory of imperfect competition, which is
covered in Part 3.
The exact process of how imperfect competition converges to perfect competition is covered Part 3, but let me give a brief explanation of it. First let us
resolve Problem 2. Let me assume that there is an auctioneer, as I will come to
the case without that in the next paragraph. In general, a market participant
with market power takes the eect of his decision on the market price into account. However, when the number of market participants is large the eect of
an individual participants change in his quantity on the market price is almost
zero. Thus, each market participant alone has to take the market price as given,
which is determined by the mass behavior of a large number of participants.
Now what about Problem 1? In order to clear this we have to dispense with
the auctioneer. Consider the simplest case that each seller sets the price of his
commodity. Then can each seller freely set his price? No. When there is a
large number of competitors if you set the price too high you will lose demand
and will lose profit. Thus even though each seller is setting his price by
himself he has to take certain market price as given, which is set by the mass
of a large number of sellers.
There are two ways to argue, though. One is to consider that the number
of actual market participants tends to be large, the other is that the number
of actual participants does not have to be large but the market is open to any
potential entrants and potentially there are indefinitely many entrants.
In either case, the word competitive means that since there are indefinitely
many actual or potential competitors in the market no market participant cannot dominate the market by himself alone, and has to respond to the mass
tendency in a passive manner.
Finally, what about Problem 3? First thing we can think of is the tatonnement process as explained above. However, the tatonnement argument assumes that all trades are done only after the adjustment process finishes and
reaches an equilibrium. This is not a realistic explanation either, as well as the
assumption of auctioneer, at least as far as we take it literally.
This necessitates to think if and how the market reaches competitive equilibrium through decentralized behaviors of market participants without any centralized adjustment.
When there is a large number of market participants each one is negligibly
small and has to take the mass behavior as given. Let us be content with, as it
is already explained above. The problem here is if such mass behavior indeed
leads to a competitive equilibrium. Since we do not rely on the auctioneer story,
we have to thin of a situation in which the mass of market participants set their
prices and quantities in a decentralized manner.
However, in order that such decentralized mass behavior indeed leads to an
equilibrium in the market, as far as such game is taken literally, each market
144
participants has to have a right prediction of how the others set prices and quantities. This seems to require each participant a terribly high level of rationality.
This point applies to the case of imperfect competition as well.
There will be two views about this. One is
The market equilibrium theory is a loose association of particular
models, in which incomplete strategic reasoning and incomplete adjustment mutually complement each other. It is a detail-free argument, saying that regardless of how particular models work the
situation overall falls in a competitive equilibrium.
It is therefore rather counter-productive to bring up a model of perfect adjustment or perfect reasoning and dismiss the equilibrium
theory on the ground that these are unrealistic.
This sounds cheating somehow, but it is understandable because the frameworks
of our recognition are limited we have to make an economical choice.
The other is of course to continue to pursue a theoretical foundation of
competitive equilibrium, or the notion of equilibrium in general. This is still an
open question, but it seems to be commonly accepted that the key is learning
and imitation. We have to give up the assumption that people trades only after
complete adjustment is done, and that people perfectly read each others mind
and lead to an equilibrium in a timeless manner. But it says the market will be
able to form an equilibrium situation through the process of repeating actual
trades, which may or may not be an equilibrium one, and learning the mass
behavior and taking that into accounts.
In the experimental literature it is known that repeated transactions converge to competitive equilibrium pretty quickly (see for example Joyce [5], Smith
[32]). But the problem is that we dont know yet why.
11.2
Complete market
145
value, but you cannot borrow this amount from the bank. Then you
cannot buy current consumption by means of selling future consumption
in the fully flexible manner. In this sense the market, in particular the
market for intertemporal trading, is incomplete.
Consider that there is only a safe asset which pays constant return regardless of uncertainty. This may sound good. But if your earning is lower
at some state (State 1, say, Republicans winning) and higher at another
state (State 2, say, Democrats winning) you would like to hedge risk by
means of transferring income from State 2 to State 1. You can do this if
there is another asset, which is risky, and pays higher return than the safe
asset at State 2 and lower return than the safe asset at State 1. Then you
can transfer income from State 2 to State 1 by means of buying the risky
asset and (short-)selling the safe asset. You cannot do this when there is
only a safe asset or generally just one asset. In this sense the asset market
is incomplete.
A consumption or production activity is said to have an externality if
there is no market for it and it is not taken into account in consumption
and production decision in the markets.
A typical example is pollution. The polluter does not take the social
eect of pollution into account in its consumption or production decision
and the other economic agents cannot stop it. If there is a market for the
right to prevent pollution, people can pay for it in order to enjoy cleaner
environment. Or, if there is a market for the right to do the activity with
pollution the polluter may pay for it and the other people may receive
the payment. But the current type of market incompleteness says that
there is no such market.
As a result, resource allocation in the market may be inecient even when
it is perfectly competitive. This is called market failure.
What is important in the above definition is that market prices do
not take it into account. Even when an activity directly aects other
economic agents it is not called externality when it is priced and traded
in markets, such as service.
Complete autarky in which no trade is allowed is the most extreme form
of market incompleteness.
Completeness is critical for eciency of allocation: when the market is incomplete even when it is perfectly competitive the resulting market outcome
may be generally inecient, in the sense that there is another allocation which
makes everybody better o.
We start with complete markets nevertheless, because it is the best way
to find out and understand precisely what types of market incompleteness are
significant.
11.3
146
Complete information
147
such fortune if he does not act rationally, but the fortune cannot come in a
systematically on average basis either, and he cannot systematically outwit the
market.
Apologies became long. Bracketing these apologies, let us see how the baseline model of market works in the next several chapters.
Chapter 12
Competitive equilibrium in
exchange economies
12.1
Exchange economy
i=1
xi1
ei1 .
i=1
Because we are excluding the case of wasting resource without loss of generality,
we assume that this constraint is met with equality. Similarly for Good 2. Thus,
148
149
i=1
n
i=1
12.1.1
xi1 =
xi2 =
i=1
n
ei1
ei2
i=1
Edgeworth box
To illustrate, let us consider that there are just two consumers, A and B. There
must be a large number of small participants, however, in order that the assumption of perfect competition makes sense. The assumption of two consumer
seems to contradict to that apparently when it is interpreted literally. Therefore
you should imagine a large number of consumers behind these two, since the
assumption is only for making the illustration easier to follow.
Now, when we draw the two consumers consumption choices in separate twodimensional graphs it is hard to see if the feasibility condition is met. So we
draw two consumers consumptions in one diagram, which is called Edgeworth
box.
First, draw As consumption set and Bs one respectively, and depict their
initial endowment points on them respectively. Denote As initial endowment by
eA = (eA1 , eA2 ) and Bs one by eB = (eB1 , eB2 ). Next, rotate Bs consumption
set and paste it onto As one so that Bs initial endowment point coincides with
As one. Then you get a rectangular-shaped diagram as in Figure 12.1, such that
its horizontal length is equal to the total amount of Good 1 available, eA1 + eB1 ,
and its vertical length is equal to the total amount of Good 2 available, eA2 +eB2 .
Denote the point at which the two endowment points coincide by e = (eA , eB ),
which is seen as As initial endowment point when it is seen from As origin,
and seen as Bs initial endowment point when it is seen from Bs origin.
Now pick any point in the box, denoted lets say by x = (xA , xB ), then the
sum of its horizontal coordinates across A and B is equal to the vertical length
of the box, and the sum of its horizontal coordinates across A and B is equal
to the vertical length of the box. Thus we have xA1 + xB1 = eA1 + eB1 and
xA2 + xB2 = eA2 + eB2 , which is nothing but the feasibility condition. That is,
any feasible allocation is described as a point in this box diagram. Also, the
budget line passing through the initial endowment point is seen as As one when
it is seen from As origin and seen as Bs one when it is seen from Bs origin.
12.2
Competitive equilibrium
150
As Good 2
6
Bs Good1
xA2
eB1
xB1
rx
xB2
re
eA2
OA
OB
eA1
xA1
eB2
- As Good 1
?
Bs Good 2
where the initial endowments are taken to be fixed and omitted from the notation.
Then the prices are determined so that demand matches supply (again, assuming that you accepted my apologies in the previous chapter...). Denote such
price vector by p = (p1 , p2 )
n
i=1
n
i=1
xi1 (p ) =
xi2 (p ) =
i=1
n
ei1 ,
ei2 ,
i=1
p1
p2
151
IA
6
Bs Good1
OB
re
IB
rx
OA
12.2.1
- As Good 1
?
Bs Good 2
Figure 12.2: Competitive equilibrium
p1 ei1 + p2 ei2
,
p1
p1 ei1 + p2 ei2
(1 i )
,
p2
i
i
.
i + i
152
(ii) Since there are just two goods here, when the market for one good is
balanced the one for the other is automatically balanced. Thus it suces to
look at the condition on Good 1,
(
)
n
n
p2
i ei1 + ei2 =
ei1
p1
i=1
i=1
By solving this we obtain
n
i ei2
p1
= n i=1
p2
(1
i )ei1
i=1
p
Note that we can only find the ratio between p1 and it is sucient. You
2
can see this by seeing that when p = (p1 , p2 ) is an equilibrium price vector its
double 2p = (2p1 , 2p2 ) is also an equilibrium price vector because it does not
change anybodys budget constraint. That is, only relative prices matter.
p
(iii) By plugging the equilibrium relative price p1 in to each consumers
2
demand function we obtain each is consumption in equilibrium
)
(
n
j=1 (1 j )ej1
n
ei2
xi1 = i ei1 +
j=1 j ej2
( n
)
j=1 j ej2
xi2 = (1 i ) n
ei1 + ei2
j=1 (1 j )ej1
12.3
12.3.1
153
ei2
xi2
= ei1 +
1+r
1+r
and we obtain his demand for consumption at each period denoted by (xi1 (r), xi2 (r)).
This is nothing but a special case of the market model above, where the price
of current consumption is normalized to 1 and the price of future consumption
is taken to be the inverse of gross interest rate, that is,
subject to
xi1 +
p1 = 1, p2 =
1
p1
, and
=1+r
1+r
p2
xi1 (r ) =
i=1
n
ei1
i=1
xi2 (r ) =
i=1
ei2
i=1
1 + r = = n
i
p2
i=1 1+i ei1
From this we see the following comparative statics result (while this holds
for more general class of preferences).
154
12.3.2
Ricardian equivalence
155
12.3.3
xi2
ei2
= ei1 +
1+r
1+r
bi
= ei1 +
ei2
1+r
Note that the first constraint can be written xi1 ei1 + bi as well, which means
that his current consumption cannot exceed the sum of current earning plus the
upper limit of borrowing.
Figure 12.3 depicts the case that the borrowing constraint does not bind and
it does not aect the consumers intertemporal consumption. This is the case
ei2
when ei1 + bi ei1 + 1+r
.
Figure 12.4 depicts the case that the borrowing constraint binds and it aects
the consumers intertemporal consumption. This is the case when ei1 + bi
ei2
ei1 + 1+r
.
The assumption of complete market says that one can exchange between
any two goods. In the intertemporal context this means one can freely exchange
between current consumption and future consumption. The absence of liquidity
constraint thus guarantees market completeness.
Period 2
6
r
ei
ei1 + bi
- Period 1
Period 2
6
r
ei
ei1 + bi
- Period 1
156
157
However, when there exists liquidity constraint and it binds the complete
market assumption fails, and one cannot freely exchange between current consumption and future consumption. Then, marginal rates of substitution of
future consumption for current consumption may not be equalized across individuals in competitive equilibrium.
This causes ineciency, as we will show in the next chapter that equalization
of MRS is necessary for eciency. Under liquidity constraints there may be a
consumer who wants to borrow more and has enough earning for repayment
indeed but cannot borrow, and a consumer who likes to save more but nobody
can borrow from him because of the constraint despite he is willing to.
Liquidity constraint is thus a friction as it is. There might be a role for such
friction, however, if there is a dimension of bounded rationality such self-control
problem. It is known that when there are conflicts between current self and
future selves as illustrated in the chapter on intertemporal choice the borrowing
constraint has a role of commitment device in the sense that it prevents the
current self from over-borrowing, which makes both the current self and future
selves better o. See Laibson [17] for more details.
12.4
12.4.1
158
subject to
Here the budget constraint states that the market value of portfolio after the
exchange is equal to the market of value of the initial portfolio.
Then, the competitive equilibrium security price vector p = (p1 , p2 ) is determined so that demand for each security matches its supply. (assuming that
you accepted my apologies in the previous chapter), that is,
n
xi1 (p ) =
i=1
n
ei1
i=1
xi2 (p ) =
i=1
If we have
ei2
i=1
individual i is selling Arrow security 1 and buying Arrow security 2 in equilibrium, and similarly for the opposite case.
Now for example, assume that the vNM index is vi (z) = ln z for all i then it
is a special case of Cobb-Douglas preference applied over consumption streams,
where i = for all i. By applying the previous result we obtain the relative
price of Security 1 for Security 2 in equilibrium,
n
i=1 ei2
p1
n
.
=
p2
(1 ) i=1 ei1
159
12.4.2
Let us consider the simplest case of so-called no aggregate risk. It is the case
where the total amount of resources in the economy is riskless, and the uncertainty is only about who will get how much, which is described by
n
ei1 =
i=1
ei2 .
i=1
Imagine a situation such that some fixed amount of resources is falling from the
heaven and there is an uncertainty about who catches how much. It is realistic
in some situation, however. For example, the number of car accident is quite
constant every year in the entire society, while there is of course an uncertainty
about who hits a crash.
Recall that marginal rate of substitution given by a preference having expected utility representation takes the form
M RSi (xi ) =
vi (xi1 )
(1 )vi (xi2 )
M RSi (xi ) =
for each i. In competitive equilibrium all consumers MRSs are equalized through
the relative price of Good 1 for Good 2, the above condition, and the above condition implies that it is met when all the consumers consumptions are riskless.
Thus we can take the equilibrium relative price of Arrow security 1 for Arrow
security 2 to be
p1
.
=
p2
1
160
As Good 2
6
Bs Good1
IA
re
IB
OB
xr
-As Good 1
?
Bs Good 2
Figure 12.5: Competitive equilibrium under no aggregate risk
OA
12.4.3
Next consider risk sharing between a risk-neutral agent and a risk-averse individual. This situation appears very often in applied analysis. Here the risk-neutral
agent is seen to be a large body which pools a large number of independent
idiosyncratic risks so that in aggregate its performance is statistically certain
thanks to the law of large numbers, such as financial institution.
Let A be the small individual and B be the large body. Maintain the assumption that there are two states of the world, Stare 1 and State 2. Let be
the probability of State 1, then the probability of State 2 is 1 . Let me depict
the situation as in Figure 12.6. At e the initial endowment point B is holding a
large and riskless wealth, and A is facing a risk of earning which is small from
Bs viewpoint. Hence trade between them is limited to occur in a small region
such as the little Edgeworth box taken around As origin, in which Bs origin
OB
is suitably adjusted.
Notice that curves are straight, locally. Therefore Bs indierence curves are
almost straight within the smaller Edgeworth box. Now recall that along Bs
161
Ad Good 2
Bs Good 1
OB
IB
r
e
OB
- As Good 1
OA
?
Bs Good 2
Figure 12.6: Risk sharing between a large body and a small individual
?
12.4.4
The above argument presumes that consumer can freely transfer his resources
across states by means of trading securities. This is what the complete market
assumption says in the current context. The security market is said to be
incomplete when consumers cannot freely transfer their resources across states.
To illustrate, consider that there are three states of the world, where con-
162
As Good 2
6
Bs Good 1
IB
IA
OB
xr
r
e
-As Good 1
?
Bs Good 2
Figure 12.7: Risk charing between risk-averse and risk-neutral agents
OA
163
= 0
xi1
xi2
xi3
= ei1 + zi1
= ei1 + zi2
= ei3
Notice that by replacing zi1 and zi2 in the first equation by those in the
second and third the above equations reduce to
p1 xi1 + p2 xi2
xi3
=
=
p1 ei1 + p2 ei2
ei3
This does not fall in the corresponding budget constraint under the complete
market assumption.
In order that the assumption of complete market is met, there has to be another security. Lets say that one unit of Security 3 pays one unit of consumption
if State 3 occurs and nothing otherwise.
Again let p1 denote the price of Security 1, p2 denote the price of Security
2 and p3 denote the price of Security 3. Let zi1 denote consumer is holding
of Security 1, zi2 denote his holding of Security 2 and zi3 denote his holding
of Security 3, where negative holdings are allowed which means short-selling.
Then his portfolio choice zi = (zi1 , zi2 , zi3 ) has to obey the budget constraint
p1 zi1 + p2 zi2 + p3 zi3 = 0
where the right-hand-side is zero since the initial income is zero.
By the similar argument as above we obtain
p1 zi1 + p2 zi2 + p3 zi3
xi1
= 0
= ei1 + zi1
xi2
xi3
= ei1 + zi2
= ei3 + zi3
Now this reduces to the standard budget constraint for three goods,
p1 xi1 + p2 xi2 + p3 xi3
in which one can freely exchange between Good 1 and Good 2, Good 2 and
Good 3, Good 3 and Good 1.
When the security market is incomplete the consumers cannot share risks
in an ecient manner (while Ill come to the definition of eciency in the next
chapter). For example consider the simplest case that there are two states of
the world but there it no security, or just one which doesnt help at all for the
diversification purpose. Then the only thing consumers can do is to eat their
164
earnings without any trades across states, which is typically inecient from the
risk-hedging viewpoint.
A reader might think it is better to have more kinds of securities available to
trade in order to help risk-hedging. This answer is in general NO! It is always
better to go from nothing to some, but it is known that everyone may lose when
more securities get available to trade in addition to some existing ones. See
Hart [12] if you are interested.
12.5
Exercise
Chapter 13
Eciency of competitive
allocation
13.1
Now, is the competitive market (if it really exists) a good way of resource
allocation? Of course it depends on what we mean by good.
It is at least known that competitive equilibrium allocation satisfies so-called
Pareto eciency. Some books call it Pareto optimality, but for the reason
I will state later it is far from what we imagine from the word optimality,
throughout this book we use the term Pareto eciency.
Let me start with the two-person case. See Figure 13.1 and look at point
x = (xA , xB ). Here IA (resp. IB ) is the indierence curve passing through xA
(resp. xB ). Is this a good allocation? It is not desirable in the following sense.
Here you can take a point lets say y = (yA , yB ) from the lens-shaped region
surrounded by IA and IB . Since yA is above As indierence curve passing
through xA , yB is above Bs indierence curve passing through xB from Bs
viewpoint, allocation y is better than x for both A and B. Since both say that
y is better than x, there will be no reason to object to it.1
Also, when you take a point in the boundary of the lens such as z = (zA , zB ),
this makes A better o without hurting B, because B is indierent between zB
and xB and A strictly prefers zA to xA . Since we are not hurting B here there
will be no reason to object to rank z over x socially.2
Such change which makes everybody better o or makes somebody better
o without hurting anybody else is called Pareto improvement. Formally,
Definition 13.1 An allocation y = (y1 , , yn ) is a Pareto improvement of
1 Im
2 Again,
165
166
As Good 2
- IA
6
Bs Good1
rx
?
IB
OB
rz
rq
ry
- As Good 1
?
Bs Good 2
Figure 13.1: Pareto improvement and Pareto eciency
OA
x = (x1 , , xn ) if it holds
for all i = 1 , n, and
yi i xi
yi i xi
167
i=1 ei1 ,
i=1 ei2 ),
xi1 = e1
i=1
n
xi2 = e2
i=1
168
keeping the welfare level of anybody else the same. For, is amount of Good
2 he has to give up is smaller than the amount of it he is willing to give up,
and js amount of Good 2 he receives is larger than the amount of it he likes to
receive at least. Hence there is a room for a Pareto improvement of x, which
means x is not Pareto ecient.
(Equalization of MRS = Ecienfy): Suppose MRSs are equalized at x, that
is, we have
M RSi (xi ) = > 0, i = 1, , n
Now suppose x is not Pareto ecient. Then there is an another feasible allocation
((x11 + x11 , x12 + x12 ), , (xn1 + xn1 , xn2 + xn2 ))
such that
(xi1 + xi1 , xi2 + xi2 ) i xi
for all i = 1 , n and
(xi1 + xi1 , xi2 + xi2 ) i xi
n
n
for at least one i where i=1 xi1 = i=1 xi2 = 0.
When preferences are smooth, the consumption (xi1 + xi1 , xi2 + xi2 ) is
above the tangent line passing through xi for all i = 1 , n, we have
xi2 M RSi (xi )xi1
Also, since (xi1 + xi1 , xi2 + xi2 ) is strictly above the tangent line passing
through xi for at least one i, we have
xi2 > M RSi (xi )xi1
for such i.
Denote the equalized value of MRSs by , then we have M RSi (xi ) = for
all i = 1 , n. Then by adding up the above inequalities we obtain
n
i=1
xi2 >
xi2 .
i=1
n
i=1
xi1 =
n
i=1
xi2 =
Notice here that there may be arbitrarily many Pareto-ecient allocations. As illustrated in Figure 13.2, we can draw arbitrarily many pairs
indierence curves which are tangent to each other. We can actually draw a
continuous curve by depicting points at which such tangency holds. In the current setting in which the goods are continuously divisible, there is actually a
continuum of Pareto-ecient allocations.
169
As Good 2
- IA
6
Bs Good1
OB
r
?
IB
r
r
- As Good 1
?
Bs Good 2
Figure 13.2: Set of Pareto ecient allocations
OA
13.2
We see that competitive equilibrium allocation is Pareto-ecient when the preferences are smooth and it ends up to be an interior allocation, by seeing that
MRSs of all consumers are equal to the relative price, that is,
M RSi (xi ) = M RSj (xj ) =
p1
p2
for all i and j. We see that in the Edgeworth box diagram by seeing that the
corresponding indierence curves of A and B are tangent to each other and to
the budget line at the equilibrium allocation point.
We can actually show that Pareto eciency of competitive equilibrium allocation without relying on the assumption of smooth preference and its being
interior. The following is called the first fundamental theorem of welfare
economics.
Theorem 13.1 Suppose the individuals preferences are monotone at least in
the weak sense, then competitive equilibrium allocation is Pareto-ecient.
Proof. Let p = (p1 , p2 ) the equilibrium price, and let x = (x1 , , xn ) denote
the equilibrium allocation.
Suppose x is not Pareto-ecient, then there is a feasible allocation x =
(x1 , , xn ) such that
xi i xi
for all i = 1 , n, and
xi i xi
170
n
n
n
n
for at least
one i, where
i=1 xi1 =
i=1 xi1 =
i=1 ei1 and
i=1 xi2 =
n
n
i=1 xi2 =
i=1 ei2 hold because of feasibility.
Recall that for each i his consumption in equilibrium xi is his optimal choice
given the budget constraint p1 xi1 +p2 xi2 p1 ei1 +p2 ei2 . We obtain the following
lemma because of this.
Lemma 13.1 When xi i xi it holds
p1 xi1 + p2 xi2 p1 ei1 + p2 ei2
Proof. Suppose p1 xi1 + p2 xi2 < p1 ei1 + p2 ei2 , then we can sightly increase the
quantities of Good 1 and Good 2, by x1 > 0 and x2 > 0 respectively, so that
the budget constraint is still met by p1 (xi1 +x1 )+p2 (xi2 +x2 ) < p1 ei1 +p2 ei2 .
By the weak monotonicity of preference we have (xi1 + x1 , xi2 + x2 ) i xi .
By transitivity of preference we get (xi1 + x1 , xi2 + x2 ) i xi , but this
contradicts to the assumption that xi is an optimal choice for i under the given
budget constraint.
Recall that we have xi i xi for at least one i, but since xi is an optimal
choice for i under the given budget constraint the strictly better consumption
xi must have not been aordable to him. Hence it must hold
p1 xi1 + p2 xi2 > p1 ei1 + p2 ei2
for such i.
By adding up the inequalities above we obtain
p1
i=1
xi1 + p2
i=1
xi2 > p1
ei1 + p2
i=1
n
i=1 ei2 .
ei2 ,
i=1
n
i=1
xi1 =
i=1 ei1
and
n
i=1
xi2 =
The name first fundamental theorem suggests that there is second one. It
says that any Pareto-ecient allocation can be obtained through competitive
equilibrium after suitable redistribution of income.
Let x = (x1 , , xn ) be any Pareto-ecient allocation. Since MRSs are
equalized under eciency, we have
M RSi (xi ) = M RSj (xj )
for all i and j. This equalized MRS is taken to be the targeted price to be
obtained in competitive equilibrium. Thus one can take the targeted price
(p1 , p2 ) such that
p1
= M RSi (xi )
p2
for all i.
171
13.3
172
13.4
Exercises
Exercise 16 There are two consumers, A and B, whose preferences are represented by uA (xA ) = A ln xA1 + A ln xA2 , uB (xB ) = B ln xB1 + B ln xB2 ,
respectively. Let e1 denote the total amount of Good 1 and e2 denote the total
amount of Good 2. Find the set of Pareto ecient allocations.
Chapter 14
Production technology
Let me start talking about production economy. Here I start with production
technology.
14.1
1-input/1-output case
First let us consider the simplest case, in which there is one input good and one
output good. Denote the quantity of input by x and denote the quantity of output by y, then production technology is described by a production function
which relates between x and y in the form
y = f (x)
This simply means that y = f (x) units of output are produced from x units of
input.
The minimally necessary property of production function will be that
f (0) = 0
That is, if nothing is input nothing is output. Also, it is necessary that
x > x = f (x) f (x )
That is, when you increase input the output does not decrease (it is possible
that the output does not increase, but at least it does not decrease).
Below you will see some mathematical analogies between production function and utility function. They are conceptually dierent, however, since utility
function is only a representation ranking and its value has no quantitative meanings, but value of production function is physical quantity of the output good
and has quantitative meanings.
One can classify production technology roughly into three. Production technology is said to exhibit
173
174
Increasing returns
Constant returns
Decreasing returns
-x
Figure 14.1: Production function
175
14.2
2-input/1-output case
14.2.1
Production function
Now we consider the case of two inputs and one output. While this may be
extended to the case of many inputs the two-input illustration is enough for our
purpose.
Here a combination of inputs is denoted by a vector x = (x1 , x2 ), which
means the amount of Input 1 is x1 units and the amount of Input 2 is x2 units.
The amount of output is denoted by y as before. Then production technology
is described by a production function which relates between y and x
y = f (x)
Graph of a production function may be depicted as in Figure 14.2. Here a
curve obtained by connecting point which yield the same output level is called
an isoquant curve, which resembles a level curve of a mountain.
Such treatment of production function looks mathematically analogous to
the treatment of utility function, but they are conceptually dierent. While
utility function is no more than a representation of ranking and its value has no
quantitative meaning, the value of production function is a measurable quantity
of output which has quantitative meaning.
While in the analysis of preferences only indierence curves have economic
contents, in the analysis of production values of production function and marginal
products have economic contents as well as the shapes of isoquant curves.
Let me give you some example of production function.
176
y
6
* x2
PP
PPP
PP
PP
PP
P
qx
1
Figure 14.2: Production function and isoquant curves
{x x }
2
1
,
f (x) = min
a b
177
exhibits parallel and L-shaped isoquant curves, but the mountain becomes
flatter and flatter as you go to the north-east direction.
Example 14.3 (Cobb-Douglas production function): It is given by
f (x) = Axa1 xb2
I will come to the details of this production function below.
14.2.2
Returns to scale
14.2.3
Marginal product
Again let me introduce the notion of marginal product, which is the amount of
additional output obtained as you add one extra unit of input. However, because
there more than one input here, we have to restate precisely that marginal
production of one input is the amount of additional output obtained as you
add one extra unit of it while the amount of the other inputs stay the
same.
178
14.2.4
179
Given that the amount of output to produce is fixed, if I have an extra one
unit of Input 1, how many units of Input 2 can I dispense with?
This is equivalent to asking
Given that the amount of output to produce is fixed, if I lose one unit of Input
1, how many units of Input 2 do I need to supplement in order to maintain
the given output level?
In any case, it measures how much Input 1 is relatively productive compared
to Input 2. For example, when Input 1 is capital and Input is labor it will
measures how much labor can be dispensed with when you have an extra one
unit of capital in order to achieve the given level of output.
Such measure is given by the (absolute value of the) slope of an indierence
surface. Let us look at the simplest case of linear production
f (x) = ax1 + bx2
Then, any combination of inputs x = (x1 , x2 ) which yields a given level of output
y is on the isoquant curve described by y = ax1 + bx2 , which is a straight line in
this simplest example. By solving this for x2 , we obtain x2 = yb ab x1 , which
implies the absolute value of the slope of the isoquant curve is ab . Thus as you
increase one unit of Input 1 you can dispense with ab units of Input 2 in order to
maintain the given output level y. Or, it means that when you lose one unit of
Input 1 for some reason you need to add ab units of Input 2 in order to maintain
the given level of output y. This is called technological rate of substitution
of Input 2 for Input 1. One can think of the reverse definition by switching
between Input 1 and 2, but I will fix the order between the two throughout the
book for definiteness.
In general, the slope of an isoquant curve is not constant. Hence we need
to look at local slope of it. Given a combination of inputs x = (x1 , x2 ), the
technological rate of substitution measure there, denoted T RS(x), is the amount
of Input 2 you can dispense with when you add a slight amount of Input 1,
which is given by the local slope of the given isoquant curve. Thus we have
T RS(x) =
x2
x1
180
y =
Recall that we are moving along the given isoquant curve on which the
change in the output level y is kept to be zero, hence we have
0=
f (x)
f (x)
x1 +
x2 .
x1
x1
x2
=
x1
f (x)
x1
f (x)
x2
Since the left-hand-side above is nothing but the technological rate of substitution, we obtain
T RS(x) =
f (x)
x1
f (x)
x2
14.3
f (x)
x1
f (x)
x2
Aaxa1
xb2
ax2
1
=
.
b1
a
bx1
Abx1 x2
Exercises
1
Chapter 15
15.1.1
One-input/one-output case
Again let me start with the case of one input and one output in order to convey
the point first.
Let p denote the given output price and q denote the given input price. Then
profit earned when the firm inputs x units and produces y units of output and
sells it is
py qx
Since the production follows y = f (x), we obtain the profit as a one-variable
function of x,
pf (x) qx
The firm maximizes this in the range x 0.
I would point out four assumptions behind this modeling (maybe more?).
First one is
1: the firm is a small participant in both output market and input
markets, decides production activity taking the prices in those
markets as given, assuming that all its products are sold.
The assumption of price takes, as before, says that each market participant is
very small compared to the entire economy and cannot manipulate the market
181
-x
Figure 15.1: Profit maximization
first-order-condition
pf (x) q = 0
if pA < q
0
x(p, q) = any value
if pA = q
15.1.2
Two-input/one-output case
This will be extended to the case of more than two inputs, but the current
specification is enough for our purpose.
Let p denote the given price of output, let q = (q1 , q2 ) denote the pair of
given input prices. Then the profit earned by producing and (selling) y units of
output from combination of inputs x = (x1 , x2 ) bought in the input markets is
py q1 x1 q2 x2
Since technology follows y = f (x), it reduces to
pf (x) q1 x1 q2 x2
Having said the same apologies I made in the one-input case, I say that the firm
maximizes its profit by choosing x.
When the production function is smooth and exhibits decreasing returns
to scale, the profit-maximizing combination of inputs is determined by taking
the partial derivative of
pf (x) q1 x1 q2 x2
by x1 and x2 respectively, and equate them to zeros. That is, the profit maximization condition is
f (x)
= q1
x1
f (x)
p
= q2
x2
p
(1)
(2)
q1
q2
=
=
q1
q2
As stated above, divide the first formula by the second one then we obtain
q1
ax2
bx1 = q2 . Solve this equation for x2 then we get a linear relationship between
bq1
the inputs, x2 = aq
x1 . Plug this into the first formula above and solve it for
2
x1 , then we get factor demand for Input 1. By plugging this now into the linear
relationship we obtain x2 . After algebra, we then obtain
1b
b
( ) 1(a+b)
( ) 1(a+b)
1
a
b
x1 (p, q) = (Ap) 1(a+b)
q1
q2
1a
a
( ) 1(a+b)
( ) 1(a+b)
1
a
b
x2 (p, q) = (Ap) 1(a+b)
q1
q2
Plug these into the production function, then we obtain the supply function
a
b
( ) 1(a+b)
( ) 1(a+b)
a+b
1
a
b
y(p, q) = A 1(a+b) p 1(a+b)
q1
q2
What about the Cobb-Douglass production function with constant returns
to scale? That is, suppose a + b = 1 f (x) = Axa1 xb2 , where we replace b by 1 a.
When we take the analogue of the previous argument, the profit maximization condition would be
paAxa1
x1a
1
2
a a
p(1 a)Ax1 x2
= q1
= q2
Then by dividing the first formula by the second we obtain the equalization of
technological rate of substitution to the relative price between the inputs
ax2
q1
=
(1 a)x1
q2
1 2
1. When p < Aaa (1a)
1a , the coecient on x1 in the above profit formula
is negative. Hence any positive level of production activity yields deficits.
Hence the profit-maximizing choice is x1 = 0. Likewise, we get x2 = 0.
Thus, inaction is profit-maximizing.
q a q 1a
1 2
2. When p = Aaa (1a)
1a , the coecient on x1 in the above profit formula is
zero. Hence any level of activity is equally profit-maximizing and yields
zero profit. Since the two inputs must be combined with the proportion
q1
ax2
(1a)x1 = q2 , the solution is
ax2
(1a)x1
q1
q2 .
q a q 1a
1 2
3. When p > Aaa (1a)
1a , the coecient on x1 in the above profit formula
is positive. Hence the firm can earn arbitrarily large profit by making the
level of production activity arbitrarily higher. Thus there is no solution
or I would say the solution is x1 = and x2 = .
15.2
Let us look at another level of production decision, which asks to solve the
following problem.
Given input prices q = (q1 , q2 ), what combination of inputs x = (x1 , x2 ) minimizes the cost to produce a given level of output y?
subject to
f (x) = y
Notice that in contrast to the profit maximization problem in competitive markets this is a problem with a constraint. Hence it always has a solution even
when the technology exhibits increasing returns to scale.
How do we find the cost-minimizing point? See Figure 15.2. First, we draw
the isoquant curve which corresponds to the given level of output y. Now pick
an arbitrary point on the curve, lets say x = (x1 , x2 ). Is it cost-minimizing?
To see if it is, draw the line passing through x with slope qq12 . Since any all the
point on the line yield the same cost, let us call it an iso-cost line. Then, since
we can pick a point on the isoquant curve which is strictly below the current
iso-cost line, lets say x = (x1 , x2 ). Then, since q1 x1 + q2 x2 > q1 x1 + q2 x2 , the
point x is not cost-minimizing.
Likewise, any point on the isoquant curve is not cost minimizing as far as
the corresponding iso-cost line crosses the curve. Thus, the cost-minimizing
point must be the point at which the corresponding iso-cost line is tangent to
r
x
rx
rx
- x1
the isoquant curve, like x = (x1 , x2 ) in the figure. When the isoquant curve is
smooth it means that the local slope of the curve is equal to the slope of the
iso-cost line, that is the technological rate of substitution is equal to the relative
price between the inputs. Hence the cost minimization condition is
T RS(x) =
q1
q2
x2 (q, y)
15.3
Here long-run means that the firm can vary its inputs in a fully flexible manner. On the other hand, short-run means that the firm cannot change the
level of some inputs. For example, once you build a factory you cannot immediately expand it or scrap it for certain periods. In such cases we have to take
some inputs to be fixed quantities.
Short-run profit maximization
For illustration, let me assume that the quantity of Input 2 is fixed in the shortrun and the firm can vary the quantity of Input 1 only. Denote the fixed quantity
of Input 2 by x
2 . Then the short-run profit maximization problem is
max pf (x1 , x
2 ) q1 x 1 q2 x
2
x1
Since there the firm has just one variable input, this is a maximization
problem with one variable. Hence the short-run profit maximization problem is
pM P1 (x1 , x
2 ) = q1 .
By solving this we obtain the short-run factor demand function for Input
1
x1 = x1 (p, q, x
2 )
By plugging this into the production function we obtain the short-run supply
function
y(p, q, x
2 ) = f (x(p, q, x
2 ))
15.3.1
Now consider cost minimization in the short-run. As before, consider that the
quantity of Input 2 is fixed to be x
2 . Then the cost minimization problem given
output level y is
min q1 x2 + q2 x
2
x1
subject to
f (x1 , x
2 ) = y
= xS1 (q, x
2 , y)
= x
2
2 , y) + q2 x
2
q1 xS1 (q, x
Notice that the first term in the above short-run cost function is variable as
output y varies, but the second term is just a constant. The cost term which
is variable as the output level varies is called variable cost, and the constant
cost which does not vary with the output level is called fixed cost. Denote the
variable cost term by V C(y) and the fixed term by F C, and omit w, x
2 from
the notation as they are given throughout, we can write the cost function as
C S (y) = V C(y) + F C
b2 = y we have xS1 = A a x
Then, from the constraint Axa1 x
2 a y a , and the
short-run cost function is given by
1
C S (y) = q1 A a x
2 a y a + q2 x
2
1
x1 ,x2
subject to
f (x1 , x2 , x
3 ) = y
Here the minimization condition is, like before, given by the equality of the
technical rate of substitution to the relative price between Input 1 and 2.
Chapter 16
C(y)
y
C(y)
y
=
=
V C(y) F C
+
y
y
AV C(y) + AF C(y)
where AV C(y) is called average variable cost, which is variable cost per one
unit of production, and AF C(y) is called average fixed cost, which is fixed
cost per one unit of production. It is obvious from the definition that average
fixed cost falls as the firm produces more.
Average cost is relevant to firms entry/exit decision, and average variable
cost is relevant to firms shut down/operation decision.
Also, in order to think of how much to produce we need to look at incremental
cost for incremental unit of production rather than average costs. This is called
marginal cost.
193
194
The marginal cost of one extra unit of production added to a given level of
production y is
M C(y) = C (y) = lim
y0
C(y + y) C(y)
y
Note that since the fixed cost part disappears after taking the derivative fixed
cost does not aect marginal cost.
When the marginal cost curve is upward-sloping we say that the technology
exhibits increasing marginal cost. This says that as you produce more the
additional cost tends to be more expensive, which corresponds to technology
exhibiting decreasing returns to scale (with regard to variable inputs).
When the marginal cost curve is downward-sloping we say that the technology exhibits decreasing marginal cost. This says that as you produce more the
additional cost tends to be cheaper, which corresponds to technology exhibiting
increasing returns to scale (with regard to variable inputs).
When the marginal cost curve is straight we say that the technology exhibits
constant marginal cost. This says that as you produce more the additional
cost tends to be cheaper, which corresponds to technology exhibiting constant
returns to scale (with regard to variable inputs).
When you plot average cost, average variable cost and marginal cost it looks
like Figure 16.1. Here the marginal cost curve is crossing the bottom of the
average cost curve and the bottom of the average variable cost curve respectively.
This is not just a coincidence, since
(
)
(
)
C(y)
C (y)y C(y)
1
C(y)
AC (y) =
=
=
C
(y)
y
y2
y
y
1
=
(M C(y) AC(y))
y
and wit holds AC (y) = 0 if and only M C(y) = AC(y). Similarly it holds
AV C (y) = 0 if and only if M C(y) = AV C(y).
Let us go over two examples.
Example 16.1 A short-run cost function C(y) = y 2 + 1 can come for example
1
from cost minimization by a firm with production function f (x1 , x2 ) = x12 x22
under input prices w1 = w2 = 1 and the short-run restriction x
2 = 1. Given
this cost function we have the following (see Figure 16.2).
1. Variable cost V C(y) = y 2
2. Fixed cost F C = 1
3. Average cost AC(y) =
y 2 +1
y
=y+
1
y
V C(y)
y
y2
y
=y
195
MC
AVC
-y
Figure 16.1: Cost curves
FC
y
1
y
y 3 2y 2 +3y+1
y
FC
y
= y 2 2y + 3 +
V C(y)
y
y 3 2y 2 +3y
y
4
y
= y 2 2y + 3
1
y
16.2
Here let me describe firms profit maximization behavior in a perfectly competitive market in which there is a large number of market participants so that each
one is small and has to take the market price as given.
AC
6
196
MC
2
AVC
-y
AC
6
MC
AVC
2
-y
197
Denote the given price of output by p, then the profit maximization problem
is formulated by
max py C(y)
y0
Then,
1. Average variable cost AV C(y) = y has its minimal value to be equal to
zero. Hence no worry about shut-down decision.
2. Marginal cost is M C(y) = 2y. Hence p = M C(y) yields S(p) = 21 p.
1
y
198
1
p for every p > 0
2
when p < 2
, when p 2
Lets consider two more examples. What if the cost function is C(y) =
y + 2? Here the average variable cost is AV C(y) = 1y , which is downwardsloping and gets closer to zero as y is large, so you dont have to worry about
1
the shut-down decision. However, the marginal cost M C(y) = 2
y exhibits a
downward-sloping curve and cannot meet the profit maximizing condition not
matter what the output price is. Hence there is no profit-maximizing point.
Notice that such cost structure is incompatible with the assumption of pricetaking, since this firm can produce the product at cheaper cost per unit as it
produces more, and such firm will in the end dominate the market and will have
a market power, which cannot be price-taking.
Since perfectly competitive markets refer to markets with a large number of small participants, the analysis of firms with decreasing marginal cost
should be handled by the models of monopoly and oligopoly, or by the model
of government regulation.
Finally, consider the case of constant marginal cost C(y) = 2y.
1. Average variable cost is AV C(y) = 2y
y = 2, which exhibits a horizontal
straight AVC curve. Hence when p < 2 the firm should shut down.
2. When p = 2 any output level is maximizing the profit.
199
when p < 2
0,
any non-negative number, when p = 2
S(p) =
no solution, ot infinity,
when p > 2
This looks tricky, but again when this firm is seen as a representative firm
of a large number of firms this is a good description of their mass behavior,
under the condition of free and costless entry and exist.
For, when p > 2 the industry is profitable, and under the condition of free
entry arbitrarily many firms will enter the market in order to exploit this profit
opportunity, and it will not stop until the profit goes down to zero. Also, when
p < 2 the industry is unprofitable, and under the condition of free exit all firms
will exit from the market and it will not stop as far as the profit is negative.
When p = 2, all the firms are on the knife-edge of break even and indierent
between staying and exiting.
16.3
Exercises
Chapter 17
Competitive equilibrium in
production economies
17.1
ik = 1
i=1
for all k = 1, , m.
To describe production in the current setting we take a little dierent notation. Production activity of firm k is typically denoted by yk = (yk1 , yk2 ), which
200
201
Good 2
6
ry
- Good 1
r y
means the firm produces yk1 units of Good 1 and yk2 units of Good 2. Since it
is technologically impossible to have positive output from nothing, it must be
either that yk1 < 0 and yk2 > 0 or that yk1 > 0 and yk2 < 0. In the first case
Good 1 is taken to be the input and Good 2 is taken to be the output whereas
in the second case Good 1 is taken to be the output and Good 2 is taken to
be the input. That is, input is treated as negative output in this notation, and
factor demand is treated as negative supply. We adopt this notation since it is
more unified and actually convenient in formulating our arguments at least in
this chapter.
Firm ks production activity yk = (yk1 , yk2 ) follows its technological constraint, which is described by an equation
Tk (yk ) = 0
This function Tk is called firm ks transformation function, which is a more
general form of production function. When you depict the above equation on the
two-dimensional plane as in Figure 17.1), we obtain a transformation curve.
Transformation curves always pass through the origin. Also, it is impossible
to have both yk1 > 0 and yk2 > 0. In general, it is possible that one produces
Good 1 from Good 2 and Good 2 from Good 1. For example, at y in Figure
17.1 Good 2 is produced from Good 1, and at y Good 1 is produced from Good
2.
Technology which is described by production function is a special case of
this, in the sense that the roles of input and output are fixed and there is no
reverse production. For example, when Good 1 is always input and Good 2 is
always output then technology described by output = f (input) is written by
means of transformation function in the form
yk2 fk (yk1 ) = 0,
202
Good 2
6
- Good 1
i=1
n
xi1
xi2
=
=
i=1
n
ei1 +
ei2 +
k=1
m
yk1
yk2
i=1
i=1
Tk (yk ) =
0 for all k = 1, , m
k=1
Here the first and second formula mean that total consumption must be equal
to the total amount of initial endowment plus total amount of output (including
the negative amount, which is factor demand). The third formula means that
production activities must obey the technological constraints.
203
Good 2
6
y
y2 6
r
y1
- Good 1
17.1.1
In a perfectly competitive market each firm maximizes its profit by taking the
price p = (p1 , p2 ) as given. It is formulated by
max p1 yk1 + p2 yk2
yk
subject to Tk (yk ) = 0
for each k. How is the profit-maximizing production activity determined then?
See Figure 17.4. It will be immediate that the origin (0, 0) provides zero
profit. Then what about yk = (yk1 , yk2 ) on the transformation curve? When
you draw a line passing through it with slope pp12 it passes above the origin,
hence it is generating positive profit. Let me call this line an iso-profit line,
since all the point on it yield the same profit.
It is not maximal, however, since you can find a point on the transformation
yk it is above the previous one, which means yk is generating more profit. Thus
yk is not profit-maximizing.
Having said that, the profit maximization point must be such that the isoprofit line passing through it is tangent to the transformation curve, such as
yk = (yk1
, yk2
) in Figure 17.4. This means that the local slope of the transformation curve is equal to the slope of the corresponding iso-profit line. Thus,
the marginal rate of transformation must be equal to the relative price between
the goods.
Thus the profit maximization condition is
M RTk (yk ) =
p1
p2
204
Good 2
6
ry
ry
yr
- Good 1
17.1.2
Consumers choice
ik k (p)
k=1
205
Thus, consumer i chooses his optimal consumption under the budget constraint
m
17.1.3
Competitive equilibrium
i = 1, , n
k = 1, , m
xi1 (p ) =
i=1
i=1
xi2 (p ) =
i=1
ei1 +
ei2 +
i=1
k=1
m
yk1 (p )
yk2 (p )
k=1
In competitive equilibrium, since each consumer is making optimal consumption under the budget constrain we have equality between the marginal rate of
substitution and the relative price
M RSi (xi ) =
p1
p2
for every i. Since each firm is maximizing its profit, we have equality between
the marginal rate of transformation and the relative price
M RTk (yk ) =
p1
p2
p1
p2
17.2
206
Here let us consider the case in which there is just one representative consumer
and there is just one representative firm. It is called representative agent
model. Since the assumption of competitive market requires substantially that
there is a large number of small traders we should imagine a large number
of consumer and produces behind the representative consumer and producer
respectively.
The representative consumer is characterized by
1. initial holding of goods: e = (e1 , e2 ),
2. ownership of the representative firm
3. preference
and the representative firm is characterized by the transformation curve
T (y) = 0
Note that here the representative consumer has the 100% ownership of the
representative firm.
A combination of consumption activity and production activity (x, y) is said
to be feasible if
x1 = e1 + y1
x2 = e2 + y2
T (y) = 0
(1)
(2)
(3)
subject to T (y) = 0
The profit maximization condition is given by the equality between the marginal
rate of transformation and the relative price
M RT (y) =
p1
p2
207
Good 2
6
re
- Good 1
e1 + y1 (p )
e2 + y2 (p )
x2 (p ) =
208
Good 2
6
y + e
r
x
r
re
- Good 1
p1
.
p2
Since the firm is maximizing its profit the marginal rate of transformation is
equal to the relative price, hence it holds
M RT (y ) =
p1
p2
Summing up it holds
M RS(x ) = M RT (y ) =
p1
p2
In Figure 17.7, this means that the indierence curve and the production possibility frontier are tangent to the budget line/iso-profit line at the common
point.
17.2.1
Let us go over an example. Fix Good 1 to be the input and Good 2 to be the
output.
209
Good 2
6
r
x = y + e
re
- Good 1
T (y) = y2 A y1 ,
where y1 0.
The firms profit maximization by taking the market price as given yields
supply function
A2 p2
A2 p2
y1 (p) = 22 , y2 (p) =
4p1
2p1
and profit function
(p) =
A2 p22
4p1
On the other hand, by taking it into account that the representative conA2 p2
sumers income is p1 e1 + p2 e2 + (p1 , p2 ) = p1 e1 + 4p12 , we can derive the
demand function generated by Cobb-Douglass preference
)
(
)
(
A2 p2
A2 p22
p1
e1 +
,
x1 (p) = e1 +
, x2 (p) = (1 )
4p21
p2
4p1
where =
+ .
210
p1
A
1+
=
p2
2 (1 )e1
and the production activity in the equilibrium is
1
1
y1 =
e 1 , y2 = A
e1
1+
1+
and the consumption activity in the equilibrium is
2
1
x1 =
e1 , x 2 = A
e1
1+
1+
17.2.2
Next let us see the case of constant returns to scale. Maintain the assumption
that Good 1 is the input and Good 2 is the output. As explained before, this
gives us a reasonable description of collective behavior of firms in the situation
that there are large numbers of potential firms and free entries and exists are
allowed.
The representative consumer is again assume to have preference represented
byu(x) = ln x1 + ln x2 and an initial holding e = (e1 , 0).
On the other hand, production technology for the representative firm exhibits
constant returns to scale, and it is described by a linear transformation function
T (y) = y2 + Ay1 ,
where y1 0. That is, the marginal rate of transformation is constant and
equal to A.
When the technology exhibits constant returns to scale the supply is given to
be a correspondence rather than a function. Here the supply correspondence
is given by
(0, 0)
when pp12 > A
That is, (i) when the relative price of input is cheaper compared to the marginal
rate of substitution the representative firm can earn arbitrarily large profit by
making the production scale arbitrarily larger; (ii) it chooses inaction in the
opposite case, and (iii) anything level of production is profit-maximizing and
the maximized profit is zero in the case of equality.
The profit function is then
{
,
when pp21 < A
(p) =
0
when pp21 A
211
+ .
Since there is always a positive and finite demand, in equilibrium there must
be a positive and finite supply. Hence it must be that the relative price in
equilibrium is
p1
=A
p2
Thus, when technology exhibits constant returns to scale the equilibrium relative price is determined by the production technology alone
(which is not true in general).
Here the production activity in equilibrium is
y1 = (1 )e1 ,
y2 = (1 )Ae1
which is profit maximizing since any production level is equally profit maximizp
ing under p1 = A, where the maximized profit is zero. On the other hand, the
2
consumption activity in equilibrium is
x1 = e1 ,
17.3
x2 = (1 )Ae1 .
212
subject to T (y) = 0
Denote the maximized profit by (r).
The representative consumers problem is given by
max v(x1 ) + v(x2 )
x
x2
e2
= e1 +
+ (r).
1+r
1+r
Note that he takes the present value of the firms profit into his income in the
present-value form.
This is nothing but a special case of the previous model, in which the prices
are written in the form
subject to
x1 +
p1 = 1, p2 =
1
p1
, and
= 1 + r.
1+r
p2
T (y) = y2 A y1 ,
by applying Example 1 above we obtain
A
p
1 + r = 1 =
p2
2
(
)
2 1
1+
e1
On the other hand, when the technology exhibits constant returns to scale
and the transformation function is
T (y) = y2 + Ay1
by applying Example 2 above we obtain
1 + r =
p1
=A
p2
Now, there was once a historically long debate about the source of interest
rate. When I summarize the main views they are basically:
213
17.4
xi i xi
i=1
n
xi1
xi2
= e1 +
= e2 +
i=1
Tk (yk1 , yk2 )
k=1
m
yk1
yk2
k=1
= 0
k = 1, , m
Again here is the definition of Pareto eciency. Note that for welfare comparison itself only consumption allocation should matter.
214
215
one unit of Good 2 as input. Without loss of generality, assume here that Good
1 is input and Good 2 is output.
Under the current interpretation the above inequality means that marginal
product in Firm k is greater than that in Firm l. Then consider moving a
slight amount Good 1 for input denoted by y1 from Firm l to k. Since the
transformation curves are locally straight Firm k can do the production activity
(yk1 y1 , yk2 + M RTk (yk )y1 )
and Firm l can do the production activity
(yl1 + y1 , yl2 M RTl (yl )y1 )
By summing up the two production activities we have
(yk1 + yl1 , yk2 + yl2 + (M RTk (yk ) M RTl (yl ))y1 )
which means that we have extra (M RTk (yk ) M RTl (yl ))y1 units of Good 2
produced from the same yk1 + yl1 units of Good 1 as input. By suitably distributing this extra units of Good 2 among the consumers we can make improve
everybodys welfare status. Thus there is a room for Pareto improvements and
the current resource allocation is not Pareto ecient.
Finally let me show that
M RSi (xi ) = M RTk (yk )
hold for all i, k.
Suppose for example
M RSi (xi ) > M RTk (yk )
Again, let me assume without loss of generality that Good 1 is input and Good
2 is output. Then the above inequality means that the amount of Good 2 which
consumer i is willing to give up in order to get one extra unit of Good 1 is
greater than the amount of Good 2 which Firm k can produce from one unit of
Good 1. This suggests that we can make i better o without hurting anybody
else, by reducing the production activity of Firm k, moving its input Good 1 to
him and let him compensate the reduction of production activity suitably.
Now consider reducing Firm ks input by slight amount y1 , then since the
transformation curve is locally straight we have its production activity becomes
(yk1 + y1 , yk2 M RTk (yk )y1 ).
Note that yk1 < 0, which means adding y1 > 0 means reduction of input.
Now give y1 units of Good 1 to Consumer i and make him pay M RTk (yk )y1
units of Good 2 to Firm k for compensation. Since his indierence curve is locally straight we have
(xi1 + y1 , xi2 M RSi (xi )y1 ) i xi
216
i = 1, , n
Now suppose x is not Pareto ecient. Then there is an another feasible allocation
((x11 + x11 , x12 + x12 ), , (xn1 + xn1 , xn2 + xn2 ))
with production activity
((y11 + y11 , y12 + y12 ), , (ym1 + ym1 , ym2 + ym2 ))
such that
(xi1 + xi1 , xi2 + xi2 ) i xi
for all i = 1 , n and
(xi1 + xi1 , xi2 + xi2 ) i xi
for at least one i, where
n
i=1
n
xi1
xi2
=
=
i=1
k=1
m
yk1
yk2
k=1
When preferences are smooth, the consumption (xi1 + xi1 , xi2 + xi2 ) is
above the tangent line passing through xi for all i = 1 , n, we have
xi2 M RSi (xi )xi1
Also, since (xi1 + xi1 , xi2 + xi2 ) is strictly above the tangent line passing
through xi for at least one i, we have
xi2 > M RSi (xi )xi1
for such i.
Denote the equalized value of MRSs by , then we have M RSi (xi ) = for
all i = 1 , n. Then by adding up the above inequalities we obtain
n
i=1
xi2 >
i=1
xi2 .
217
yk2
k=1
Since
n
i=1
xi1 =
m
k=1
i=1
yk1 .
k=1
yk1 and
n
n
i=1
xi2
xi2 =
n
m
k=1
xi2 ,
i=1
p1
p2
and y = (y1 , , ym
) denote the consumption allocation and production activity in equilibrium respectively. Now suppose x is not Pareto-ecient, then
there exists another feasible consumption allocation and corresponding production activity x = (x1 , , xn ) and y = (y1 , , ym ) such that
xi i xi
holds for all i = 1 , n and
xi i xi
218
i=1
n
xi1
xi2
i=1
i=1
n
ei1 +
ei2 +
i=1
k=1
m
yk1
yk2
k=1
is met.
Recall that for each i his consumption xi is optimal for him under the
k
budget constraint p1 xi1 + p2 xi2 p1 ei1 + p2 ei2 + k=1 ik k (p ). Just like in
the argument on exchange economies we have the following lemma.
Lemma 17.1 xi1 i xi implies
p1 xi1 + p2 xi2 p1 ei1 + p2 ei2 +
ik k (p )
k=1
ik k (p )
k=1
xi1 + p2
i=1
xi2 > p1
i=1
ei1 + p2
i=1
ei2 +
i=1
k (p )
k=1
On the other hand, since each firm k is maximizing its profit at yk respectively, it holds
k (p ) = p1 yk1
+ p2 yk2
p1 yk1 + p2 yk2
By summing up this across firms we obtain
( n
)
( n
)
n
n
m
m
p1
xi1 + p2
xi2 > p1
ei1 +
yk1 + p2
ei2 +
yk2 ,
i=1
i=1
i=1
i=1
k=1
i=1
n
i=1
xi1
xi2
=
=
i=1
n
i=1
ei1 +
ei2 +
k=1
m
k=1
yk1
yk2
k=1
219
subject to T (yk ) = 0
Let k (p) denote the maximized profit.
Now for each consumer i we need to given him income equal to p1 xi1 +p2 xi2 .
Let the proportion of his income to the entire society be given by
i =
p1
p1 xi1 + p2 xi2
n
j=1 xj1 + p2
j=1 xj2
ik k (p)
k=1
This result does not depend on the smoothness of preferences and technologies.
220
17.5
221
Chapter 18
222
223
The assumption of cost minimization on the producer side states that the
firms are taking input prices as given, meaning that they are small buyers in
the input markets while may be big and have market powers in the output
market, which is essentially presuming that the market for it is suciently small
compared to the entire economy.
The market for the good under partial equilibrium analysis is thus like a
boat floating on the ocean. Since the boat is negligibly small compared to the
ocean, one can ignore its eect on the ocean, and can focus on the movement
of the boat itself. When the market for the good under consideration is very
small compared to the entire economy its behavior will not aect the entire
economy and the behavior of the markets for the other commodities will
remain unchanged at least in the approximate sense.
18.1
Hereafter let Good 1 be the object of partial equilibrium analysis and let Good
2 be the income transfer to be spent on the rest of goods. The amount of Good
2 can be either positive or negative, and it is taken to be a payment when it is
negative. Fix the price of Good 2 equal to one, that is, take the normalization
p2 = 1. Then denote the price of Good 1 by p instead of p1 .
On the production side, Good 1 is produced as output whereas Good 2 is
used as input. I apologize for switching the roles between Good 1 and 2 from
before, but here I like to put priority on describing consumers willingness to
pay for Good 1, which is marginal rate of substitution of Good 2 for Good 1.
From the assumption of no income eect we assume that consumers preferences are linear in Good 2, that is, each consumer i = 1, , n has preference
represented in the form
ui (xi1 , xi2 ) = f (vi (xi1 ) + xi2 )
Since consumption Good 1 does not depend on income under the assumption of
no income eect, without loss of generality assume that initial income is zero,
and consider that consumer i obeys his budget constraint
pxi1 + xi2 = 0
That is, the income transfer here is xi2 = pxi1 , which is the payment for the
purchase of Good 1.
As seen before, consumption of Good 1 is independent of income and it is
determined by the condition of equality between marginal willingness to pay
and (relative) price
vi (xi1 ) = p
From this we obtain inverse demand function
pi (xi1 ) = vi (xi1 )
224
xi1 (p),
i=1
and by taking its inverse we obtain the aggregate inverse demand function
p(x1 ). It is helpful later to take an understanding that the buyer who is willing
to buy the x1 -th unit of the good is willing to pay p(x1 ). Such x1 -th
consumer is said to be the marginal consumer at x1 .
On the other hand, each firm k = 1, , m has technology which is summarized in the form of cost function Ck (yk1 ), where yk1 denotes its output level of
Good 1. In a competitive market it solves
max pyk1 Ck (yk1 )
yk1
Here we assume that the shut-down condition does not bind, and the equality
between marginal cost and price holds,
M Ck (yk1 ) = p
From this we obtain the inverse supply function
pk (yk1 ) = M Ck (yk1 )
and supply function yk1 (p) = (M Ck )1 (p). Likewise, we obtain the aggregate supply function
m
y1 (p) =
yi1 (p)
k=1
xi1 (p ) =
i=1
18.2
yk1 (p ) = y1 (p )
k=1
In order to talk about eciency let rewrite down the definition of feasible allocation in the setting of partial equilibrium.
Definition 18.1 Consumption allocation x = (x1 , , xn ) is said to be feasible if there exists (y11 , , ym1 ) such that
n
i=1
n
i=1
xi1
xi2
yk1
k=1
m
k=1
Ck (yk1 )
225
Note that the second equality says that total payment by the consumers equals
to the total cost for production.
In the framework of partial equilibrium Pareto eciency is equivalent to
maximizing social surplus. Sometimes this is confused with the idea of the
greatest happiness of the greatest number. Since we are maintaining the viewpoint of ordinal utility we cannot simply take the sum of utilities across individuals, since we cannot compare them without bringing in a particular faith.
On the other hand, willingness to pay and consumer surplus have quantitative
meanings and it has an economic meaning to take the sum of them. Indeed, as
is discussed below maximizing social surplus has totally silent about how much
surplus each consumer should gain.
Proposition 18.1 Consumption allocation x = (x1 , , xn ) is Pareto-ecient
if and only if (x11 , , xn1 ) is given by the solution to
max
x,y
vi (xi1 )
i=1
Ck (yk1 )
k=1
n
subject to
xi1 =
i=1
yk1
k=1
)
, , ym1
Proof. Only
there exists (x11 , , xn1 ) and (y11
Suppose
n if Part:
m
vi (xi1 )
i=1
Let
(
S=
Ck (yk1
)>
i=1
vi (xi1 )
i=1
k=1
vi (xi1 )
)
vi (xi1 )
i=1
Ck (yk1 )
k=1
Ck (yk1
)
k=1
)
Ck (yk1 )
k=1
then by assumption we
nhave S > 0. Divide this among all individuals so that
si > 0 for each i and i=1 si = S.
Now for each i = 1, , n let
Then, from
i=1
i=1
xi2
xi2
xi2 +
i=1
m
=
=
k=1
m
k=1
vi (xi1 )
i=1
Ck (yk1 ) +
vi (xi1 ) +
i=1
n
i=1
Ck (yk1
)
vi (xi1 )
si
i=1
n
i=1
vi (xi1 ) + S
226
i=1
i=1
m
n
m
n
)
By feasibility we have i=1 xi2 = k=1 Ck (yk1 ) and i=1 xi2 = k=1 Ck (yk1
for the corresponding productions. Hence we have
n
i=1
vi (xi1 )
Ck (yk1
)>
k=1
vi (xi1 )
i=1
Ck (yk1 )
k=1
Remark 18.1 In the framework of partial equilibrium analysis, Pareto eciency pins down allocation of Good 1 basically uniquely through maximization
of social surplus, but we should notice that it is totally silent about how
the maximized surplus (i.e., Good 2) should be distributed among
individuals. Any distribution of maximized social surplus is ecient, indeed.
How we should distributed social surplus is a question which is orthogonal to
the notion of eciency.
To clarify the above point, consider an exchange economy between two individuals. Consumer A has initial holding of Good 1 denoted by eA1 , and
consumer B has eB1 . Without loss of generality let the initial holdings of Good
2 be eA2 = eB2 = 0. Then feasible allocation (xA , xB ) must satisfy
xA1 + xB1
xA2 + xB2
= eA1 + eA2
= 0
xB1
re
227
- xA1
?
xB2
Figure 18.1: Set of ecient allocations under quasi-linear preferences
i=1
(v(xi1 ) pxi1 ) =
i=1
v(xi1 ) p
i=1
xi1
228
(pyk1 Ck (yk1 ))
k=1
Note that since the firms are ultimately owned by individuals the producers
surplus is in the end paid back to them, while it remains to be a problem
who should receive how much. The point here is that in a competitive market
maximization of social surplus is achieved through each individuals pursue of
maximizing his consumer surplus and each firms maximizing producer surplus
(i.e., profit) in a decentralized manner.
18.3
Exercises
Exercise 20 Let Good 1 be the object of partial equilibrium analysis and Good
2 to be the income transfer to be spent on the other goods. Assume that
each consumer i = 1, , n has quasi-linear preference represented by ui (x) =
Part III
229
Chapter 19
Monopoly
Recall that perfect competition refers to situations in which the markets consist
of a large number of small participants, each of which is negligibly small compared to the entire economy and has to take the market price as give which he
cannot manipulate by himself alone.
This assumption is pretty innocuous for the demand side, but it will be a
problem for the producer side. There are many instances in which the markets
consist of a small number of large firms. In such cases, sellers must be aware
that their actions directly and indirectly aect the market price. We call such
situation imperfect competition, in which certain market participants have
market powers.
Analysis of imperfect competition is mostly carried out in the framework
of partial equilibrium analysis. For, the concept of imperfect competition and
the concept of general equilibrium are actually very hard to be compatible with
each other. First, when a given firm has a market power and be able to manipulate the market price, it may use this power either to maximize its profit
or to manipulate the market price in favor of its shareholders interests in their
consumptions, which are not necessarily compatible each other. This conflict
leaves it ambiguous what the objective of a firm is. Second, while in the general
equilibrium analysis only relative prices should matter and the analysis does
not depend on how to normalize prices (which one to set equal to 1), price
to be manipulated by firms in imperfect competition in the market for a particular output is an absolute one, or in other words it is a particular kind of
relative price obtained by taking the price of income equal to 1. Third point
is a mathematical one, but it is that the profit functions to be used in general
equilibrium analysis is not well-behaved (technically speaking, not concave) and
cannot guarantee the existence of general equilibrium.
Having said that, I explain imperfect competition in the framework of partial
equilibrium analysis. I start with monopoly, the most extreme case of imperfect
competition.
230
19.1
231
Monopoly equilibrium
Here the firms marginal revenue denoted by M R(y), which is the additional revenue obtained from selling one extra unit of output, is given by the
derivative of R(y), that is, M R(y) = R (y).
Note that when you plot the marginal revenue curve it must be below the
inverse demand curve as depicted in Figure 19.1. This follows from
232
M C(y)
pM
pCE
p(y)
M R(y)
r
yM
y CE
-y
and that the inverse demand curve is downward-sloping, meaning p (y) < 0,
which implies M R(y) < p(y).
Since the marginal cost is M C(y) = C (y) as before, the profit-maximization
condition is given by the equality between marginal revenue and marginal cost
M R(y M ) = M C(y M )
Let us call such y M monopoly equilibrium. Then the monopoly price is given
by pM = p(y M ).
Because the marginal revenue curve is below the inverse demand curve, it
holds y M < y CE and therefore pM > pCE . That is the output level is lower and
the price is higher in monopoly equilibrium when it is compared to competitive
equilibrium.
To illustrate, consider the case of linear inverse demand and constant marginal
cost. The inverse demand function is given by p(x) = a bx, then the revenue
takes the form R(y) = p(y)y = ay by 2 and the marginal revenue is
M R(y) = a 2by
On the other hand, the cost function is given by C(y) = cy, in which the
marginal cost is
M C(y) = c
If this firm behaves as if it is the representative firm in a perfectly competitive
market, its supply function is flat and given by p = c. Hence the competitive
equilibrium is
ac
pCE = c, y CE =
b
233
(ac)2
2b
(ac)2
2b .
ac
2b
19.2
a+c
.
2
y CE
234
19.3
One might argue, however, that monopoly itself is not the culprit of ineciency,
and the culprit is the constraint that the monopolist cannot exercise price
discrimination so that it charges dierently across dierent consumers and
dierent units of purchase.
What do I mean? I wrote in the previous explanation that the Pareto
improvement at y M is done by trading at a lower price p such that p(y M ) >
p > M C(y M ). However, if price discrimination is not allowed the monopolist
has to sell every unit at this lower price p, which is not acceptable for it.
Ineciency due to such constraint may be recovered when the monopolist
can charge higher amount to those with higher willingess to pay and lower
amount to those with lower willingness to pay, so that the economy can fully
exploit the opportunity of gains from trade. Then the monopolist can extract
social surplus as much as possible, and if such extracted surplus is suitably
redistributed among people we can make everybody better o.
Of course, if this suitable redistribution of extracted surplus is not guaranteed price discrimination may be regarded as undesirable from certain social
viewpoints, and thats why it is sometimes made illegal.
So here I keep myself silent about how the extracted surplus should
be distributed among people, and focus on how we can fully exploit
the potential opportunity of gains from trade.
19.3.1
Let me start with the most extreme case in order to convey the point. That is,
consider that the monopolist can charge dierently across dierent consumers,
consumer by consumer, and dierently across units, unit by unit.
Discrete case illustration
To illustrate, consider that the are two types of consumer A and B, each of
which has population NA and NB respectively, where NA < 2NB . Assume for
235
236
Continuous case
Now consider the case of continuous quantity. The consumer side is summarized by an aggregate inverse demand function p(x), which says the marginal
consumer at x-th unit is willing to pay p(x). Here the first-degree price discrimination says if you are the marginal consumer at x-th unit please pay
p(x), and stops selling at yb such that
p(b
y ) = M C(b
y)
which is the surplus maximizing output level, that is, yb = y CE . Since each
consumer pays his willingness to pay for each unit the consumer surplus is zero,
and the entire maximal social surplus is extracted by the monopolist, which is
given by
yCE
p(x)dx C(y CE ).
0
Graphically, the maximized social surplus is the area surrounded by the inverse
demand curve, the marginal cost curve and the vertical axis in Figure 19.1.
The problem of misreporting
The first-degree price discrimination is hardly observed in reality, however. The
reason is that willingness to pay is in general each consumers private information which is not verifiable by others. Even if the others know ones willingness
to pay it is a dierent question if it is verifiable. There is a distinction between
knowing something and being able to verify it.
Go back to the leading example. There Type A consumer is supposed to
pay more than Type B and receives no consumer surplus. This gives a Type
A consumer an incentive to misreport his consumer type and mimic to be a
Type B consumer. He would say, No, Im not Type A but Type B. Since the
seller cannot verify buyers type, he cannot counter-argue against it and has to
accept the claim.
19.3.2
This leads us to think of charging dierently across dierent units, unit by unit,
while the same payment schedule must apply to all buyers. For, the seller can
track and verify how many units a given consumer is purchasing. This is called
second-degree price discrimination.
Discrete case illustration
Let me explain using the same numerical example as before. The natural candidates are as follows. Note that any consumer cannot buy the second unit
without buying the first unit.
1. Charge 8 for each unit (profit-maximizing price when no price discrimination is allowed): The profit is 4NA + 4NB .
237
2. Charge 10 for the 1st unit, 5 for the 2nd unit: The profit is 10NA + 5NA
4 2NA = 7NA
3. Charge 8 for the 1st unit, 3 for the 2nd unit: The charge for the 2nd unit
is obviously unprofitable.
4. Charge 10 for the 1st unit, 3 for the 2nd unit: The charge for the 2nd unit
is obviously unprofitable.
5. Charge 8 for the 1st unit, 5 for the 2nd unit: The profit is 8(NA + NB ) +
5NA 4(2NA + NB ) = 5NA + 4NB
Under the assumption NA < 2NB the last one is profit maximizing and the
profit 5NA + 4NB . This is lower than the profit earned if the first-degree price
discrimination is possible, but it is higher than the profit earned when no price
discrimination is allowed.
The second-degree price discrimination does not make consumers fully reveal
their willingness to pay, but it induces each consumer to self-select his type so
that a heavy user buys more and a light user buys less. Charging system for for
mobile phones is one such example.
Continuous case
In the setting of continuous quantity the second-degree price discrimination is
also called non-linear pricing. This is because while the total charge is linear
in pursed quantity when the price per unit is constant it is non-linear when the
price per unit varies unit by unit.
In Figure 19.2, the graph of total charge is linear when the price per unit is
constant. In general, non-linear pricing can yield any curve, but how to choose
such curve in order to maximize the profit is hard (technically speaking it is
maximization in an infinite-dimensional space). So practically we restrict attention to piece-wise linear payment schedule in which price per unit is constant
up to some quantity and shifts to another constant after than, and so on.
19.3.3
On the other hand, when consumer types are physically verifiable to some
extent the seller may group the consumers and charge dierently across the
groups. Student discount and senior discount are such examples. Sellers can
ask customers to show their student IDs, and also can ask them to show their
ID cards for age verification. However, this time the seller cannot charge dierently between dierent units, unit by unit. This is called third-degree price
discrimination.
Sellers oer student discounts not because they are sympathetic to students.
If they charge uniformly students will not buy when their willingness to pay
is lower than that of business persons. Given that, it can be better for the
monopolist to sell the good at cheaper price to students rather than losing them.
However, if a business person can mimic to be a student such attempt simply
238
total pay
6
quantity
leads to selling the good uniformly at the cheaper price. Thus it is important
here that a business person cannot mimic to be a student.
Third-degree price discrimination is also known as multi-market monopoly.
When several markets are geographically and institutionally separated so that
consumers in one market cannot mimic to be a consumer in another market,
the monopolist can price dierently between the dierent markets.
Discrete case illustration
Let us consider the previous example again. Suppose the monopolist can verify
between Type A and Type B, and can charge dierently between the groups.
However, each unit must be sold at the same price to consumers in a given
group. Then the natural candidates are as follows.
1. Charge 8 per unit to both A and B (profit-maximizing price when no price
discrimination is allowed): The profit is 4NA + 4NB
2. Charge 10 per unit to A and Charge 8 per unit to B: The profit is 10NA +
8NB 4(NA + NB ) = 6NA + 4NB
3. Charge 10 per unit to A and Charge 3 per unit to B: Sales to B is obviously
unprofitable.
4. Charge 5 per unit to A and charge 8 to B: The profit is 5 2NA + 8NB
4(2NA + NB ) = 2NA + 4NB
5. Charge 5 per unit to A and charge 3 to B: Sales to B is obviously unprofitable.
Here the second one is profit maximizing. The maximized profit is 6NA +
4NB , which is not as much as 7NA + 4NB which is earned if the first-degree
239
discrimination is possible, but it is larger than the profit earned when no price
discrimination is possible.
Remember that third-degree price discrimination is possible only for verifiable types. For example, suppose there is another type of consumer C with
population NC and the seller cannot distinguish between B and C. Then the
monopolist can price-discriminate between Group A and Group BC, but cannot
price-discriminate between Type B and Type C.
Continuous case
Now consider the case of continuous quantity. Suppose the monopolist can
group consumers into two, A and B. The aggregate inverse demand in Market
A is given by pA (xA ) and that in Market B is given by pB (xB ).
Thus, when the monopolist provides yA units to Market A and yB to Market
B the profit one unit is sold at price pA (yA )in Market A and at price pB (yB ) in
Market B. Since the total cost is C(yA + yB ) the profit for the monopolist is
pA (yA )yA + pB (yB )yB C(yA + yB )
The monopolist varies both yA and yB in order to maximize its profit. The
profit maximization condition is then
M RA (yA ) = M C(yA + yB )
M RB (yB ) = M C(yA + yB )
Let us go over an example of linear demand and constant marginal cost.
Suppose demand functions in the two markets are given by
xA (p) = 100 p
xB (p) = 100 2p
respectively. Then the inverse demand functions the two markets are
pA (xA ) = 100 xA
1
pB (xB ) = 50 xB
2
respectively. Suppose the cost function is C(y) = 20y, that is, the marginal cost
is constant and given by M C = 20.
When no price discrimination is allowed the monopolist faces a single market,
in which the aggregate demand function is
x(p) = xA (p) + xB (p) = 200 3p,
and its inverse demand function is p(x) =
200
3
13 y.
240
2
Since marginal revenue in this single market is M R(y) = 200
3 3 y, the
200
2
profit maximizing condition is 3 3 y = 20, which implies y = 70. Thus the
resulting price in the single market is p(70) = 130
3 and the monopolists profit
1
is 4900
=
1633
.
3
3
On the other hand, when third-degree price discrimination is allowed the
monopolist solves
(
)
1
max (100 yA )yA + 50 yB yB 20(yA + yB )
yA ,yB
2
19.3.4
Two-part tari
Consider the case of linear inverse demand p(x) = a bx and constant marginal
cost c, where the linearity of inverse demand is not essential here but the constancy of marginal cost is.
Then the competitive equilibrium quantity is y CE = ac
b , which is Pareto
2
ecient, and the price is y CE = c. Here the maximal social surplus is (ac)
2b , all
of which is taken by the consumer side. On the other hand, in monopoly equilibrium without price discrimination we already saw that there is an eciency
loss.
Price discrimination is already discussed as a method to recover the eciency
loss, but there is another way here: charge the entire social surplus as an entry
fee and charge the marginal cost for each unit of purchase. In the above
2
example the entry fee is (ac)
and the charge per unit is c. By doing this the
2b
monopolist can extract the entire maximal social surplus.
The monopolist of course needs to know every consumers inverse demand
function and the marginal cost has to be constant in order that the two-part
tari is carried out precisely. It is doable in a milder manner, however, and it
quite often used in practice. A familiar examples are warehouse club and sports
gym.
19.3.5
Bundling
I explain this through an example. Microsoft has two products, Word and Excel.
We assume for simplicity that it can produce copies of them at no cost. Also
we assume it has a technology to prevent piracy.
241
Word
12
8
Excel
8
12
Suppose Microsoft price each of Word and Excel. When the price of Word is 12
only consumer A buys it and the profit is 12, when it is 8 both A and B buy
and the profit if 16. Hence it is profit-maximizing to set the price of Word equal
to 8. Likewise, it is profit-maximizing to set the price of Excel equal to 8. Then
the total profit is 16 2 = 32.
Now consider that Microsoft can bundle Word and Excel into Oce and
sell it. For simplicity assume that willingness to pay for Oce is equal to the
sum of willingness to pay for Word and willingness to pay for Excel. Then
both A and B are willing to pay 20 for Oce. Thus, by selling Oce for
price 20 instead of selling Word and Excel separately Microsoft can earn profit
20 2 = 40, which is greater than the above.
Let us proceed one step further to consider that Microsoft Word and Excel
separately as well as Oce. Then each consumer chooses either of (i) buying
Word only; (ii) buying Excel only; (iii) buying Oce; (iv) buying both Word
and Excel separately, and (v) nothing.
Suppose there are four equally populated groups of consumers A,B,C and
D. Their willingness to pay is as follows.
A
B
C
D
Word
12
8
15
0
Excel
8
12
0
15
Oce
20
20
15
15
Here A and B are modest consumers and C and D are extreme consumers.
Then the profit-maximizing choice is to let A and B buy Oce, let C buy Word
only and let D buy Excel only, which is to set the price of Word equal to 15,
the price of Excel equal to 15 and the price of Word equal to 20.
The problem is more complicated when willingness to pay for Oce is not
equal to the sum of willingness to pay for Word and willingness to pay for Excel,
since if the price of oce is too high consumers may buy both Word and Excel
separately. Solution in such cases would require more sophisticated technique
of combinatorial optimization.
19.4
Exercises
242
(i) Suppose this firm behaves as a price-taker, and find the quantity, price consumer surplus, producer surplus and social surplus in competitive equilibrium.
(ii) Find the quantity, price consumer surplus, producer surplus, social surplus
and dead weight loss in monopoly equilibrium.
(iii) Described the first-degree price discrimination which achieves full surplus
extraction here.
Exercise 22 There is a firm being the monopolist in two markets. Its cost
function is c(y) = 0.5y 2 . The market inverse demand function in Market A and
B are respectively pA (yA ) = 90 yA and pB (yB ) = 120 2yB .
(i) Suppose the firm cannot price-discriminate and also behaves as a price-taker.
Find the quantity and price in competitive equilibrium.
(ii) Suppose the firm cannot price-discriminate. Find the quantity and price in
monopoly equilibrium.
(iii) Suppose the firm can price-discriminate. Find the quantity and price in
monopoly equilibrium.
Chapter 20
243
20.1
244
Let me start with the simplest case that there are two players. There a normalform game is described in the form a payo matrix.
Let me explain through an example.
Example 20.1 Market entry: There are two firms, A and B. They choose
whether to enter the market or not, respectively. Payos are given in the following table,
B
Entry Non-entry
Entry
5, 5
10, 0
A
Non-entry 0, 10
0, 0
where the number in the left in each cell refers to As payo and that in the
right in each cell refers to Bs payo. The table is read as
If both A and B enter each of them gets 5.
If A enters and B does not enter A gets 10 and B gets 0.
If A does not enter and B enters A gets 0 and B gets 10.
If neither A or B enters each of them gets 0.
Now let me formalize this in a more general manner. A normal-form game
consists of a set of players, strategy sets and payo functions. The set of players
is a finite set I = {1, , n}. For each player i = 1, , n, let Si denote the set
of hisstrategies, where its generic element is denoted lets say by si Si . Let
n
S = i=1 Si denote the set of all the combinations of all the players strategies,
where its element denoted by s = (s1 , , sn ) is called a strategy profile. Given
a strategy profile s = (s1 , , sn ), the payo received by player i is denoted
by vi (s). Since this is defined for all s S, player is payo is described by a
function vi : S R, which is called payo function for player i.
Let us apply this formalization to the leading example. The set of players is
I = {A, B}. The strategy sets are SA = {E, N } and SB = {E, N } respectively,
where E denotes Entry and N denotes Non-entry. The payo functions are
given by
vA (E, E) =
vA (E, N ) =
vA (N, E) =
5,
10,
0,
vA (N, N ) =
245
and
vB (E, E) =
5,
vB (E, N ) =
vB (N, E) =
0,
10,
vB (N, N ) =
where the first argument in the functions is As strategy and the second if Bs
strategy.
I guess the notion of payo function may not be fully convincing to you at
this point, but I will come to this after explaining one more example.
Example 20.2 Prisoners dilemma Two gangs are arrested for a minor crime
of which they are already convicted. They are suspected to have committed a
serious crime, however. The prosecutor oers the following legal deal: if one
confess while the other does not the one who confessed is free and the one who
did not confess is prisoned for 10 years. If both confess each of them is prisoned
for 5 years. If neither confesses each of them is prisoned for 1 year for the crimed
they are already convicted of.
For illustration, let me count payos by the negatives of years in prison (it
doesnt have to be, though, as Ill explain in the next subsection), then the
payo matrix is
B
C
N
C 5, 5 0, 10
A
N 10, 0 1, 1
where C refers to confess and N refers to not to confess.
Let us apply this formalization to the leading example. The set of players is
I = {A, B}. The strategy sets are SA = {C, N } and SB = {C, N } respectively.
The payo functions are given by
vA (C, C) =
vA (C, N ) = 0
vA (N, C) = 10
vA (N, N ) =
vB (C, C)
and
=
vB (C, N ) = 10
vB (N, C) = 0
vB (N, N ) = 1
20.1.1
246
On payo functions
Let me now get into the detailed arguments on the payo functions. Throughout
this book Im maintaining the standpoint that utility representation of preference is only an ordinal notion and it has no quantitative meanings. In the above
specifications, however, I had assigned particular numbers to strategy profiles.
It may be OK in the first example in which the firms payo are described by
their profits, but how can we describe individuals payos numerically like in
the second example?
In order to establish the precise definition of payo function game theory
borrows expected utility theory as introduced in Chapter 9. As is discussed
later, strategies taken in games may be in general stochastic. Thus we consider
the set of probability distributions over the set of strategy profiles and apply
the expected utility theory there.
That is, each player i has preference i over the set of probability distributions over the set of strategy profiles, denoted by (S), and it is represented in
the form
(
)
(
)
p i q f
vi (s)ps f
vi (s)qs
sS
sS
for p, q (S), where vi is the vNM index which described players risk attitude
and f is an arbitrary monotone transformation.
The vNM index vi which forms the expected utility representation here is
the payo function what are talking about now. Thus we should understand
that numbers appear in payo matrices are already adjusted to the players risk
attitudes.
vMM index is cardinal in the sense that it forms representation of preference in the expectation form, in which we take summation operations over the
values of the index. I has been already discussed, however, that only the curvature of the index has quantitative meanings and absolute amount of utility
change or absolute level of utility have no meaning.
Thus, if vi a vNM index which forms an expected utility representation
for some preference, its any ane formation is a vNM index which forms an
expected utility representation for the same preference. For example, in the
Prisoners dilemma
B
A
C
N
C
5, 5
10, 0
N
0, 10
1, 1
247
obtain
B
A
C
N
C
2, 20
12, 5
N
8, 35
6, 8
which describes the same game as the above. It is merely for simplicity that
we use the first one instead of the second.
Note again that the overall representation of preference allows arbitrary
monotone transformation (denoted f here), which is consistent with the standpoint that representation of preference is ordinal.
20.2
Dominant strategy
248
Entry
Non-entry
Entry
2, 2
0, 10
Non-entry
10, 0
0, 0
20.3
What can we think of next when there is no dominant strategy? Let us consider
the following question.
Suppose each player is rational in the sense that they can process
given information in a logically correct manner and that he maximizes his expected utility under the given information. Then how
much can players narrow down the strategies?
Let me start by explaining with an example.
X
Y
Z
F
3, 0
4, 3
2, 2
B
G
4, 1
6, 2
5, 0
H
2, 5
3, 1
8, 1
We see in the above table that strategy X is worse than Y from As viewpoint
no matter what B chooses. Then we say that X is strictly dominated by Y
for A. A strictly dominated strategy can never be an optimal choice no matter
what the opponent does. Hence, if
A1-1: A knows the payo matrix correctly,
A1-2: A is rational in the sense explained above,
X is never chosen by A.
Now, if it holds
249
Y
Z
F
4, 3
2, 2
B
G
6, 2
5, 0
H
3, 1
8, 1
In the game after the elimination, we see that H is worse than G for B, no
matter what A chooses (except when A chooses X, which is the case already
eliminated). That is, H is strictly dominated by G for B. Thus, if
B1-3: B is rational in the sense explained above,
H is never chosen by B.
Now, if it holds
A2: A knows B1-1, B1-2 and B1-3,
in addition to A1-1 and A2-2, then A knows B knows A never chooses X, and
because of this B never chooses H. Thus A eliminates H from the possibility.
The game after the elimination is now
B
A
Y
Z
F
4, 3
2, 2
G
6, 2
5, 0
In the game after the elimination, we see that Z is worse than Y for A, no
matter what B chooses (except when B chooses H, which has been eliminated
because X had been eliminated). That is, Z is strictly dominated by Y for A.
Thus, if A is rational in the sense explained above, Z is never chosen by A.
Now, if it holds
B2: B knows A2,
in addition to B1-1, B1-2, B1-3, then B knows A knows B knows A never
chooses X, and because of this B never chooses H, and because of this A
never chooses Z. Thus B eliminates Z from the possibility. The game after the
elimination is now
B
F
G
A
Y
4, 3
6, 2
In the game after the elimination, we see that F is optimal for B. As a result,
the only possibility is (F, G).
This is called iterated eliminated of dominated strategies. Formally,
it is defined as follows.
250
0
for all si j=i Sj0 Si
, say that si is strictly dominated by si .
1
Thus, let Si denote the set of is strategies in Si0 which are not strictly
dominated by anything else in Si0 . That is,
Si1
0
= {si Si0 : si Si0 , si Si
, ui (si , si ) > ui (si , si )}
0
0
0
= {si Si : si Si , si Si , ui (si , si ) ui (si , si )}
k
= {si Si0 : si Sik , si Si
, ui (si , si ) > ui (si , si )}
k
= {si Sik : si Sik , si Si
, ui (si , si ) ui (si , si )}
4. Repeat this
If this process leads to a unique strategy profile we say that the game is
dominance-solvable. However, it is not in general the case that the iterated
elimination leads to a unique strategy profile. Here is an example.
B
X
Y
Z
W
F
3, 0
6, 1
2, 3
0, 2
G
1, 1
1, 2
0, 1
2, 4
H
4, 2
2, 0
1, 2
3, 1
I
2, 1
0, 5
1, 4
1, 0
251
X
Y
F
1, 1
0, 0
B
G
2, 0
2, 2
20.4
Rationalizable strategies
X
Y
Z
F
3, 2
0, 2
2, 4
B
G
0, 0
5, 3
2, 0
H
3, 5
0, 0
2, 0
Here there is no strict dominance relation between any two strategies from
neither players viewpoint. However, strategy Z cannot be an optimal choice for
A no matter what B chooses. Hence we may eliminate it for the same reason
as before, since it is never used.
Then the game reduces to
X
Y
F
3, 2
0, 2
B
G
0, 0
5, 3
H
3, 5
0, 0
Here there is no strict dominance relation between any remaining strategies from
neither players viewpoint. However, strategy F cannot be an optimal choice for
B no matter what A chooses. Hence we may eliminate it for the same reason
as before, since it is never used.
252
G
0, 0
5, 3
H
3, 5
0, 0
0
. If a strategy is not in this subset
opponents strategies in j=i Sej0 Sei
it cannot be optimal against any opponents strategies. That is,
Sei1
0
= {si Sei0 : si Sei
, si Sei0 , ui (si , si ) ui (si , si )}
k
{si Seik : si Sei
, si Seik , ui (si , si ) ui (si , si )}
for each i.
4. Repeat this.
A strategy is said to be rationalizable if it survives the above iterated
eliminations. Here, rationalizable just means that there is a reason to use it,
and has nothing to do with other meanings.
Let me go over one more example.
B
X
Y
Z
W
P
4, 4
2, 0
3, 1
3, 2
Q
4, 3
5, 4
4, 2
2, 3
R
1, 3
2, 5
3, 2
1, 3
S
1, 0
2, 0
2, 3
3, 4
20.5
253
Nash equilibrium
Given that dominant strategy does not always exist and iterated elimination
of dominated strategies does not lead to a unique strategy profile and neither
rationalizability does, what strategies should we think are chosen?
In game theory, a solution concept called Nash equilibrium is taken to
be the most standard one. In two-player games, Nash equilibrium refers to a
strategy profile such that
As strategy is optimal for him given Bs strategy, and
Bs strategy is optimal for him given As strategy.
In other words, it is a situation such that I do this because you do that, and
you do that because I do this.
Compared this to the definition of dominant strategy equilibrium
As strategy is optimal for him no matter what B does, and
Bs strategy is optimal for him no matter what A does.
Then you might notice a kind of jump or circularity in the definition of
Nash equilibrium.
Let me first finish the formal definition of Nash equilibrium.
Definition 20.2 A strategy profile s = (s1 , , sn ) is said to be a Nash
equilibrium if it holds
vi (si , si ) vi (si , si )
for all i and si Si .
One can restate the definition of Nash equilibrium in the following way. For
each player i, given a profile of the other players strategies si , let BRi (si )
denote the set of strategies which are optimal for i against si , which is called
best response. We take a set in general rather than a point because there
may be multiple optima. Then a strategy profile s = (s1 , , sn ) is said to be
a Nash equilibrium if it holds
si BRi (si )
for all i.
Now let us find Nash equilibria in specific examples.
Example 20.4 Market entry 2: Consider the version of market entry game
in which both lose when both enter,
B
A
Entry
Non-entry
Entry
2, 2
0, 10
Non-entry
10, 0
0, 0
254
First let me first look at As best response. Suppose B enters then the best
response for A is not to enter, hence BRA (E) = {N }. Suppose B does not
enter then the best response for A is to enter, hence BRA (N ) = {E}. Do the
same exercise for B, then we obtain BRB (E) = {N }BRB (N ) = {E}
Since it is not immediate to see which strategy profile satisfies the condition
sA BRA (sB ), sA BRA (sB ), let us do the following exercise. For each
player and each possible opponent strategy, draw lines under the payos given
by the best responses. For example, since As best response when B enters is
not to enter, draw a line under As payo 0 in the lower-left cell (0, 10) which
corresponds to the strategy profile (N, E). Be careful not to do it in the
reverse way. Then we obtain
B
A
Entry
Non-entry
Entry
2, 2
0, 10
Non-entry
10, 0
0, 0
Likewise, since As best response when B does not enter is to enter, draw a
line under As payo 10 in the upper-right cell (10, 0) which corresponds to the
strategy profile (E, N ). Then we obtain
Entry
Non-entry
Entry
-2, 2
0, 10
B
Non-entry
10, 0
0, 0
Do the same exercise for B for all possible strategies by A, then we obtain
Entry
Non-entry
Entry
-2, -2
0, 10
B
Non-entry
10, 0
0, 0
In Nash equilibrium each players strategy must be a best response against each
others strategy, hence it corresponds to a cell in which both players payos are
underlined. Thus in the current example there are two Nash equilibria, (E, N )
and (N, E). In the first equilibrium, because A enters B does not enter, and
because B does not enter A enters. In the second equilibrium, because B
enters A does not enter, and because A does not enter B enters.
This example also shows that there may be multiple Nash equilibria.
Here it says only that either player concedes, and does not tells us anything
about which player concedes. I will come to the problem of multiple equilibria
later.
Let us go over three more examples for practice.
255
X
Y
Z
F
2, 4
3, 1
1, 7
G
3, 5
2, 0
4, 8
H
8, 3
5, 1
7, 9
X
Y
Z
F
2, 4
3, 1
1, 7
G
3, 5
2, 0
4, 8
H
8, 3
5, 1
7, 9
c
p1
p2
p3
p4
c
0, 0
0, 0
0, 0
0, 0
0, 0
p1
0, 0
4, 4
0, 8
0, 8
0, 8
p2
0, 0
8, 0
6, 6
0, 12
0, 12
p3
0, 0
8, 0
12, 0
10, 10
0, 20
p4
0, 0
8, 0
12, 0
20, 0
15,15
c
p1
p2
p3
p4
c
0, 0
0, 0
0, 0
0, 0
0, 0
p1
0, 0
4, 4
0, 8
0, 8
0, 8
p2
0, 0
8, 0
6, 6
0, 12
0, 12
p3
0, 0
8, 0
12, 0
10, 10
0, 20
p4
0, 0
8, 0
12, 0
20, 0
15,15
Thus there are two Nash equilibria, (c, c) and (p1 , p1 ). It looks better for both
that they cooperate and play (p4 , p4 ), but they dont do so in equilibrium. For,
256
if your opponent is setting high price then it is better for you to slight undercut
the price and get all the demands than to set the same price. Your opponent
will do the same thing as well. Thus they have to undercut prices down to either
p1 or c.
Example 20.7 Battle of sexes: Boy A and Girl B are a couple, and their
problem where to go for a date. There are two places, one is boxing and the
other is opera. Since the main objective is dating, if they go to dierent places
they get nothing. If they go to the same place, A prefers boxing to opera and
B prefers opera to boxing. Such situation can be described by a payo matrix
like below.
B
Boxing Opera
Boxing
2, 1
0, 0
A
0, 0
1, 2
Opera
Go over the underlying exercise, then we obtain
B
A
Boxing
2, 1
0, 0
Boxing
Opera
Opera
0, 0
1, 2
Thus there are two Nash equilibria, (Boxing, Boxing) and (Opera, Opera).
Again we have the multiple equilibria problem, since Nash equilibrium tells
us only that they go to the same place and nothing about where they go.
20.5.1
20.5.2
Confess
Not to Confess
Confess
5, 5
-10, 1
Not to Confess
1, -10
2, 2
257
Proof. Let s = (s1 , , sn ) be the unique strategy profile that survives the
iterated elimination. Suppose it is not a Nash equilibrium then there exists i
and si Si such that
ui (si , si ) > ui (si , si ).
Since si has been eliminated as a dominated strategy in a previous round
lets say k, there is si Sik such that
ui (si , si ) > ui (si , si )
k
for all si Si
.
k
Since si has survived the elimination it must be that si Si
. Hence we
obtain
ui (si , si ) > ui (si , si )
258
X
Y
Z
W
F
3, 0
6, 1
2, 3
0, 2
G
1, 1
1, 2
0, 1
2, 4
H
4, 2
2, 0
1, 2
3, 1
I
2, 1
0, 5
1, 4
1, 0
As we saw there, {X, W } is the set of As strategies and {G, H} is the set of
Bs strategies which survive the iterated elimination respectively. Now there are
two Nash equilibria, (X, H) and (W, G).
B
X
Y
Z
W
F
3, 0
6, 1
2, 3
0, 2
G
1, 1
1, 2
0, 1
2, 4
H
4, 2
2, 0
1, 2
3, 1
I
2, 1
0, 5
1, 4
1, 0
259
20.5.3
Let us go back to the definition of Nash equilibrium illustrated for the two-layer
case,
As strategy is optimal for him given Bs strategy, and
Bs strategy is optimal for him given As strategy.
In other words, it is a situation such that A does this because B does that, and
B does that because A does this.
However, in order to say because B does that, it must be that A is correctly
predicting B does that, and also in order to say because A does his, it must
be that B is correctly predicting B does this.
This correct prediction cannot be reached just by the iterated elimination
of dominated strategies or by the rationalizability argument. It is known that
when they lead to a unique strategy profile it is Nash equilibrium, but in general
they narrow down to a unique strategy profile. Thus there is a leap from there
to Nash equilibrium which must based on mutual correct prediction about
each others choices.
260
Since even the iterated elimination of dominated strategies or the rationalizability argument seems to require pretty high ability of logical reasonings, such
further leap toward Nash equilibrium may seem to require that players are
super-rational.
One might argue in contrary, No, Nash equilibrium looks like requiring
super-rationality because you are looking at the situation in a too static manner. If you look at the situation from a dynamic viewpoint, you will see that
players learn to play Nash equilibrium through learning and imitations without
having such super-rationality. From this viewpoint players may not have correct predictions about each others actions initially, but as the game is played
repeatedly they learn about each other and gradually form correct predictions
about each others actions, which converges to Nash equilibrium in a long-run.
See for example Kalai and Lehler [14] as one such theoretical result.
20.6
Mixed strategies
Does Nash equilibrium always exist, by the way? It is easy to find an example in
which it does not exist. Rock-scissors-paper is a representative example. Here it
is impossible that both players are taking actions each which is optimal against
each other.
Let us consider an even simpler example.
Example 20.8 Marching pennies: A and B simultaneously show their coins.
A wins if their faces match and B if they dont. For simplicity, let me assume
that if you win you get 1 and if you lose you lose 1. Then the payo matrix is
B
A
Head
Tail
Head
1, 1
1, 1
Tail
1, 1
1, 1
It is easy to see that there is no Nash equilibrium. You can see this by doing
the underlining exercise, which results in
B
A
Head
Tail
Head
1, 1
1, 1
Tail
1, 1
1, 1
261
Head is called a pure strategy. Note, however, that pure strategy is a special
case of mixed strategy, because showing Head is nothing but showing Head
with probability 1 and showing Tail with probability
How should we interpret mixed strategies? We can think of two interpretations. One is literal, which says we literally randomize actions. For example,
if you want to implement a mixed strategy showing Head with probability 0.3
and showing Tail with probability 0.7, bring cards numbered from 1 to 10, and
show Head if you draw one of numbers 1 to 3 and show Tail if you draw one
of numbers from 4 to 10. For the other interpretation, imagine that players are
randomly drawn and matched from a large population. Consider for example
which side of the road you walk. Then a mixed strategy like walking Left with
probability 0.3 and walking Right with probability 0.7 is interpreted that 30%
of people you encounter walk Left and 70% of people you encounter walk Right.
The second interpretation takes mixed strategies as such collective behaviors.
The following result is known. If you are curious about its proof see an
advanced textbook such as Mas-Colell, Whinston and Green [21].
Theorem 20.1 When the set of strategies are finite, Nash equilibrium always
exists in mixed strategies.
Now how do we find mixed-strategy Nash equilibria? You can do it by
extending the best response argument to mixed strategies. Let me illustrate it
using the example of matching pennies. Let pA denote the probability that A
shows Head and let pB denote the probability that B shows Head.
Then, As expected utility given a combination of mixed strategies (pA , pB )
is
uA (pA , pB ) =
pA pB pA (1 pB ) (1 pA )pB + (1 pA )(1 pB )
{1},
when pB > 0.5.
When pB < 0.5 the coecient on As own probability of Head pA given by
2(2pB 1) is negative. Hence As expected utility is linearly decreasing in
pA . Since pA moves between 0 and 1, the maximal expected utility is obtained
at the left end-point, which is pA = 0. When pB > 0.5 the coecient on
As own probability of Head pA given by 2(2pB 1) is positive. Hence As
262
= pA pB + pA (1 pB ) + (1 pA )pB (1 pA )(1 pB )
= (2 4pA )pB 1 + 2pA
{0},
when pA > 0.5.
Now, the best responses of the two players are depicted as in Figure 20.1.2 In
a mixed-strategy Nash equilibrium, the combination of each players probability
to show Head (pA , pB ) satisfies
pA
BRA (pB )
pB
BRB (pA )
That is, it is the point at which the two best response graphs coincide. Here the
intersection is (pA , pB ) = (0.5, 0.5). Thus the mixed-strategy Nash equilibrium
is
((Head 0.5, Tail 0.5), (Head 0.5, Tail 0.5)).
Let us go over two more examples. Consider a version of market entry game
we saw before,
B
A
Entry
Non-entry
Entry
2, 2
0, 10
Non-entry
10, 0
0, 0
We already know that there are two pure-strategy Nash equilibria, (Entry,Nonentry) and (Non-entry,Entry). There is one more when we allow mixed strategies, however.
As expected utility given a combination of mixed strategies (pA , pB ) is
uA (pA , pB ) =
2 This
263
pB
16
BRB
BRA
0.5
- pA
1
0.5
{0},
when pB > 5/6.
Likewise, Bs expected utility given a combination of mixed strategies (pA , pB )
is
EUB (pA , pB ) =
{0},
when pA > 5/6.
The two players best responses are depicted as in Figure 20.2, and their
graphs cross at three points. Thus we there are three Nash equilibria
((Entry 1, Non-entry 0), (Entry 0, Non-entry 1))
((Entry 0, Non-entry 1), (Entry 1, Non-entry 0))
((
) (
))
5
1
5
1
Entry , Non-entry
, Entry , Non-entry
6
6
6
6
two of which are pure-strategy Nash equilibria which have been already obtained. The nice thing of this method is that you obtain all equilibria at once.
Consider the battle of sexes,
B
A
Boxing
Opera
Boxing
2, 1
0, 0
Opera
0, 0
1, 2
264
pB
16
BRB
5/6
BRA
5/6
- pA
1
We already know that there are two pure-strategy Nash equilibria, (Boxing,Boxing)
and (Opera,Opera). There is one more when we allow mixed strategies, however.
As expected utility given a combination of mixed strategies (pA , pB ) is
uA (pA , pB ) =
(3pB 1)pA pB + 1
{1},
when pB > 1/3.
Likewise, Bs expected utility given a combination of mixed strategies (pA , pB )
is
EUB (pA , pB ) =
{1},
when pA > 2/3.
The two players best responses are depicted as in Figure 20.3, and their
graphs cross at three points. Thus we there are three Nash equilibria
265
pB
16
BRB
BRA
1/3
- pA
1
2/3
20.7
Reconsider the example which I raised in order to say that a weakly dominated
strategy should not necessarily be eliminated.
X
Y
F
1, 1
0, 0
B
G
2, 0
2, 2
First let us find all Nash equilibria in this game. Let pA denote the probability
that A chooses X, and let pB denote the probability that B chooses F.
Then, As expected utility given (pA , pB ) is
uA (pA , pB ) =
=
{1},
when 2/3 pA 1.
266
pB
16
BRB
BRA
2/3
- pA
1
When we depict the best responses we obtain Figure 20.4, The set of Nash
equilibria consists of points at which the two graphs intersect, which is actually
a continuum. It consists of two components,
{(pA , pB ) : 0 pA 2/3, pB = 0}
and
(pA , pB ) = (1, 1)
Thus, Y is a weakly dominated strategy, but it can be played in Nash equilibria.
However, it is optimal for A to choose Y only when B chooses G for sure.
It is an unreliable choice, given that there might be an error in the opponents
choice.
This leads us to think of a game in which each of A and B chooses any
action at least with probability whether he wants it or not. This is called a
perturbed game
Recall that As expected utility is
uA (pA , pB ) =
pA pB 2pB + 2
pB 1 .
267
pB
6
1
BRB
BRA
2/3
- pA
1
{1 },
when 2/3 pA 1 .
When we depict the best responses of the two we obtain Figure 20.5. Let
(pA , pB ) denote corresponding choice probabilities of X and F respectively in
Nash equilibrium in the perturbed game, then it is given by the condition
pA
BRA (pB )
pB
BRB (pA )
268
20.8
Which one is played when there are multiple Nash equilibria? The argument of
equilibrium refinement excludes unlikely equilibria by eliminating non-robust
choices such as weakly dominate strategies. It leaves many of multiple equilibria
problems unsolved, however.
Consider for example a version of market entry game
B
Entry
Non-entry
Entry
2, 2
0, 10
Non-entry
10, 0
0, 0
5
1
5
1
6 , N 6 ), (E 6 , N 6 )),
B
A
Boxing
Opera
Boxing
2, 1
0, 0
Opera
0, 0
1, 2
there are thee Nash equilibria, (Box, Box), (Ope, Ope) and ((Box 32 , Ope 13 ), (Box 13 , Ope 32 )),
and all of them are trembling-hand perfect.
There is a literature called equilibrium selection, which proposes a criterion for selecting equilibrium positively rather than tries to eliminate unlikely
equilibrium as in equilibrium refinement.
While the equilibrium refinement literature resorts to each players individual
and intellectual rationality wanting strategic to choice be robust to certain kinds
of errors, the equilibrium selection literature has a flavor of bringing in factors
other than individual and intellectual rationality from outside.
Consider the following example.
269
Stag
Rabbit
Stag
10, 10
6, 0
B
Rabbit
0, 6
6, 6
There are three Nash equilibria, (S, S), (R, R) and ((S 53 , R 52 ), (S 35 , R 52 )), and
all of them are trembling-hand perfect.
According to payo dominance (S, S) is played, but it is a risky choice
to play this equilibrium in that sense that it depends critically on that the
opponent surely follows the same equilibrium play. Since it is unclear which
pure strategy equilibrium the opponent will follow let us say that he takes each
action with even chance. Then if you go to hunt stag your expected utility is
10 0.5 = 5. On the other hand, if you go to hunt rabbit you get 6 no matter
what the opponent does. Thus we may say that going to hunt rabbit is the safe
choice and that (R, R) is the safe equilibrium. This criterion is called risk
dominance.
270
20.9
271
Exercises
X
Y
Z
V
W
F
0, 2
2, 1
1, 3
2, 1
3, 4
G
3, 1
0, 3
2, 0
4, 1
1, 3
H
0, 2
2, 4
1, 1
1, 3
3, 0
I
5, 3
1, 1
3, 2
4, 1
2, 2
(1) Find the set of strategies which survive the iterated elimination of dominated
strategies.
(2) Find the set of rationalizable strategies.
(3) Find all pure-strategy Nash equilibria.
Exercise 24 Consider a game in which players simultaneously choose integers
from 0 to 100 respectively, and one wins 100 dollars if his number is closest to
the half of the averages of the chosen numbers. Find the set of rationalizable
strategies.
Exercise 25 There are three players A,B, and C. A chooses between A and Y,
B chooses between F and G, and C chooses between K and L. The payo matrix
when A chooses X is in the left below, and the one when A chooses Y is in the
right, where in each cell the the number in the lest if As payo, the one in the
middle is Bs payo and the one in the right is Cs payo.
sA = X
B
C
F
G
K
2, 1, 4
7, 2, 1
L
4, 3, 8
2, 1, 3
sA = Y
B
C
F
G
K
5, 1, 8
4, 9, 3
L
1, 3, 2
3, 4, 5
Chapter 21
Extensive-form games deal with sequential decisions with turns, which are described by game trees. It will be better to start with an example.
Example 21.1 Market entry: There are two firms, A and B. A is a potential
entrant to the market and B is an incumbent monopoly firm. First, A decides
whether to enter (E) the market or not (N). Then B decides whether to fight
(F) or compromise (C) after seeing As action.
A
E
A
B
q
20
10
10
20
Payos are explained as follows. If A does not enter A receives payo 0 and B
receives the monopoly profit 20. If A enters and B fights A loses 20 and B loses
5 as well. If A enters and B compromises each receives profit 10.
We say that the above extensive-form game is with perfect information because the second-mover can monitor the first-movers action.
First let us look into this extensive-form game by representing it by a normalform game. It is called a normal-form expression of extensive-form game.
272
273
E
N
F
20, 5
0, 20
C
10, 10
0, 20
This normal-form game has two (pure-strategy) Nash equilibria. One is (E, C)
and the other is (N, F).
The latter Nash equilibrium is unrealistic, however.The story behind it is
that the entrant refrains from entry because of the incumbents threat saying
if you enter I will fight. However, once entry is done the decision problem
for the incumbent is
A
B
q
20
10
10
F
20, 5
C
10, 10
Such game which follows after preceding actions is called a subgame. In the
above subgame after entry the incumbent firm never fights since it is simply a
waste of resource. Therefore, the threat saying if you enter I will fight is not
credible. Such threat which is never carried out is called an empty threat.
21.2
274
In extensive-games with finite rounds with perfect information, subgameperfect Nash equilibrium is found by backward induction. Backward induction tells us here that we start with solving the subgame after As entry. Here it
is optimal for B to choose Compromise. Next we look at As decision, in which
A is supposed to foresee what B does in the game after As entry. Under
the foresight that B chooses Compromise there, As payo is 10 if he chooses
Entry and 0 if he chooses Non-entry. Thus it is optimal for A to choose Entry.
Thus the backward induction yields (E, C), which is the subgame-perfect Nash
equilibrium in this game.
Example 21.2 Market entry 2: Here A the entrant chooses whether to enter (E) or not to enter (N). Then B the incumbent chooses whether to take
aggressive pricing strategy (A) or conservative pricing strategy (C), after
seeing As action.
A
E
A
B
q
q
N
q
B
20
10
10
20
Like before, let me start with the normal-form expression. Notice that here
the number of Bs strategies is not two, but four. It is not simply aggressive
and conservative, but it should be aggressive whether there is an entry or
not (denoted AA), aggressive if there is entry, conservative if there is no entry
(AC), conservative if there is entry, aggressive if there is no entry (CA), and
conservative whether there is an entry or not (CC).
In extensive-form games there is a distinction between a strategy and an
action. In stead, it is a list of actions conditional on histories, such as I
will be conservative if there is entry, aggressive if there is no entry.
To understand, imagine for example that this game is played online and the
two players submit their programs to the mediator beforehand and the mediator
runs the submitted programs. Then, what A needs to submit is simply the
object name of an action (E or N), but what B needs to submit is an entire code
saying for example output C if input is E; output A if input is N.
Now here is the payo matrix for the normal-form expression
B
A
E
N
AA
20, 5
0, 5
AC
20, 5
0, 20
CA
10, 10
0, 5
CC
10, 10
0, 20
275
There are three pure-strategy Nash equilibria in the above normal-form game:
(E, CA)
(E, CC)
(N, AC)
The third Nash equilibrium is not subgame-perfect, which is the case of
empty threat. Also, the first one is not subgame-perfect. Here the incumbent
firm takes an aggression action if there is no entry, but this is not an optimal
choice if there is no entry.
Thus the subgame-perfect Nash equilibrium is the second one,
(E, CC)
Let us verify this by backward induction. If A enters B chooses Conservative
which yields payo 10 over Aggressive which yields payo 5. If A does not
enter B chooses Conservative which yields payo 20 over Aggressive which yields
payo 5.
Next we look at As decision, in which A is supposed to foresee what B
does after As choice. Under the foresight that B chooses Conservative after
A chooses Entry and that B chooses Conservative after A chooses Non-entry,
As payo is 10 if he chooses Entry and 20 if he chooses Non-entry. Thus it is
optimal for A to choose Entry. Thus the backward induction yields
(E, CC),
which is the subgame-perfect Nash equilibrium in this game.
Note that here (Entry, Conservative) is not a right description of strategies
while it is right as a description of observed path of actions. You might wonder,
why do we care if the incumbent is taking optimal action when there is no
entry, despite that the entrant is entering? It does matter if the second-mover
is taking optimal action at any node even if such node is not reached.
To illustrate, consider the following extensive-form game.
X
A
B
q
q
Y
q
B
100
276
q
X1
q
X2
B
q
A
99
B
1
50
50
99
A
X3
q
B
The normal-form expression of the above game is below, where for example
N Y Y denotes Bs strategy No if X1, Yes if X2 and Yes if X3, and similarly
for the other ones.
B
X1
X2
X3
YYY
99, 1
50, 50
1, 99
YYN
99, 1
50, 50
0, 0
YNY
99, 1
0, 0
1, 99
YNN
99, 1
0, 0
0, 0
NYY
0, 0
50, 50
1, 99
NYN
0, 0
50, 50
0, 0
NNY
0, 0
0, 0
1, 99
NNN
0, 0
0, 0
0, 0
Ar C
S
A
B
Br
S
1
1
0
21
Ar C
S
20
20
Br
S
19
40
277
Ar
S
Br
A 58
B 58
39
39
38
59
In experiments it has been observed that when the first mover makes a greedy
proposal the second mover rejects it despite that it is profitable to accept it.
How should we interpret this? One explanation is that the above numbers
are no more than material payos given by the experimenter, and they do not
have to be the payos perceived by the subjects. When it is the case, payos
perceived by the subjects are something which cannot be controlled by the
experimenter and hence have to be estimated. Also, such perception severely
depends on how experiments are carried out (e.g., whether it is face-to-face or
not).
Let us go over one more example.
Example 21.4 The game gas six turns. First A chooses whether to continue
the game (C) or to stop the game (S). If A continues then B chooses whether
to continue or to stop the game. If B continues then A chooses whether to
continue or to stop the game, and so on. As in Figure 21.1 each of A and B has
at most three turns. Payos are as listed in the Figure. From the shape of the
game tree we call it a centipede game.
The normal-form expression of the above is below, where As strategy lets
say SCS says choose S in the first turn, C in the third turn and S in the fifth
turn, and similarly for the other ones.
B
CCC
CCS
CSC
CSS
SCC
SCS
SSC
SSS
CCC
58, 58
39, 39
20, 20
20, 20
1, 1
1, 1
1, 1
1, 1
CCS
38, 59
39, 39
20, 20
20, 20
1, 1
1, 1
1, 1
1, 1
CSC
19, 40
19, 40
20, 20
20, 20
1, 1
1, 1
1, 1
1, 1
CSS
19, 40
19, 40
20, 20
20, 20
1, 1
1, 1
1, 1
1, 1
SCC
0, 21
0, 21
0, 21
0, 21
1, 1
1, 1
1, 1
1, 1
SCS
0, 21
0, 21
0, 21
0, 21
1, 1
1, 1
1, 1
1, 1
SSC
0, 21
0, 21
0, 21
0, 21
1, 1
1, 1
1, 1
1, 1
SSS
0, 21
0, 21
0, 21
0, 21
1, 1
1, 1
1, 1
1, 1
278
You might wonder why we need to think of what A would do in the third and
fifth turn even if he stops the game in the first turn. We need to think, however,
about though I stop in the first turn, but what should I do in the third and
fifth turn if I mistakenly continues the game for some reason?
Here all the strategy profiles corresponding to the sixteen cells in the lowerright are Nash equilibria. However, only (SSS, SSS) is subgame-perfect. For,
backward induction leads to:
1. B stops the game if the sixth turn comes to him;
2. since A foresees it he stops the game if the fifth turn comes to him;
3. since B foresees it he stops the game if the fourth turn comes to him;
4. since A foresees it he stops the game if the third turn comes to him;
5. since B foresees it he stops the game if the second turn comes to him;
6. since A foresees it he stops the game if the first turn.
21.3
So far we have assumed that the succeeding players can observe the actions taken
by the preceding players. This assumption is called perfect information. On
the other hand, the game is called an extensive-form game with imperfect information when the succeeding players cannot necessarily observe the
preceding players actions.
Figure 21.2 is such an example. Here the second mover cannot monitor the
action taken by the first mover when he makes an action. Thus, this game does
not have a subgame which is strictly smaller than that, or the entire game itself
is the only subgame of it. Therefore its subgame-perfect Nash equilibrium and
its Nash equilibrium are the same.
Thus in its normal-form expression
B
A
X
Y
F
100, 30
80, 10
G
60, 10
180, 5
we find that its Nash equilibrium is (X, F ), which is also subgame-perfect vacuously.
Now suppose that the game as in Figure 21.2 has actually a preceding stage
in which B chooses whether to play this game or not. Thus let us consider an
extensive-form game as in Figure 21.3.
Here Bs action E is interpreted as entry and N is interpreted as non-entry.
Then let NG for example denote Bs strategy such that he does not enter but
279
A
B
100 30
Bq
Aq
q
B
60
10
80
10
180 5
A
B
100 30
Bq
Aq
E
B q
q
B
60
10
80
10
180 5
N
200 40
Figure 21.3: Imperfect information game 2
280
if he enters he chooses G in the game after the entry. Then the normal-form
expression of this extensive-form game is given by
X
Y
EF
100, 30
80, 10
EG
60, 10
180, 5
B
NF
200, 40
200, 40
NG
200, 40
200, 40
Here this normal-form expression has four pure-strategy Nash equilibria, (X, N F ),
(X, N G), (Y, N F ) and (Y, N G). We can still apply the backward induction argument in a generalized sense here, however. Consider that B has already chosen
action E for any reason and the game after the entry as in Figure 21.2 is already
here. Then Nash equilibrium in the game after the entry is (X, F ), which yields
payos (100, 30). Now consider that B is choosing between E and F. Since he
foresees that the consequence of choosing E is (100, 30) and that of choosing N
is (200, 40), he chooses N over E. Hence the subgame-perfect Nash equilibrium
is (X, N F ).
Again let me emphasize that even when B chooses not to enter and the game
after entry is not played we need to describe what they will do in there, since
B needs to reason about the consequence of choosing entry when he decides
whether to enter or not.
21.4
Bargaining game
21.4.1
One-period bargaining
First consider the one-period case like before, where A proposes and B either
accepts or rejects. Denote As receipt by x, and denote an arbitrary proposal
by (x, 1 x), where x can be any number from 0 to 1. If B accepts then As
oer goes through, and if he rejects all the money disappear.
As we solve this extensive-form game by backward induction, it is optimal
for B to accept any proposal 0 x 1. It is strictly better for B to accept
the proposal when 0 x < 1. When x = 1 he is indierent between accepting
and rejecting, so we can take that accepting it optimal with ties.
Since A expects then that any proposal is accepted he makes the greediest
proposal x = 1.
21.4.2
Next consider that the two players alternately make proposals over two periods.
They bargain in the following procedure:
281
21.4.3
Now let us extend the above argument to general even number of periods, 2T .
They bargain in the following procedure:
1. At Period 1, A gives a proposal to B. Denote the proposed amount to
be given to A by x1 , then the proposed allocation is (x1 , 1 x1 ). B
either accepts or rejects the proposal. If B accepts then As proposal goes
through, and if he rejects they go to Period 2.
2. At Period 2, B gives a proposal to A. Denote the proposed amount to
be given to A by x2 , then the proposed allocation is (x2 , 1 x2 ). A
either accepts or rejects the proposal. If A accepts then Bs proposal goes
through, and if he rejects they go to Period 3.
3. After that, if they have not reached agreement by the previous period, A
makes a proposal on odd periods and B does on even periods.
4. If they come to Period 2T and they do not reach agreement the money
disappear.
282
As we solve this extensive-form game by backward induction, from the previous argument A accepts any proposal 0 x2T 1. Since B expects that any
proposal is accepted he makes the greediest proposal
(x2T , 1 x2T ) = (0, 1).
Given this look into Bs acceptance/rejection decision at Period 2T 1. If
B accepts As proposal (x2T 1 , 1 x2T 1 ) he receives 1 x2T 1 . On the other
hand, if B rejects As proposal they go to Period 2T in which B proposes (0, 1)
and A accepts it. Since Bs payo 1 to be received at Period 2T is seen to
be from the viewpoint of Period 2T 1, B accepts As proposal as far as
1 x2T 1 , that is, x2T 1 1 , and rejects when x2T 1 > 1 . Note
that since B is indierent between accepting and rejecting when x2T 1 = 1 ,
we can take that accepting it is optimal with ties.
Since A expects this he makes the greediest proposal x2T 1 = 1 , which
yields
(x2T 1 , 1 x2T 1 ) = (1 , ).
Given this look into As acceptance/rejection decision at Period 2T 2. If A
accepts Bs proposal (x2T 2 , 1 x2T 2 ) he receives x2T 2 . On the other hand,
if A rejects Bs proposal they go to Period 2T 1 in which A proposes (1 , )
and B accepts it. Since As payo 1 to be received at Period 2T 1 is seen
to be (1 ) from the viewpoint of Period 2T 2, A accepts Bs proposal as
far as x2T 2 2 , and rejects when x2T 2 < 2 . Note that since A is
indierent between accepting and rejecting when x2T 2 = 2 , we can take
that accepting it is optimal with ties.
Since B expects this he makes the greediest proposal x2T 2 = 2 , which
yields
(x2T 2 , 1 x2T 2 ) = ( 2 , 1 + 2 ).
Given this look into Bs acceptance/rejection decision at Period 2T 3. If
B accepts As proposal (x2T 3 , 1 x2T 3 ) he receives 1 x2T 3 . On the other
hand, if B rejects As proposal they go to Period 2T 2 in which B proposes
( 2 , 1 + 2 ) and A accepts it. Since Bs payo 1 + 2 to be
received at Period 2T 2 is seen to be (1 + 2 ) from the viewpoint of
Period 2T 3, B accepts As proposal as far as 1 x2T 3 2 + 3 , that is,
x2T 3 1+ 2 3 and rejects when x2T 3 > 1+ 2 3 . Note that since
B is indierent between accepting and rejecting when x2T 3 = 1 + 2 3 ,
we can take that accepting it is optimal with ties.
Since A expects this he makes the greediest proposal x2T 3 = 1 + 2 3 ,
which yields
(x2T 3 , 1 x2T 3 ) = (1 + 2 3 , 2 + 3 ).
By repeating this, we obtain that in Period 2 B proposes
(x2 , 1 x2 )
= ( 2 + + 2T 3 2T 2 ,
1 + 2 2T 3 + 2T 2 )
283
21.5
N
C
B
N
v0 , v0
v , v+
C
v+ , v
vc , vc
where N stands for non-cooperation, C stands for cooperation, and v+ > vc >
vo > v . Beware that I switched the notation between N and C from the
original version of the prisoners dilemma. Here the unique Nash equilibrium
is (N,N), which is even a dominant strategy equilibrium. That is, they try to
cheat each other and lead together to an outcome which is worse for both.
On the other hand, in our usual life we seem to be sustaining cooperation
(more or less) instead of cheating each other. Where does the dierence come
from?
The dierence comes from how to take the time horizon. Note that the above
game is for just once. Since the game is played just once, even when you cheat
your opponent you are not punished unless there is third party who punishes.
Consider on the other hand that this game played repeatedly indefinitely
many times. Then a player who cheats and destroys the cooperation may be
punished privately the opponents in the future by means of non-cooperative
actions. Thus each player may choose to sustain cooperation even when there
is no third party which enforces it, if such punishment is a credible threat. This
is the basic idea of the repeated game theory.
Hereafter we consider that a game is played repeatedly infinitely many times.
It is hard to take this assumption literally as we die sometime, but since we dont
know when we die and we dont know when the repetition ends, the assumption
of infinite repetition is a reasonable description of such an open ended situation
in which there is always the possibility that you are punished privately by your
opponent or partner in the future.
Here it is important that there is no terminal date, for, if the repetition is just
for finite periods we can apply the backward induction, implying that neither
284
cooperates on the last day, implying that neither cooperates on the second last
day, implying that neither cooperates on the third last day, on so on, implying
neither cooperates on the first day.
In repeated games we need to consider how the players evaluate streams of
payos rather than payos on a given day. Here we assume that the players
care about the discounted present values of payo streams.
Suppose player A receives vA1 in Period 1, vA2 in Period 2, and so on, and
vAt at Period t in general, his payo streams is denoted by (vA1 , vA2 , vA3 , ).
Let A denote As discount factor which measures how A is patient, then the
discounted present value of (vA1 , vA2 , vA3 , ) is
t1
vAt A
t=1
t1
vBt B
t=1
285
That is, given that B is taking the trigger strategy it is optimal for A to follow
the trigger strategy as well after any history, and vice versa.
In the following argument I assume so-called one-shot deviation principle, though, saying that in order to check subgame-perfection of a given strategy
profile it is enough to check if the prescribed action is optimal for each player at
each time supposing that all the player (including the specified player himself)
follow the given strategy profile in the future.
In the current case the principle says it is enough to see check if playing C
is optimal for lets say A when cooperation has been sustained and if playing N
is optimal for him when cooperation has been broken, supposing that both
A and B follow the trigger strategy in the future. More specifically,
it means we dont have to consider for example As deviation from the trigger
strategy such that he plays N when cooperation has been sustained and plays
C after cooperation has already been broken, which form a double deviation
of the trigger strategy.
For the proof of the one-shot deviation principle see for example Osborne
and Rubinstein [24].
Here we look at A without loss of generality.
1. When cooperation has been broken before: Given that B is taking the
trigger strategy, he chooses N forever including the current period. Also,
given that A is following the trigger strategy from the next period he
chooses N forever from the next period. Then, A gets only v by choosing
C and he gets vo at best by choosing N in this period, where this choice
does not change that he gets vo forever from the next period.
Hence it is optimal for him to choose N.
2. When cooperation has been sustained so far (including the first period): If
A chooses N he gets v+ since B is choosing C in the current period. From
Case 1, both play C forever from the next period and A gets vo at every
period in the future. Hence the payo stream he obtains is (v+ , vo , vo , ),
vo
and its discounted present value is v+ + 1
.
On the other hand, if A plays C he gets vc since B is choosing C in the
current period. Given that B is taking the trigger strategy and that A
is following the trigger strategy from the next period, A gets vc at every
period in the future. Hence the payo stream he obtains is (vc , vc , ),
vc
and its discounted present value is 1
.
vc
vo
vc
Therefore, when 1
v+ + 1
, that is, when vv++ v
, it is optimal for
o
each player to follow the trigger strategy given that the opponent is taking the
trigger strategy.
+ vc
The condition vv+
vo says that both players are suciently patient.
1
like firms, in which r
When the players discount factor is given by = 1+r
denotes pure interest rate, the condition says that the interest rate is suciently
low. In any case, when the players are not myopic but values long-terms gains
286
21.6
Exercise
B
q
X
A
q
Y
q
B
60
12
10
Exercise 28 A and B split a land in the following procedure. First A cuts the
land into two pieces. Then B takes one of the two, and A receives the remaining
one. Suppose the area of the land is simply 1 and both A and B cares only
about the area they receive respectively. Then what is the subgame-perfect
equilibrium here?
Exercise 29 Find the subgame perfect Nash equilibria (may or may not be
unique) in the game below.
A
B
100 30
Bq
Aq
E
B q
q
B
60
10
80
10
180 60
N
200 40
Chapter 22
Oligopoly
In monopoly there is only one firm which has market power. Now we consider
oligopoly in which there are several firms that have market power. I start with
duopoly, the case of two firms, and later extend the argument to the case of
more firms.
When there are several firms the consequence of ones action depends not
only on it but also on others actions. For example, Firm As profit depends
not only on its action but also on Firm Bs action, and vice versa. Hence A
needs to read what B will choose and B needs to read what A will choose. Such
situation is called strategic interdependence. Discussion on strategic interaction
in general is relegated to the chapters on game theory, and we focus on its
implication in oligopoly markets.
Models of oligopoly are classified by
what firms compete by
timing of their decision
According to the first category we think of two kinds in this chapter, one is
quantity and the other is price. According to the second category, we think
of two cases, one in which the firms simultaneously choose their quantities or
prices, the other in which there is first mover and second mover.
Thus we will cover four kinds of competitions
1. simultaneous quantity setting
2. sequential quantity setting
3. simultaneous price setting
4. sequential price setting
Again, we adopt the partial equilibrium framework because of the reason I
discussed in the chapter on monopoly.
287
22.1
288
First, let us consider that the firm simultaneously choose how much to provide to
the market. This is called Cournot competition after the name of economists
who first analyzed this.
Consider a market in which there is certain market maker or auctioneer who
sets the price so that a given provided units are sold.
The demand side is summarized in the form of inverse demand function p(y).
That is, when y units of output are provided to the market one unit of it is sold
for p(y),
Denote firm As cost function by, CA , that is, when firm A produces yA units
it cost is CA (yA ). Likewise, let CB denote firm Bs cost function. Suppose A
provides yA units and B provides yB units then the resulting market price is
p(yA + yB ),
As profit is
p(yA + yB )yA CA (yA ),
and Bs profit is
p(yA + yB )yB CB (yB )
The two firms simultaneously choose their quantities. Here we assume that
they play Nash equilibrium.
Here is how to find Nash equilibrium. First, find each players best response
to any possible combination of the other players strategies. In the current setting, given any Bs quantity choice yB , A solves its profit maximization problem
max p(yA + yB )yA CA (yA )
yA
Denote the solution to the above problem by BRA (yB ), which is As best response to yB .
Likewise, given any As quantity choice yA , B solves its profit maximization
problem
max p(yA + yB )yB CB (yB )
yB
Denote the solution to the above problem by BRB (yA ), which is Bs best response to yA .
yA
= BRA (yB
)
yB
= BRB (yA
)
289
We have now two equations with two unknowns, which we can solve.
Let us do this is a simple example of linear inverse demand and constant
marginal cost. The inverse demand is given by p(y) = a by, As cost function
is CA (yA ) = cA yA , and Bs cost function is CB (yB ) = cB yB .
First let us solve for As best response. Given an arbitrary level of Bs
quantity yB , solve As profit maximization problem
max [a b(yA + yB )] yA cA yA
yA
a cA byB
2b
Notice that as the opponents quantity is large the own optimal quantity is
smaller, and as the opponents quantity is smaller the own optimal quantity is
larger. When the opponent is aggressive and provides more the market price is
lower, the firm should be more conservative. When the opponent is conservative
and provides more the market price is maintained higher, the firm should be
more aggressive and take advantage of it. This is called strategic substitutes
Now let us solve for Bs best response. Given an arbitrary level of As
quantity yA , solve Bs profit maximization problem
max [a b(yA + yB )] yB cB yB
yB
a cB byA
2b
Plot the two players best response functions as in Figure 22.1. Then in
yA
=
a cA byB
,
2b
yB
=
a cB byA
2b
yA
=
a 2cA + cB
3b
yB
=
a 2cB + cA
3b
290
yB
acA 6
b
BRA
acB
2b
yB
r
BRB
acA
yA
2b
acB
b
- yA
=
+ yB
yA
2a cA cB
,
3b
p(yA
+ yB
)=
a + cA + cB
,
3
(a 2cA + cB )2
,
9b
Bs profit =
(a 2cB + cA )2
.
9b
291
yB
acA 6
b
BRA
acB
2b
yB
0
BRB
yA
acA
2b
acB
b
- yA
that quantity larger than y A can be optimal only when B chooses quantity larger
acB
B
than ac
2b . Since B never chooses quantity larger than 2b , any quantity larger
than y A can never be optimal cor A.
[ Summing] up, the set of As quantity levels which A has a reason to choose
A
is y A , ac
, and the set of Bs quantity levels which B has a reason to choose
2b
[
]
B
is y B , ac
.
2b
By repeat this argument, then he set of As quantity levels which A has a
22.2
Next consider the case that there is first-mover and second-mover. Let A be the
first mover and B be the second mover, and consider that B can set its quantity
after seeing As quantity. This is called Stackelberg competition.
Except for timing Stackelberg competition takes the same setting as Cournot
competition. Consider a market in which there is certain market maker or
auctioneer who sets the price so that a given provided units are sold.
The demand side is summarized in the form of inverse demand function p(y).
That is, when y units of output are provided to the market one unit of it is sold
for p(y),
Denote firm As cost function by, CA , that is, when firm A produces yA units
it cost is CA (yA ). Likewise, let CB denote firm Bs cost function. Suppose A
provides yA units and B provides yB units then the resulting market price is
p(yA + yB ),
292
As profit is
p(yA + yB )yA CA (yA ),
and Bs profit is
p(yA + yB )yB CB (yB )
In the simultaneous-move case As strategy was As quantity yA and Bs
strategy was Bs quantity yB , which means a strategy is simply a quantity, or
a strategy is simply an action.
However, in sequential setting there is generally a distinction between a
strategy and an action. In stead, it is a list of actions conditional histories.
For the first mover A it remains the same that his strategy is a quantity
itself. For the second mover B, his strategy takes the form
is denoted by (yA
, fB ). We can find it by backward induction which works
as follows.
1. For each possible yA , the second mover B solves
max p(yA + yB )yB CB (yB )
yB
293
Let us find the backward induction solution in the example of linear inverse
demand and constant marginal cost. The inverse demand function is given by
p(y) = aby, As cost function is given by CA (yA ) = cA yA and Bs cost function
is given by CB (yB ) = cB yB .
Because it is backward induction, first we solve the second mover Bs
problem. For each possible yA , solve Bs profit maximization problem
max [a b(yA + yB )] yB cB yB
yB
a cB byA
2b
yA
=
a 2cA + cB
2b
yA
fB (yA ) =
a 2cA + cB
2b
a cB byA
for each yA
2b
yA
=
a 2cA + cB
,
2b
fB (yA
)=
a 3cB + 2cA
4b
294
yA
+ fB (yA
)=
3a 2cA cB
4b
+ fB (yA
)) =
p(yA
a + 2cA + cB
4
(a 2cA + cB )2
,
8b
Bs profit =
(a 3cB + 2cA )2
16b
Recall that As profit in the simultaneous move case under the same demand2
A +cB )
cost assumption was (a2c9b
, which implies that in sequential quantity
setting the first mover has an advantage compared to the simultaneous move
quantity setting. This is because the first mover can drive out the second mover
by providing its quantity aggressively so that the the second mover has to be
cautious in order not to destroy the market price any longer.
Of course, this does not mean that in any sequential decisions the first mover
has an advantage. Imagine for example the rock-scissors-paper game played
sequentially.
22.3
Next let us consider the oligopoly situation in which the firms simultaneously
set their prices. This is called Bertand competition. We will focus on the case
of two firms.
22.3.1
No product dierentiation
First let us consider that the two firms sell identical products.
The case of identical marginal costs
Also assume that the two firms have the same and constant marginal cost c.
Denote the market demand function by x(p). Denote As price by pA and
Bs price by pB , then the consequences are classified as follows:
1. When pA < pB , A takes all the demand x(pA ).
2. When pA > pB , B takes all the demand x(pB ).
295
296
22.3.2
Let us think of the case with product dierentiation. Here even if one firm sets
higher price than the other does it does not immediately lose the demand.
Let pA denote As price and pB denote Bs price. The market demand
function for As product is given by xA (pA , pB ) and that for Bs product is
given by xB (pA , pB ). Let CA denote As cost function and CB denote Bs cost
function.
Then As profit is given by
pA xA (pA , pB ) CA (xA (pA , pB )),
and Bs one is given by
pB xB (pA , pB ) CB (xB (pA , pB ))
Again we consider that the two firms play Nash equilibrium in the game of
simultaneously setting their prices.
Let me explain how to find Nash equilibrium in the present context. Given
any possible price pB taken by B, consider As profit maximization problem
max pA xA (pA , pB ) CA (xA (pA , pB ))
pA
Denote its solution by BRA (pB ), which forms a best response function as pB is
variable.
Likewise, given any possible price pA taken by A, consider Bs profit maximization problem
max pB xB (pA , pB ) CB (xB (pA , pB ))
pB
Denote its solution by BRB (pA ), which forms a best response function as pA is
variable.
297
pB
6
BRA
BRB
pB
dB +cB eBB
2eBB
dA +cA eAA
2eAA
pA
- pA
In Nash equilibrium (pA , pB ) each firms price-setting choice is a best response to each others one. Thus we have
pA = BRA (pB )
pB = BRB (pA )
Let us look for Nash equilibrium in the example of linear demand and linear
costs. Demand for each firms product is xA (pA , pB ) = dA eAA pA + eAB pB for
A and xB (pA , pB ) = dB eBB pB + eBA pA for B. As cost function is CA (yA ) =
cA yA and Bs cost function is CB (yB ) = cB yB .
First let us find As best response. Given any possible pB , solve As profit
maximization problem
max pA [dA eAA pA + eAB pB ] cA [dA eAA pA + eAB pB ]
pA
dA + cA eAA + eAB pB
2eAA
298
dB + cB eBB + eBA pA
2eBB
dA + cA eAA + eAB pB
,
2eAA
pB =
dB + cB eBB + eBA pA
2eBB
dA + cA eAA +
2eAA
pB =
dB + cB eBB +
2eBB
which is the intersection of the two best response curves as in Figure 22.3.
Nash equilibrium in Bertand competition can be reached by the rationalizability criterion alone. See Figure 22.4.
In this argument let us assume that there is an upper bound for prices they
can set, denoted by p. This looks like an ad hoc assumption but it is reasonable: here As best response becomes arbitrarily large as Bs prices tends to be
arbitrarily higher, and vice versa, but this is rather an artifact of linear demand
function which is assumed just for simplicity of calculation the demand for
As product xA (pA , pB ) = dA eAA pA + eAB pB tends to be arbitrarily large as
pB tends to be arbitrarily higher here, but it wont be the case in reality.
A eAA
Here the pA -intercept of As best response curve BRA is dA +c
. That is,
2eAA
any price lower than this cannot be optimal for A no matter what pB is. Hence
A eAA
A never chooses price lower than dA +c
.
2eAA
Let peA = BRA (pB ). Then, any As price higher than this can never be
optimal since B does not set its price higher than pB . Hence A never chooses
price higher than peA .
B eBB
Likewise, here the pB -intercept of Bs best response curve BRB is dB +c
.
2eBB
That is, any price lower than this cannot be optimal for B no matter what pA
B eBB
is. Hence B never chooses price lower than dB +c
.
2eBB
Let peB = BRA (pA ). Then, any Bs price higher than this can never be
optimal since A does not set its price higher than pA . Hence B never chooses
price higher than peB .
] the set of As price levels which A has a reason to choose is
[ Summing up,
dA +cA eAA
,
p
e
A , and the set of Bs price levels which B has a reason to choose
[ 2eAA
]
B eBB
is dB +c
,
p
e
.
B
2eBB
By repeat this argument, then he set of As price levels which A has a reason
to choose converges to the single point pA , and the set of Bs price levels which
B has a reason to choose converges to the single point pB .
299
pB
pB 6
peB
BRA
BRB
pB
dB +cB eBB
2eBB
dA +cA eAA
2eAA
pA peA
- pA
pA
22.4
22.4.1
Let us first assume that products are homogeneous and the two firms have the
same and constant marginal cost c.
Denote the market demand function by x(p). Denote As price by pA and
Bs price by pB , then the consequences are classified as follows:
1. When pA < pB , A takes all the demand x(pA ).
2. When pA > pB , B takes all the demand x(pB ).
3. When pA = pB , they split the demand by half.
Now suppose A sets its price first and B sets its price after seeing As price.
Then, if pA > c, B can grab the whole demand by setting slightly lower price
so that pA > pB > c. However, B can make pB close to pA as much as possible
but the profit suddenly drops when pB is exactly equal to pA . Hence B does
not always have optimal choice in the precise sense.
So like before let us consider that there is a discrete grid for price setting,
lets say an integer grid, and assume that cA and cB are integer as well.
Then for all pA c + 2 the optimal choice of B is pB = pA 1. For all
pA c 1 the optimal choice of B is any pB c. When pA = c + 1 the optimal
choice of B is pB = c + 1. When pA = c the optimal choice of B is any pB c.
Since A receives zero profit except when pA = c + 1 and pB = c + 1, the
300
fB (pA ) =
22.4.2
c+1
any c, when
c + 1,
when
pA 11, when
pA c
pA c + 1
pA c + 2
Notice that here Bs price pB has been replaced by fB (pA ). This is a onevariable maximization problem, and As optimal strategy pA is obtained
by solving this.
Plug As strategy as price pA into Bs strategy as a function fB , we obtain
Bs price choice fB (pA ). Here the sequence of prices (pA , fB (pA )) is called
301
dB + cB eBB + eBA pA
2eBB
eAB eBA
2eBB
302
On the other hand, in the sequential move case the resulting prices are
pA = 17, pB = f (pA ) = 14.5
and their profits are
A = 112.5, B = 156.25
which shows that the second mover has an advantage (you can should this in a
more general manner, of course).
22.5
22.5.1
Suppose there are n firms, and each firm k = 1, , n is given its cost function
Ck . Denote the inverse demand function by p(y). Then, given a profile of
supplied quantities of n firms denoted by (y1 , , yn ) the resulting market price
is
p
yj
j=1
yj .
j=1
303
Let me verify the above claim with a simpler example. Assume linear inverse
demand function p(y) = a by. Consider that there are n identical firms with
a constant marginal cost c.
Note that in a perfectly competitive market the equilibrium quantity follows
from a by = c, which is y CE = ac
b , and that the competitive equilibrium
price of output is pCE = c.
Now pick one firm arbitrarily, and denote its quantity by y, whereas the
sum of the other firms quantities is denoted by Y . Then this firms profit
maximization problem is
max(a b(y + Y ))y cy
y
a c bY
2b
Now, consider that all the other firms behave just like this generic firm.
Although it is a symmetric environment in which all the firms have the same
cost structure, it is not necessary that they behave in a symmetric manner, so
the assumption of symmetry of behavior is an additional assumption. This is
called symmetric Nash equilibrium.
In a symmetric Nash equilibrium when there are n firms, denote (y n , , y n ),
since all the other firms consequently provide the same quantity as the above
generic firm does, we have Y = (n 1)y n . Hence we it holds
yn =
a c bY
a c b(n 1)y n
=
2b
2b
which yields
yn =
ac
(n + 1)b
n(a c)
n+1
ac
0,
(n + 1)b
ac
ac
,
=
(1 + 1/n)b
b
ac
= a
c
1 + 1/n
=
304
Here each individual firms supply tends to be negligibly small, the total quantity
ny n converges to that in the competitive equilibrium, and the price converges
to the competitive equilibrium price level, that is, the marginal cost.
However, the model of the quantity-setting competition relies on the existence of auctioneer. We need to think of price-setting competition ala Bertrand
if we want to get a strategic foundation of perfect competition assumption without relying on auctioneer.
22.5.2
We know that when firms have identical and constant marginal costs competitive
outcome is realized under Bertrand competition just with two firms, which set
their prices equal to the marginal cost in Nash equilibrium.
This argument relies heavily on the assumption that marginal costs are constant and identical across firms. However, it is known that each firm has to
behave as a price-taker when there are indefinitely many potential firms with
decreasing returns to scale and when the freedom of entry is guaranteed. For, if
existing firms are earning profits with their prices being higher than the marginal
cost there are arbitrarily many firms which can produce the good at cheaper
cost and can set lower prices. Thus if the freedom of entry is guaranteed even
the existing firms have to lower their prices to the marginal cost
In the sense that it does not need to assume the existence of auctioneer it may
be more convincing than the convergence argument for Cournot competition.
22.6
Collusion
22.6.1
Let us now consider the firms form a cartel and manipulate quantities or prices
cooperatively, so that they maximize their joint profit and share it nicely.
To illustrate, assume linear inverse demand p(y) = a by and also assume
that the two firms have constant and identical marginal cost given by c. If
they have dierent marginal cost, let the more ecient firm produce everything
and let the less ecient firm do nothing and receive a reward for doing nothing.
Suppose they set quantities non-cooperatively in Cournot competition the
Nash equilibrium quantities are
yA
=
ac
,
3b
qB
=
ac
3b
p(yA
+ yB
)=
a + 2c
3
305
(a c)2
9b
Bs profit =
(a c)2
9b
On the other hand, what if they cooperatively manipulate the total quantity
y = yA + yB in order to maximize the joint profit
(a by)y cy
where they behave together as a monopolist? Here the joint monopolists profit
maximization yields
ac
yb =
.
2b
This is smaller than the total quantity in the Cournot competition, and results
in a higher price
a+c
p(b
y) =
2
which yields the joint profit
(a c)2
4b
This is greater than the sum of profits in the Cournot competition.
So we can think of the following cartel arrangement.
1. Each of A and B produces half of yb, that is,
ac
4b .
(ac)2
9b
(ac)2
8b .
a+c
2
22.6.2
By forming a cartel the two firm can earn higher profits. But do they keep the
promise?
=
p yA
+
2
12
306
5(a c)2
36b
Bs profit =
5(a c)2
48b
yB
yA
y
b
2
(ac)
(ac)
9b 2 ,
9b
5(ac)
5(ac)2
,
48b
36b
y
b
2
5(ac)2
5(ac)
36b 2 ,
48b
(ac)
(ac)2
,
8b
8b
2
From the above table we see that it is always better fro A to provide the Cournot
quantity, ans similarly for B. Thus the upper-left cell will occur. However, this
outcome is unanimously worse for A and B than the cartel outcome. Its a
prisoners dilemma situation.
This remains the case in the Bertrand price competition as well. Even when
they promise to set the monopoly price together it is better for each one to
break the promise and set lower price.
However, when the competition is repeated indefinitely many times forever,
as we saw in the section of repeated games cartel can be sustained when the
firms are suciently patient, which means in this context the interest rate is
suciently low.
22.7
Exercises
Exercise 30 Market inverse demand is given by p(y) = 100 2y. There are
two firms A and B, which have the same and constant marginal cost 4.
(i) Find Nash equilibrium in the simultaneous quantity setting competition.
(ii) Find subgame-perfect Nash equilibrium in the sequential quantity setting
competition, in which A moves first and B moves second.
Exercise 31 Consider the case with product dierentiation, in which demand
functions for As and Bs products are given by xA (pA , pB ) = 50 2pA + pB and
xB (pA , pB ) = 50 2pB + pA , respectively. Assume that the two firms have the
same and constant marginal cost. (i) Find Nash equilibrium in the simultaneous
307
Part IV
308
Chapter 23
23.1
E
N
B = Strong
E
N
-2, 5 10, 0
0, 10 0, 0
309
310
E
N
B = W eak
E
N
5, -2 10, 0
0, 10 0, 0
Since B knows his type he can condition his action on his type. Thus Bs
strategy takes the form for example enter if he is strong and do not enter if he
is weak. The game extended in such a way is called a Bayesian game, and
Nash equilibrium in a Bayesian game is called Bayesian Nash equilibrium.
Since A cannot know Bs type when he chooses his action he cannot condition
his action on Bs type. Thus A just chooses between entry and non-entry in an
unconditional way.
Now let me denote Bs strategy for example enter if he is strong and do not
enter if he is weak by EN in short and similarly for the other ones. Then the
payos in the Bayesian game are given by ex-ante expected values of payos
in the original games. Hence the payo matrix in the Bayesian game is
E
N
EE
1.5, 1.5
0, 10
EN
4, 2.5
0, 5
B
NE
7.5, -1
0, 5
NN
10, 0
0, 0
E
N
EE
5 7p, 2 + 7p
0, 10
B
EN
NE
10 12p, 5p 5 + 5p, 2 + 2p
0, 10p
0, 10 10p
NN
10, 0
0, 0
Notice that EN strictly dominates NN since 5p > 0 and 10p > 0. Also, EE
strictly dominates NE since 2 + 7p > 2 + 2p and 10 > 10 10p. Thus by
eliminating strictly dominated strategies we obtain
B
A
E
N
EE
5 7p, 2 + 7p
0, 10
EN
10 12p, 5p
0, 10p
311
Since it holds 2 + 7p < 5p and 10 > 10p, if there exists a pure-strategy Nash
equilibrium it must be either (E,EN) or (N,EE). When 10 12p 0 (E,EN)
can be an equilibrium, and when 5 7p 0 (N,EE) can be an equilibrium.
Summing up, we have
When 0 < p < 5/7 (E, EN) is the unique equilibrium.
When 5/7 p 5/6 both (E, EN) and (N, EE) are equilibria
When 5/6 < p < 1 (N, EE) is the unique equilibrium.
Here (E, EN) refers to an equilibrium in which B enters only when he is strong
and A enters, in which B is timid. This case can be an equilibrium when Bs
probability of being strong is suciently low, that is, when p 5/6.
On the other hand, (N,EE) refers to an equilibrium in which B enters no
matter what his type is and A does not enter, in which B is blung. This
case can be an equilibrium when Bs probability of being strong is suciently
high, that is, when p 5/7.
In the intersection the two cases 5/7 p 5/6 there are two Bayesian Nash
equilibria, one in which B is timid and the other in which B is blung. Thus
we run into the multiple equilibria problem again.
Example 23.2 Now suppose this time that both A and B have possibilities
of being strong or weak. Assume for simlicity that the types of A and B are
independently and identically distributed (I.I.D.), where the probability of being
strong is 2/3.
Payo matrix for each combination of types of A and B is given by
A=S
A=W
E
N
E
-2, -2
0, 10
B=S
N
10, 0
0, 0
E
N
E
-8, 5
0, 10
B=S
N
10, 0
0, 0
A=S
A=W
B=W
N
10, 0
0, 0
E
N
E
5, -8
0, 10
E
N
E
-5, -5
0, 10
B=W
N
10, 0
0, 0
Now each of A and B can condition his action on his type (but not on the
opponents one as before), and the payo matrix of the Bayesian game is
EE
EN
NE
NN
EE
19/9, 19/9
2/9, 2/3
13/9, 65/9
0, 10
B
EN
NE
2/3, 2/9
65/9, 13/9
4/3, 4/3
50/9, 2/3
2/3, 50/9
5/9, 5/9
0, 20/3
0, 10/3
NN
10, 0
20/3, 0
10/3, 0
0, 0
312
On the other hand, when the probability of being strong is 1/3 the payo
matrix for the Bayesian game is
EE
EN
NE
NN
EE
28/9, 28/9
8/9, 14/3
4, 20/9
0, 10
B
EN
NE
14/3, 8/9 20/9, 4
2, 2
20/9, 8/3
8/3, 20/9
0, 0
0, 10/3
0, 20/3
NN
10, 0
10/3, 0
20/3, 0
0, 0
(EN, EE)
Here in each equilibrium one is blung and the other is timid. There emerges
the multiple equilibrium problem again.
Now let me give you the formal definitions of games with incomplete information, corresponding Bayesian games and Bayesian Nash equilibria. A game
with incomplete information consists of a set of players, sets of types, a common prior distribution, sets of strategies and payo functions. The set of players
I = {1, , n} is a finite set. The set of types of each player i = 1,
, n is
n
denoted by i . The set of all the players type profiles is given by = i=1 i ,
where its element is generically denoted by = (1 , , n ).
In the last example, the sets of types A and B are given by A = {S, W }
and B = {S, W }, respectively.
Let p denote an ex-ante probability distribution over denoted by p, which
is commonly held by all the players as their probabilistic beliefs about . This
is called the common prior assumption. This says
From the ex-ante viewpoint all the players share the same belief,
and any dierence between their beliefs in the meantime arises only
from the dierence of informations they receive.
In other words, this excludes a stubborn attitude such that one believes what
he believes no matter what information he receives and no matter what others
believe. I will explain later why we need to make such assumption.
Let Si denote the set of strategies of player
n i = 1, , n, where its generic
element is denoted by si Si . Let S = i=1 Si denote the set of strategy
profiles, where its generic element is denoted by s = (s1 , , sn ).
Let vi (, s) denote player is payo realized when the type profile is and the
strategy profile is s. That is, player is payo function is given by vi : S R.
The Bayesian game corresponding to a given game with incomplete information consists of the set of players, the sets of Bayesian strategies and ex-ante
expected payo functions. The set of players is I just as before. A Bayesian
strategy of each player is a function from the set of his types to the set of his
strategies. Let i denote the set of all functions from i to Si , then it is the
set of is Bayesian strategies. Let i denote its generic element, then he chooses
313
n
strategy i (i ) if his type is i . Let = i=1 i denote the set of Bayesian
strategy profiles, where its generic element is denoted by = (1 , , n ). Also,
let () = (1 (1 ), , n (n )) denote the strategy profile induced by a given
Bayesian strategy profile when the realized type profile is . It is also denoted
in the form () = (i (i ), i (i )) for any given i.
Given a Bayesian strategy profile = (1 , , n ), player is ex-ante expected payo is given by
vi (, ())p().
vi (,
bi (i ),
bi (i ))p()
vi (, i (i ),
bi (i ))p()
The next proposition says that Bayesian Nash equilibrium defined above in
which every player is optimizing his Bayesian strategy from the ex-ante viewpoint is equivalent to an interim situation in which each player optimizes his
action given that he knows his type only.
For example, it states the equivalence between the two conditions,
A enters if A is strong and does not enter if A is weak is optimal for A
from the ex-ante view point.
It is optimal for A to enter once A knows he strong and it is optimal for
A not to enter once A knows he weak .
This allows us to find Bayesian Nash equilibrium in either of two ways,
one based on ex-ante optimality of Bayesian strategies and the other based on
interim optimality of them. In auction games to be covered later the second
way is actually easier.2
1 When types are continuous quantities such as willingness to pay, the common prior distribution is a probability measure or probability distribution which is mathematically suitably
defined, and the definition of equilibrium is that for all i and i i it holds
vi (,
bi (i ),
bi (i ))dp()
vi (, i (i ),
bi (i ))dp()
2 When types are continuous quantities the equivalent condition is stated as: for all i,
i i and si Si it holds
vi (i , i ,
bi (i ),
bi (i ))dp(i |i )
i
vi (i , i , si ,
bi (i ))dp(i |i )
314
vi (i , i ,
bi (i ),
bi (i ))p(i |i )
i i
vi (i , i , si ,
bi (i ))p(i |i ),
i i
vi (i , i , si ,
bi (i ))p(i |i )
i i
>
vi (i , i ,
bi (i ),
bi (i ))p(i |i )
i i
vi (i , i , i (i ),
bi (i ))p()
vi (i , i , i (i ),
bi (i ))p(i |i )p(i )
i i i
>
vi (i , i ,
bi (i ),
bi (i ))p(i |i )p(i )
i i i
vi (i , i ,
bi (i ),
bi (i ))p()
vi (i , i ,
bi (i ),
bi (i ))p(i |i )
i i
vi (i , i , si ,
bi (i ))p(i |i )
i i
vi (i , i ,
bi (i ),
bi (i ))p(i |i )
i i
i i
vi (i , i , i (i ),
bi (i ))p(i |i )
315
it holds
vi (i , i ,
bi (i ),
bi (i ))p()
vi (i , i ,
bi (i ),
bi (i ))p(i |i )p(i )
i i i
vi (i , i , i (i ),
bi (i ))p(i |i )p(i )
i i i
vi (i , i , i (i ),
bi (i ))p()
23.2
The assumption that all the players share an identical beliefs over the set of
type profiles sounds absurd. However, we cannot just drop it.
Suppose for example that A and B have dierent beliefs. Now, if A is
rational in the sense that he is not stubborn, he must think of how Bs
belief is like, and as a result A must have a belief about Bs belief. Likewise,
if B is rational in the sense that he is not stubborn, he must think of how
As belief is like, and as a result B must have a belief about As belief.
Suppose that these second-order beliefs held by A and B respectively are
dierent. Then, if A is rational in the sense that he is not stubborn, he
must think of how Bs second-order belief is like, and as a result A must have a
third-order belief about Bs second-order belief. Likewise, if B is rational in
the sense that he is not stubborn, he must think of how As second-order belief
is like, and as a result B must have a third-order belief about As second-order
belief.
Suppose that these third-order beliefs held by A and B respectively are
dierent. Then, if A is rational in the sense that he is not stubborn, he
must think of how Bs third-order belief is like, and as a result A must have a
fourth-order belief about Bs third-order belief. Likewise, if B is rational in
the sense that he is not stubborn, he must think of how As third-order belief
is like, and as a result B must have a fourth-order belief about As third-order
belief. And so on.
If you want to drop the common prior assumption and do not want the
model to fall into an ad hoc one, you have to consider a set of types which
contains all the above-mentioned infinite stairways as its element. Such set is
called universal type space.
23.3
316
Exercises
C
N
C
10, 10
12, 1
N
1, 12
0, 0
C
N
C
10, 10
0, 5
B
N
5, 0
0, 0
Write down the Bayesian game and find Bayesian Nash equilibrium assuming
0 < p < 1.
Exercise 33 Consider the game in which A is a potential buyer and B is a
potential seller of a stock. With probability p the stock is a good one and will
go up by 10, and with probability 1 p it is a bad one will go down by 10.
Buy
Not to Buy
Good
Sell Not to Sell
10, 0
0, 10
0, 10
0, 10
Buy
Not to Buy
Bad
Sell
Not to Sell
10, 0
0, 10
0, 10
0, 10
Write down the corresponding Bayesian game and find all the Bayesian Nash
equilibria assuming that 0 < p < 1.
Chapter 24
Auction
I focus on the case that a single item is being sold and buyers bid. Like procurement auctions we can consider that the sellers bid, but it can be treated by
flipping the direction in the arguments below.
If the seller knows buyers willingness to pay he simply points to the buyer
with the highest willingness to pay and charge him the price equal to his willingness to pay. However, in genera the seller does not know the buyers willingness
to pay. How can the seller make the buyers reveal their willingness to pay?
Let them bid. If the bidding mechanism is a nice one the bids will nicely
reveal their willingness to pay. But what auction format it better?
24.1
317
24.2
318
In general, not only that the seller doesnt know bidders willingness to pay,
but also each bidder does not know how much the other bidders are willing to
pay. Also, it is often the case that an individual bidder does not know even his
willingness to pay since the value of the item is uncertain to him.
To understand, consider that the auction is conducted in the following timeline.
1. Bidders are collected. At this point no bidder knows even his own valuation, and they only know the prior probabilistic distribution of values.
This is called the ex-ante stage.
2. The bidders go to the viewing event or go to do research, and receive
certain information (called signal in this literature). However, each bidder
does not know what signals the other bidders have received. This is called
interim stage.
3. The bidders bid. After the winner receives the item its actual value is
realized. It is called the ex-post stage.
The natures of values are classified as follows.
1. Independent private values: Its just like whats said by the proverb
there is no accounting for taste. Each bidder knows his valuation of the
item at the interim stage, since it is just a matter of his own taste. Also,
the prior distributions of individual bidders values are independent in the
ex-ante stage, that is, how my value is likely to be higher or lower is not
correlated with how you value is likely to be higher or lower.
2. Common value: When the value a stock for example becomes 100 dollars in the future it is 100 dollars for everybody. Here the dierence of
bidders valuations can come only from the dierence between informations they receive at the interim stage. Another example is bidding for an
oil field. Here all bidders interest is solely in how much oil it contains,
and the dierence of their valuations can come only from the dierence of
informations received in the interim stage about how much oil is contained.
Here each bidder does not necessarily know the true value, and may learn
about it from others valuations.
I focus on the case of independent private values, since the common value case
is actually harder and beyond the level of this book.
Under the assumption of independent private values second-price auction
and English auction are equivalent in the sense that the second highest bid
in the second-price auction is equal to the price at which the last bidder but
one gives up in English auction. Also, under expected utility theory first-price
auction and Dutch auction are known to be equivalent in the sense that the
319
highest bid in the first-price auction is equal to the price at which the winning
bidder announces to buy. Thus we focus on the first-price auction and the
second-price auction.
24.3
Preferences
Before getting into the details of auction formats let me first describe bidders
preferences. Since winning and losing are uncertain in general and also payment or any income transfer is uncertain as well we need to consider bidders
preferences over risky prospects.
Consider that there are n bidders. We assume that each bidders preference
is quasi-linear in income and each one is risk-neutral. The quasi-linearity assumption means that there is no income eect on the item to be sold. That is,
we assume both of no income eect and risk neutrality.
Let vi denote is willingness to pay. Let pi = ((xi1 , ai1 ); pi1 , (xi2 , ai2 ); pi2 , , (xim , aim ); pim )
denote a lottery over the pairs of the item and income transfer, which says i
receives xi1 units of the item (which is either 0 or 1) and ai1 units of income
(payment if it is negative) with probability pi1 , xi2 units of the item and ai2
units of income with probability pi1 , and so on, and xim units of the item and
aim units of income with probability pim . Then his preference over such lotteries
is assumed to be represented in the form
ui (pi ) =
k=1
24.4
First-price auction
Probably one of the most popular auction formats. It appears the seller expect
more revenue in the first-price auction since the winner pays the highest bid.
But this is not immediate.
320
6
vi maxj=i bj r
r
r
maxj=i bj vi
- bi
24.4.1
Let me start with the case of complete information, in the sense that bidders
know how strong or weak each other bidder is, which the seller still doesnt
know the bidders willingness to pay.
Problem of dealing with complete information and continuous bids
It is hard to deal with the case of incomplete information in the setting of
continuous bids. See Figure 24.1. Pick bidder i and let maxj=i bj denote the
highest bids of the others. Since is value vi is greater than this he should win,
but how much should he bid?
In order to win his bid bi must be greater than maxj=i bj , and his net gain
vi bi is larger as his bid is closer to it. However, once his bid bi is exactly
equal to maxj=i bj he loses the sure win and has to obey some tie-breaking
rule, so that his expected utility suddenly drops down to somewhere between 0
and vi maxj=i bj . Also, when his bid bi is below the highest bid of the others
maxj=i bj now he gets nothing. Thus his maximization problem has no solution.
There are two ways to think of this problem. One is to assume that bids
are discrete, so that you dont have any such discontinuity problem. I do this
just below. The other is to introduce random noise which smoothes out the
discontinuity, which is the case of incomplete information I cover in the next
subsection.
Now consider the discrete-bid case. To illustrate, assume that there are two
bidders and five possible values, 10, 20, 30, 40 and 50. Also, assume lets say
that As willingness to pay is 40 and Bs willingness to pay is 20. Tie-breaking
321
10
20
30
40
50
10
30 10
2 , 2
20, 0
10, 0
0, 0
10, 0
20
0, 0
20 0
2 , 2
10, 0
0, 0
10, 0
B
30
0, 10
0, 10
10 10
2 , 2
0, 0
10, 0
40
0, 20
0, 20
0, 20
0 20
2, 2
10, 0
50
0, 30
0, 30
0, 30
0, 30
10 30
2 , 2
Let me give some explanations of where the above numbers come from. Consider
for example the case where A bids 20 and B bids 10. Then A wins and pays
20. Since As willingness to pay is 40 his net gain is 40 20 = 20. B loses on
the other hand he receives or pays nothing, hence his net gain is zero. Consider
for another example the case where both bid 30. Since the tie is broken by
coin-flip and both bidders are assumed to be risk-neutral, As expected utility is
1
1
10
1
1
10
2 (40 30) + 2 0 = 2 and Bs expected utility is 2 (20 30) + 2 0 = 2
There are three pure-strategy Nash equilibria, (20, 10), (20, 20) and (30, 20).
Notice that whichever equilibrium is played A is bidding strictly lower than his
willingness to pay. Here A is the stronger bidder and he is aware of it, so he
tries to lower his payment by bidding lower as far as it does not risk his winning
to much.
24.4.2
Let us now consider the case of incomplete information, in which each bidder
knows only his value and does not know others values, which he knows only in
the form of prior probability distribution.
Now consider that there are n bidders who have preferences quasi-linear in
income and are risk-neutral. Let vi denote is willingness to pay.
Assume that each bidders valuations are distributed independently and each
one is drawn from an identical distribution, which is called IID (independently
and identically distributed) in the context of probability and statistics. Let F
denote the cumulative distribution function for the identical distribution and
f denote its density function. Note that given F and a fixed number v the
probability that any drawn number is less than or equal to v is F (v). Note also
that F and f have the relationship F (v) = f (v).
Denote bidder is value by vi . Denote the profile of bids other than is by
bi = (b1 , , bi1 , bi+1 , , bn ), then the entire profile of biddings is denoted
by (bi , bi ).
Then bidder is net gain is
{
vi bi , if bi > maxj=i bj
Ui (bi , bi ) =
0,
if bi < maxj=i bj
We ignore the case of ties since its probability is zero in the setting of continuous
bids.
Here we consider a situation such that
322
all the bidder know how each bidder conditions his bid on his value
while nobody knows every other biddere realized value, and the way
how one conditions his bid on his value is identical across bidders.
This is called symmetric Bayesian Nash equilibrium. Since the concept of
Bayesian Nash equilibrium itself does not imply that the bidding behaviors are
symmetric even when the distributions of values are symmetric, we are imposing
the symmetry of bidding behavior as an additional assumption. Denote the
common function (called bidding function) by : [0, 1] R, where (v)
denotes ones bid when his value is v.
Suppose now that all the bidders except i with his value vi are following the
bidding function . Then bidder is expected utility of bidding bi is
(vi bi ) Probability of winning
since if he wins his net gain is his value minus his pay and if he loses he gets
nothing and pays nothing. As the probability of winning is the probability that
all the other bidders bids are lower than bi , it is P rob(maxj=i (vj ) < bi ).
Since it is equal to the probability that all the other bidders values are lower
than 1 (bi ), the value which yields bid bi under the bidding function , it is
F ( 1 (bi ))n1 . Thus is expected utility is given by
(vi bi )F ( 1 (bi ))n1
Take the first-order condition for maximization, then we have
F ( 1 (bi ))n1 + (vi bi )
323
which leads to
vi
(vi ) =
]
[
(n 1)vF (v)n2 f (v)dv
= E max vj | max vj < vi
j=i
j=i
F (vi )n1
Here the right-hand-side is the expected value of the highest bid of the others
conditional on the event that is bid is the highest.
For example, when the distribution is uniform, that is, when F (v) = v and
f (v) = 1, the above solution reduces to
(vi ) =
24.5
n1
vi
n
Second-price auction
Second-price sounds tricky, but it has actually a nice property. Let me explain
this by an example first. Take the same numerical example as before in which
As willingness to pay is 40 and Bs is 20, then the payo matrix for the secondprice auction game is
10
20
30
40
50
10
30 10
2 , 2
30, 0
30, 0
30, 0
30, 0
20
0, 10
20 0
2 , 2
20, 0
20, 0
20, 0
B
30
0, 10
0, 0
10 10
2 , 2
10, 0
10, 0
40
0, 10
0, 0
0, 10
0 20
2, 2
0, 0
50
0, 10
0, 0
0, 10
0, 20
10 30
2 , 2
The dierence from the previous one is that here if you win you pay the second
highest bid (which is the opponents bid in the present illustration with two
bidders). Thus in second-price auction the incentive to lower payment after
winning is eliminated in the beginning, by conceding the payment down to the
second-highest bid.
Consider for example that A bids 20 and Bi bids 10, then A wins and pays
10, not 20. Since As willingness to pay is 40 his net gain is 40 10 = 30.
It is immediate that Bs net gain is 0 then. Also, when both bid 30 lets say
the winner pays 30, since the first-price and the second-price coincide. Since
both are risk-neutral As expected utility is 21 (40 30) + 12 0 = 10
2 and Bs
expected utility is 21 (20 30) + 12 0 = 10
.
2
You can see that there are many Nash equilibria because of lots of ties in
payos, but there is an obvious one. Notice that bidding 40 is always optimal
for A and bidding 20 is always optimal for B respectively, no matter what the
opponent bids. Thus (40, 20) is a dominant strategy equilibrium here, in which
each bidder bids his willingness to pay.
This property holds generally in second-price auction. Consider that there
are n. Denote bidder is willingness to pay by vi and his bid by bi . Also, denote
324
6
vi maxj=i bj r
r
r
maxj=i bj vi
- bi
the profile of bids other than is by bi = (b1 , , bi1 , bi+1 , , bn ), then the
entire profile of bids is denoted by (bi , bi ).
Then bidder is net gain is
{
0,
when bi < maxj=i bj
Ui (bi , bi ) =
vi maxj=i bj , when bi > maxj=i bj
Again ignore the case of ties in the present setting of continuous bids.
Notice that ones own bid does not appear in the term of his net gain as
given above, and it aects only whether he wins or not. Thus any bidder
cannot manipulate his payment by means of bidding either higher or lower than
his willingness to pay, which means he does not lose anything by bidding his
willingness to pay as it is. Thus we obtain the following result.
Theorem 24.1 In second-price auctions it is always a dominant strategy to
bid ones willingness to pay.
Proof. Case 1: When vi > maxj=i bj bidder i should win. Then the relation
between his bid bi and his net gain is depicted as in Figure 24.2. There is a
continuum of optimal choices here, but since the graph is flat the bidder does
not lose anything by bidding vi .
Case 2: When vi < maxj=i bj bidder i should not win. Then teh relation
between his bid bi and his net gain is depicted as in Figure 24.3. There is a
continuum of optimal choices here, but since the graph is flat the bidder does
not lose anything by bidding vi .
Weakness to collusion
Second-price auction has a weakness in the sense that it may be manipulated
by collusion. Recall that in the previous example A bids 40 and B bids 20 in
the dominant strategy equilibrium, where As net gain is 40 20 = 20 and Bs
net gain is 0. If A and B can communicate, however, A can ask B to bid 10
instead of 20 by oering to pay him lets say 5. Since B knows he cannot win
325
r
r
- bi
vi maxj=i bj
vi maxj=i bj r
the auction anyway it is better for him to accept the oer. A gains as well, since
even after paying 5 his net gain is 40 10 5 = 25, which is greater than 20.
24.6
Now, which auction format is good from the viewpoint of the seller?
Consider the following timeline.
1. Bidders are collected. At this point no bidder knows even his own valuation, and they only know the prior probabilistic distribution of valuations.
2. Auction format is chosen.
3. The bidders go to the viewing event, in which each one realizes his valuation. However, each bidder does not know the realized values for the
others.
4. The bidders bid.
More realistically, bidders may decide whether to bid after setting the auction
format or seeing his value, but let me omit it as it is an advanced topic.
Thus, at the point of choosing an auction format the seller wants to maximize
the ex-ante expected revenue since he is given only the prior distribution
over the bidders valuations.
In this setting the following result is know, which is called revenue equivalence theorem
Theorem 24.2 Assume the following.
1. Risk-neural bidders
2. Independent private values
Then, any auction format such that the bidder with the highest valuation always
wins the item in Bayesian Nash equilibrium yields the same ex-ante expected
revenue.
326
Since the bidder with the highest valuation always wins in all of first-price,
second-price, English and Dutch auctions they yield the same expected revenue
ex-ante. One can also think of all-pay auction in which bidders pay their bids
regardless of winning or losing and the highest bidder wins the item, third-price
auction in which the highest bidder wins and pays the third highest bid, and
many others. All these, however, maintains the property that the bidder with
the highest valuation always wins they yield the same expected revenue ex-ante.
The revenue equivalence theorem plays the role of benchmark in the auction
literature, in the sense that it suggests which auction format is more favorable
when we depart from its assumptions.
It is known that when the bidders are risk-averse the first-price auction
yields higher expected revenue than the second-price auction. When a riskaverse bidder knows he is strong he likes to make his winning sure more than
a risk-neutral bidder does, and willing to bid higher. On the other hand, it
holds regardless of risk attitudes that bidding own willingness to pay is always
a dominant strategy in the second-price auction.
Also, it is known that in the common-value case English auction yields higher
expected revenue than the second-price auction. This is because English auction
is a dynamic process and bidders learn about the common value as the auction
is in progress. It is more likely to happen that even when a bidder initially
plans to give up earlier but as he sees other bidders are still remaining in the
room he learns that the value of the item was actually higher than his initial
expectation, updates his valuations stays longer as well.
24.7
Exercises
Exercise 34 There are n bidders, who have quasi-linear preferences over consumptionincome pairs and are risk-neutral.
Their willingness to pay are independently and identically distributed, where
the cumulative distribution function is denoted by F and the density is denoted
by f .
Then find the bidding function for Bayesian Nash equilibrium in the all-pay
auction game.
Chapter 25
25.1
Adverse selection
Adverse selection refers to a situation in which one side of the market cannot
observe or verify the types of the other side of the market.
25.1.1
Lemon is a slang in used car markets which refers to a defective car. There
are 100 sellers in the used car markets each of which has one used car. Out of
100 sellers 50 own cars with good quality and 50 own cars with bad quality. The
owners of good ones are willing to sell when they are paid 10000 dollars, the
owners of bad ones are willing to sell when they are paid 3000 dollars. There are
buyers, who are willing to pay 11000 dollars for a good one and 3500 for a bad
one. For simplicity assume that all the market participants are risk-neutral.
Now let me proceed by the assuming the following.
Assumption on information
1: All the market participants know the above numbers.
2: But the buyers cannot observe or verify the quality of an individual car.
You can know its quality only after driving several weeks or so. When
327
328
you realize the car is defective and return to the seller claiming for some
compensation the seller will say thatll be because you wore it out, and
you dont have any counter-evidence against it.
3: Also the owner of good one cannot dierentiate their cars from the bad
ones, for the owners of bad owns can claim the same thing.
In such a situation how are the resulting trading pattern will be?
If the buyers could distinguish between the good ones and the bad ones they
are simply traded as dierent commodities. Then a good car will be traded for
the price between 10000 and 11000 and a bad car will be traded for the price
between 3000 and 3500.
Since the buyers cannot observe or verify the qualities, however, they have
to accept a price common between the good ones and the bad ones. Denote
such price by p. Below we think step by step.
1. When 11000 < p: Since the price is higher even than the willingness to
pay for the good ones nobody buys.
2. When 10000 p 11000: In this price range both types are provided to
the market. The buyers know this. That is, they know 50 cars in the
market are good and 50 are bad, hence they know if they buy they draw a
good one with probability half and a bad one with probability half. Hence
each buyers expected utility of buying is
0.5 110 + 0.5 35 p = 72.5 p
from the assumption f risk neutrality. Notice that under the present assumption 10000 p 11000 it is negative. Hence nobody buys.
3. When 3500 < p < 10000: Since the price is lower than the reserve price
for the good ones their owners do not sell. Thus only the bad ones are
provided to the market. The buyers know this. That is, they know that
all the cars in the market are bad. Hence each buyers expected utility of
buying is
35 p,
which s negative under the current assumption 3500 < p < 10000. Hence
nobody buys.
4. When p < 3000: Since the price is lower even that the reserve price for
the bad ones nobody sells.
5. The remaining case is 3000 p 3500: Only the bad ones are provided
to the market. The buyers know that all the cars in the market are
bad, but they are willing to buy. Thus only the bad cars are traded.
329
So-called adverse selection refers to such situation in which items with good
quality are driven out of the market and only bad ones remain in the market
and be traded. Whether is it the only case or not is a quantitative question, so
let me go over the argument in a little more general model.
The proportion of sellers owning good ones is denoted by , then the proportion of sellers having bad ones is 1 . The reserve price for a good one
is denoted by vH , that for a bad one is denoted by vL , where vH > vL . Also,
willingness to pay for a good one is denoted by wH , and that for a bad one is
wL , where wH > wL .
Let us focus on the case that vH wH and vL wL , since if vH > wH or
vL > wL trades are not made even under complete information.
Again all the traders are assumed to be risk-neutral.
Case 1: When wH + (1 )wL vH , there are two kinds of equilibria. One
is such that the price is in the range
wH + (1 )wL p vH
and both of good ones and bad ones are traded. Here both types are provided and the buyers accept the risk of drawing a bad one with probability
1 , since the gambling is suciently attractive.
The other is such that the price is in the range
w L p vL
and only the bad ones are traded, which is the adverse selection situation
in the original example.
There exist multiple equilibria here, in the sense that even though they
come from the same fundamental in one equilibrium higher price promotes
the items with higher quality to be provided and this substantiates buyers optimistic expectations in a self-fulfilling manner, and in the other
equilibrium lower price prevents the items with higher quality from being provided and this substantiates buyers pessimistic expectations in a
self-fulfilling manner.
Case 2: When wH + (1 )wL < vH the only price range which can support
equilibrium is
w L p vL
in which only the bad ones are traded, which is the original adverse selection situation.
25.1.2
Insurance market
330
income denoted by w. Each smoker dies of dies of lung cancer with probability
S and each non-smoker dies of lung cancer with probability N (forget about
other causes of death). If one dies his dependent loses income L, where L < w.
On the other hand the insurance company can issue life insurance such that
the insurees dependent can receive the entire loss L. Also, the insurees are
their dependents are risk-averse and their vNM indices are given by v(z) = ln z.
Assume that the insurance company is risk-neutral.
Again, as before assume the following.
Assumption on information
1: All the market participants know the above numbers.
2: But the insurance company cannot observe or verify if each individual is
smoking or not (though I heard it is technically possible nowadays).
3: Also a non-smoker cannot dierentiate himself from the smokers, for the
smokers can claim that they are not smoking.
Here we assume that an insuree and his dependent are considered as one. How
does the insurance premium look like?
If the insurance company can observe and verify if a given insuree is smoking or not, it can charge dierent insurance premia between smokers and nonsmokers. In order to make the point clearer let us first think of this case. Let pS
denote the insurance premium for smokers and pN denote that for non-smokers.
1. In order that a smoker buys the insurance his expected utility of buying
must be at least as large as his expected utility of not buying, that is, it
has to be met
ln(w pS ) (1 S ) ln w + S ln(w L)
Here the left-hand-side is ln(wpS ) because the insurance is a full-coverage
one, which guarantees the insurees final income w pS .
On the other hand, in order that the insurance company sells the insurance
its expected profit must be non-negative, that is,
pS S L 0
has to be met. Hence the price range in which an insurance contract can
be made between a smoker and the insurance company is
w w1S (w L)S pS S L
For notational simplicity, let w w1S (w L)S = vS .
2. Likewise, the price range in which an insurance contract can be made
between a non-smoker and the insurance company is
w w1N (w L)N pN N L
Again for notational simplicity, let w w1N (w L)N = vN .
331
Now lets consider that the insurance company cannot observe or verify if
a given buyer is smoking or not. Then the company has to face the insurance
premium common across smoker and non-smokers, which is denoted by p. We
can think of two cases.
Case 1: When vN (S + (1 )N )L there are two kinds of equilibria.
One is such that the insurance premium is in the range
vN p (S + (1 )N )L
and both of smokers and non-smokers can buy the insurance. Here both
types are buying the insurance and the insurance company accepts the
risk of drawing a smoker one with probability , since the gambling is
suciently attractive.
The other equilibrium is such that the insurance premium is in the range
vS p S L
and only smokers buy the insurance.
There exist multiple equilibria here, in which both types are willing
to buy the insurance since the premium is suciently cheap, and the
insurance company accepts the risk taking since the proportion of the
smokers is suciently low.
Case 2: When vN < (S + (1 )N )L the only price range which can
support equilibrium is
vS p S L
in which only smokers buy the insurance, which is the adverse selection
situation.
25.2
Moral hazard
Moral hazard refers to a situation in which one side of the market cannot
observe actions taken by the other side of the market. It originated from insurance industry and refers to a situation in which insurees neglect to protect their
assets when they are fully insured.
25.2.1
Insurance market
Consider a market for car insurance. Assume that the driver is risk-averse and
his risk attitude is described by a vNM index v(z) = ln z for simplicity. The
insurance company is assumed to be risk neutral. The drivers initial income is
denoted by w.
If the driver drives carefully the accident probability is 1 , and otherwise it
is 2 , where 0 < 1 < 2 < 1. If he hits an accident he loses L.
332
Being careful is not for free, it needs certain cost. Such cost given by c > 0
in the monetary term. It doesnt have much quantitative meaning. The whole
point is that it is not free.
Assume that c is positive but suciently small, 1 is positive but suciently
small, 2 is suciently large but less than 1. Then as is discussed in the chapter
on competitive market the ecient risk-sharing is such that the risk-neutral
agent, the insurance company here, takes all the risk, and the risk-averse agent
receives riskless consumption, where the driver drives carefully because the cost
of doing to is suciently cheap. Thus, as a benchmark consider an insurance
with complete coverage, which pays L if the driver has an accident.
Now consider, however, that the insurance company cannot observe or verify
if the driver was driving carefully. We assume the following.
Assumption on information
1: Both parties know the above setting and numbers.
2: However, the insurance company cannot observe or verify if an individual
driver was driving carefully.
Here the insurance company cannot make the premium or payment conditional on if the insuree was driving carefully. Thus, consider the unconditional
insurance premium p, where the insurance is assumed to be full-coverage paying
L upon accident.
1. Suppose the driver is not insured. Then his expected utility when he drives
carefully is
(1 1 ) ln(w c) + 1 ln(w L c)
On the other hand his expected utility when he does not drive carefully is
(1 2 ) ln w + 1 ln(w L)
Since 1 is suciently low and 2 is suciently high, the driver drives
carefully when he is not insured.
2. Suppose the driver is insured. Then, given that the insurance is fullcoverage his expected utility when he drives carefully is
ln(w c p)
On the other hand, his expected utility when he does not drive carefully
is
ln(w p)
since the insurance is full-coverage. Thus it is a waste of eort for the
drive to drive carefully under the full-coverage insurance (unless he dies
of the accident).
333
334
3. The condition that the drives chooses to buy the insurance (called a participation constraint) is
(1 1 ) ln(w c p) + 1 ln(w L + R c p)
(1 1 ) ln(w c) + 1 ln(w L c)
4. Given that the above constraints are met the insurance company can assume that the driver pays eort and the accident probability is 1 , hence
the condition for the company to sell the insurance is
p 1 R 0
25.2.2
Reward contract
Consider that there is one employer and one employee. There is a project such
that its outcome (revenue) is aected by the level of eort by the employee, in
the sense that the probability of good outcome is higher (but not 1) when he
pays eort and low (but not 0) when he does not pay eort.
Assume that the employee is risk-averse and his risk attitude is described by
a vNM index v(z) where z denotes final income. The employer is assumed to
be risk neutral.
If the employee pays eort the probability of good outcome is H , and otherwise it is L , where 0 < L < H < 1. If the outcome is good the employer
receives revenue RG and if it is bad he receives RB .
Eort is not for free. The cost of eort given by c > 0 in the monetary term.
Assume that c is positive but suciently small compared to its eect on
expected increase of revenue, H is suciently large but not 1, L is suciently
small but not 0. Then as is discussed in the chapter on competitive market
the ecient risk-sharing is such that the risk-neutral agent, the employer here,
bears all the risk, and the risk-averse agent receives riskless reward, where the
employee pays eor because the cost of doing to is suciently cheap.
However, the riskless reward is impossible under moral hazard. The employer
cannot observe or verify how the employee is working. Then,
1. Since the level of eort is not observable or verifiable the employer cannot
make pay according to the employees eort.
2. On the other hand, when a fixed reward payment is made regardless of
outcome the employee will not pay eort. When there is even a small
uncertainty about outcomes in particular the employer cannot identify
the employees eort level from the outcome. One can always say, I tired
hard but unfortunately the outcome wasnt good.
In such a situation the employer has condition the reward payment on the
outcome.
335
Let wG denote the reward payment for good outcome and wB for bad outcome. Then the incentive condition for the employee to pay eort is
H v(wG c) + (1 H )v(wB c) L v(wG ) + (1 L )v(wB )
Let w be the income which the employee receives when he does not sign
the contract and works elsewhere. Then the participation condition so that the
employee signs the contract by his choice is
H v(wG c) + (1 H )v(wB c) v(w)
25.2.3
Now consider that the insurance company or employer try to design a contract
in order to maximize the expected profit. This is called a principal-agent
problem, where the principal oers a contract to the agent but cannot monitor
or verify the agents action. Typical examples are that the principals are an
insurance company, an employer and a shareholder of a firm, and the agents are
a driver, an employee and a manager, respectively.
In general it is not clear if inducing the agent to pay eort is the best thing,
since it might be too expensive to do so and it might be better to get content
with lower eort of the agent. Hence the problem takes the following form. Let
e denote the agents eort lever, which is assumed here for simplicity either H
or L. Let ce denote the cost of making eort level e, which is measure in terms
of income. There are two possible outcomes, good or bad. Let RG denote the
principals revenue when the outcome is good, and RB be the revenue when the
outcome is bad. Let e be the probability of good outcome when the agents
eort level is e.
The principal is risk-neutral and cares for the expected profit. The agent is
risk-averse and his risk attitude is described by the vNM index v(z), where z
denotes the final income. Also, let v denote the level of vNM index given to the
agent when he does not sign the contract and goes somewhere else.
Then the principal-agent problem is formulated as
max e (RG wH ) + (1 e )(RB wB )
wG ,wB ,e
subject to
e v(wG ce ) + (1 e )v(wB ce )
e v(wG ce ) + (1 e )v(wB ce )
e v(wG ce ) + (1 e )v(wB ce )
for all e = H, L
v
where the first constraint is the incentive constraint and the second is the participation constraint.
25.3
Signaling
25.3.1
336
Education as a signaling
Consider two types of workers, high and low in terms of productivity. Let a1
denote the marginal productivity of a high-type worker and a2 denote that of a
low-type, where a1 > a2 .
The proportion of high is known to be . Hence the prior expectation of
workers expected marginal productivity is a1 + (1 )a2 .
When the labor market is perfectly competitive and if the employers can
observe and verify individual workers types the high-type worker is paid a1 and
the low-type is paid a2 . However, when the employers cannot observe or verify
individual workers types we run into the adverse selection problem.
Now consider that the workers can send signals to the employers. Here let
us consider an academic degree of a given level for the signal. Receiving the
signal from a given employer, the employers pay the conditional expectation of
his productivity.
Let c1 be the cost acquiring the degree for the high type which is measure
in income, and and that c2 be that for the low type, where c1 < c2 . This means
that it is easier for the high type to acquire the degree.
Here the cost is not just direct monetary cost, but rather about how painful
it is. The assumption c1 < c2 says that it is relatively more painful for the low
type to acquire the degree.
We can think of three cases
Case 1: Consider that only the high type acquires the degree. This is called
a separating equilibrium. Then the employers know that the type of
a given worker is high if he as the degree, and low otherwise. Hence they
pay a1 to him if he has the degree and a2 otherwise.
In order that this is the case it has to be optimal for the high-type worker
to acquire the degree and it has to be optimal for the low-type worker not
to acquire the degree. The first condition is
a1 c1 a2 ,
where the left-hand-side is the net gain of the high-type worker when he
acquires the degree and the right-hand-side his net gain when he doesnt.
The second condition is
a1 c2 < a2 ,
where the left-hand-side is the net gain of the low-type worker when he
acquires the degree and the right-hand-side his net gain when he doesnt.
Summing up, the separating equilibrium occurs when c1 a1 a2 < c2 .
Case 2: Consider that both types acquire the degree. This is called a pooling equilibrium. Then the employers receives no new information when
they receive the signal, hence the expected marginal productivity of any
individual worker who has the degree is a1 + (1 )a2 .
337
A problem occurs when they see a worker who does not have the degree.
Since all the workers acquire the degree here, the existence of a worker
without the degree is supposed to be impossible. What would the employers believe when an impossible thing had happened?
Since the standard Bayes rule does not apply here, their belief after seeing the impossible is taken to be another parameter. Let denote the
employers belief that a worker without the degree is high type. Then the
expected marginal productivity is a1 + (1 )a2 .
Here it has to be the case that it is optimal for both types to acquire the
degree. For the high type the required condition is
a1 + (1 )a2 c1 a1 + (1 )a2
where the left-hand-side is the net gain of the high-type worker when he
acquires the degree and the right-hand-side his net gain when he doesnt.
For the low type the required condition is
a1 + (1 )a2 c2 a1 + (1 )a2
where the left-hand-side is the net gain of the low-type worker when he
acquires the degree and the right-hand-side his net gain when he doesnt.
From the above two inequalities the pooling equilibrium to occurs when
c1
a1 a2
and
c2
.
a1 a2
Since c1 < c2 the condition reduces to
c2
.
a1 a2
The condition says that when the employers see a worker without the
degree (which is impossible) they believe that the worker is less likely to
be the high type.1
Case 3: Consider that neither type acquires the degree. This is called an
another kind of pooling equilibrium. Then the employers receives no new
information when they receive the signal of no degree, hence the expected
marginal productivity of any individual worker who does not have the
degree is a1 + (1 )a2 .
A problem occurs when they see a worker who does have the degree. Since
no worker acquires the degree here, the existence of a worker having the
degree is supposed to beimpossible. Again, what would the employers
believe when an impossible thing had happened?
1 Technically,
338
Since the standard Bayes rule does not apply here either, their belief after
seeing the impossible is taken to be another parameter again. Let denote
the employers belief that a worker having the degree is high type. Then
the expected marginal productivity is a1 + (1 )a2 .
Here it has to be the case that it is optimal for both types not to acquire
the degree. For the high type the required condition is
a1 + (1 )a2 c1 < a1 + (1 )a2
where the left-hand-side is the net gain of the high-type worker when he
acquires the degree and the right-hand-side his net gain when he doesnt.
For the low type the required condition is
a1 + (1 )a2 c2 < a1 + (1 )a2
where the left-hand-side is the net gain of the low-type worker when he
acquires the degree and the right-hand-side his net gain when he doesnt.
From the above two inequalities the pooling equilibrium to occurs when
<+
c1
a1 a2
<+
c2
.
a1 a2
and
c1
.
a1 a2
The condition says that when the employers see a worker having the degree
(which is impossible) they believe that the worker is less likely to be the
high type.
Notice that here the eort paid for the signaling purpose has nothing to do
with enhancing productivity. It is a waste. In Japan, a new university graduate
who seeks a job at an entry level is often supposed to submit his CV in handwriting. No correction marker is allowed. Here the employers are testing if a job
candidate is crazy enough (and revealing that they are wanting such crazy
workers).
Of course Im not saying that education is totally a waste of resources. But
I would say education certainly has such an aspect.
25.4
Speculative trade
339
Under complete information, trades occur because of distributions of preferences and initial holdings, such as I have what you like and you have what I
like. Now, consider that such factors have been eliminated and ask if trades can
happen solely because of dierence of information. That is, ask if speculative
trades are possible.
It sounds easy. When I receive information that the price of some stock is
likely go up and you receive information that it is likely to go down, I might
want to buy and you might want to sell.
However, if I am rational in the sense that I am not stubborn and do
not have baseless confidence, I should get surprised when I encounter a trader
who is willing to sell the stock: why is this guy willing to sell? And if I am
rational in the sense that I can process information in the logically correct
manner, I should update my prediction about the stock price based on the
probability theory, taking the emergence of such seller (you) into account.
Similarly for you. If you are rational in the sense that you are not stubborn and do not have baseless confidence, you should get surprised when you
encounter a trader who is willing to buy the stock: why is this guy willing to
buy? And if you are rational in the sense that you can process information
in the logically correct manner, you should update you prediction about the
stock price based on the probability theory, taking the emergence of such buyer
(me) into account.
Can such rational individuals agree to disagree so that they trade? Let
me explain using the following example, which is an economic version of so-called
muddy childrens puzzle.
First let me explain the muddy childrens puzzle.
Two children are playing in a muddy playground. Their faces may
become muddy, but each child cannot see if his face is muddy or not,
while he can see if the other childs face is muddy or not.
Now lets say that both childrens faces are muddy.
Here comes the teacher and said, At least one of your faces is
muddy. If you see that your face is muddy go to the bathroom
immediately, and otherwise stay here.
The two children stared at each other for a while, and as soon as each
confirmed that the opponent was staying they immediately went to
the bathroom.
What happened here? When they are told by the teacher that at least one
of their faces is muddy they cannot figure out if just one of them is muddy or
both are muddy. So each child cannot see if the opponents face alone is muddy
or his face is muddy as well. Thus each of them stays.
However, once each of them sees that the opponent is staying he reasons,
if my face is not muddy the opponent would have gone to the bathroom immediately, since he would have realized that his face alone is muddy when the
teacher said that at least one of our faces is muddy.
340
Q1
0.7
0.3
0.8
Q2
0.6
0.55
0.1
Q3
0.4
0.6
0.45
Here lets say the number in the cell (P1 , Q1 ) shows that the stock price rises with
probability 0.7 and falls with probability 0.3 if P1 and Q1 are true. Similarly
for the other combinations.
Assume that both traders are risk neutral, without loss of generality. Then
A likes to hold the stock if the probability of rise is greater than 0.5 and likes to
sell if it below 0.5. B likes to buy the stock if the probability of rise is greater
than 0.5 and likes not to if it below 0.5.
Now suppose A has received P2 and B has received Q3 .
Now there is a mediator who asks the two questions in the following procedure. Each traders response is observed by the opponent.
Step 1: Ask A, Would you like to sell? If the response is NO, stop. If the
response is YES, go to Step 2.
Step 2: Ask B, Would you like to buy? If the response is NO, stop. If the
response is YES, go to Step 3.
Step 3: Ask A, Would you still like to sell? If the response is NO, stop. If
the response is YES, go to Step 4.
Step 4: Ask B, Would you still like to buy? If the response is NO, stop. If
the response is YES, go to Step 5.
Repeat as far as they say YES.
As you repeat this, after finite rounds either of them says NO.
Why? Let us see this step by step. Initially, A knows P2 is true but he
does not know what B knows. Also B knows Q3 but does not know what
A knows. Hence knowledge commonly known to them is only a trivial one,
either P1 or P2 or P3 is true, and Q1 or Q2 or Q3 is true. Let me denote this
341
1.45
0.3 + 0.55 + 0.6
=
< 0.5
3
3
0.6 + 0.45
1.05
=
> 0.5
2
2
0.3 + 0.6
0.9
=
< 0.5
2
2
Since A still believes the stock price is likely to fall he says YES.
Observing this response, B knows that P3 is not true. For, if P3 were true
A would have said NO, because the conditional probability of price rise is then
P rob(up|{P3 } {Q1 , Q3 }) =
1.25
0.8 + 0.45
=
> 0.5
2
2
342
After Step 3 it is known to both A and B that P3 is not true, the common
knowledge is updated into CK3 = {P2 } {Q1 , Q3 }.
In Step 4, Since B knows that Q3 is true and also that P2 is true, the
probability of price rise conditional on his current information is
P rob(up|{P2 } {Q3 }) = 0.6 > 0.5
Since B knows the stock price is likely to rise he says YES.
Observing this response, A knows that Q1 is not true. For, if Q1 were true
B would have said NO, because the conditional probability of price rise is then
P rob(up|{P2 } {Q1 }}) = 0.3 < 0.5
But B said YES. Therefore A knows Q1 is not true.
After Step 4 it is known to both A and B that Q1 is not true, the common
knowledge is updated into CK4 = {P2 } {Q3 }. Now both know the truth.
Now in Step 5, since A knows that P2 is true and that Q3 is true, the
probability of price rise conditional on his current information is
P rob(up|{P2 } {Q3 }) = 0.6 > 0.5
Since A knows the stock price is likely to rise he says NO.
Thus, it is impossible for rational agents to agree to disagree. You may
have a concern about the assumption here that each agent responds YES or
NO sincerely at each step, while in a more realistic trading situation they may
lie and also the trade makes when both say YES simultaneously. It is known,
however, that it is impossible for them to agree to disagree even under such
situations. See Geanakoplos [7] for more details.
This result relies on the assumption of common-prior, which in the current
example says that both A and B know that the probability distributions over
P1 , P2 , P3 and Q1 , Q2 , Q3 are uniform and independent. As I discussed in the
chapter of games with incomplete information, however, we cannot just drop
or relax it.
25.5
Exercises
Exercise 35 There are 100 people who want to sell used cars and 100 people
who want to buy a used car. Assume they are risk-neutral. 75 cars are plum
and 25 cars are lemon. The owners of a plum is willing to part with it for
4000. The owners of a lemon is willing to part with it for 1000. The buyers are
willing to pay 5000 for a plum and 1500 for a lemon.
When the buyers cannot verify whether a given car is a plum or a lemon, they
have to accept a common price. Describe all the trading patterns that can arise
in this setting.
343
Exercise 36 There is one employer and one employee. The employee has two
choices of eorts, high or low. When she makes high eort, it generates a good
outcome with probability 0.95 and a bad outcome with probability 0.05. When
she makes low eort, it generates a good outcome with probability 0.2 and a
bad outcome with probability 0.8. The cost of high eort for the employee is 3,
and the cost of low eort is 0. Assume that the employee is risk-neutral.
The employer cannot observe or verify whether the employee made high eort
or not, although she knows the cost of high eort for the employee. She can
pay dierent wages for dierent outcomes. Denote the wage for good outcome
by wG , the wage for bad outcome by wB . To induce the employee make high
eort, what condition do wG and wB have to satisfy?
Exercise 37 A is holding a stock. B is a potential buyer of the stock. A
receives some news that may be related to the stock price. She receives either
of P1 , P2 , P3 , with equal probability. B receives another type of news that
may be related to the stock price. She receives either of Q1 , Q2 , with equal
probability. The probability distributions of the news they receive respectively
are independent. Also, A cannot see what B receives and B cannot see what A
receives.
The table below shows how the stock price is related to the news.
P1
P2
P3
Q1
0.55
0.3
0.6
Q2
0.4
0.9
0.3
For example, the number in the cell (P1 , Q2 ) tells that if you know both P3 and
Q2 then you know that the stock price goes up with probability 0.3 and goes
down with probability 0.7.
Assume that both individuals are risk neutral. That is, A wants to sell the
stock if the probability of its price going up is smaller than 0.5 based on what
she knows (and does not want to sell if it is greater than 0.5). B wants to buy
the stock if the probability of its price going up is greater than 0.5 based on
what she knows (and does not want to buy if it is smaller than 0.5).
Now suppose that A receives P1 and B receives Q1 .
You successively ask questions to the two individuals in the following way.
Every answer is immediately observable to the opponent.
Step 1: You ask A, would you like to sell the stock?
Step 2: If A says Yes in Step 1, you ask B, would you like to buy the stock?
Step 3: If B says Yes in Step 2, you ask A, would you still like to sell the stock?
Step 4: If A says Yes in Step 3, you ask B, would you still like to buy the stock?
And so on. You continue this until one of them says No.
Show that after finite rounds one of them ends up with saying No. Describe
the logical process of how.
Part V
344
Chapter 26
Externality
Consumption or production activity is said to have an externality if its eect
is not taken into account in the determination of market price.
For example, when there is a consumption good such that its consumption
harms or annoys others the consumer himself would not take that into account,
the others cannot stop it unless they have a legal right to do so, hence the price
of the consumption good does not take the harm or annoyance into account. For
example, even if we know that consumption activity harms ourselves through
transmission of CO2 each consumer need not take this into account in doing his
consumption activity and we cannot to force each consumer to do so either.
Or, even when there is a production activity which causes social costs (such
as pollution) the corresponding firm does not take it into account in its profit
maximization decision, and the other economic agents cannot stop it unless they
have a legal right to do so, hence hence the prices of the output and inputs do
not take such social costs into account. As a result, resource allocation in the
market may be inecient even when it is perfectly competitive. This is called
market failure.
What is important in the above definition is that market prices does not
take it into account. Even when an activity directly aects other economic
agents it is not called an externality when it is priced and traded in markets,
such as service.
26.1
Market failure
To illustrate I will focus on the case that ones consumption activity causes to
another one. Externality between firms and between firms and consumers can
be dealt in a similar way.
Also for the illustration purpose let me assume the quasi-linear environment,
in which Good 1 is the consumption good which causes externality and Good 2 is
income transfer to be spent on the other goods. Each consumer i cares not only
about his consumption of Good 1 xi1 and his income transfer xi2 but also others
345
346
xj1 + xi2
ui (xi1 , xi1 , xi2 ) = vi (xi1 ) + ei
j=i
and ei
j=i xj1 is the benefit (or cost if it is negative) caused by the others
consumptions of Good 1. Let me call the first one internal benefit and the
latter external benefit (or external cost if negative).
Remark 26.1 One might think that a situation in which externality matters is
typically the situation in which income eect does matter. True, but here I like
to extract the problem of externality alone, and the assumption of no income
eect is just for the convenience of doing it.
Denote the relative price of Good 1 for Good 2 (income transfer) by p, then
each consumer i solves
max vi (xi1 ) + ei
xj1 pxi1
xi1
j=i
Note that in this individual choice others consumptions (xj1 )j=i are taken as
given. That is, the above maximization problem is equivalent to the maximization problem in which only the internal benefit is taken into account:
max vi (xi1 ) pxi1
xi1
Therefore condition for optimal consumption in which only the internal benefit
is taken into account is
vi (xi1 ) = p
By solving this we obtain is demand xi1 (p).
On the other had, the representative firm solves its profit maximization
problem
max py C(y)
y
i=1
xi1 (p ) = y(p )
347
Let (x11 , xn1 ) denote the allocation of Good 1 in the competitive equilibrium.
Then, since internal marginal benefit of each individual and marginal cost are
all equal, that is, since it holds
p = vi (xi1 ) = M C
xj1
j=1
for all i = 1, , n, the allocation is maximizing the dierence between the sum
of internal benefits and the producer surplus
( n
)
n
vi (xi1 ) C
xi1
i=1
i=1
( n
)
n
xi1
vi1 (xi1 ) + ei
xj1 C
i=1
i=1
j=i
x
bj1
x
bj1 = M C
vi (b
xi1 ) + ei
j=i
j=1
26.2
Solutions
The market failure as illustrated above suggests that some intervention is necessary. To illustrate, let me further simplify the setting as below.
There are two consumers, A and B. A prefers consumption of Good 1, but B
dislikes it. Also, B dislikes that A consumes Good 1, while A does not care about
348
vA (xA1 ) + xA2
lB (xA1 ) + xB2
Here vA denotes As internal benefit from his own consumption of Good 1, which
(xA1 ) > 0.
As consumption of Good 1, where lB 0 and lB
First let us look at the outcome of competitive market. Since B never consumes Good 1 we only look at As consumption. As individually optimal consumption solves
max vA (xA1 ) pxA1
xA1
26.2.1
Rationing
First thing we can think of as the governments policy is to maximize the social
surplus
max {vA (xA1 ) lB (xA1 ) C(xA1 )}
xA1
26.2.2
349
Pigovian tax
(b
xA1 )
t = lB
vA
(xA1 ) = p + t
v (xA1 ) M C(xA1 ) = lB
(b
xA1 )
26.2.3
Another idea is to create a right to consume the good or a right to stop the
consumption and allow people to trade it.
Case 1: When B is given the right
First, let us consider the case that the government creates a right to consume
the good and gives it to B. Then A can consume one unit of the good only
by buying one unit of the right and exercising it. A does not have to exercise
350
the right which he has bought from B, but here it is innocuous to assume he
does. Let r denote the price of one unit of the right and p denote the price of
the good.
Let us look how many units of the right A demands. As demand for the
right zA is determined by solving
max vA (zA ) rzA pzA
zA
where rzA is the cost of purchasing the right and pzA is the cost of exercising
the right. Then the optimization condition is given by
vA
(zA ) = p + r
By solving this we obtain As demand for the right zA (p, r), which is also As
demand for the good in the output market.
On the other hand, Bs supply of the right zB is determined by solving
max lB (zB ) + rzB
zB
where rzB is the revenue from selling the right. Then the optimization condition
is given by
lB
(zB ) = r
By solving this we obtain Bs supply of the demand zB (r).
The representative firm as before solves
max py C(y)
y
zB (r )
y(p )
Let z denote the amount of the right traded in the equilibrium, which is also
the amount of the good traded, then it satisfies
v (z ) = p + r , lB
(z ), M C(z ) = p
implying
v (z ) l (z ) M C(z ) = 0
Now we see that z is the ecient level. Here A pays r z units of income to B.
351
where wsA is the revenue of selling the right and p(x sA ) is the cost of consumption. Then the optimization condition is given by
vA
(x sA ) = w + p
By solving this we obtain As supply of the right sA (p, w) and As demand for
the good x sA (p, w).
On the other hand, Bs demand for the right sB is determined by solving
max lB (x sB ) wsB
sB
where wsB is the cost of purchasing the right to stop As consumption of the
good. Then the optimization condition is
(x sB ) = w
lB
sA (p , w )
y(p )
Denote the amount of the right traded in the equilibrium by s , then it satisfies
v (x s ) = p + w , lB
(x s ), M C(x s ) = p
This implies
v (x s ) l (x s ) M C(x s ) = 0
352
Chapter 27
Public goods
A good with the following two properties are called a public good
1. Non-rivalry: Ones using it does not prevent anybody from using it. For
example, ones using a road does not prevent anybody from using it (here
we abstract away the case of congestion). On the other hand, when one
uses or consumes a private good in the standard sense nobody else cannot
use or consume the identical object.
2. Non-excludability: Everybody can use it and we cannot exclude any
person from using it. For example, everybody can use an open road and
cannot be exclude for the reason that he is not paying for it. On the
other hand, toll road meets the condition of non-rivalry but fails to meet
non-excludability, since it has a toll gate so that one who does not pay for
it cannot enter.
Some adopt the definition only with non-rivalry, but in this chapter I focus on
public goods with both non-rivalry and non-excludability.
27.2
To illustrate, consider that there one private good and one public good. Denote
the quantity of the public good by g, and consumer is private consumption by
xi . Then consumer is consumption space is the non-negative quadrant of the
two-dimensional space R2+ , in which the public good is taken to be Good 1 and
the private good is taken to be Good 2. Let (g, xi ) denote a combination of the
public good and the private good for consumer i.
353
354
Let i denote is preference over pair of the public good and the private
good. For example when it holds (g, xi ) i (g , xi ) it means that i weakly
prefers (g, xi ) to (g , xi ). Let ui (g, xi ) denote a utility representation of is
preference i .
Here, let us consider how much of the private good each consumer is willing
to sacrifice in order to increase one extra unit of the public good. It is given by
the slope of a given indierence curve, where Good 1 is taken to be the public
good and Good 2 is taken to be the private good.
Given that current combination is (g, xi ), let xi denote the amount of the
private good which consumer i is willing to change in order to increase one
extra extra g units of the public good. Then the amount of the private good
he is willing to give up in order to increase one extra unit of the public good is
approximately
xi
g
where the sign of absolute value is put because xi is negative. As we make
g tend to zero, we obtain marginal rate of substitution of the private good for
the public good at (g, xi )
dxi
M RSi (g, xi ) =
dg
which corresponds to the slope of the tangent line to the given indierence curve
at (g, xi ).
As we did for the case of two private goods, marginal rate of substitution is
described as the ratio between marginal utilities (it is the same as before that
(g,xi )
denote the
marginal utilities themselves have no economic content). Let uig
(g,xi )
marginal utility of the public good at (g, xi ) and let uix
denote the marginal
i
utility of the private good at (g, xi ), then the marginal rate of substitution of
the private good for the public good is given by
M RSi (g, xi ) =
ui (g,xi )
g
ui (g,xi )
xi
i=1
xi + C(g) =
i=1
ei
355
Let us consider Pareto-ecient allocation of the public good and the private
good. First let us define eciency. An allocation (g , x1 , , xn ) is said to a
Pareto improvement of an allocation (g, x1 , , xn ) if
(g , xi ) i (g, xi ) holds for all i and
(g , xi ) i (g, xi ) holds for at least one i
A feasible allocation (g, x1 , , xn )is said to be Pareto ecient if no feasible
allocation can be a Pareto improvement of it.
The following result states that eciency requires the sum of willingness to
sacrifice the private good across individuals equals to the marginal cost.
Theorem 27.1 An (interior) allocation (g, x1 , , xn ) is Pareto ecient if and
only if
n
Proof. Only
part: Suppose the equality does not hold.
if
n
Suppose i=1 M RSi (g, xi ) > M C(g). This is the case that more public
good can be produced at cheaper cost. Since the cost function is locally linear
the cost of producing extra g units of the public good is M C(g)g. Let xi
denote the amount of private good to be sacrificed by each i, then we must have
n
i=1 xi = M C(g)g.
From the above inequality we can divide M C(g)g so that xi < M RSi (g, xi )g
holds for all i. Since the amount one needs to sacrifice is smaller than the amount
he is willing to sacrifice, each i prefers (g + g, xi xi ) to (g, xi ). Hence
(g + g, x1 x1 , , xn xn ) is a Pareto-improvement of (g, x1 , , xn ).
n
Suppose i=1 M RSi (g, xi ) < M C(g) on the other hand. This is the case
that the current level of public good is too costly. Hence by reducing the level
of the public good and suitably distributing the dispensed cost back to the
consumers we can make everybody better o.
Since the cost function is locally linear the cost dispensed by reducing g
units of the public good is M C(g)g. Let xi denote
n the amount of private
good to be received by each i, then we must have i=1 xi = M C(g)g.
From the above inequality we can divide M C(g)g so that xi > M RSi (g, xi )g
holds for all i. Since the amount one receives is more than the amount one demands for the compensation of reduction, each i prefers (g + g, xi + xi ) to
(g, xi ). Hence (g g, x1 + x1 , , xn + xn ) is a Pareto-improvement of
(g, x1 , , xn ).
n
If part: Suppose i=1 M RSi (g, xi ) = M C(g) holds and (g, x1 , , xn ) is not
Pareto ecient. Then there is an allocation
(g + g, x1 + x1 , , xn + xn )
with
n
n
(xi + xi ) + C(g + g) =
ei
i=1
i=1
356
i=1
we obtain
xi + C(g) =
ei .
i=1
xi + C(g + g) C(g) = 0.
i=1
Since
n
i=1
xi + M C(g)g 0.
i=1
27.3
357
xi + C(g) = 0
i=1
In the quasi-linear environment Pareto eciency is characterized as the maximization of social surplus. Note that it determines only the level of public good
and it says nothing about how the allocation of the private goods should be.
That is, it is totally silent about who should pay how much.
Proposition 27.1 Allocation (g, x1 , , xn ) is Pareto ecient if and only if g
maximizes
n
vi (g) C(g)
i=1
Proof.
Only if part: Suppose there
i=1 vi (g ) C(g ) >
n exists g with
n
C(g). Then, by taking i=1 xi + C(g) = 0 into account we obtain
i=1 vi (g)
n
C(g ) < i=1 (vi (g ) vi (g) xi ). In order to cover the cost C(g ) we need
to make
each i contribute xi units of the private good, and they must satisfy
n
C(g ) = i=1 (xi ). From the above inequality it is possible to divide C(g ) so
that xi < vi (g )vi (g)xi holds for all i Thus we obtain vi (g )+xi > vi (g)+xi
for all i, which is a Pareto improvement.
If part: Suppose (g, x1 , , xn ) is not Pareto ecient, then there exists a
feasible allocation (g , x1 , , xn ) such that vi (g ) + xi vi (g) + xi for all i and
vi (g ) + xi > vi (g) + xi for at least one i. Then by summing up the equalities
we obtain
n
n
{vi (g ) + xi } >
{vi (g) + xi }
i=1
i=1
i=1
vi (g ) C(g ) >
i=1
n
i=1
xi and C(g ) =
vi (g) C(g).
n
i=1
xi . Hence
27.3.1
358
Continuous case
Suppose the amount of public good is continuous and preference/cost are smooth.
Then marginal rate of substitution reduces to marginal willingness to pay, so
that we have M RSi (g, xi ) = vi (g). Thus the Sanuelson condition reduces to the
following
Theorem 27.2 In quasi-linear environments (g, x1 , , xn ) is Pareto-ecient
if and only if
n
vi (g) = M C(g)
i=1
holds.
27.3.2
Discrete case
Provision of public good is quite often discrete, and quite often it is just provided
or not.
For simplicity, assume g is either 0 or 1. For the private good still assume
that its quantity is continuous. Again (g, xi ) denotes a combination of the
public good and the private good for i. Maintain the assumption of quasi-linear
preference, which in this case reduces to the representation
ui (g, xi ) = vi g + xi
When g = 1 it is ui (1, xi ) = vi + xi , and when g = 0 it is ui (0, xi ) = xi . That
is, vi is is willingness to pay for the public good.
Let C be the cost of the public good. Then a feasible allocation must satisfy
(g, x1 , , xn )
n
xi + Cg = 0
i=1
vi C
= g = 1
vi C
= g = 0
i=1
n
i=1
That is, when the sum of willingness to pay exceeds the cost it is ecient to
provide the public good and otherwise no. Again, this condition is totally silent
about who should pay how much.
27.4
359
So far we have put aside the problem of who should pay how much. Now lets
think about it. The problem is very hard, since it is likely that each individual
tries to keep his contribution as low as possible hoping that the others would
contribute, and as everybody does so and nothing is contributed as a result.
This is called free-rider problem. As is shown in the next section, it is known
that eciency of allocation has to be given up partially in order to resolve the
problem.
Here let me explain two examples in order to make the point clear.
Example 27.1 Town A and B are planing to build a library jointly. Once
a library is built anybody in either town can use it. If one of the two towns
contributes 5 millions a small library is build and each towns benefit evaluated
in terms of income is 3 millions. If both towns contribute 5 millions respectively
a large library is built and each towns benefit is 7 millions.
If just one of the two towns contributes and a small library is built the social
surplus is 3 2 5 = 1. If both towns contribute and a large library is built
the social surplus is 7 2 5 2 = 4. Hence the surplus-maximizing solution
is that both towns contribute.
Is this implementable? Consider lets say that they decide whether to contribute respectively. Then the payo matrix is
C
N
C
2, 2
3, -2
B
N
-2, 3
0, 0
vA
vA +vB ,
B pays
vB
vA +vB .
360
Can it be a Nash equilibrium in this mechanism that each reports his true
willingness to pay? The answer is NO.
Lets say As willingness to pay of 1.2 and Bs is 0.4. Then, if A is reporting
vA = 1.2 the best thing for B is to report vB = 0, so that A pays the whole cost
and B pays nothing. Also, if B is reporting vB = 0.4 the best thing for A is to
report vA = 0.6, so that he needs to pay only the minimal necessary amount in
order to make the project. Thus they do not report true willingness to pay in
Nash equilibrium.
There are many Nash equilibria in this game: pair of reports (vA , vB ) is
Nash equilibrium when
vA + vB = 1, vA 0.6, vB 0.4.
The second example shows that unless you design a mechanism nicely it may
be manipulated by misreporting preferences. Now, how can we let people report
their true preferences by their choices?
27.5
Strategy-proof mechanism
In order to resolve the free-rider problem we have to exclude the case that one
can gain from misreporting. Here let us consider a mechanism such that it is a
dominant strategy to report ones true preference/willingness to pay.
The requirement that it is always optimal for everybody to report truthfully
not matter what the others say might be too strong. Indeed it leads to an
impossibility result that we have to give up eciency at least partially. I will
come to mechanisms with milder requirements in the last chapter.
Let me introduce Vickrey-Clarke-Groves mechanism, which is a prominent
one of strategy-proof mechanisms.
Let G denote the set of possible choices of public good provision. In the first
example, since the choice is whether to build a big one or a small one or nothing
we can write G = {0, 1, 2}, 0 refers to nothing, 1 refers to small and 2 refers to
big. In the second example the choice is just whether to under take or not we
can write G = {0, 1}. When the public good provision is continuous we would
write G = R+ . Given g G, denote its cost by C(g).
We assume that individuals preferences are quasi-linear, and each is preference over pairs of the level of public good provision and the amount of income
transfer to him is represented in the form
ui (g, xi ) = vi (g) + xi
Here vi (g) is is willingness to pay for the level of public good provision g, which
is a private information.
Given this, the VCG mechanism (actually its special case called the pivotal
mechanism) is defined as follows.
361
vi (g) C(g)
i=1
1
n
1
ti (v) =
C(g(v)) + max
vj (g)
C(g)
gG
n
n
j=i
vj (g(v))
C(g(v))
n
j=i
It will be easy to see the term n1 C(g(v)). It is equal division of the cost.
How should the term
n1
n1
vj (g(v))
max
vj (g)
C(g)
C(g(v))
gG
n
n
j=i
j=i
1
max
vj (g)
G(g)
gG
n
j=i
vj (g(v))
j=i
n1
C(g(v))
n
is the actual surplus for those other than i. Thus the dierence between them
is viewed as the loss for those other than i which is caused by is report. In this
mechanism i is required to pay this amount in addition to the equal division of
the direct cost. This is called Clarke tax.
Note that this Clarke tax is not paid to other individuals in the society but
paid to a third party outside of the society. In other words, you have to
throw that amount of income in to the garbage!
When we take the sum of payments we have
n
i=1
ti (v) C(g(v))
362
which
n does not in general hold with equality, and the society has to throw
i=1 ti (v) C(g(v)) units of income to the outside. It is somehow the social
cost of acquiring right information.
Can we avoid such cost? The answer is known to be NO. See for example
the corresponding chapter in Mas-Colell, Whinston and Green [21].
Let us solve for the Clarke tax in the second example. As before, let
(vA (0), vA (1)) = (0, 1.2), (vA (0), vB (1)) = (0, 0.6)
be the true willingness to pay by A and B, respectively.
1. Since v1 (1) + vB (1) C(1) = 1.2 + 0.6 1 > 0 the public good provision
is g(vA , vB ) = 1.
2. The payments are
{
1
tA (vA , vA ) = 1 + max 0.4
2
{
1
tB (vA , vB ) = 1 + max 1.2
2
} (
1
1, 0 0.4
2
} (
1
1, 0 1.2
2
)
1
1 = 0.6
2
)
1
1 = 0.5
2
We see that the sum of payments exceeds 1 and A is paying 0.1 units of Clarke
tax to a third party.
27.6
Exercises
Chapter 28
Indivisibility and
heterogeneity
This chapter covers allocation of indivisible and heterogeneous objects.
If there is a divisible and homogeneous good besides the indivisible and
heterogeneous objects we can used it as the mean of payment and we can use
some kind of auction mechanism. Here Im talking about situation in which
there is no such good, or even if it exists we are not allowed to pay by it either
for ethical or institutional reasons.
Let me give you one example, which sounds extreme but already quite actual.
Suppose there are two patients A and B who need kidney transplantations. A
has a brother who is willing to donate one of his kidneys but unfortunately his
blood type does not match As one. B has a sister who is willing to donate one
of her kidneys but unfortunately her blood type does not match Bs one. So
neither of A or B can get transplantation as they are.
However, what if As donors blood type matches Bs one and Bs donors
blood type matches As one? If A and B swap their donors both can have kidney
transplantations.
Here using money or goods for payment is now allowed ethically or institutionally. Of course it is not ethically obvious either if we should allow swapping
donors. One may say, whats wrong with this, as they both gain and hurt
nobody else. Another may say, Even if both gain and hurt nobody else, is
that all fine? It is an open question, but in any case it is admitted in some
states in US, and put in practice.
Anyhow, here I will argue how to obtain a nice allocation when such kind of
exchange is allowed.
363
28.1
364
Organ transplantation is a somewhat extreme example, but allocation of indivisible objects is in many places in our life, such as which house to live, in which
organization to work, which position to take in a given organization, and so on.
Here let us restrict attention to a model in which each individual can get at
exactly one unit.
Example 28.1 There are 6 individuals. Each i initially holds an object denoted
by ei . Each consumer has preference over the objects e1 , e2 , , e6 , and they
are given as follows.
1
e3
e5
e2
e4
e1
e6
2
e3
e4
e1
e5
e6
e2
3
e5
e4
e1
e6
e2
e3
4
e5
e2
e4
e6
e3
e1
5
e1
e3
e2
e5
e4
e6
6
e1
e2
e3
e6
e5
e4
365
Step k: Each remaining individual points to his best object among the remaining ones. When the arrows of pointing form a cycle, the individuals
forming the cycle trade according to the arrow directions and leave with
the objects.
Repeat the above procedure until all the individuals leave, then it ends after
finite rounds.
Let us do it using the above example.
Step 1: When each individual points to his best object we have
1 e3 , 2 e3 , 3 e5 , 4 e5 , 5 e1 , 6 e1
There is one cycle 1 3 5 1, hence 1 receives e3 , 3 receives e5 , 5
receives e1 and they leave.
Step 2: After Step 1, three individuals 2,4,6 are remaining. When each of
them points to his best object among the remaining ones we have
2 e4 , 4 e2 , 6 e2
There is one cycle 2 4 2, hence 2 receives e4 , 4 receives e2 and they
leave.
366
Step 3: Only 6 remains after Step 2. Then he points his object and we have
a self-cycle
6 e6
Thus 6 receives e6 and leaves.
END: The outcome is
(e3 , e4 , e5 , e2 , e1 , e6 )
There is another nice thing of the core allocation here. Consider a mechanism
in which each individual submits his preference and the above algorithm is run
based on the submitted preferences. It is known that it is always a dominant
strategy for every individual to submit his true preference in this mechanism.
In other words, it is impossible to gain by telling a lie in this mechanism. Since
each individual can try his best object according to his submitted preference goes
down as the algorithm proceeds there is no point in trying with less preferable
one first by telling a lie.
28.2
Matching
m2
w2
w1
w3
w4
m3
w4
w2
w1
w3
m4
w2
w3
w1
w4
w1
m4
m2
m3
m1
w2
m1
m3
m2
m4
w3
m2
m1
m4
m3
w4
m2
m3
m1
m4
367
run away from the current matching arrangement. Here we way that m1 w4
block the current matching. Likewise, m4 and w1 block the current matching.
A matching which is blocked will not last long. Or, even though it does not
last long, since we have to go through a costly and painful process in order to
dissolve a relationship once it is formed, it is better to avoid such unfortunate
situation beforehand. Thus we consider the following concept.
Definition 28.1 A matching is said to be stable if there is no blocking pair.
Stable matching exists. It is not uniquely determined, however. The set of
stable matching is segment-like, though, and each of its two extremes corresponds to the most favorable one for men which all men unanimously prefer and
the most favorable one for women which all women unanimously prefer, respectively. These two are called men-optimal matching and women-optimal
matching
Each of the two extreme stable matchings are found by the deferredacceptance algorithm. There are two ways to run this algorithm, one in
which men propose and the other in which women propose. The men-proposing
version obtains the men-optimal stable matching and the women-proposing version obtains the women-optimal stable matching.
Let me first explain the men-proposing version.
Step 1: (a) Each man proposes to his best woman.
(b) Each woman picks the best man among those who propose to her (if
any), keeps him and reject all the other proposers.
Step 2: (a) Each man who got rejected in the previous step proposes to his
second best woman.
(b) Each woman picks the best man among those who propose to her (if
any) and the man she has kept from the previous step (if any), keep him
and reject all the other proposers.
Step k: (a) Each man who got rejected in the previous step proposes to his
best woman among those to whom he has not yet proposed.
(b) Each woman picks the best man among those who propose to her (if
any) and the man she has kept from the previous step (if any), keep him
and reject all the other proposers.
Repeat this and stop when there is no men competing for a woman.
368
369
28.3
Exercises
2
e6
e3
e1
e5
e2
e4
3
e4
e3
e2
e5
e6
e1
4
e6
e5
e1
e4
e3
e2
5
e2
e5
e1
e4
e6
e3
6
e5
e3
e6
e1
e4
e2
w2
f1
f3
f2
f4
w3
f1
f2
f4
f3
w4
f2
f3
f4
f1
f1
w1
w2
w3
w4
f2
w3
w4
w1
w2
f3
w1
w4
w2
w3
f4
w3
w1
w4
w2
370
(ii) Find the firm-optimal stable matching by using the firm-proposing deferredacceptance algorithm.
Exercise 41 There are 6 students and 3 schools. Students are denoted by
i1 , i2 , i3 , i4 , i5 , i6 , and schools are denoted by s1 , s2 , s3 . Each student can enroll
to at most one school, and each school has 2 seats.
Students preferences over schools and schools priority rankings over students are as follows.
i1
s3
s2
s1
i2
s3
s2
s1
i3
s2
s3
s1
i4
s2
s1
s3
i5
s2
s3
s1
i6
s3
s1
s2
s1
i3
i1
i6
i5
i4
i2
s2
i2
i1
i4
i5
i3
i6
s3
i3
i4
i5
i6
i1
i2
(i) Run the so-called Boston mechanism, assuming that each student submits
his preference truthfully. The Boston mechanism works as follows: First, each
student applies to his first-best school, and each school admits students according to its priority ranking as far as its capacity allows. Second, each student
rejected in the previous step applies to his second-best school, but it the seats
there are already full by the previous step he is automatically rejected. Repeat
this.
Show that in this mechanism students can gain by misreporting their preferences.
(ii) Run the student-proposing deferred-acceptance algorithm.
Chapter 29
Eciency, welfare
comparison and fairness
I have used Pareto eciency as a criterion for welfare judgment in many places
in this book, but I also emphasized that the Pareto principle alone is silent about
whether a change in economic activity is desirable when it does not improve all
individuals welfare and about who should gain and who should lose, and how
much. Also I emphasized that Pareto eciency has nothing to do with notion
of fairness in any sense.
Thus I like to spend one chapter on what are often referred to as welfare
criteria which can say some change is better even when it is not a Pareto improvement, and on some discussions on fairness in resource allocation.
29.1
Change in economic activity does not always make all individuals better o.
Can we think of a criterion which supports a change even in such situations?
The so-called Kaldor criterion says that a change should be accepted if
we can make everybody better o by reallocating the allocation obtained by the
change than in the allocation before the change. It basically says, making the
pie is better. Formally it says
Definition 29.1 An allocation y = (y1 , , yn ) is a Kaldor-improvement of
x = (x1 , , xn ) if there exists an allocation y = (y1 , , yn ) with
n
yi1
i=1
n
i=1
yi1
i=1
yi2
i=1
371
yi2
372
Good 2
r
yB H
6
H
jr yB
xr B
yA
r
Y r
H
H yA
r
xA
- Good 1
yi i xi
yi i xi
vectors yA
yA and yB
yB are exactly opposite of each other.
Or, one can explain this by using so-called utility possibility frontiers. Fix a
representation of As preference uA and also a representation of Bs preference
uB . Given a vector of aggregate resources available to the society e = (e1 , e2 ),
let
I(e) = {(uA (xA ), uB (xB )) : xA1 + xB1 = e1 , xA2 + xB2 = e2 }
be the set of pairs of As utility and Bs utility which are obtained by allocating
e. Of course, this is only for describing trade-os between As gain and Bs gain
and utility numbers themselves have no quantitative meanings.
See Figure 29.2, in which two utility possibility frontiers are drawn, I(e)
obtained from e and I(e ) obtained from e . Then y on I(e ) makes a Kaldorimprovement of x on I(e) since we can pick y on I(e ) which is in the upper-right
of x.
Let me state two problems of the Kaldor criterion. One is,
373
B
6
I(e)
r
y
ry
xr
I(e )
- A
tion to (yA
, yB
), and (xA , xB ) is a Kaldor-improvement of (yA , yB ) through
the potential reallocation to (xA , xB ). Hence the Kaldor-criterion cannot rank
properly between allocations in general.
One can explain this by using the utility possibility frontiers. See Figure
29.4, in which two utility possibility frontiers are drawn, I(e) obtained from e
and I(e ) obtained from e . Then y on I(e ) makes a Kaldor-improvement of x
on I(e) since we can pick y on I(e ) which is in the upper-right of x. However,
x makes a Kaldor improvement of y as well, since we can pick x on I(e) which
is in the upper-right of y.
Let me introduce a criterion which is the complement of the Kaldor criterion. The so-called Hicks criterion says that a change should be accepted if
Good 2
r
yB H
6
H
jr yB
xr B
r
xB
yA
rH
Y
Hr
yA
rxA
r
xA
- Good 1
B
6
I(e)
x
r
r
y
ry
xr
I(e )
- A
374
375
B
6
I(e)
x
r
ry
I(e )
- A
Figure 29.5: Hicks improvement
xi1 =
i=1
n
i=1
xi1
i=1
xi2 =
xi2
i=1
xi i yi
xi i yi
376
The same comments as above apply to the Hicks criterion. Besides the
ethical issue, an allocation which Hicks-improves upon another may be Hicksimproved upon by the latter. See Figure 29.5 again. There we can never go
to the upper-right of y on I(e ) by reallocating x on I(e). Hence y is a Hicks
improvement of x. However, it is also the case that we can never go to the upperright of x on I(e) by reallocating y on I(e ). Hence x is a Hicks improvement
of y.
Since the Kaldor criterion and Hicks criterion are the complement of each
other, if we impose both we can avoid the problem that two allocations dominate
each other under the Kaldor criterion alone and under the Hicks criterion alone
respectively. The idea is due to Scitovsky.
Definition 29.3 An allocation y = (y1 , , yn ) is a Scitovsky-improvement
or of x = (x1 , , xn ) if y is both a Kaldor-improvement and a Hicks-improvement
of x.
If one is a Hicks-improvement of another it means the latter is not a Kaldorimprovement of the former. Hence it is always the case that if one is a Scitovskyimprovement of another the latter is not a Scitovsky-improvement of the former.
However, the ranking by the Scitovsky-improvement may be intransitive,
that is, it may have a cycle. See Figure 29.6. There y is a Scitovsky-improvement
of x, z is a Scitovsky-improvement of y, w is a Scitovsky-improvement of z, but
x is a Scitovsky-improvement of w. It is called the Gorman paradox. So the
Scitovsky-improvement doesnt help, unfortunately.1
The above problems of inconsistency do not occur in the quasi-linear environment in which there is no income eect, while the ethical problem I discussed
above still remains. There the Kaldor criterion and the Hicks criterion coincide
and the comparisons are determined by the amount of social surplus, that is,
making the pie is better.
Proposition 29.1 Assume the quasi-linear environment. Then, if y = (y1 , , yn )
and x = (x1 , , xn ) satisfy
n
i=1
i=1
1 Samuelson considered a weakening of the condition that one allocation is better than
another when the entire utility possibility frontier given by the former is above the entire
utility possibility frontier by the latter. Let me call this Samuelson improvement. This
leads to a cycle again when it is combined with the Pareto criterion, however. The same
example works. In Figure 29.6, y is a Pareto-improvement of x, z is a Samuelson-improvement
of y, w is a Pareto-improvement of z, but x is a Samuelson-improvement of w.
377
B
6
r
x
ry
zr
w
r
- A
yi2
= vi (xi1 ) + xi2 vi (yi1 ) +
w
n
for all i.
Again, let me emphasize that it is still totally silent about how we should cut
the pie, that is, how we should distribute the maximized surplus. Not only
that. If we want to rank between allocation in order to satisfy completeness, the
above result implies that we should be indierent between any distributions
of surplus. Thus it excludes any fairness or equity concerns.
29.2
378
due to lucks, and nobody is responsible for it, hence it has no information. In
such a primitive situation, only the total amount of resources should be the
relevant information.
On the other hand, we maintain the assumption that each person is responsible for his preference, that is, everybody is responsible for how he is.
In the two-good illustration, initial holding of each i = 1, , n denoted
by ei = (ei1 , ei2 ) is coming only from luck and he has no responsibility for
it.
nThus what
n is relevant is the sum of initial endowments E = (E1 , E2 ) =
( i=1 ei1 , i=1 ei2 ) only.
Then a feasible allocation is any x = (x1 , , xn ) such that
n
i=1
n
xi1
E1
xi2
E2
i=1
29.2.1
Equal division
29.2.2
Equal utility?
What about the idea that people should be equally happy, while they dont
have to consume the same thing? That is, given utility representation ui (xi )
for each i = 1, , n, consider that
ui (xi ) = uj (xj )
should hold for all i, j.
This needs a faith of interpersonal comparison of utility, however, and
we have to go outside of the framework of ordinal utility, in which utility representation is only an ordinal representation of preference ranking, it has no
quantitative meanings or it is not comparable across individuals. We dont even
have a definition of equally happy in a well-grounded sense.
When ui (xi ) is a representation of is preference i , lets say its double
u
bi (xi ) = 2ui (xi ) is also a representation of the same preference, and we cannot
379
As Good 2
- IA
16
Bs Good
OB
x
r
?
IB
xr
rE/2
rx
r x
- As Good 1
?
Bs Good 2
Figure 29.7: Allocations with envy and no envy
OA
say which one is the right representation. Also, when uj (xj ) is a represen1
tation of js preference j , lets say its cubic root u
bj (xj ) = uj (xj ) 3 is also
a representation of the same preference, and we cannot say which one is the
right representation.
Here the condition of fairness implied by ui (xi ) = uj (xj ) does not agree
with the condition of fairness implied by u
bi (xi ) = u
bj (xj ), and as far as we
dont have a particular faith we cannot judge which one is the right condition for equal happiness.
29.2.3
Now, is it possible to define a notion of fairness based only on preference relations, without bringing in a faith of interpersonal comparison of utility? Here
let me introduce the idea of fairness as absence of envy.
First let us define envy. Say that i envied j given allocation x = (x1 , , xn )
if i prefers js consumption to his one, that is,
x j i x i
holds. Since we are talking about a primitive situation in which nobody is
responsible for anything, this is a justifiable envy and we should avoid it.
Then an allocation is said to be envy-free if nobody envies anybody there.
See Figure 29.7. Here at x, A envies B and B does not envy A. At x , A
does not envy B but B envies A. At x , both envy each other. And, at x
there is no envy.
The equal division allocation is obviously envy-free, but it is not the only
envy-free allocation.
29.2.4
380
The answer is YES when goods are continuously divisible. Indeed, allocation
in competitive equilibrium from equal division is Pareto ecient and
envy-free.
Competitive equilibrium from equal
( Edivision
) is such that all individuals are
1 E2
initially given the equal division E
and they exchange in a competn =
n , n
itive market so as to obtain their final consumptions.
Since competitive equilibrium allocation is Pareto-ecient for arbitrary initial holdings, the allocation in competitive equilibrium from equal division is
Pareto-ecient. It is envy-free, since all the individuals have the same initial
holding and hence face the same budget constraint. Thus anybody could buy
what anybody else is consuming. Nevertheless he has chosen what he has chosen, it is cause he likes his one better than others ones. Hence there is no room
for envy here.
The answer is NO in general, however, when goods are indivisible. Consider
the simplest case that there is just one indivisible object (which everybody
wants) and nothing else. Since you cannot divide it if somebody receives it
everybody else envies him. Hence the only way to avoid envy is to throw it into
the garbage, but it is obviously inecient.
One can think of lets say dividing the time to use the object when it is
durable, though. Or, one can think of drawing a lottery in order to decide who
should get it. Since probabilities are divisible, eciency and envy-freeness are
compatible with each other at least in the ex-ante sense, while ex-post they
are obviously incompatible. Or, if there is another good which is continuously
divisible we can use it as a mean of compensation so that the two notions are
compatible. In any case, you will see that divisibility of goods or objects are
crucial for the compatibility.
29.3
381
21
22 .
2xB + 1 2xA
Again together with xA + xB = 1 this implies xA 34 , which is incompatible
with the previous inequality.
Intuitively, A prefers more leisure compared to consumption than B does
and B is willing to work more than A does. However, A is skillful and B has
almost no skill. This leads A to envy B on the ground that B is enjoying more
leisure what A likes, and leads B to envy A on the ground that A is enjoying
more consumption what B likes.
This suggests that we have to at least partially give up one of eciency and
envy-freeness as defined above.
Next, let us consider that each individual is responsible for his skill. Assume that each individual can produce the consumption good with a constant
marginal productivity of labor which is independent of the others labors. Let
382
Aj (1lj )
.
Ai
29.4
Exercises
ln xA1 + xA2
uB (xB ) =
2 ln xB1 + xB2
There is no production, and the total amount of Good 1 is 12, the total of Good
2 is taken to be 0 without loss of generality.
(i) Find the set of Pareto-ecient allocations.
(ii) Find the set of Pareto-ecient and envy-free allocations.
Chapter 30
Aggregation of preferences
and social choice
30.1
The argument in the last chapter suggests that we have to have a serious theory about who should gain or lose and how much when a change in resource
allocation or state of the society in general from one to another cannot make
everybody better o. The problem of the modern economics is rather that it is
not even utilitarian.
What do I mean? If you are a classical (and naive) utilitarian who believes
interpersonal comparability of utilities and the law of diminishing marginal utilities, you would claim you can justify income redistribution from the rich to the
poor on the ground that the marginal utility of wealth for the poor is greater
than that for the rich.
Since such claims need to bring in some faith, it would be a natural thing
to do for scientific economists to try to dispense with such things. The socalled new welfare economics which advocates the criteria due to Pareto,
Kaldor, Hicks and Scitovsky as illustrated in the last chapter is viewed as an
attempt to dispense with the concept of interpersonally comparable cardinal
utility.
As we saw above it was a failure, in the sense that it cannot in general induce
a consistent ranking. Or, even in the special case with no income eect in which
the rankings can be consistent it cannot accommodate with any equity concern.
This leads us to start with thinking of a complete and transitive ranking over
all the possible social alternatives, and make it clear how making a consistent
ranking requires to us to make a judgment about who should gain or lose and
how much.
383
384
i ui (x)
i=1
in which F is additive.
2. Nash social welfare function
W (x) =
ui (x)i
i=1
in which F is multiplicative.
3. Rawlsian social welfare function
W (x) =
min ui (x)
i=1, ,n
385
We saw that in the partial equilibrium environment ecient allocation maximizes the sum of surplus. Historically and educationally, this seems to have
caused a confusion with the idea of the greatest happiness of the greatest number. However, surplus is measurable through marginal rate of substitution and
does not require the notion of cardinal utility. Thus, taking the sum of surplus
is consistent with the framework of ordinal utility theory. Indeed, maximization
of the sum of surplus is totally silent about how the maximized surplus should
be distributed.
The above social welfare function approach seems to need a departure from
the ordinal utility framework. But how exactly should the departure described?
Moreover, it is not clear for us how F indeed describes the ethical views
expressed in the social welfare function. For example, consider the simplest
case of two individuals and consider a social welfare function in the form
W (x) = uA (x) + uB (x)
Should it be interpreted as treating the two individuals equally since it puts 1
vs. 1 weights to A and B respectively? You see that it is nonsense, as we can
take another pair of representations u
eA (x) = 2uA (x) and u
eB (x) = 0.5uB (x)
then we have
W (x) = 0.5e
uA (x) + 2e
uB (x)
which is now interpreted as giving 0.5 vs. 2 weights to A and B respectively.
Also, as we take another pair of representations u
eA (x) = euA (x) and u
eB (x) =
uB (x)
e
we have
W (x) = ln u
eA (x) + ln u
eB (x) = ln u
eA (x)e
uB (x),
which is equivalent to taking a Nash social welfare function with respect to u
eA
and u
eB .
Such ambiguities motivate us to take an axiomatic approach. It starts
with a set of axioms each of which clearly expresses a normatively meaningful
view by itself, instead of playing with a functional form.
Another motivation comes from political science, which has a long literature
of formal analysis of voting dating back to Condorcet. There the interest is
how of if voting nicely aggregates peoples diverse opinions in order to make
consistent social decisions.
The theory of social welfare function established by Arrow stands more directly on this line. The subsequent part of the chapter follows Arrows argument
and its sequels, in which the set of social alternatives are abstract. Also, we
assume that X is a finite set, for the sake of simplicity.
Such apparent dierence of set-up seems to have given Samuelson a room
for an excuse that Arrows negative result on aggregation does not aect the
new welfare economics. However, similar negative results have been obtained in
the setting of resource allocation. So without loss of generality we focus on the
case that X is abstract and finite.
30.2
30.2.1
Completeness
386
It is known that the Pareto principle or unanimity rule alone cannot rank between any two social alternatives. Thus it fails to satisfy completeness, our first
postulate, which says that we can always rank between (including indierence)
any two alternatives.
Let us confirm this with an example.
Example 30.1 There are two individuals, A and B. There are six social alternatives x, y, z, u, v, w. Preferences of A and B are as follows.
A
y
z
u
w
x
v
B
w
x
v
z
y
u
30.2.2
Transitivity
Another and even more familiar rule is majority rule. It says the society ranks
x over y if majority of people rank x over y. Here let me assume there are
odd number of people and everybody has a strict preference in the sense that
he is not indierent between any two alternatives.
However, the majority rule has the following problem, called voting paradox
Example 30.2 There are there individuals, A,B and C. There are three social
alternatives, x, y, z. Preferences of A,B and C are given as follows.
A
x
y
z
B
z
x
y
C
y
z
x
Then, the majority rule ranks, x over y, y over z but z over x, which forms a
cycle and violates transitivity.
387
30.2.3
Now let us come back to the idea of social welfare function. Since it is numericalvalued, the social ranking given by a social welfare function is complete and
transitive. Whats the problem then?
To illustrate, let us think of a specific example called Borda rule. Again
assume strict preferences. It works as follows: Let m be the number of alternatives. For individual i, let (xi1 , xi2 , , xim ) denote the list of the alternatives
in the descending order, in which xi1 is his best and xim is his worst. Assign m
to xi1 , m 1 to xi2 , and so on, and assign 1 to xim . Then for each alternative
take the sum of the assigned numbers across individuals.
Such scoring is viewed as a social welfare function
W (x) =
ui (x)
i=1
in which ui (x) = m k + 1 with x being is k-th best alternative. That is, this
rule fixes a class of representations in which everybodys highest utility is m
and the lowest is 1, and the width of the grid of utility is 1.
Let us do it with the following example.
A
x
y
z
q
B
y
x
q
z
C
z
y
x
q
388
this we obtain
W (x) = 9, W (y) = 10, W (z) = 7, W (q) = 4
and y is given the largest score.
Now whats any problem? To see that suppose As preference is dierent
and it is A as follows (where the preferences of B and C remain the same),
A
x
z
q
y
B
y
x
q
z
C
z
y
x
q
B
y
x
C
y
x
which remain unchanged across the cases, but in the former case y wins and
in the latter case x wins. Thus, here the ranking between x and y depend not
only on how individuals rank between just these two but also on how they rank
z and q together with those as well.
The requirement that how the society ranks between x and y should depend
only on how individuals rank between x and y and should be independent of
how they rank other alternatives (such as z and q) is called independence of
irrelevant alternatives or IIA. Note that this is an inter-profile axiom, in
the sense that it is a property connecting across dierent preference profiles.
IIA says in other words that in constructing the aggregate preference we
should take individual preferences in a purely ordinal way and should not read
any cardinal meaning in them. To see this, think of why Borda rule violates
IIA. Borda rule not only takes how one ranks between x and y but also how
much he likes x over y lets say through the comparisons with z and q. Here
such cardinal information is taken by means of setting all individualss highest
scores and lowest scores to be the same and the width of the grid of scores to
be the same as well and constant.
Why is IIA, or the requirement that preference aggregation should take a
purely ordinal standpoint, important? See the first case in which y is the top
element. A wont like it. Then if he reports A instead of A he can manipulate
389
30.3
Arrows theorem
Now consider a preference aggregation rule which satisfies the three axioms,
while the more formal presentation is relegated to the appendix.
1. Order: The aggregate preference satisfies completeness and transitivity;
2. Pareto: If everybody ranks x over y the aggregate preference ranks x
over y.
3. Independence of Irrelevant Alternatives (IIA): How the aggregate
preference ranks between x and y depends only on how the individuals
rank between x and y and not on how they rank any other alternatives.
Does there exist a preference aggregation rule which satisfies the above axioms, and what does it look like? Arrow [1] gave a negative answer.
Theorem 30.1 (Arrows theorem): When there are at least three individuals and at least three social alternatives, the only preference aggregation rule
satisfying the above axiom is dictatorship, in the sense that there is one individual and the aggregate preference always follows his preference.
Given our intuition that dictatorship is obviously bad, the above result suggests
that we need to give up or weaken at least one of them the axioms.
The proof is relegated to the appendix, but there youll see that the IIA
axiom is repeatedly and heavily used. When you see how the IIA axiom
is used there you will see it so thoughtless to draw an interpretation line the
impossibility of democracy based on Arrows theorem.
30.4
Mays theorem
When there are only two alternatives we dont have to worry about violating
transitivity. IIA is also met, vacuously. Hence the majority rule works, lets say.
Not only that, it is known to be the only rule which satisfies quite appealing
axioms.
390
Theorem 30.2 (Mays theorem): Suppose there are just two social alternatives and odd number of individuals. Then the only preference aggregation rule
satisfying the following axioms is the majority rule.
1. Anonymity: Every individual is treated equally.
2. Neutrality: The two alternatives are treated equally.
3. Monotonicity: Suppose x is winning under the current voting profile.
Then if some people switch their voting from y to x, the winner is again
x. Similarly for the opposite case.
30.5
If you give up IIA you can allow the use of the Borda rule and other ruled based
on social welfare functions. Violation of IIA leads to manipulability, though.
However, the problem of manipulability may not be serious when the society
is so large that the room for each individual to manipulate the outcome by
himself alone tends to be negligibly small. This is an empirical question.
Borda rule is characterized by the following set of axioms, in the setting in
which the set of individuals is variable (see Young [39] for example).
Theorem 30.3 The only rule satisfying the five axioms below is the Borda
rule.
1. Order: The aggregate preference is complete and transitive.
2. Neutrality: All the alternatives are treated equally.
3. Monotonicity: When the group consist of a singe individual the aggregate preference is just his preference.
4. Combination: Suppose the aggregated preference of group A ranks x at
least as good as y and the aggregate preference of group B, which does
not overlap with A, ranks x at least as good as y. Then the aggregate
preference of the group made by merging A and B again ranks x at least
as good as y. The same assertion holds also when we replace at least as
good as by over.
5. Cancellation: If the number of people who rank x over y is equal to the
number of people who rank y over x then the aggregate preference take
them to be equally preferable.
30.6
391
I did not write explicitly in the above explanation, but there is an implicit
assumption in Arrows theorem. It is that the social welfare function has to
take any combinatorially possible preferences into account. This assumption is
called Unrestricted Domain. It will better to explain using an example.
There are three alternatives, Left, Center and Right, which are denoted by
L, C and R. Combinatorially, there are 3 2 1 = 6 preferences as follows.
P1
L
C
R
P2
L
R
C
P3
C
L
R
P4
C
R
L
P5
R
L
C
P6
R
C
L
Look at P2 and P5. Here P2 ranks Left to be the best but ranks Right over
Center. It is possible combinatorially, but we would say it is unlikely. Likewise, P5 ranks Right to be the best but ranks Left over Center. It is possible
combinatorially as well, but we would say it is unlikely. Arrows theorem is
assuming that even such preferences should be taken into account.
However, when we are allowed to exclude such unlikely preferences and
narrow down to a restricted domain, Arrows theorem does not necessarily hold.
A prominent case of restricted domain is single-peakedness. Assume that
social alternatives are ordered in one dimension. Then preference is said to be
single-peaked if there is a bliss point (peak) and any alternative gets worse
as it goes far to the left from the peak and also as it goes far to the right from
the peak. This includes the cases that the peak is the left endpoint or the right
endpoint.
In the above example, single-peaked preferences are limited to four,
P1
L
C
R
P3
C
L
R
P4
C
R
L
P6
R
C
L
392
G1
0
2
G2
10
3
G3
30
3
G4
50
4
G5
70
3
G6
90
6
G7
100
4
Group G2 for example consist of three individuals who prefer 10 the best. In
contrast to the previous example, the above table describes only peaks. Hence
individuals in the same group may have dierent preferences. For example,
some in G2 may prefer 0 to 30 but some other may prefer 30 to 0. In any case,
however, such dierence does not aect the outcome of majority voting.
It is because of the following. Out of these seven groups the median is G5,
which is the group needed for making the majority from either side. Since the
sum G1, G2, G3 and G4 is 12 and they cannot form the majority by themselves
alone, and needs G5. Likewise, since the sum of G6 and G7 is 10 and they also
need G5 in oder to form the majority.
In the majority rule the peak for G5, which is 70, is chosen. When you compare between 70 and 50, while G1-4 prefer 50 since G5-7 prefer 70 the majority
prefer 70 to 50. Similarly for the comparison between 70 and 30,10,0, respectively. When you compare between 70 and 90, while G6-7 prefer 90 since G1-5
prefer 70 the majority prefer 70 to 90. Similarly for the comparison between 70
and 100. Thus, 70 beats everything else in the majority rule.
In general, we have the following result.
Theorem 30.4 (Median Voter Theorem): When individuals have singlepeaked preferences the majority rule selects the median of the peaks of preferences.
Note that median is dierent from mean. For example, in the distribution
of peaks below the mean is almost 50 but the median is 100.
Peak
#
30.6.1
G1
0
50
G2
10
0
G3
30
0
G4
50
0
G5
70
0
G6
90
0
G7
100
51
There are two candidate running for the oce. Policy is a point on the real line,
and each candidate commits to the policy he announces. Voters preferences are
393
30.7
394
395
for all i G,
for all j
/G
Then, since G is semi-decisive for x over y we have xP ( )y. Also, since everybody ranks y over z we have yP ( )z from Pareto. By Transitivity, we have
xP ( )z.
Since only the restriction on above is that x i z for all i G and there
is no restriction on how the individuals other than G rank between x and z, we
can take in the above argument so that
x i z x i z
holds for all i = 1, , n. Hence by IIA we obtain xP ()z.
Thus we have shown that G is decisive for x over z for any z = x, y. When
G is decisive for x over z it immediately implies G is semi-decisive for x over z.
Hence by switching between y and z in the above argument we can show that
G is decisive for x over y.
Lemma 30.2 If a group is semi-decisive for x over y, for all z it is decisive for
y over z.
Proof. Suppose group G is semi-decisive for x over y. Consider any preference
profile = (1 , , n ) such that y i z holds for all i G.
Suppose first that z = x, y, and consider any preference profile = (1
, , n ) such that
y i x i z
z
x, y
for all i G,
for all j
/G
Then, from Lemma 30.1 G is decisive for x over z, implying xP ( )z. Also,
since everybody ranks y over x we have yP ( )x from Pareto. By Transitivity,
we have yP ( )z.
Since only the restriction on above is that y i z for all i G and there
is no restriction on how the individuals other than G rank between y and z, we
can take in the above argument so that
y i z y i z
holds for all i = 1, , n. Hence by IIA we obtain yP ()z.
Thus we have shown that G is decisive for y over z for any z = x, y. When
G is decisive for y over z it immediately implies G is semi-decisive for y over
z. Hence by applying Lemma 30.1 with taking x = y, y = z and z = x we can
show that G is decisive for y over x.
396
Chapter 31
Implementability of social
choice objectives
If you are a benevolent policy maker, you would like to satisfy peoples voice
as much as possible. However, the argument in the previous chapter shows that
it is a non-obvious problem to aggregate peoples voices into peoples voice.
In this chapter, on the other hand, I shift the interest to how to design
a mechanism so that peoples voices are transmitted without lying. It will
make me look bad to say that people may lie. However, we already saw in
the chapter on public good that a badly designed mechanism induces people to
lie about the benefits from a project.
Normative welfare criteria or fairness are meaningful only when the reported
informations are true. It is empty if you boast of achieving some wonderful
normative criterion by carrying out a policy which is based on fake informations.
Hence this chapter considers the possibility that people may lie in order to
manipulate social choice, and what form of social choice is implementable under
such possibility.
31.1
398
31.2
First, let us think of implementation such that the message to be sent by each
individual is optimal for him no matter what messages the others send. In
other words, it is to design a mechanism so that nobody needs to worry about
strategic interdependence. This is the most demand kind of requirement.
In such type of implementation, each individual has to be able to make decision without knowing the other individuals preferences. Thus, each individuals
strategy on how to send his message depends only on his preference and not on
the others ones.
Definition 31.2 Given a mechanism = (S1 , , Sn , g), individual is privateinformation-dependent message is a function i : D Si .
That is, we are think of a game with incomplete information in which a strategy
take the form like
if my true preference is i I will send message i (i ).
Given a profile of preferences of those other than i denoted by i = (j )j=i ,
let i (i ) = (j (j ))j=i denote the profile of messages to be sent by those
other than i.
399
Definition 31.3 Given mechanism = (S1 , , Sn , g), a profile of privateinformation-dependent messages = (1 , , n ) forms a dominant strategy equilibrium if for all = (1 , , n ) Dn and for all i = 1, , n
si Si it holds
g(i (i ), i (i )) i g(si , i (i ))
Definition 31.4 Say that a social choice function f : Dn X is implementable in dominant strategy equilibrium if there exists a mechanism
= (S1 , , Sn , g) and a profile of private-information-dependent messages
= (1 , , n ) forming a dominant strategy equilibrium such that for all
= (1 , , n ) Dn it holds
g( ()) = f ().
Definition 31.5 Say that a social choice function f : Dn X is truthfully implementable in dominant strategy equilibrium if for all = (1
, , n ) Dn and for all i = 1, , n and all i D it holds
f (i , i ) i f (i , i ).
Let us prove a result called the revelation principle. This states that if a
social choice function is implementable in dominant strategy equilibrium it is
always truthfully implementable in dominant strategy equilibrium, which means
that it if without loss of generality to focus on the direct mechanism.
Theorem 31.1 (Revelation principle for implementation in dominant
strategy equilibrium) If a social choice function f : Dn X is implementable
in dominant strategy equilibrium then it is truthfully implementable in dominant strategy equilibrium.
Proof. Let = (S1 , , Sn , g) be a mechanism for which f is implementable
in dominant strategy equilibrium, and let = (1 , , n ) denote a profile of
private-information-dependent messages which form a dominant strategy equilibrium.
By the definition of implementability in dominant strategy equilibrium, for
all = (1 , , n ) Dn it holds g( ()) = f (), and for all i = 1, , n
and for all si Si it holds
g(i (i ), i
(i )) i g(si , i
(i ))
g(i (i ), i
(i )) i g(i (i ), i
(i ))
Since g( ()) = f () holds for all , we obtain that
f (i , i ) i f (i , i )
holds for all i = 1, , n and all i D.
400
We already saw the diculty of implementability in dominant strategy equilibrium in the chapter on public goods. Here I introduce the Gibbard-Satterthwaite
theorem in the abstract setting (Gibbard [10], Satterthwaite [30]).
Definition 31.6 Social choice function f : Dn X is weakly Pareto
ecient if for all = (1 , , n ) Dn there is no y X such that y i f ()
holds for all i = 1, , n.
Weakly means that it leaves the possibility that we can make somebody
strictly better o while keeping others indierent. In this sense it is a weak
requirement.
It doesnt really matter here, since we put an additional restriction (which
is rather technical) that one is never indierent between any two alternatives.
Let P be the subset of R consisting of preferences which are never indierent
between any two alternatives. Such preferences are called strict preferences.
In the domain of strict preferences D = P we have the following result.
Theorem 31.2 (Gibbard-Satterthwait theorem): If a social choice function f : P n X is truthfully implementable in dominant strategy equilibrium
and satisfies weak Pareto eciency then it is dictatorial. That is, there is some
i such that for all = (1 , , n ) Dn and for all x X it holds f (x) i x.
How should we interpret this negative result? In general, the implementation problem is harder when it is easier to say whatever you want, and it is
easier to say whatever you want when informational incompleteness is severer.
Since implementation in dominant strategy equilibrium (via the direct mechanism) requires that it is always optimal for everybody to report his preference
truthfully no matter what the others say, it is the hardest thing to do.
Thus, the range of what can be implemented becomes larger as the individuals know more each other, like in the Bayesian situation in which they cont
know actual realization of others preferences but they know the ex-ante probability distribution of those. For implementability in Bayesian Nash equilibrium,
see the corresponding chapter in Mas-Colell, Whinston and Green [21].
Also, it is harder to implement social choice objectives in abstract settings
without relying on concrete natures of the objects to be handled, as you have
to take logically possible but less likely preferences seriously and it makes easier
to say whatever you want. However, when concrete natures of the objects to
be handled are known it is harder to say whatever you want, and this makes it
easier to implement social choice objectives.
For example, in the problem of allocating indivisible objects in which everybody can get just one item, we already know that the core allocation is weakly
Pareto ecient and obtained vby the top-trading-cycle mechanism in which it
is a dominant strategy for everybody to submit his true preference. Also, when
individual preferences are single-peaked it is known that selecting the median of
peaks of reported preference is truthfully implementable in dominant strategy
equilibrium.
31.3
401
402
Definition 31.9 Social choice correspondence F : Dn X is said to be monotonic if for all = (1 , , n ) Dn , x F () and for all = (1 , , n
) Dn such that
x i y = x i y
holds for all i = 1, , n and y X, it holds x F ( ).
Example 31.1 The correspondence which maps each preference profile to the
set of weakly Pareto-ecient outcomes given that is monotonic.
Given an arbitrary preference profile = (1 , , n ) denote the set of
weakly Pareto-ecient outcomes under it is denoted by P ().
Pick any x P (), and consider any = (1 , , n ) such that
x i y = x i y
holds for all i = 1, , n and y X.
Now suppose x is not weakly Pareto-ecient under , that is, there exists
y such that y i x for all i = 1, , n. Then from the above assumption (by
taking the contrapositive) we have y i x for all i = 1, , n, which contradicts
to x being weakly Pareto-ecient.
Thus we obtain x P ( ).
Theorem 31.3 If social choice correspondence F : Dn X is Nash implementable then it is monotonic.
Proof. Pick any = (1 , , n ) Dn and x F (). Pick any = (1
, , n ) Dn such that for all i = 1, , n and y X it holds
x i y = x i y.
Take s N E(, ) such that x = g(s), then for all i = 1, , n and si Si
it holds
g(si , si ) i g(si , si )
By assumption we have
g(si , si ) i g(si , si ),
Since i = 1, , n and si Si are arbitrary, we obtain s N E( , ), implying
x F ( ).
It is shown by Maskin [22] that together with the following condition monotonicity is sucient for Nash implementability.
Definition 31.10 Social choice correspondence F : Dn X is said to allow
no veto power if for all and x X such that x is maximal element for n 1
individuals it holds x F () regardless of the remaining ones preference.
403
Theorem 31.4 Assume n 3 and D = R. Then if social choice correspondence F : Dn X satisfies monotonicity and no veto power it is Nash implementable.
Proof. We follow the proof by Repullo [28].
For each individual i = 1, , n let the set of his messages be Si = Rn
X N. That is, each i announces a list of all the individuals preferences,
a social outcome and a natural number. Denote the list of all the individuals
preferences which i submits by i = (i1 , , in ), let the social outcome he
submits by xi , and the number he submits by k i , then his message is denoted
by si = (i , xi , k i ).
Outcome function g : S X is defined as follows. Given any message profile
s = ((1 , x1 , k 1 ), , (n , xn , k n )),
1. if all but one individual, denoted i, report the same message (, x, k) and
if x F (), then let g(s) = xi if x i xi , based on the preference profile
reported by those n 1 individuals, and let g(s) = x otherwise.
2. Otherwise, pick i who reported the largest number and let g(s) = xi , while
tie-breaking is carried out based on certain priority ranking determined
beforehand.
First we show F () g(N E(, )). Pick any and x F (). Consider a
message profile such that all the individuals submit the same message (, x, 1).
Then for any individual Rule 1 applies if he deviates, implying that he cannot
change x by deviation, or can change it only to xi such that x i xi , which is
unprofitable. Hence this message profile is a Nash equilibrium, and implements
x.
Next we show g(N E(, )) F (). Let s N E(, ).
Case 1: Suppose all the individuals report the same message ( , x, k) and that
x F ( ).
Pick an arbitrary i and consider a message si = ( , y, k) with y being an
arbitrary outcome satisfying x i y. Then from Rule 1 in the definition of the
outcome function we have g(si , si ) = y.
Since s N E(, ) we have x i y. Hence by monotonicity it follows
x F ().
Case 2: Suppose all the individuals report the same message ( , x, k) and that
x
/ F ( ).
Pick an arbitrary i and consider a message si = ( , y, k ) with y being
an arbitrary outcome and k > k. Then from Rule 2 in the definition of the
outcome function we have g(si , si ) = y.
Since s N E(, ) we have x i y. Hence x is a maximal element according
to i . Since i was arbitrary, from the no-veto-power condition it follows x
F ().
Case 3: Suppose si = sj for some i, j. Without loss of generality, let s1 = s2 .
404
31.4
405
406
Proof. Pick any and x, y, and suppose x i y for all i. Take any {x, y}majorization of , denoted by , then from the weak Pareto eciency it follows
f ( )x, which implies xRf ()y. Since Rf () does not allow indierence, we
obtain xPf ()y.
Lemma 31.5 Rf satisfies Independence of Irrelevant Alternatives.
Proof. Pick any and x, y, and take any which agrees with over {x, y}.
Take any {x, y}-majorization of , denoted by . Without loss of generality,
let f ( ) = x, meaning xRf ()y.
Since is a {x, y}-majorization of as well, xRf ()y and xRf ( )y are
equivalent.
Lemma 31.6 If a social choice function f : P n X satisfies monotonicity and
weak Pareto eciency it is dictatorial.
Proof. From Arrows theorem, the preference aggregation rule Rf : Dn R
generated by f is dictatorial. Let i be the dictator there, and for any let x
denote the maximal element for i , then for all y = x it holds xRf ()y.
Suppose now that f () = x, then by taking any {x, f ()}-majorization
of , denoted by , we obtain f ( ) = f (), which means f ()Rf ()x, a
contradiction.
Postscripts
There are many issues which I could not cover in this book, partly because it
is intended to be an intermediate textbook, partly because of the limitation on
my knowledge and ability. Here I list such issues and raise relevant books and
articles, in order to help you to go to the next step. The choice is subjective
and by no means exhaustive.
First, let me refer to an advanced textbook which should be read after this.
Mas-Colell, Whinston and Green (MWG),Microeconomic Theory [21]
This has been the most popular textbook at graduate level in the last two
decades. Its a fat book even in the American standard, but it is the most
comprehensive one.
Game theory
General textbook on game theory I would recommend is Osborne and Rubinstein
(OR) [24] or Fudenberg and Tirole (FT) [6] its pretty much a matter of taste
which one you choose, OR is more concise and FT may be more exhaustive.
Also, FT covers more about incomplete information while OR focuses more on
complete information. Although, game theory chapters of MWG may be enough
to cover the first-year graduate course materials.
Brief illustrations of specific fields of game theory and recommended readings
follow.
Equilibrium refinement
As discussed in the text there may be many Nash equilibria in a game, and
some of them are unlikely. For example, Nash equilibrium allows that a weakly
dominated strategy is played, which will be unrealistic as you worry that the opponents may not precisely play the strategies to which such dominated strategy
is optimal. Also, Nash equilibria in the normal-form expression of an extensiveform game may not be subgame-perfect.
The theory of equilibrium refinement proposes concepts on robustness of
equilibria to various kinds of errors or perturbations, and attempts to narrow
407
POSTSCRIPTS
408
down the set of equilibria based on them. After reading the corresponding
chapters in OR or FT, you might want to go into a comprehensive book on this
literature such as Van Damme [35].
Equilibrium selection
The theory of equilibrium selection has a similar objective as the theory of
equilibrium refinement, in the sense that both propose criteria to narrow down
the set of equilibria, but it has a flavor more to bring in the selection criteria from
outside of the purely formal arguments on players optimization in games.
Focal point is one such notion. Also, risk dominance as explained in the text
is bringing risk attitudes which is somewhat outside of the description of the
game, since the standard kind of risk attitude is already taken into account in
the description of payos. The most reputable book in the literature is Harsanyi
and Selten [11].
Epistemic game theory
Epistemic game theory attempts to make it precise what level of rationality
indeed allows us to play a course of actions such as Nash equilibrum or other
solutions.
In the first chapter on game theory I gave a crude explanation on the relationship between iterated elimination of dominated strategy (or rationalizability) and common knowledge of rationality.
In the chapter on incomplete information, I gave a crude explanation of what
are behind the common prior assumption and any possible departure from it.
Also I covered some remarkable implication of common knowledge under the
common prior assumption, such as the impossibility of agreeing to disagreeing
and the impossibility of speculative trades.
In order to go further we need a precise definition of knowledge and common
knowledge in order to handle this. You can start with the chapters on knowledge
in OR and FT.
Repeated games
In the text I gave an illustration of how cooperation is sustained when the
game is repeatedly played indefinitely many times, by means of a pair of trigger
strategies. This argument is generalized into so-called folk theorem, where folk
means it had been informally known for a long time: any payo profile being
Pareto superior to the equilibrium payos in the one-shot setting is sustainable
by means of a strategy profile which consists of a more sophisticated version of
the trigger strategy.
The basic argument as illustrated in the text assumes so-called perfect monitoring, meaning that every player observes and remembers all of what all players
did before. Also it assumes complete information, meaning that every player
knows all the players characteristics. The literature proceeds by considering
imperfect monitoring or/and incomplete information.
POSTSCRIPTS
409
You can start with the chapters on repeated games in MWG, OR and FT.
Then you might want to proceed to a reputable advanced textbook such as
Mailath and Samuelson [20].
Evolutionary games
One can interpret equilibrium in games as a consequence of imitation or learning
process with trial and error at a population level, which is called evolutionary dynamic, rather than a consequence of deductive reasonings by individuals. Then
equilibrium is thought to be a course of action which is stable to invasions.
Evolutionary game theory, which was originated in the field of mathematical
biology, analyzes population dynamics of replication and characterize equilibria
that are stable to invasions. Since it was imported to economics, economicsbased game theorists have worked on evolutionary dynamics which are closer
to human responses and learning processes rather than direct biological processes. Such notions of stability are also related to equilibrium refinement and
equilibrium selection.
For the basics of evolutionary game theory you can consult Weibull [38].
Also, Vega-Redondo [36] puts more emphasis on implications to economic behavior.
Cooperative games
The games covered in the book are called non-cooperative games.
On the other hand, there is a literature called cooperative game theory. It
starts with describing what are attainable for any possible coalition, rather than
starting with describing strategies, proposes solution concepts, and provides (axiomatic) characterizations. The key notion there is coalitional stability. When a
coalition is to receive an outcome which is worse than what they can achieve by
themselves, they will block the current proposal. A stable outcome is such that
no coalition can block it. The literature also considers additional distributional
conditions and obtains sharper solutions.
There are two interpretations of such cooperative concepts. One is descriptive, in the sense that if an outcome is blocked or faces certain kind of objection
by a coalition it will not last, although this is somehow a detail-free argument
as it does not provide explicit descriptions of strategies and equilibrium course
of actions about how a coalition blocks the current proposal.
The other interpretation is that it is a normative goal. Even though an
unstable outcome does not last in the long run it takes time to dissolve and it
is very costly once you get married you cannot easily get divorced even if it
was a wrong one. Thus it is desirable to look for a stable outcome beforehand,
in order to avoid such tragedies, by designing mechanism nicely.
POSTSCRIPTS
410
Mechanism design
In general, the party with private information is called agent and the party
which cannot observe the agents private information (either his type or action)
is called principal. For example, in the moral hazard problem the principal
is the employer and the agent is the employee, and in the auction problem the
seller is the principal and the bidders are the agents.
Since the principal cannot observe or verify the agents private information
he has to design a mechanism so that it is profitable for them to reveal their
private information by their choices. This constraint is called incentive constraint.
In addition to the incentive constraint, being subject to the participation
constraint that it is profitable for agents to sign the contract rather than opting
out (otherwise the contract doesnt make), the principal designs the mechanism
in order to maximize his payo. This is what contract theory is about.
An important subject to which the mechanism design approach is relevant
is auction. Here the seller is to design the auction format in order to maximize
his expected revenue. Since the seller does not know the bidders willingness to
pay, he has to design auction so that bids reveal willingness to pay nicely.
For this direction you can start with the corresponding chapters in MWG and
FT. Then you might move onto reputable books such as Bolton and Dewatripont
[4] on contract theory and Krishna [16] on auction theory.
Mechanism design approach is adopted not only in analyzing the principals
profit maximization but also in looking for a solution in the problems in which
the principal is interpreted to be a planner whose objective is to achieve given
normative requirements.
Recall for example that ecient level of provision of pubic good is characterized by the Samuelson condition if individuals preferences are known.
However, there is a gap between this and how to implement it, for the policy
maker, the principal here, does not know those preferences.
Hence he has to know peoples preferences either directly or indirectly, but
there is no guarantee that people truthfully report their preferences. Thus it is
necessary to design a game in which people choose to report true preferences
by their choices, which is the incentive constraint. Pivotal mechanism is one
such example, but we already saw that we have to give up ecient resource
allocation at least partially there.
Mechanism design theory in this direction takes incentive constraint as the
basic condition and investigates if eciency and other normative postulates are
implementable, and how the implementing mechanisms look like.
For this direction of mechanism design, I recommend to read the corresponding chapters in MWG, OR and FT.
Mechanism design theory has also a role to complement the standard market
theory.
POSTSCRIPTS
411
As is discussed in the book the theory of competitive market leaves it unspecified who sets the prices in what procedure. In other words, it is a theory
about situations in which whoever sets the prices in whatever procedure the
resulting prices have to fall in certain values. However, as we saw the section
on social calculation debate such model does not have a formal distinction from
the model in which a central planner plays the roles of both competitive firms
and auctioneer.
This necessitates to provide an explicit description of setting prices either
as players strategies or a part of the rule of the game, that is, as a visible
hand instead of an invisible hand. It necessitates to investigate what type
of mechanism indeed implements competitive equilibrium allocation or ecient
allocation, and which one is more informationally ecient if if there exist
several such things.
Political economics
It is often said, economists very often lament that politicians do not choose
right economic policy, but isnt it naive to say so without thing about why
such policy is not chosen in the arena of politics?
Of course this is a problem. Political Economics (not Political Economy) is
hence a popular research program now, which approaches to the interaction between political process and economic dynamics from positive viewpoints rather
than normative.
Persson and Tabellini [26] is a representative textbook in this direction.
POSTSCRIPTS
412
POSTSCRIPTS
413
POSTSCRIPTS
414
POSTSCRIPTS
415
POSTSCRIPTS
416
response with 5% probability when he does not have the disease (which is the
false positive case).
Now, given that one gets a positive response, what is the probability that
he indeed as the disease?
Correct answer follows from Bayes rule, as
=
=
=
P (Disease|Positive)
P (Disease and Positive)
P (Positive)
P (Disease)P (Positive|Disease)
P (Disease)P (Positive|Disease) + P (Non-disease)P (Positive|Non-disease)
0.0001 0.9
0.0001 0.9 + 0.9999 0.05
0.0018
POSTSCRIPTS
417
POSTSCRIPTS
418
now that the posterior probability of Red after knowing that the drawn ball is
not Green is 2/5, instead of 1/2 derived from the Bayes rule.
Then, assuming risk neutrality for simplicity, despite that the decision maker
chooses 90 dollars if Red is Drawn over 75 dollars if Blue is drawn since
9000 13 = 3000 > 2500 = 7500 13 , after knowing that the ball drawn is not
Green he changes his mind since 9000 52 = 3600 < 4500 = 7500 35 , meaning
that himself ex-post does not follow the plan made by himself ex-ante.
Again there emerges a contradiction of why the current self makes a plan
which has to be overturned by his future selves.
The second problem is that if we attempt to describe behaviors of boundedly
rational agent as problem solving such problem looks more dicult than the
problem being solved by a rational agent. For example, a problem being
solved by multiple selves looks more complicated than a problem being solved
by a consistent individual. Also, a problem with constraints on the decision
makers ability of reasoning and computation looks more complicated (since
there are more constraints!) than a problem for the decision maker with
unlimited ability of reasoning and computation
Because of this, the models of satisficing tend to be an apparently more
sophisticated model of optimal stopping problem in which the decision maker
optimally decides when to stop searching.
I guess Im confusing or misunderstanding about the notion of solving at
some point, in the sense lets say that there are various levels of solving and
I am confusing between them. I guess such problems are to be resolved by
borrowing helps from neighboring disciplines such as computer sciences.2
The third problem is that under bounded rationality it is not any longer
clear what is better or worse, even for a single individual, and nevertheless
economics has to do welfare analysis as long as it is economics. From the standpoint of rationality what is better or worse for an individual is simply what
is revealed from his choice data. However, under departures from Criterion 1
above there may be disagreements among multiple selves about what is good.
Under departures from Criterion 2, an individual does not have knowledge and
information enough to judge what is good for him. Under departures from
Criteria 3 and 4 due to bias or limitation of capacity in reasoning and computation one may not be able to draw a judgment about if something is good
or bad.
This leads us to ask the following question: Should it be now allowed that an
external authority intervenes the judgment on what is good for an individual?
If so, to what extent?
In recent years, Thaler and Sunstein [33] advocate a concept what they call
libertarian paternalism. This says, given that human choices depend heavily
2 Salant [29] recently shows that if a decision rule which uses smaller computational amounts
than the fully rational one has to obey a framing eect. This proves that Im indeed confusing.
POSTSCRIPTS
419
on how choice frames are given, which is called framing eects, we should induce
peoples choice to better ones, by means of manipulating frames. For example,
in the enrollment to 401k-type pension plan it is known that the enrollment rate
is significantly higher when the default choice is enrollment and one has to sign
up when he likes to opt out, than when the default choice is non-enrollment
and one has to sign up when he like to enroll. Libertarian paternalism says the
former type of framing should be given.
It is a paternalism in the sense that an authority intervenes to manipulate
framing, but they say it is still libertarian in the sense that it is not enforcing a
particular alternative and it is still the individual who chooses.
It is nothing but an external authoritys subjective judgment, however, that
the object toward which the individuals choice is induced is good, although
the judgment has been relatively straightforward in the existing applications
of libertarian paternalism so far, such as health choices with costs being given
constants.
Final words
I guess it is now the good time to summarize (if I may) the mode of thinking
underlying the rationality approach. From the style of the arguments demonstrated in the book, you will notice the following mode of thinking:
1. Take things in a symmetric manner.
2. If you have to break symmetry, imagine an infinite hierarchy behind the
apparent asymmetry.
The symmetry principle tells you that if you have a reason to do something the
other will do as well, and the level of thinking or knowledge you reach must
have been reached by the others as well, and if your reasoning is disturbed by
some noise so will be for the others. The principle is sometimes criticized of
blinding ourselves to real asymmetries in the society. However, it is eective
in restraining the temptations to easily introduce asymmetries in an ad hoc
manner.
Like informational asymmetries and bounded rationalities, sometimes we
have to depart from the symmetry principle. The principle of infinite hierarchy,
however, tells us that even if you have more information than the others you
face uncertainty at a deeper level, with regard to how you believe about how
the others believe about the world, and so on. Also it tells us that even if you
are smarter than the others you may be fooled at another level. The arguments
in bounded rationality will tempt you to think that you can outwit the market
consistently, while it is impossible when everybody is rational. However, even
if you know well about some dimension of irrationality and know well about
how to manipulate people based on that, you may be actually manipulated at
another dimension. Nobody will be free from that.
POSTSCRIPTS
420
I bet you got frustrated, Why are you still sticking to the rationality approach, knowing that it goes nowhere? You seem to be just toying with what you
dont believe, or toying with impossibilities. Why are you so cynical? How can
it be anything other than an intellectual decadence? If Im allowed to use cheap
rhetoric, I would say that enduring such contradictions and tensions within is
the only way to escape from the dualism of alienated mechanical application of
a theory and uncritically accepting or reacting to reality as it is.
Bibliography
[1] Arrow, Kenneth J. Social choice and individual values. Vol. 12. Yale university press, 2012.
[2] David Austen-Smith and Jerey S. Banks, Positive Political Theory I: Collective Preference, University of Michigan Press, Ann Arbor, 1999.
[3] Shlomo Benartzi and Richard H. Thaler, Naive Diversification Strategies in
Defined Contribution Saving Plans, American Economic Review 91 (2001),
79-98.
[4] Bolton, Patrick, and Mathias Dewatripont. Contract theory. MIT press,
2005.
[5] Patrick Joyce, The Walrasian t
atonnement mechanism and information,
RAND Journal of Economics, Vol. 15, No. 3 (1984), pp. 416-425.
[6] Fudenberg, Drew, and Jean Tirole. Game theory. 1991. (1991).
[7] John Geanakoplos, Common knowledge, in Handbook of Game Theory with
Economic Applications Volume 2 (1994), 1437-1496.
[8] John Geanakoplos and James Sebenius, Dont Bet On It: A Note on Contingent Agreements with Asymmetric Information, Journal of American
Statistical Association (1983), 78(382): 224-226.
[9] P. Ghirardato, Revisiting Savage in a conditional world, Economic Theory,
Vol. 20 (2002), pp. 83-92.
[10] Gibbard, Allan. Manipulation of voting schemes: a general result. Econometrica: journal of the Econometric Society (1973): 587-601.
[11] Harsanyi, John C., and Reinhard Selten. A general theory of equilibrium
selection in games. MIT Press Books 1 (1988).
[12] Oliver Hart, On the optimality of equilibrium when the market structure
is incomplete, Journal of Economic Theory 11 (1975), 418-443.
[13] Georey A. Jehle and Philip J. Reny, Advanced Microeconomic Theory,
Prentice Hall; 2nd edition, 2000.
421
BIBLIOGRAPHY
422
[14] Ehud Kalai and Ehud Lehrer, Rational learning leads to Nash equilibrium,
Econometrica Vol. 61, No. 5, 1993, pp. 1019-1045.
[15] David M. Kreps and Evan L Porteus, Temporal Resolution of Uncertainty
and Dynamic Choice Theory, Econometrica, vol. 46 (1978), pages 185-200.
[16] Krishna, Vijay. Auction theory. Academic press, 2009.
[17] David Laibson, Golden eggs and hyperbolic discounting, Quarterly Journal of Economics 112.2 (1997): 443-478.
[18] Lars Ljungqvist and Thomas J. Sargent, Recursive Macroeconomic Theory,
The MIT Press; 2nd edition, 2004.
[19] Mark Machina, Dynamic consistency and non-expected utility models of
choice under uncertainty, Journal of Economic Literature, 28 (1989), 16221668.
[20] Mailath, George J., and Larry Samuelson. Repeated games and reputations: long-run relationships. OUP Catalogue (2006).
[21] Andrew Mas-Colell, Michael Whinston and Jerry Green, Microeconomic
Theory, Oxford University Press, 1995.
[22] Maskin, Eric. Nash equilibrium and welfare optimality*. The Review of
Economic Studies 66.1 (1999): 23-38.
[23] Herve Moulin, Axioms of Cooperative Decision Making, Cambridge University Press, 1991.
[24] Martin J. Osborne and Ariel Rubinstein, A Course in Game Theory, MIT
Press, 1994.
[25] Pazner, E., and D. Schmeidler, 1974. A diculty in the concept of equity.
Review of Economic Studies 41, 441-443.
[26] Torsten Persson and Guido E. Tabellini, Political Economics: Explaining
Economic Policy, MIT Press, 2000.
[27] Ariel Rubinstein, Modeling Bounded Rationality, MIT Press, 1998.
[28] Rafael Repullo, A simple proof of Maskins theorem on Nash implementation, Social Choice and Welfare, (1987), 4, 39-41.
[29] Salant, Yuval. Procedural analysis of choice rules with applications to
bounded rationality. The American Economic Review 101.2 (2011): 724748.
[30] Satterthwaite, Mark Allen. Strategy-proofness and Arrows conditions:
Existence and correspondence theorems for voting procedures and social
welfare functions. Journal of economic theory 10.2 (1975): 187-217.
BIBLIOGRAPHY
423
[31] Shapley, Lloyd and Scarf, Herbert, On cores and indivisibility, Journal of
Mathematical Economics, vol. 1 (1974), pages 23-37.
[32] Vernon L. Smith, Experimental auction markets and the Walrasian hypothesis, Journal of Political Economy, Vol. 73 (1965), pp. 387-393.
[33] Richard H. Thaler and Cass R. Sunstein, Nudge: Improving Decisions
About Health, Wealth, and Happiness, Yale University Press, 2008.
[34] Kahneman, Daniel, and Amos Tversky, Prospect Theory: An Analysis of
Decision under Risk, Econometrica, 47 (1979), 263-291.
[35] Van Damme, Eric. Stability and perfection of Nash equilibria. Springer,
1991.
[36] Vega-Redondo, Fernando. Evolution, games, and economic behaviour. Oxford University Press, 1996.
[37] Xavier Vives, Small income eects: A Marshallian theory of consumer surplus and downward sloping demand, Review of Economic Studies 54 (1)
(1987) 87-103.
[38] Weibull, Jorgen W. Evolutionary game theory. MIT press, 1997.
[39] H. Peyton Young, An axiomatization of Bordas rule, Journal of Economic
Theory, 1974, vol. 9, issue 1, pages 43-52
4
.
3
Exercise 43 Let Good 1 be consumption good at Period 1, and Good 2 be consumption good at Period 2.
(i) Describe by means of indierence curves the preference of a consumer who cares
only about consumption at Period 1.
(ii) Describe by means of indierence curves the preference of a consumer exhibiting
perfect substitution between consumptions at two periods, such that he cares more
about the current consumption.
(iii) Describe by means of indierence curves the preference of a consumer exhibiting
perfect substitution between consumptions at two periods, such that he cares more
about the future consumption.
Answer 2 Take consumption in Period 1 on the horizontal axis and consumption in
Period 2 on the vertical axis. Then,
(i) Vertical and parallel straight lines.
(ii) Downward-sloping parallel straight lines, which are steeper than negative 45-degree
lines.
(iii) Downward-sloping parallel straight lines, which are flatter than negative 45-degree
lines.
Answer 3 Take consumption at State 1 on the horizontal axis and consumption at
State 2 on the vertical axis. Then,
(i) This is the case of perfect substitution, Indierence curves are downward-sloping
parallel straight lines, with the absolute value of slope being (2/3)/(1/3) = 2.
(ii) This is the case of perfect complementarity. Indierence curves are parallel Lshaped, aligned along the 45-degree line x1 = x2 .
Answer 4 Let x1 denote consumption in Period 1 and x2 denote consumption in
Period 2.
(i) u(x) = x1 or its arbitrary monotone transformation.
(ii) u(x) = ax1 + bx2 with a > b > 0, or its arbitrary monotone transformation.
(iii) u(x) = ax1 + bx2 with b > a > 0, or its arbitrary monotone transformation.
Answer 5 Let x1 denote consumption at State 1 and x2 denote consumption at State
2.
424
425
x 5 x 5 = 25 x1 5 x25 .
x1 1 2
3
2 2
2
x 5 x 5 = 35 x15 x2 5 .
x2 1 2
Answer 6 (i)
(ii)
(iii)
M RS(x) =
x5 x5
x1 1 2
3
3
2 5 5
x x2
5 1
2
2
3 5 5
x x
5 1 2
x5 x5
x2 1 2
(iv) (i) x 1 (2 ln x1 + 3 ln x2 ) =
(ii) x 2 (2 ln x1 + 3 ln x2 ) = x32 .
(iii)
2x2
.
3x1
2
.
x1
(2 ln x1
x1
(2 ln x1
x2
M RS(x) =
+ 3 ln x2 )
+ 3 ln x2 )
2/x1
2x2
=
.
3/x2
3x1
2x2
3x1
p1
,
p2
implying
3p1
5p1
x1 =
x1 = I
3p2
2
2I
Thus we obtain x1 (p, I) = 5p
. By plugging this into the previous formula we obtain
1
3I
x2 (p, I) = 5p1 . By putting these into the utility representation we obtain the indirect
2
3p1
x1
2p2
3p1
2p2
)3
3p1
2p2
)3
5
x1
2
5
2
5
e1,p1
e1,p2
1
a2 p2 + 2b2 p1
a2 p2 + b2 p1
b2 p 1
a2 p2 + b2 p1
2
5
3
5
2
5
3
5
2I
5p1
and x2 (p, I) =
426
((
Hence
p1 + p1
p1
p1 +p1
CS =
p1
(
1
)2
5
1 I
p1
p1 + p1
)2 )
5
2I
p1 + p1
2I
dq =
ln
5q
5
p1
1
400 684.6
Answer 10 (i) 300 + 1.04
1
1
(ii) 200
+
500
885.95
300
+
2
1.1
1.1
300
300
(iii)
= 1.06
300 = 5300
t=1 1.06t1 = 1 1
0.06
1.06
Answer 11 Since marginal utility of current consumption is x11 and marginal utility
of future consumption is 0.95
, MRS of future consumption for current consumption is
x2
x2
M RS(x) = 0.95x
.
1
x2
From the tangency condition M RS(x) = 1+r we obtain 0.9x
= 1+r = 1.04, implying
1
x2 = 1.04 0.95x1 .
1
1
By plugging this into the lifetime budget equation x1 + 1.04
x2 = 40 + 1.04
30 = 68.85
and solve for x1 , then we obtain x1 = 35.3 and x2 = 34.88.Saving is thus 4035.3 = 4.7.
Answer
12 (i) Let z denote the certainty equivalent. Then from z = 0.4 256 +
(ii) Let denote the probability of rain. Then from 225 = 361 + (1 ) 64 we
obtain = 7/11.
Answer 13 Let t denote the investment on A. Then the final income at State 1 is
1.2t + 0.8(100 t) = 80 + 0.4t, and that at State 2 is 0.9t + 1.5(100 t) = 150 0.6t.
Hence the expected utility is
0.6 ln(80 + 0.4t) + 0.4 ln(150 0.6t)
By taking the first order condition through taking the derivative by t we obtain
0.6
0.4
0.6
+ 0.4
=0
80 + 0.4t
150 0.6t
By solving the above we obtain t = 70. Hence the investment on A is 70 and that on
B is 30.
Answer 14 MRS of Good 2 for Good 1 in consumer i is
a
i
ai xi2
2 x
M RSi (xi ) = b i1 =
bi xi1
i
2 x
i2
p1
p2
we obtain
b2i p21
xi1
a2i p22
427
By plugging this into the budget equation p1 xi1 + p2 xi2 = p1 ei1 + p2 ei2 and solve for
xi1 , then we obtain the demand function
xi1 (p) =
p1
e
p2 i1
p1
p2
+ ei2
a2
i
b2
i
p2
1
p2
2
xi2 (p) =
b2i p21
a2i p22
p1
e
p2 i1
p1
p2
+ ei2
a2
i
b2
i
p2
1
p2
2
i=1
p
1
p
2
a2
i
b2
i
n
+ ei2
ei1
( )2 =
p
i=1
p1
2
B xB2
B xB1
A xA2
A xA1
Combine this with the feasibility condition xA1 + xB1 = e1 , xA2 + xB2 = e2 , then we
obtain
B (e2 xA2 )
A xA2
=
A xA1
B (e1 xA1 )
Hence the set of Pareto-ecient allocations is
{
}
B (e2 xA2 )
A xA2
(xA , xB ) :
=
, xA1 + xB1 = e1 , xA2 + xB2 = e2
A xA1
B (e1 xA1 )
2
1
5
x13 x2 5 .
P1 (x1 ,x2 )
2
x13 x2 5 and T RS(x1 , x2 ) = M
= 5x
.
M P2 (x1 ,x2 )
3x1
(i) From the profit maximization condition pM P1 = w1 and pM P2 = w2 , we have
1
5
1 23 15
x x2
3 1
1 4
1
p x13 x2 5
5
p
5x2
3x1
w1
,
w2
x2 =
w1
w2
implying
3w1
x1 .
5w2
428
3/7 15/7
w2
5/7
3/7 10/7
w2
5x2
3x1
w1
,
w2
hence we have
3w1
x1 .
5w2
3/8
w2 y 15/8
3
2
C(y)
= y 4y y+7y+9 = y 2
y
3
2
V C(y)
= y 4yy +7y = y 2 4y + 7.
y
3
2
4y + 7 +
3/8
18
.
y
(ii) AV C(y) =
(iii) M C(y) = (y 4y + 7y + 9) = 3y 8y + 7.
(iv) Since AC (y) = 2y 4 y182 becomes 0 at y = 3, the minimum of average cost is
AC(3) = 10. Hence the break-even point is p = 10.
(v) Since AV C(y) = y 2 4y + 7 = (y 2)2 + 3 the minimum of average variable cost
is 3. Hence the shut-down point is p = 3.
(vi) Shut down and produce nothing when p < 3. When p 3 from p = M C(y)
we
have p = 3y 2 8y+7. There are two solutions to this quadratic equation, y = 4 33p5 ,
but since the marginal cost
curve must be upward-sloping at the profit maximization
point we have S(p) = 4+ 33p5 ..
Summing up, we obtain
{
0,
when p < 3
S(p) =
4+ 3p5
, when p 3
3
429
Answer 20 Let p denote the relative price of Good 1 for Good 2. Then consumer is
optimal consumption of Good 1 is determined by M RSi (xi ) = 21xi1 = p alone here
a2
p
a2i
=
2
4p
2c
k
i=1
k=1
71.43.
7
(iii) Since revenue in A is RA (yA ) = (90yA )yA marginal revenue there is M RA (yA ) =
90 2yA . Since revenue in B is RB (yB ) = (120 2yB )yB marginal revenue there is
M RB (yB ) = 120 4yB . Since marginal cost is M C(yA + yB ) = yA + yB the profit
maximization condition is
90 2yA
yA + yB
120 4yB
yA + yB
165
7
23.57, yB =
135
7
19.29 and pA =
465
7
430
Answer 23 (1)
1. For A, Y is strictly dominated by W. Hence eliminate Y.
2. For B, G is strictly dominated by F. Hence eliminate G.
3. For A, V is strictly dominated by X. Hence eliminate V.
4. For B, H is strictly dominated by I. Hence eliminate H.
We cannot eliminate any further. Hence {X, Z, W } is the set of As strategies that
survive the elimination, {F, I} is the set of Bs strategies that survive the elimination.
(2)
1. For A, Y and Z can never be a best response, hence they are eliminated.
2. For B, G can never be a best response, hence it is eliminated.
3. For A, V can never be a best response, hence it is eliminated.
4. For B, H can never be a best response, hence it is eliminated.
We cannot eliminate any further. Hence {X, Z, W } is the set of As rationalizable
strategies, {F, I} is the set of Bs rationaizable strategies.
(3) There are two pure-strategy Nash equilibria, (X, I) and (W, F ).
Answer 24 Since nobody can announce any number greater than 100 it is impossible
that the half of the average of announced numbers is greater than 50. Hence any
number greater than 50 can never be a best response. Thus, nobody announces any
number greater than 50, therefore it is impossible that the half of the average of
announced numbers is greater than 25. Hence any number greater than 25 can never
be a best response. Thus, nobody announces any number greater than 25, therefore it
is impossible that the half of the average of announced numbers is greater than 12,5.
And so on.
By repeating this argument, only 0 is the rationalizable strategy for anybody.
Answer 25 To explain As best response, consider for example that B chooses F and
C chooses K. Then compare between As payo in the upper-left cell in both matrices.
If A chooses X he gets 2, if chooses Y he gets 5, hence the best response is Y and
underline 5. Do the similar things for A, and for B and C do the known underlining
exercises. Then we obtain
sA = X
B
F
G
K
2, 1, 4
7, 2, 1
C
L
4, 3, 8
2, 1, 3
sA = Y
B
F
G
K
5, 1, 8
4, 9, 3
C
L
1, 3, 2
3, 4, 5
Thus, there are two pure-strategy Nash equilibria, (X, F, L) and (Y, G, L).
Answer 26 Since As expected payo given (pA , pB ) is
uA (pA , pB )
As best response to pB is
{1},
[0, 1] ,
BRA (pB ) =
{0},
when
when
when
pB < 3/5
pB = 3/5
pB > 3/5.
431
Bs best response to pA is
{1},
[0, 1] ,
BRB (pA ) =
{0},
when
when
when
pA < 3/7
pA = 3/7
pA > 3/7.
2
max 96yA 2yA
2yB yA
yA
48 yB
2
432
48 yA
2
=
yA
48 yB
48 yA
, yB
=
2
2
Hence we obtain yA
= yB
= 16.
(ii) Bs optimal choice given yA is solved in the same way as above, at least mathematically. Then Bs strategy is a function fB given by
fB (yA ) =
48 yA
2
=
=
implying yA = 24.
58 + pB
4
58 + pA
4
58 + pB
,
4
pB =
58 + pA
4
pA = pB = 11.6
(ii) Bs optimal choice given pA is solved in the same way as above, at least mathematically. Then Bs strategy is a function fB given by
fB (pA ) =
58 + pA
4
433
=
=
7
p
2 A
143
2
= 0 we
C
N
CC
10, 10
12p, 5 6p
CN
5 + 5p, 10p
12p, p
NC
10 11p, 10 + 2p
0, 5 5p
NN
5 6p, 12p
0, 0
and as far as 0 < p < 1 NC is the dominant strategy for B. Hence Bayesian-Nash
equilibrium is (C, N C) when p 10/11 and (N, N C) when p 10/11.
Answer 33 The corresponding Bayesian game is
B
N
S/G, S/B
20p 10, 0
0, 20p 10
S/G, N/B
10p, 10 + 10p
0, 20p 10
B
N/G, S/B
10 + 10p, 10p
0, 20p 10
N/G, N/B
0, 20p 10
0, 20p 10
Note that since 0 < p < 1 Bs only best response to As Buying is (N/G, S/B), and
anything is optimal for B when A is not buying.
There are two BNE for all 0 < p < 1, (N, (N/G, S/B)) and (N, (N/G, N/B)).
When 0 < p < 1/2 there is one more BNE, (N, (S/G, S/B)).
Answer 34 Let vi denote bidder is willingness to pay. Let bi = (b1 , , bi1 , bi+1 , , bn )
denote a profile of biddings except for is. Then an entire bidding profile is denoted
by (bi , bi ).
Then bidder is payo in the all-pay auction game is
{
vi bi , if bi > maxj=i bj
Ui (bi , bi ) =
bi ,
if bi < maxj=i bj
Ignore the case of ties since it is of probability zero here.
Denote the bidding function in Symmetric Bayesian-Nash equilibrium by :
[0, 1] R.
Suppose all bidders other than i are following . Then if i bids bi his expected
payo is
vi P rob(max (vj ) < bi ) bi = vi F ( 1 (bi ))n1 bi
j=i
434
(vi g)2
i=1
n
i=1
435
2(vi g) = 0 we have
g(v) =
n
1
vi
n i=1
Since surplus maximization for the those other than i is determined similarly,
Clarke tax to be paid by i is
(vj g(v))2
(vj g)2
ti (v) = max
g
j=i
2
(
)2
n
1
1
vj
vj
vk
vk
n1
n
j=i
j=i
k=i
j=i
k=1
n
1
1
vk
vk
n1
n
k=i
k=1
436
Answer 41 (i) Round 1: i1 applies foe s3 , i2 applies for s3 , i3 applies for s2 , i4 applies
for s2 , i5 applies for s2 , and i6 applies for s3 .
s2 admits i4 , i5 , and s3 admits i1 , i6 .
Round 2: i2 applies for s2 , but the seats are already full and gets rejected. i3 applies
for s3 , but the the seats are already full and gets rejected. Round 3: i2 and i3 apply
for s1 , and get admitted.
Here for example i2 can report s2 s3 s1 instead of his true preference, given that
the others are reporting truthfully, then he is admitted s2 , which is better for him
than s1 .
(ii) Round 1: i1 applies for s3 , i2 applies for s3 , i3 applies for s2 , i4 applies for s2 , i5
applies for s2 , and i6 applies for s3 .
s2 keeps i4 , i5 and rejects i3 . s3 keeps i1 , i6 and rejects i2 .
Round 1: i2 applies for s2 , and i3 applies for s3 .
s2 keeps i4 , i2 and rejects i5 . s3 keeps i6 , i3 and rejects i1 .
Round 3: i1 applies for s2 , and i5 applies for s3 .
s2 keeps i2 , i1 and rejects i4 . s3 keeps i3 , i5 and rejects i6 .
Round 4: i4 applies for s1 , and i6 applies for s1 .
s1 keeps i4 , i6 , s2 keeps i2 , i1 , and s3 keeps i3 , i5 .
Summing up, we obtain
i1 s2 , i2 s2 , i3 s3 , i4 s1 , i5 s3 , i6 s1 .
Answer 42 (i) Since MRSs are equalized at Pareto-ecient allocations, we have
M RSA (xA ) =
2
1
=
= M RSB (xB ),
xA1
xB1
which implies xB1 = 2xA1 . Combine this with the resource constraint xA1 + xB2 = 12,
then we obtain xA1 = 4, xB1 = 8.
Since the eciency condition is silent about allocation of Good 2 here, the set of
ecient allocations is
{(xA , xB ) : xA1 = 4, xB1 = 8, xA2 + xB2 = 0}
(ii) Since the allocation of Good 1 is xA1 = 4, xB1 = 8 from eciency, the condition
that A does not envy B is
ln 4 + xA2 ln 8 + xB2
and the condition that B does not envy A is
2 ln 8 + xB2 2 ln 4 + xA2
Combine these with the constraint xA2 + xB2 = 0, we obtain 12 ln 2 xA2 ln 2.
Hence the set of ecient and envy-free allocations is
}
{
1
(xA , xB ) : xA1 = 4, xB1 = 8, xA2 + xB2 = 0, ln 2 xA2 ln 2
2