Vague Language and Context-Dependence:
An Experimental Study.∗
Wooyoung Lim† and Qinggong Wu‡
May 22, 2017
Abstract
In this paper we broaden the existing notion of vagueness to account for linguistic
ambiguity due to language used in a context-dependent way. This broadened notion,
termed as literal vagueness, necessarily arises in any Pareto-optimal equilibrium in
many standard conversational situations. With controlled laboratory experiments we
find that people can make use of literally vague language to effectively transmit information.
Keywords: Communication Games, Context-Dependence, Vagueness, Laboratory Experiments
JEL classification numbers: C91, D03, D83
This study is supported by a grant from the Research Grants Council of Hong Kong (Grant No. GRF16502015).
†
Department of Economics, The Hong Kong University of Science and Technology.
Email:
wooyoung@ust.hk
‡
Department of Decision Sciences and Managerial Economics, Chinese University of Hong Kong. Email:
wqg@baf.cuhk.edu.hk
∗
1 Introduction
Language is vague, as this very sentence exemplifies —– what does “vague” mean here,
after all? Despite economists’ persisting study of language and communication, we still
lack a good explanation to account for vagueness. On the contrary, as Lipman (2009)
forcefully argues, there is no benefit to vague language in a typical conversation. To explain vagueness, as Lipman (2009) notes, we need to relax the standard “full rationality”
assumption.
In this paper, we offer our explanation of why language is vague. Instead of deviating
from the full-rationality paradigm, we take an alternative perspective. In particular, we
broaden the notion of vagueness and show that vagueness in this broader sense has an
advantage in many standard conversational situations.
Vagueness language admits a borderline case. For example, no one can claim that there
is a clear cutoff between red and orange that allows him/her to separate the color spectrum
into two disjoint sets. Thus, if a speaker uses a vague language, listener’s posterior beliefs
cannot be concentrated on a precise subset of a state space. Such vague languages cannot
create an efficiency improvement from a non-vague language.
We argue that in many cases, there is an exogenous relation between messages and
a particular sub-dimension of states they conventionally signify. The other dimension
with which the literal meaning of messages does not have any exogenous relation can be
regarded as context. One important observation from our daily life conversations is that
the presence of context and the use of a language dependent upon the context may impress
a listener that the language is vague. We call such context-dependent use of language
and its consequential vagueness literal vagueness. We show that literal vagueness can
indeed be a fundamental property of the optimal language in many standard conversations
in which the context — either directly payoff-relevant or not — matters. Thus literal
vagueness has a solid efficiency foundation, which makes it a plausible explanation for
why language is often vague.
We then experimentally investigate whether people could make use of literally vague
languages to efficiently communicate. We consider a conversational environment in which
two speakers speak sequentially to a listener, and the way the later speaker talks may rely
on what the earlier speaker has said. In this simple environment in which the optimal
language is necessarily literally vague, we observed that subjects indeed tended to do so.
Several variations of the environment with varying degrees of complexity in coordinating
on the optimal, literally vague language provide further supporting evidence. Although
our experimental result alone does not suffice to identify the efficiency advantage of literal
2
vagueness as the explanation for the omnipresence of vague languages, we view them as
an important first step to understanding vagueness of language.
In the rest of the introduction, we shall discuss the theoretical aspects of vagueness
and introduce our notion of literal vagueness.
Vagueness
Take the simplest case of communication: A speaker wishes to tell the state of the world
ω ∈ Ω to a listener by sending the latter a message m ∈ M , where Ω is the set of all states
and M the set of all available messages. For simplicity assume Ω and M are finite.
The language of the speaker is formally his message-sending strategy λ ∶ Ω → ∆(M )
where ∆(M ) is the set of all lotteries over M . Thus, if the speaker knows ω is the state,
he uses the lottery λ(ω) to draw a message.
Multiple states can be grouped together if the same lottery is used in them. In other
words, language λ induces a partition Π on Ω such that for any two states ω and ω ′ , Πλ (ω) =
Πλ (ω ′ ) if and only if λ(ω) = λ(ω ′ ). Language λ is not vague if the following is satisfied:
V. For any two states ω and ω ′ , if Πλ (ω) ≠ Πλ (ω ′ ) then λ(ω) and λ(ω ′ ) have disjoint
supports.
Otherwise λ is vague. What is vagueness meant to capture? Note that if a language
λ is not vague then for each message m ever used there is a unique block π(m) ∈ Πλ such
that π(m) contains the states in which m is drawn with positive probability. Therefore
the message m helps the listener precisely narrow down the possible states to π(m) in the
sense that the listener’s posterior is simply his prior concentrated on π(m). If λ is vague
then it does not induce such sharp demarcation of Πλ : Upon receiving some message m
that is used with positive probability in distinct blocks π ∈ Πλ and π ′ ∈ Πλ the listener
remains uncertain which block obtains. Theorem 1 of Lipman (2009) establishes that
under common interest there is always a Pareto-optimal language that is not vague, thus
negating the necessity of vague language.1
1
The definition we give here is weaker than that given in Lipman (2009), as the latter rules out any
randomization. The difference is minimal, though, because Condition V implies the weaker version only
admits randomization among messages that are entirely synonymous. It is straightforward to verify that
all results in Lipman (2009) hold under the weaker definition as well.
3
Literal Vagueness
We would like to use an example to demonstrate that vagueness as defined above may
exclude languages that typically impress us as vague.
Example 1. In a conference, Bob, a graduate student, asks Alice, a distinguished scholar,
“How do you like my talk”. Alice’s reply depends on: 1) whether she likes Bob’s talk (L) or
not (N L), and 2) whether she has time for further conversation (T ) or is in a hurry to the
next session (N T ). If Alice does not have time she says “It’s interesting.” (l). If she has time
she says “It’s interesting” if she likes the talk, or “The research can benefit from some further
development” (nl) if she does not like the talk.
Is Alice’s language vague? We can formalize the example with state space Ω = {L, N L}×
{T, N T } and M = {l, nl}. Alice’s language is technically not vague as defined above, because
she uses only nl in state (N L, T ) and only l in other states. However, Alice’s language may
strike Bob as vague. For instance, if Alice says “It’s interesting”, Bob is uncertain whether
Alice indeed likes the paper, or is just in a hurry to end the conversation.
Taking a closer look, we see that Bob’s confusion is rooted in the relation of the messages “It’s interesting” and “The research can benefit from some further development” to
the states. These messages have a literal interpretation that only concerns Alice’s opinion
about Bob’s talk and is irrelevant whether Alice has time. Literal interpretation represents an exogenous relation between messages and the states they conventionally signify.
Thus, an impression of vagueness would result if the language does not precisely demarcate the “literal state space” {L, N L} as a non-vague language does for the true state space.
Languages like Alice’s are deemed as non-vague according to the definition above because in the standard model of conversation, Ω and M are taken as abstract sets lacking
structure and relation. No exogenous literal interpretation exists outside the speaker’s
idiosyncratic use of the messages, and thus interpretation is purely endogenous. In the
background of everyday conversation, on the other hand, there is a focal language which
endows messages with exogenous literal interpretation, and the sense of vagueness often occurs because of the inconsistency between the common literal interpretation and
an individual’s idiosyncratic language use, as is the case in Example 1. Thus it would be
beneficial to study vagueness in a model which literal interpretation can be built in.
For this purpose we alter the standard model as follows: Suppose the state space has
a two-dimensional product structure such that Ω = F × C, where a typical f ∈ F is called
a feature and c ∈ C a context. The message space M has an exogenous literal relation
with F , but not C, that is, messages in M are conventionally used to describe features but
4
has no literal relation with the context.2 For instance “I’m well” is conventionally used
to describe wellness but not hurriedness. Given this setup we propose a new notion of
vagueness. Specifically, a language λ is said to be not literally vague if Πλ satisfies the
following conditions:
L1. For any two states ω and ω ′ , if Πλ (ω) ≠ Πλ (ω ′ ) then λ(ω) and λ(ω ′ ) have disjoint
supports.
L2. For any π ∈ Πλ there exist Fπ ⊂ F and Cπ ⊂ C such that π = Fπ × Cπ .
Otherwise λ is said to be literally vague.
If a language is not literally vague then it is not vague, because Condition L1 is identical to Condition V. Thus literal vagueness is broader than vagueness. The additional
Condition L2 further demands that, for a language to be not literally vague, every message m ever sent precisely narrows down the possible features for the listener in the sense
that upon receiving m the listener’s posterior about F is his prior concentrated on Fπ(w)
where π(w) is the unique block of Πλ with m being used with positive probability.
Alice’s language in Example 1 is literally vague, because, given the message “It’s interesting”, Bob’s prior does not become concentrated on any subset of F = {L, N L}.
The Efficiency Advantage of Literal Vagueness
We wish to demonstrate that, unlike strictly vague languages, literally vague languages
may have an efficiency advantage. To show that we adopt the standard cheap talk model
with common interest as Lipman (2009) does:3 The speaker is informed of the state ω ∈ Ω
and the listener is not. The listener’s job is to choose an action a from a set of available
actions A, and consequently both players receive a payoff of u(ω, a). To help the listener, the
speaker sends a message m ∈ M . To incorporate literal vagueness we impose the product
structure Ω = F × C on the state space as discussed above. We do not require a common
prior. However, we do assume that each state is believed by at least one of the players as
possible i.e. its probability is positive.
The following proposition establishes the potential efficiency advantage of literally
vague languages by showing that if the choice problem is not trivial and if messages are
limited then there are payoff functions u given which the optimal language must be literally vague.
2
3
The notion of context in our framework resembles the “prior collateral information” in Quine (1960).
An earlier, non-formal study on language use in this situation is due to Lewis (1969).
5
Proposition 1. If ∣F ∣ ≥ 2, ∣C∣ ≥ 2, ∣A∣ ≥ 2 and min(∣F ∣∣C∣ − 1, ∣A∣) ≥ ∣M ∣ ≥ 2 then there is
a payoff function u ∶ Ω × A ↦ R given which the sender’s language in any Pareto-optimal
perfect Bayesian equilibrium must be literally vague.
Proof. Pick distinct f, f ′ ∈ F , c, c′ ∈ C and a, a′ ∈ A. Let A∗ be an arbitrary subset of A/{a, a′ }
such that ∣A∗ ∣ = ∣M ∣ − 2. Note that A∗ can be the empty set. Such A∗ exists because ∣A∣ ≥
∣M ∣ ≥ 2 implies ∣A/{a, a′ }∣ = ∣A∣ − 2 ≥ ∣M ∣ − 2 ≥ 0.
Consider the utility function u such that:
1. a is the unique optimal action in states (f, c) and (f ′ , c′ ).
2. For any action a∗ ∈ A∗ there is a unique state (f ∗ , c∗ ) ∉ {(f, c), (f ′ , c′ ), (f, c′ )} such that
a∗ is the unique optimal action in (f ∗ , c∗ ).
3. a′ is the unique optimal action in the rest of the states, including (f, c′ ).
Such u exists because ∣A∗ ∣ ≤ ∣F ∣∣C∣ − 3. By construction of u there is a partition P on Ω such
that:
1. For any p ∈ P there is some α(p) ∈ A∗ ∪ {a, a′ } where α(p) is the unique optimal action
in any state ω ∈ p.
2. α(p) ≠ α(p′ ) if p ≠ p′ .
Let λ be the speaker’s language in a Pareto-optimal perfect Bayesian equilibrium.
Clearly λ can be described by the bijection µ ∶ P ↦ M such that λ(ω) = µ(P (ω)). Message µ(P (ω)) can be interpreted as the recommendation “the optimal action is α(P (ω)) in
the current state”. Therefore Πλ = P . Thus {(f, c), (f ′ , c′ )} ∈ Πλ since only for these two
states the optimal action is a. λ is literally vague because {(f, c), (f ′ , c′ )} is not a product
of any subsets of F and C.
Remark 1. The conditions ∣F ∣ ≥ 2, ∣C∣ ≥ 2 and ∣A∣ ≥ 2 reflect that the state space and
the decision problem are not trivial. ∣M ∣ ≥ 2 reflects that communication is not trivial.
∣F ∣∣C∣ − 1 ≥ ∣M ∣ implies that available messages are not adequate to fully describe the
state space, that is, it is not possible to use a distinct message to denote each state. It
is noteworthy that the whole issue of vagueness would be less of a concern if messages
were not limited, for obviously full description is best if feasible.
Remark 2. That ∣A∣ ≥ ∣M ∣ is not a necessary condition. The assumption is made to
simplify the proof, which is based on a particular construction. Although the construction
6
may strike us as contrived and peculiar, the spirit of the idea underlying optimality of
a literally vague language does not crucially rely on that ∣A∣ ≥ ∣M ∣, and is quite general.
The spirit is the following: if the context-dimension of the state is also payoff-relevant,
and the messages are not abundant enough, then in the optimal language a message may
be used in a context-dependent way, because the messages, albeit literally descriptive
only of features, need to contain information about the context as well. Therefore, to
correctly interpret which subset of features a message points to, it is necessary to know
the corresponding context; in contrast, without knowledge of the context, the message is
vague in terms of which subset of features it points to.
Remark 3. That the result holds without the common prior assumption is because
the constructed payoff function u implies that ex post optimality is achieved in a Paretooptimal equilibrium. Thus the potential conflict of interest due to heterogeneous priors
becomes irrelevant concerning the Pareto-optimal equilibrium because it is the most preferred strategy profile by both players regardless of prior beliefs.
Context-dependence
As mentioned in Remark 2 above, the optimality of literal vagueness is due to the nature of the decision problem being context-dependent, that is, the payoff depends on
the context-dimension of the state of which the focal language is not literally descriptive.
Consequently, the language may also be used in context-dependent ways. The same message refers to different subsets of features depending on the context that obtains, because
information about the context is also worth communicating. It is not unusual that the
meaning of a word depends on the context. Indeed, the whole linguistic field of pragmatics is dedicated to the study of context-dependence.
It is important to distinguish between two types of context-dependence embedded in a
language. The first type refers to the case that the sender’s choice of message depends on
the context. For instance, there are two common Cantonese expressions of “thank you”,
“m-goi” and “do-ze”. People would say “m-goi” to thank for something non-materialistic,
e.g., “m-goi” is used when people ask for help or express graditude for a favor. In contrast, “do-ze” is mostly used to thank for something materialistic, e.g., a gift.4 This type of
context-dependent use of language needs not be literally vague, because distinct messages
may simply convey distinct information about the context without differentiating between
information about the feature. The second type of context-dependence in a language refers
to the case that the correct interpretation about which subset of features is signified by
4
However, no such distinction in expressing gratitude exists in Mandarin Chinese.
7
a message depends on the context. In this case, not only the speaker’s choice of message
depends on the context, but he also chooses the same message to signify different subsets
of features depending on the context. In Example 1, Alice’s use of “It’s interesting” corresponds to this case. Thus the second type is stronger than the first type, and it necessarily
implies literal vagueness. When thereafter we discuss context-dependence in a language
we mean the second type.
A crucial source of context-dependence and thus literal vagueness is that the focal
language is essentially one-dimensional: It only has terms literally relating to the featuredimension of the state. Why would people use such an overly parsimonious language? One
important reason is its simplicity. After all, using a richer language is more costly, and
often the richer language may not be feasible at all — for instance, if the common language
shared by people from different linguistic backgrounds is only descriptive of one aspect of
the state. People with different native tongues can use facial expression in person, or
emoji , and emoticon :-) on the internet, as a common language to describe emotion, but
this common language lacks terms descriptive of any other aspect of the world.
In Example 1 the context is payoff-relevant. It is possible that even if the context is
not directly payoff-relevant, people still prefer to use a literally vague language. This is
particularly the case if the context is meta-linguistic. Here we give two examples, both
variants of the standard two-person cheap talk model with common interest.
Example 2. Alice wishes to describe to Bob, a tailor, the color she has in mind for her next
dress. Alice may have a small vocabulary for colors, only with typical terms like “blue”,
“red”, or she may have a large vocabulary for colors, which in addition also includes terms
denoting subtle colors like “maroon” and “turquoise”. Alice’s vocabulary is unknown to Bob,
and can be interpreted as the context-dimension of the state. Alice’s optimal language may
be literally vague.
This is an example of the more general model studied in Blume and Board (2013). In
their model, the available messages for the speaker may vary depending on the speaker’s
language competence. Consequently, in optimal communication speakers of different language competence may use the same message to indicate different sets of payoff-relevant
states, implying that the optimal language is vague in their model. Example 2 shows that
language competence can be incorporated as the context-dimension of the state, and in
the model with the enriched state space the optimal language is no longer vague, but is
instead literally vague.
Example 3. Alice wishes to describe the height of Charlie to Bob, so that Bob can recognize
Charlie at the airport and pick him up. Moreover,
8
• Alice knows whether Charlie is a professional basketball player or not.
• Bob may or may not know whether Charlie is a professional basketball player or not.
• Alice may or may not know whether Bob knows whether Charlie is a professional
basketball player or not.
Alice can also describe Charlie as “tall” or “short”. Her optimal choice of the word can
depend both on Charlie’s height and on whether she believes Bob knows Charlie is a professional basketball player or not. For instance, if she believes Bob knows Charlie is a
professional basketball player then she says Charlie is “tall” only if his height is above 6
foot 10, whereas if she believes Bob does not know then Charlie is “tall” only if his height
is above 6 foot 2. If we take Alice’s belief about Bob’s knowledge as the context, then her
language is literally vague.
This example is adapted from a more casual discussion in Lipman (2009). A more formal model of the example is as follows: The listener’s knowledge about the payoff-relevant
state space is not common knowledge. In particular, prior to the conversation the listener
may receive with some probability an informative private signal which narrows down the
possible set of payoff-relevant states. Moreover, the speaker may also receive a private
(possibly noisy) signal which tells him whether the listener has received that informative
signal or not. It is very easy to construct a typical decision problem under which, in the
optimal equilibrium, the meaning of a message from the speaker depends on the signal
that he receives. If we do not incorporate the speaker’s signal as part of the state then
the optimal language is vague. However, in the enriched model in which the speaker’s signal is considered as the context and the payoff-relevant state as the feature, the optimal
language is not vague but literally vague.
Finally, we show an example demonstrating that when there are multiple speakers who
speak sequentially, the way later speakers talk may rely on what earlier speakers have
said. Earlier messages become the context on which later messages depend – the context
of the bilateral conversation between a later speaker and the listener is then endogenously
generated in the larger-scale multilateral conversation. The following example shows one
of such situations.
Example 4. Alice and Bob interviewed a job candidate. Alice observes the candidate’s
ability A and Bob observes his personality B. A and B are independently and uniformly
distributed over [0, 100]. The best decision is to hire only if A + B ≥ 100.
Bob and Alice sequentially report their observations in a binary fashion to the committee
chair Charlie who is responsible for the recruitment decision, with Alice speaking first. The
9
best strategy is the following: Alice reports “A is high” if A ≥ 50 and “A is low” otherwise.
If Alice reports “A is high” then Bob reports “B is high” if B ≥ 25 and “B is low” otherwise;
if Alice reports “A is low” then Bob reports “B is high” if B ≥ 75 and “B is low” otherwise.
Charlie hires the candidate if and only if Bob reports “B is high”.
Alice’s report provides the context with which Bob’s report is to be correctly interpreted.
Bob’s language is literally vague because, for instance, the correct interpretation of “B is
high” depends on Alice’s report.
This example will serve as the benchmark model for our experiments.
Related Literature
Economists have been studying strategic language use since the canonical “cheap talk”
model proposed by Crawford and Sobel (1982). Despite the formidable literature since
generated on this subject, only until recently was linguistic vagueness given the academic
attention it deserved when Lipman (2009), posing the question of “why is language vague”,
argued that vague language is not optimal if all parties in the conversation à la Crawford
and Sobel (1982) have 1) common interest and 2) full rationality. Following in this quest,
Blume and Board (2013) explore a situation in which the linguistic capability of some
of the conversing parties are unknown and show that this uncertainty could make the
optimal language vague. The same authors further investigate the effect of higher-order
uncertainty about linguistic ability on communication in Blume and Board (2014a), and
find that, in the common interest case, vagueness persists but efficiency loss due to higherorder uncertainty is small. The relation between our paper and the above papers has been
discussed in depth in the Introduction.
It is well known that even in the presence of conflict of interest endogenous vague
language still has no efficiency advantage in the cheap talk framework. However, Blume
et al. (2007) show that exogenous noise in communication, which forces vagueness upon the
language, can bring Pareto improvement. Blume and Board (2014b) further confirm that
the speaker may intentionally take advantage of the noise to introduce more vagueness
in the language. These papers differ from ours in that we focus on the common interest
environment, and moreover we study the efficiency-foundation of endogenous vagueness.
Context-dependence is the source of literal vagueness in our paper. Given common
interest, context-dependence arises when available messages are not sufficient to fully
communicate the complexity of the situation. Within the cheap talk framework, Tian
(2016) discusses that, when the message space is small, how the optimal language changes
with the common prior, that is, the context. In the framework of experimentation, formally
10
similar to a model of sequential communication with limited messages and/or memories,
Smith et al. (2016) and Wilson (2014) investigate how a participant optimally uses his
language conditioning on contexts, which are messages received from earlier participants.
Indeed, an implication of the characterization of the Pareto-ranking of communication
mechanisms due to Wu (2016) is that a mechanism which allows the participants to have
more flexibility in using context-dependent language dominates another one which allows
less.
On the experimental side, a few recent papers investigate how the availability of vague
messages improves or preserves efficiency. Serra-Garcia et al. (2011) show that vague
languages help players preserve efficiency in a two-player sequential-move public goods
game with asymmetric information. Wood (2016) explores the efficiency-enhancing role of
vague languages in a discretized version of canonical sender-receiver games à la Crawford
and Sobel (1982). Agranov and Schotter (2012) show that vague messages are useful in
concealing conflict between a sender and a receiver that, if it is publicly known, would
prevent them from coordinating and achieving an efficient outcome. All these papers take
the availability of vague messages as given and study how it affects players’ coordination
behavior. On the contrary, we explore how a message endogenously obtains its vague
meaning. For more comprehensive discussion of the experimental literature on vague
languages, see the recent survey by Blume et al. (2017).
2 Experimental Games
We would like to know whether people actually do use languages in a literally vague fashion. This is a very important step towards understanding whether literal vagueness, due
to its efficiency advantage, stands firm as an explanation for some linguistic vagueness
that we experience everyday, because the whole efficiency foundation for vagueness is
pointless if people cannot make use of literally vague languages effectively. Of course,
the finding that people can effectively make use of literally vague language alone does not
sufficiently prove that the prevalence of linguistic vagueness is founded on the efficiency
advantage of literal vagueness in our sense, yet it is a worthwhile first step. It is with this
question in mind we design our experiments.
We use the situation described in Example 4 to examine whether, and if so, how, people
use literally vague language to converse. This example has a number of important attractive features which make it particularly suitable for our purpose. Firstly, as we shall show
below, there is a moderate degree of efficiency advantage to the optimal literally vague
language, which renders literal vagueness potentially useful but not entirely crucial for
11
communication. Secondly, the situation is simple and straightforward, and thus should in
principle shorten the period of learning and make the experimental results closer to the
eventual stable language, or in terms of Lewis (1969) the “convention”. Thirdly, the situation covers the most general environment in which the context, being the message from
Alice, is endogenously generated. Fourthly, given that for optimal communication Alice
does not need to use a literally vague language whereas Bob does, having both senders in
the same game gives us an additional comparison regarding how people’s use of language
depends on the conversational environment they face.
The Benchmark Game
There are three players: two senders, Alice and Bob, and one receiver, Charlie. Alice
privately observes a number A and Bob a number B. A and B are independently and
uniformly drawn from [0, 100]. Alice sends a message to Bob, where the message is either
“A is Low” or “A is High”. Alice’s message is unobservable to Charlie. Then Bob sends
a message to Charlie, where the message is either “B is Low” or “B is High”. Charlie
receives Bob’s message and chooses an action: UP or DOWN. Players’ preferences are
perfectly aligned. If A + B ≥ 100 and UP is chosen, or if A + B ≤ 100 and DOWN is chosen,
then all receive a payoff of 1. Otherwise all receive a payoff of 0.
Equilibria of the game can be classified into three categories:5
1. Bob babbles: Bob uses a strategy given which Charlie’s posterior about A + B remains the same as the prior upon seeing any message chosen by Bob with positive
probability. Whether Alice babbles or not does not have bearing on the outcome. No
information is transmitted and Charlie is indifferent between the two actions regardless of the message he receives. Accordingly, the success rate, which is the probability
that Charlie chooses the optimal action, is 50%.
2. Only Alice babbles: Bob sends message “B is High” if B > 50 or “B is Low” if B < 50.6
Charlie chooses UP seeing “B is High” and DOWN otherwise. Only Bob’s information
is transmitted. Accordingly, the success rate is 75%.
3. Neither babbles: Alice sends “A is High” if A > 50 and “A is Low” if A < 50.
5
A similar categorization of equilibria persists for variations of the benchmark game to be introduced.
For those variations we will skip the analysis of equilibria in which someone babbles, because they are of no
theoretical consequence and do not correspond to the experiment results.
6
Of course there are outcome-equivalent equilibria in which Bob uses the messages in the opposite way.
We do not explicitly itemize such equilibria here and thereafter.
12
• If Alice’s message is “A is High” then Bob sends “B is High” if B > 25 and “B is
Low” if B < 25.
• If Alice’s message is “A is Low” then Bob sends “B is High” if B > 75 and “B is
Low” if B < 75.
Charlie chooses UP seeing “B is High” and DOWN otherwise. Accordingly, the success rate is 87.5%.
Clearly any equilibrium in which no one babbles is Pareto-optimal. Because the messages available to Bob explicitly refer to the value of B alone, Bob’s language in any Paretooptimal equilibrium is clearly literally vague, because the set of values of B a particular
message, say “B is high”, describes depends on Alice’s message, which serves as the context.
To best test whether indeed the players effectively use the optimal literally vague language, we propose to consider the counterfactual in which they don’t. This can be formally
modeled as Bob being constrained to use the same messaging strategy regardless of Alice’s
message. In this case, the best Bob can do is to always use the cutoff of 50, and the corresponding success rate is 75%. To examine the results and test whether the counterfactual
holds, we thus should pay particular attention to how Bob’s use of language depends on
Alice’s message.
In addition, for a literally vague language to be effectively used, the listener should also
be aware of the underlying context-dependence and correctly take that into consideration
when making decisions. However, Charlie’s strategy in the counterfactual would be the
same as that in an optimal equilibrium so that considering the benchmark only will not
allow us to identify whether the listener fully understand the optimal, context-dependent
language or not. We thus need further variations for sharper identification.
Variation 1 (Charlie hears Alice).
Consider a variation of the Benchmark: the only difference is that now Alice’s messages
is also observable to Charlie. The equilibria in which no one babbles remain the same as
in the Benchmark. In particular, it is notable that Charlie’s strategy does not depend on
Alice’s messages despite it being available.
In this variation, if Charlie believes that Bob uses his language in the optimal, literally
vague way, the former’s choice should not depend on Alice’s messages, because the information contained in Alice’s messages is fully incorporated into Bob’s messages through
context-dependence. Thus we can tease out whether Charlie correctly interprets Bob’s
messages according to the optimal language by checking whether Charlie’s choice depends
on Alice’s messages.
13
Variation 2 (Charlie chooses from three actions).
In this variation, Charlie has three actions to choose from: UP, MIDDLE and DOWN.
UP is optimal if A + B ≥ 120, MIDDLE if 80 ≤ A + B ≤ 120, and DOWN if A + B ≤ 80. If the
optimal action is chosen the players all receive a payoff of 1, otherwise 0.
In equilibria in which no one babbles, Alice sends “A is High” if A > 50 and “A is Low”
if A < 50.
• If Alice’s message is “A is High” then Bob sends “B is High” if B > 25 and “B is Low”
if B < 25.
• If Alice’s message is “A is Low” then Bob sends “B is High” if B > 75 and “B is Low”
if B < 75.
Charlie chooses UP seeing “B is High” and DOWN otherwise. Accordingly, the success
rate is 63.5%. It should be noted that MIDDLE is never chosen in this equilibrium.
The purpose of introducing the third action is to make the conversational environment
more complex, in particular for Bob and Charlie. The Benchmark and Variation 1 are relatively simple environments in which it is not supremely difficult to “compute” the optimal
cutoffs.7 On the other hand, when people talk in real life they do not typically derive the
optimal language consciously. Thus we want to see, when it is more difficult to explicitly derive the optimal language, whether people can still arrive at the optimal, literally
vague language and use it to effectively communicate, or whether they instead revert to
context-independent language, which is simpler to use for the speaker and to understand
for the listener. Thus it is crucial to examine whether in this variation Bob’s message is
context-dependent or not, and whether Charlie best responds or not.
Variation 3 (Charlie chooses from three actions and hears Alice).
This variation differs from Variation 2 in that Alice’s message is now observable to
Charlie. In an optimal equilibrium, Alice sends “A is High” if A > 50 and “A is Low” if
A < 50. Bob’s strategy does not quite differ from that in Variation 2 qualitatively, but is
with different optimal cutoffs.
• If Alice’s message is “A is High” then Bob sends “B is High” if B > 45 and “B is Low”
if B < 45.
7
The key logic one can easily come with is that Alice would use a cutoff of 50 because of the symmetry of
the problem. Thereafter the optimal cutoffs of 25 and 75 can be deduced simply by mind or at most by some
back-of-envelope calculation.
14
• If Alice’s message is “A is Low” then Bob sends “B is High” if B > 55 and “B is Low”
if B < 55.
Charlie chooses UP seeing (“A is High”, “B is High”), DOWN seeing (“A is Low”, “B is
Low”), and MIDDLE otherwise. The success rate is 78.5%. It should be noted that MIDDLE is chosen in this equilibrium when the messages from Alice and Bob disagree.
This variation serves two purposes. First, it allows us to study how the quantitative
change in the optimal cutoff values affects the use of languages. The optimal cutoff values that are substantially closer each other generates a very minimal benefit of contextdependence relative to the context-independent counterfactual. In fact, the success rate
from the context-independent counterfactual is 78%. Thus, this variation enables us to
understand how individuals’ choice of context-dependent languages are guided by the
salience of incentives.
Variation 4 (Bob’s Messages are Imperative).
Consider a variation of the Benchmark in which we replace Bob’s messages “B is High”
and “B is Low” by “Take UP” and “Take DOWN”. Clearly this change eradicates any possibility of literal vagueness because Bob’s messages, now imperative, have unambiguous
literal interpretations with respect to the decision problem at hand. Apart from the difference in literal interpretation of the messages the variation is the same as the Benchmark.
Hence the variation serves as a nice control version of the Benchmark. In particular, it
allows us to test whether literal vagueness may intimidate players from using the optimal
language.
In the experiments, we create not only an “imperative messages” version of the Benchmark, but also that of Variation 2.
3 Experimental Implementation
3.1 Experimental Design and Hypotheses
The benchmark game and its variants introduced in the previous section constitute our
experimental treatments. Our experiment features a (2 × 2) + (2 × 1) treatment design
(Table 1). The first treatment variable concerns the number of actions available to the
receiver (Charlie) and the second treatment variable concerns whether Alice’s messages
are observed by Charlie or not. The third treatment variable concerns whether Bob’s
15
messages are framed to be indicative or imperative.8 We consider the treatments with
imperative messages as a robustness check so that we omit the corresponding treatments
in which Alice’s messages are observed by Charlie.
Table 1: Experimental Treatments
Indicative Messages from Bob
Alice’s messages
# of Actions
Two
Three
Unobservable
2A-U-IND
3A-U-IND
Observable
2A-O-IND
3A-O-IND
Imperative Messages from Bob
+
Alice’s messages
Unobservable
# of Actions
Two
Three
2A-U-IMP
3A-U-IMP
Our first experimental hypothesis concerns the overall outcome of the communication
games represented by the success rate. Let S(T ) denote the average success rate of Treatment T . Postulating that the optimal equilibria are played in each game, we have the
following hypothesis.
Hypothesis 1 (Success Rate). S(2A-O-IND) = S(2A-U-IND) = S(2A-U-IMP) > S(3A-O-IND)
> S(3A-U-IND) = S(3A-U-IMP)
This hypothesis can be decomposed into two sub-hypotheses. First, the observability of
Alice’s message to Charlie does not affect the success rate in the treatments with two
actions (S(2A-O-IND) = S(2A-U-IND)) while the very same observability increases the
success rate in the treatments with three actions (S(3A-O-IND) > S(3A-U-IND)). Second,
the imperativeness of Bob’s messages does not affect the success rate (S(2A-U-IND) = S(2AU-IMP) and S(3A-U-IND) = S(3A-U-IMP)).
Our second hypothesis considers the counterfactual in which Bob is constrained to be
context-independent and thus always uses the cutoff of 50.9 In the counterfactual scenario, the success rates are 75% in Treatments 2A-O-IND, 2A-U-IND, and 2A-U-IMP,
78% in Treatment 3A-O-IND, and 55% in Treatments 3A-U-IND and 3A-U-IMP. If the
players effectively use the optimal, literally vague language, the success rates should be
significantly above the levels predicted by the counterfactual. Thus, we have the following
hypothesis.
Hypothesis 2 (Counterfactual Comparison).
1. S(2A-O-IND), S(2A-U-IND), S(2A-U-IMP) > 75%
8
For example, Bob’s message spaces in Treatments 2A-U-IND and 2A-U-IMP are {“B is HIGH”, “B is
LOW”} and {“Take UP”, “Take DOWN”}, respectively.
9
Charlie’s optimal strategy remains the same regardless of whether Bob is constrained to be contextindependent or not.
16
2. S(3A-O-IND) > 78%
3. S(3A-U-IND), S(3A-U-IMP) > 55%
Note that the success rate predicted by the optimal, literally vague language in Treatment 3A-O-IND is 78.5%, which is not substantially different from the 78% predicted by
the counterfactual. The net benefit of the context-dependent language measured with respect to the success rates is only 0.5% (= 78.5% - 78%) in Treatment 3A-O-IND. The net
benefit of context-dependence becomes substantially larger in other treatments as it is
12.5% (= 87.5% - 75%) in Treatments 2A-O-IND, 2A-U-IND, and 2A-U-IMP, and 6.5% (=
63.5% - 55%) in Treatments 3A-U-IND and 3A-U-IMP.
Our third hypothesis concerns Alice’s message choices. For all treatments, the optimal
equilibrium play predicts that Alice employs the simple cutoff strategy in which she sends
“A is High” if A > 50 and “A is Low” if A < 50. Let PAlice (m∣A) denote the proportion of
Alice’s message m given the realized number A. Then we have the following hypothesis.
Hypothesis 3 (Alice’s Messages). Alice’s message choices observed in all treatments are
the same. Moreover, PAlice (“A is Low”∣A) = 1 for any A < 50 and PAlice (“A is High”∣A) = 1 for
any A > 50.
Our next hypothesis concerns Bob’s message choices. The optimal equilibrium play predicts that Bob’s message choices depend on the context, i.e., which message he received
m′
(m∣B) denote the proportion of Bob’s
from Alice. To state our hypothesis clearly, let PBob
message m given the realized number B and Alice’s message m′ ∈ {“A is High”, “A is Low”}.
Define
H
L
CD(B) = PBob
(m∣B) − PBob
(m∣B)
where m is “B is Low” for the treatments with indicative messages and “Take DOWN”
for the treatments with imperative messages. CD(B) measures the degree of contextdependence of Bob’s message choices given the realized number B. Bob’s optimal, contextdependent strategy implies that there is a range of number B under which Bob’s message
choices differ depending on Alice’s messages, i.e., CD(B) = 1. Such intervals are [45, 55]
for Treatment 3A-O-IND and [25, 75] for all other treatments. It is worthwhile to note that
CD(B) = 0 for any B ∈ [0, 100] if Bob uses a context-independent strategy. Thus, we have
the following hypothesis:
Hypothesis 4 (Bob’s Messages). Bob’s messages are context-dependent in such a way that
is predicted by the optimal equilibrium of each game. More precisely,
1. For each treatment T , there exists an interval [X T , Y T ] with X T > 0 and Y T < 100 such
that CD(B) > 0 for any B ∈ [X T , Y T ] and CD(B) = 0 otherwise.
17
2. The length of the interval [X T , Y T ] is significantly smaller in Treatment 3A-O-IND
than in any other treatments.
Our last hypothesis concerns if the listener, Charlie, can correctly interpret the messages. In particular, Charlie’s strategy in the optimal equilibrium does not depend on
whether or not Alice’s messages are observable to Charlie in the treatments with two actions. In the games with three actions, however, the observability of Alice’s message to
Charlie matters. Precisely, the optimal equilibrium predicts that MIDDLE should not be
taken by Charlie in Treatments 3A-U-IND and 3A-U-IMP whereas MIDDLE is taken in
Treatment 3A-O-IND when the messages from Alice and Bob do not coincide. Thus, we
have the following hypothesis.
Hypothesis 5 (Charlie’s Action Choices). In the treatments with two actions, Charlie’s
action choices do not depend on whether Alice’s messages are observable or not. In the
treatments with three actions, MIDDLE is taken only in Treatment 3A-O-IND when the
messages from Alice and Bob do not coincide.
3.2
Procedures
Our experiment was conducted in English using z-Tree (Fischbacher (2007)) at the Hong
Kong University of Science and Technology. A total of 138 subjects who had no prior experience with our experiment were recruited from the undergraduate population of the
university. Upon arrival at the laboratory, subjects were instructed to sit at separate
computer terminals. Each received a copy of the experimental instructions. To ensure
that the information contained in the instructions was induced as public knowledge, the
instructions were read aloud, aided by slide illustrations and a comprehension quiz.
We conducted one session for each treatment. In all sessions, subjects participated in
21 rounds of play under one treatment condition. Each session had 21 or 24 participants
and thus involved 7 or 8 fixed matching groups of three subjects, one Member A (Alice), one
Member B (Bob), and one Member C (Charlie). Thus, we used the fixed-matching protocol
and between-subject design. As we regard each group in each session as an independent
observation, we have seven to eight observations for each of these treatments, which provide us with sufficient power for non-parametric tests. At the beginning of a session, one
third of the subjects were randomly labeled as Member A, another one third labeled as
Member B and the remaining one third labeled as Member C. The role designation remained fixed throughout the session.
We illustrate the instructions for Treatment 2A-U-IND. The full instructions can be
found in Appendix A. For each group, the computer selected two integer numbers A and
18
B between 0 and 100 (uniformly) randomly and independently. Subjects were presented
with a two-dimensional coordinate system (with A in the horizontal coordinate and B in
the vertical coordinate) as in Figures 5(a) and 5(b) in Appendix A. The selected number A
was revealed only to Member A and the selected number B was revealed only to Member
B. Member A sent one of two messages, “A is Low” and “A is High”, to Member B but not
to Member C. After observing both the selected number B and the message from Member
A, Member B sent one of two messages, “B is Low” and “B is High”, to Member C who
then took one of two actions, UP and DOWN. The ideal actions for all three players were
UP when A + B > 100 and DOWN when A + B < 100.10 Every member in a group received
50 ECU if the ideal action was taken and 0 ECU otherwise.
For Rounds 1-20, we used the standard choice-method so that each participant first
encountered one possible contingency and specified a choice for the given contingency.
For example, Member A decided what message to send after seeing the randomly selected
number A. Member B decided what message to send after observing the randomly selected
number B and the message from Member A. Similarly, Member C decided what action to
take after receiving the message from Member B. For Round 21, however, we used the
strategy-method and elicited beliefs of players. For the belief elicitation, a small amount
of compensation (in the range between 2 ECU and 8 ECU) was provided for each correct
guess.11 For more details, see the selected sample scripts for the strategy-method and the
belief-elicitation provided in Appendix B.
We randomly selected two rounds out of the 21 total rounds for each subject’s payment.
The sum of the payoffs a subject earned in the two selected rounds was converted into
Hong Kong dollars at a fixed and known exchange rate of HK$1 per 1 ECU. In addition
to these earnings, subjects also received a show-up payment of HK$30. Subjects’ total
earnings averaged HK$103.5 (≈ US$13.3).12 The average duration of a session was about
1 hour.
4 Experimental Findings
We report our experimental results as a number of findings that address our hypotheses
as set forth in Section 3.1.
10
To make the likelihood of each action being ideal exactly equal across two actions, we set both actions to
be ideal when A + B = 100.
11
Although we were aware of the fact that an appropriate incentive-compatible mechanism is needed to
elicit beliefs correctly, we took this simple elicitation procedure because of its simplicity as well as the fact
that the belief and strategy data were only secondary data mainly for the purpose of robustness checks.
12
Under the Hong Kong’s currency board system, the Hong Kong dollar is pegged to the US dollar at the
rate of HK $7.8 = US$1.
19
4.1 Overall Outcome
Note: The red bars depict the theoretical predictions from the optimal,
literally vague equilibria. The red dotted lines depict the predictions from
the counterfactual in which Bob is constrained to be context-independent.
Figure 1: Average Success Rate
Figure 1 reports the average success rates aggregated across all rounds and all matching groups for each treatment. It also presents the theoretical predictions from the optimal
equilibrium depicted by the red bars and the predictions from the counterfactual in which
Bob is constrained to be context-independent depicted by the dotted lines. A few observations were apparent. First, non-parametric Mann-Whitney test reveals that the success
rates in Treatment 2A-O-IND and in Treatment 2A-U-IND were not statistically different (81.6% vs. 83.6%, two-sided, p-value = 0.6973). On the contrary, the success rate in
Treatment 3A-O-IND was 73.8%, which is significantly higher than 54.2% in Treatment
3A-U-IND (Mann-Whitney test, p-value = 0.0267). This observation is consistent with Hypothesis 1 that the observability of Alice’s message to Charlie affects the success rate only
in the treatments with three actions.
Second, there was no significant difference in the success rates between Treatment
2A-U-IND and Treatment 2A-U-IMP (83.6% vs. 85.7%, two-sided Mann-Whitney test, pvalue = 0.5989) and between Treatment 3A-U-IND and Treatment 3A-U-IMP (54.2% vs.
51.2%, two-sided Mann-Whitney test, p-value = 0.2237). This observation is also consistent
with Hypothesis 1 that imperativeness of Bob’s message does not affect the success rate
regardless of the number of available actions. Confirming Hypothesis 1, we thus have our
first finding as follows:
Finding 1. Observability of Alice’s message to Charlie affected the success rate only in the
treatments with three actions. Imperativeness of Bob’s message did not affect the success
rate regardless of the number of available actions.
20
Figure 1 seems to suggest that the success rates observed in the three treatments with
two actions (hereafter Treatments 2A) are better approximated by the predictions from
the optimal, context-dependent equilibrium languages than by the predictions from the
context-independent counterfactual. Indeed, we cannot reject the null hypothesis that the
success rates are not different from 87.5%, the predicted value from the optimal equilibrium (two-sided Mann-Whitney tests, p-values are 0.8262, 0.5076, 1.000 for Treatments
2A-O-IND, 2A-U-IND and 2A-U-IMP, respectively). Even if we can reject the alternative hypothesis that the success rates are significantly higher than the predicted level
of 75% from the context-independent counterfactual only for Treatment 2A-U-IND (onesided Mann-Whitney tests, p-values are respectively 0.2551, 0.0610, 0.2174 for Treatments
2A-O-IND, 2A-U-IND and 2A-U-IMP), the p-values resulted from the non-parametric analysis suggest that the optimal, context-dependent equilibrium is a better predictor of the
results observed from these treatments.
On the other hand, we do not have the same observation from the three treatments
with three actions (hereafter Treatments 3A), especially those with unobservable messages from Alice. We cannot reject the null hypothesis that the success rates observed in
these treatments are the same as the success rates predicted by the context-independent
counterfactual (two-sided Mann-Whitney tests, p-values are 0.4347, 0.6936, 0.4347 respectively for Treatments 3A-O-IND, 3A-U-IND and 3A-U-IMP). Moreover, the success rates
observed in Treatments 3A-U-IND and 3A-U-IMP were respectively 54.2% and 51.2%,
which are substantially lower than the predicted level of 63.5% from the optimal, literally
vague equilibrium language. Although the difference is statistically insignificant (twosided Mann-Whitney tests, p-values are 0.4308 and 0.2413, respectively), the p-values
generated from the non-parametric analysis suggest that the context-independent counterfactual is a better predictor of the results from Treatments 3A-U-IND and 3A-U-IMP.13
Thus, we have the following result:
Finding 2. The average success rates observed in Treatments 2A-O-IND, 2A-U-IND, and
2A-U-IMP were higher than the predicted level from the counterfactual in which Bob is
context-independent. The average success rates observed in Treatments 3A-O-IND, 3A-UIND, and 3A-U-IMP were lower than the predicted level from the counterfactual. However,
the difference between the observed success rate and the prediction from the counterfactual
is statistically significant only in Treatment 2A-U-IND.
Among Treatments 3A, more substantial deviations from the optimal, context-dependent
equilibrium were observed in Treatments 3A-U-IND and 3A-U-IMP. This observed devi13
The success rates observed in Treatments 3A-O-IND was 73.8%, which is not significantly different from
the predicted level of 78.5% from the optimal, literally vague equilibrium language (two-sided Mann-Whitney
test, p-value = 0.4347).
21
ation in the average success rates may imply that the context-dependent, literally vague
languages were not emerged in those treatments, probably due to the complexity of the environment considered. However, another completely plausible scenario would be that the
observed deviation originates from a different source, such as Charlie’s choices not being
consistent with the optimal equilibrium. Without taking a careful look at the individual
behavior, it is impossible to draw any meaningful conclusion. Hence, in the subsequent
sections, we shall look at individual players’ choices.
4.2
Alice’s Behavior
Figure 2 reports Alice’s message strategies by presenting the proportion of each message
as a function of the number A where data were grouped into bins by the realization of
number A (e.g., [0, 5), [5, 10), ..., etc.). Figure 2(a) provides the aggregated data from Treatments 2A while Figure 2(b) provides the aggregated data from Treatments 3A. The same
figures separately drawn for each treatment can be found in Appendix C.
Alice's
Message
-‐
Treatments
3A
100
80
80
60
LOW
40
Propor%on
Propor%on
Alice's
Message
-‐
Treatments
2A
100
60
LOW
40
HIGH
HIGH
0
0
[0
,5
)
[5
,1
0
[1 )
0,
15
[1 )
5,
20
[2 )
0,
25
[2 )
5,
30
[3 )
0,
35
[3 )
5,
40
[4 )
0,
45
[4 )
5,
50
[5 )
0,
55
[5 )
5,
60
[6 )
0,
65
[6 )
5,
70
[7 )
0,
75
[7 )
5,
80
[8 )
0,
85
[8 )
5,
90
[9 )
0.
9
[9 5)
5,
10
0]
20
[0
,5
)
[5
,1
0
[1 )
0,
15
[1 )
5,
20
[2 )
0,
25
[2 )
5,
30
[3 )
0,
35
[3 )
5,
40
[4 )
0,
45
[4 )
5,
50
[5 )
0,
55
[5 )
5,
60
[6 )
0,
65
[6 )
5,
70
[7 )
0,
75
[7 )
5,
80
[8 )
0,
85
[8 )
5,
90
[9 )
0.
9
[9 5)
5,
10
0]
20
Number
A
Number
A
(a) Treatments 2A
(b) Treatments 3A
Note: The red dotted lines illustrate the optimal equilibrium strategy with cutoff of 50.
Figure 2: Alice’s Messages
From these two figures, it was immediately clear that the subjects whose designated
roles were Alice in our experiments tended to use cut-off strategies well approximated by
the optimal cutoff of 50. Using the matching-group level data from all rounds for each
bin of the realized number A (e.g. [0, 5), [5, 10), ..., etc.) as independent data points for
each treatment, a set of (two-sided) Mann-Whitney tests reveals that 1) for any bins of
A below 50, we cannot reject the null hypothesis that the proportion of message “A is
Low” being sent was 100%, and 2) for any bins of A above 50, we cannot reject the null
hypothesis that the proportion of message “A is Low” being sent was 0%. Among 20 bins
in each treatment, the p-values for 14-17 bins were 1.0 while the lowest p-value for each
22
treatment was ranged between 0.2636 and 0.5637. Confirming our Hypothesis 3, we thus
have the following result:
Finding 3. For any treatment, PAlice (“Low”∣A) = 1 for any A < 50 and PAlice (“High”∣A) = 1
for any A > 50.
The elicited strategies and beliefs reported in Figure 10 in Appendix C provided additional supports for Finding 3. We cannot reject the null hypothesis that the reported
cutoff values for Alice’s strategy and other player’s reported beliefs for Alice’s strategy in
all treatments were the same as the optimal equilibrium cutoff of 50 (two-sided MannWhitney tests, p-values are in the range between 0.4533 and 1.000).
4.3
Bob’s Behavior
L
We now look at Bob’s behavior. Recall that PBob
(m∣B) denoted the proportion of Bob’s
H
message m given the realized number B and Alice’s message “A is Low”, and PBob
(m∣B)
denoted that given Alice’s message “A is High”. We introduced the measure for the contextH
L
(m∣B) where m is “B is Low” for the treatments
dependence as CD(B) = PBob
(m∣B) − PBob
with indicative messages and “Take DOWN” for the treatments with imperative messages.
Figures 3(a)-(f) illustrate the distributions of CD(B) over the realization of number B,
aggregated across all matching groups for each treatment. Again, the data from all rounds
were grouped into bins by the realization of number B (e.g., [0, 5), [5, 10), ..., etc.).
The optimal, context-dependent equilibrium language implies that there exists an interval (strictly interior of the support of B) such that the value of CD(B) is 1 if B is in
the interval and 0 otherwise. Moreover, the boundaries of such intervals are determined
by the equilibrium cutoff strategy so that the interval is narrower in Treatment 3A-OIND. The exact prediction of the distribution of CD(B) made by the optimal equilibrium
is illustrated by the red-dotted lines in Figures 3(a)-(f).14
These figures convincingly visualize the fact that, in each treatment, there existed an
interval in which the value of CD(B) is strictly positive. For Treatment 2A-O-IND, for
instance, we cannot reject the null hypothesis that CD(B) = 0 for B ∈ [0, 30) and for B ∈
[60, 100] (two-sided Mann-Whitney test, both p-values = 1.00). However, for B ∈ [30, 60), we
can reject the null hypothesis that CD(B) = 0 in favor of the alternative that CD(B) > 0
(one-sided Mann-Whitney test, p-value = 0.04).15 Similarly, for Treatments 2A-U-IMP and
14
Figures 8(a)-(f) and 9(a)-(f) presented in Appendix C separately report the distributions of
H
is Low”∣B) and of PBob
(“B is Low”∣B) for each treatment.
To conduct Mann-Whitney tests for Hypothesis 4, we first eyeball Figures 3(a)-(f) to identify the plausible
choices of the interval with CD(B) > 0 for each treatment. For example, for Treatment 2A-O-IND, relying
L
PBob
(“B
15
23
(a) Treatment 2A-O-IND
(b) Treatment 3A-O-IND
(c) Treatment 2A-U-IND
(d) Treatment 3A-U-IND
(e) Treatment 2A-U-IMP
(f) Treatment 3A-U-IMP
Note: The red dotted lines present the predicted distribution from the optimal equilibrium strategy.
Figure 3: Bob’s Message Strategy
3A-U-IMP, we cannot reject the null hypothesis that CD(B) = 0 for the intervals of [25, 75)
(p-values are respectively 0.052 and 0.059) in favor of the alternative that CD(B) > 0.
Qualitatively the same but less significant results were obtained from Treatment 2A-UIND with the non-zero interval of [25, 70) (p-value = 0.136) and from Treatment 3A-U-IND
with the non-zero interval of [25, 75) (p-value = 0.121). Thus, we have the following result:
on Figure 3(a), we divide the support of number B into three intervals – [0, 30), [30, 60), and [60, 100]. We
next calculate the value of CD(B) for each of the three intervals for each matching group. Taking those
values as group-level independent data points for each treatment, we conducted the non-parametric test.
24
Finding 4. In Treatments 2A-O-IND, 2A-U-IND, 2A-U-IMP, 3A-U-IND, and 3A-U-IMP,
there existed an interval [X, Y ] with X > 0 and Y < 100 such that CD(B) > 0 if B ∈ [X, Y ]
and CD(B) = 0 otherwise.
It is necessary to discuss the data from Treatment 3A-O-IND in Figure 3(b) more carefully. First, as predicted by the optimal equilibrium strategy, the interval that has nonzero value of CD(B) seemed to shrink significantly compared to any other treatments. Indeed, we cannot reject the null hypothesis that CD(B) = 0 for B ∈ [0, 45) and for B ∈ [50, 100]
(two-sided Mann-Whitney test, p-values are 1.00 and 0.7237, respectively). However, a
substantial deviation from the theoretical prediction was observed such that the reported
value for the bin [50, 55) was negative.16 This observation was driven by the fact that the
observed cutoff value from Bob’s strategy conditional on Alice’s message “A is Low” was
50, which may look more focal than the theoretically optimal cutoff of 45.17
The elicited strategies and beliefs reported in Figure 11 in Appendix C provide further
supporting evidence. Wilcoxon signed-rank tests reveal that the reported cutoff values
given Alice’s message “A is Low” were significantly higher than the cutoff values given
Alice’s message “A is High” for most of the treatments (p-values are ranged between 0.0004
and 0.0204) except for Treatment 3A-O-IND and Treatment 3A-U-IND.18 The fact that the
reported cutoff values in Treatment 3A-O-IND were not significantly different (two-sided,
p-value = 0.1924) is not surprising at all because the optimal equilibrium cutoff values are
45 and 55, distinctively closer each other than the predicted values for all other treatments.
The insignificant result for Treatment 3A-U-IND (two-sided, p-value = 0.3561) mainly
originated from two observations that the reported cutoffs from two Charlie-subjects were
50 and 45 given “A is Low” and 85 and 80 given “A is High”.
4.4
Charlie’s Behavior
Figure 4 reports Charlie’s action choices by presenting the proportion of each action as
a function of information available to Charlie. Figure 4(a) presents the data aggregated
across all matching groups of Treatments 2A while Figure 4(b) presents the data aggregated across all matching groups of Treatments 3A.
16
For Treatment 3A-O-IND, we cannot conduct any meaningful statistical analysis for B ∈ [45, 50) because
there are only two group-level independent data points.
17
Similarly, two substantial deviations were observed in Treatment 3A-U-IND - in the first bin of [0, 5)
and the ninth bin of [40, 45) in Figure 3(d). The first deviation was solely driven by the single data point
with B ∈ [0, 5) in which Bob sent “B is High” after receiving “A is High” from Alice. The second deviation
was solely driven by the single data point with B ∈ [40, 45) in which Bob sent “B is Low” after receiving “A
is High” from Alice.
18
To conduct Wilcoxon signed-rank tests, we pooled the data from the reported cutoff values for Bob’s
strategy and other players’ reported beliefs.
25
(a) Treatments 2A
(b) Treatments 3A
Note: The red bars present the theoretical predictions from the Pareto-optimal equilibria.
Figure 4: Charlie’s Actions
A few observations emerged immediately from these figures. First, Figure 4(a) reveals
that observed strategy by Charlie depended, to some but limited extent, on whether or not
Alice’s messages were observable to him in Treatments 2A. In Treatment 2A-O-IND, the
proportion of UP being chosen given Bob’s message “B is High” was about 52% and the
proportion of DOWN being chosen given Bob’s message “B is Low” was about 75%, both are
substantially and significantly different from 100% predicted by the optimal equilibrium.
However, the observed proportions became 63% and 100% if we took the last three rounds
data only, showing that learning took place toward the right direction.19
Second, Figure 4(b) reveals that MIDDLE was taken by Charlie in Treatment 3A-OIND when the messages from Alice and Bob did not coincide. The proportions of MIDDLE
being chosen by Charlie given the message combinations (“A is High”, “B is Low”) and
(“A is Low”, “B is High”) were higher than 90%. However, inconsistent with the prediction
from the optimal equilibrium strategy, MIDDLE was taken even in Treatments 3A-U-IND
and 3A-U-IMP. The proportions of MIDDLE being taken observed in these two treatments
varied between 24% and 43% which are substantially larger than 0%. This observed deviation seemed persistent as it did not disappear even when we took the data from the last
three rounds only.20 Thus, we have the following result:
Finding 5. Charlie’s observed action choices in Treatment 2A-O-IND were not the same
as those observed in Treatment 2A-U-IND, showing that observability of Alice’s message to
19
In an early stage of the project, we have conducted two sessions in which subjects participated in the
first 20 rounds with the treatment condition of 2A-U-IND and in the second 20 rounds with the treatment
condition of 2A-O-IND. The data from the second 20 rounds were almost perfectly consistent with the theoretical prediction, showing another convincing evidence of learning. Data from this additional treatment
are available upon request.
20
The elicited strategies and beliefs presented in Figure 12 in Appendix C were highly consistent with the
results in Finding 5.
26
Charlie mattered. For Treatments 3A, MIDDLE was taken in Treatment 3A-O-IND when
Charlie received different messages from Alice and Bob. However, a substantial proportion
of MIDDLE was observed even in Treatments 3A-U-IND and 3A-U-IMP.
This observed discrepancy in Charlie’s action choices between our data and the prediction from the optimal equilibrium was the main source of the lower success rates we
had in Treatments 3A-U-IND and 3A-U-IMP (see Figure 1). The success rates observed
in Treatments 3A-U-IND and 3A-U-IMP were respectively 54.2% and 51.2%, which are
substantially lower than the predicted level of 63.5%, although the difference is statistically insignificant (Mann-Whitney tests, p-values are 0.4308 and 0.2413, respectively). If
we replace the observed empirical choices by Charlie with the hypothetical choices from
the optimal strategy, the success rates in Treatments 3A-U-IND and 3A-U-IMP become
58.3% and 56.0% respectively, both are substantially higher than the observed levels.
Note that MIDDLE was taken slightly more often in Treatment 3A-U-IMP than in
Treatment 3A-U-IND (31% and 24% vs. 33% and 43%). The elicited strategies and beliefs presented in Figure 12 in Appendix C demonstrate the difference more vividly. This
difference may come from the fact that we framed Bob’s messages as “Don’t take UP” and
“Don’t take DOWN” to impose imperativeness to the messages for Treatment 3A-U-IMP.
4.5
Emergence of Context-dependent, Literally Vague Languages
In this section, we combine our findings presented in the previous sections to establish
the emergence of context-dependent, literally vague languages. Admittedly, we did not
present a perfect match between our data and the prediction from the optimal, contextdependent equilibrium language. However, we provided convincing evidence that overall
behavior observed in our laboratory was qualitatively consistent with the prediction. More
precisely,
1. Finding 3 illustrated that Alice tended to use cut-off strategies well approximated by
the cut-off value of 50. Thus, contexts are properly defined.
2. Finding 4 showed that Bob tended to use context-dependent strategies.
3. Finding 5 suggested that Charlie understood the messages from the speaker(s) well
and took actions in a manner that is qualitatively consistent with the optimal equilibrium strategy.
4. Finding 5 also revealed that the lower success rates observed in Treatments 3A-UIND and 3A-U-IMP reported in Finding 2 were largely driven by Charlie’s behavior
27
being partially inconsistent with the theoretical predictions from the optimal equilibrium.
Taking these findings together, we establish the following result:
Finding 6. Literally vague languages emerged in our experiment. The way it was used by
the speakers and understood by the listener was all consistent with the prediction from the
Pareto-optimal equilibria of the communication games.
5 Concluding Remarks
In this paper we introduce the notion of context-dependence and literal vagueness, and
offer our explanation of why language is vague. We show that literal vagueness arises in a
Pareto-optimal equilibrium in many standard conversational situations. Our experimental data provide supporting evidence for the emergence of literally vague languages.
Although our discussion of linguistic vagueness focuses on the environment in which
players’ preferences are perfectly aligned, the theoretical discussions presented in Blume
et al. (2007) and Blume and Board (2014b) suggest that the communicative advantage of
literal vagueness would be extended to the environment with conflicts of interests. We
believe that experimentally investigating the role of vagueness in the presence of conflict
of interests is an interesting avenue for future research.
28
References
Agranov, Marina and Andrew Schotter (2012), “Ignorance is bliss: An experimental study
of the use of ambiguity and vagueness in the coordination games with asymmetric payoffs.” American Economic Journal: Microeconomics, 4, 77–103.
Blume, Andreas and Oliver Board (2013), “Language barriers.” Econometrica, 81, 781–
812.
Blume, Andreas and Oliver Board (2014a), “Higher-order uncertainty about language.”
Working paper.
Blume, Andreas and Oliver Board (2014b), “Intentional vagueness.” Erkenntnis, 79, 855–
899.
Blume, Andreas, Oliver J. Board, and Kohei Kawamura (2007), “Noisy talk.” Theoretical
Economics, 2, 395–440.
Blume, Andreas, Ernest K. Lai, and Wooyoung Lim (2017), “Strategic information transmission: A survey of experiments and theoretical foundations.” Working paper.
Crawford, Vincent and Joel Sobel (1982), “Strategic information transmission.” Econometrica, 50, 1431–51.
Fischbacher, Urs (2007), “z-tree: Zurich toolbox for ready-made economic experiments.”
Experimental Economics, 10, 171–178.
Lewis, David (1969), Convention: A Philosophical Study. Harvard University Press.
Lipman, Barton L. (2009), “Why is language vague.” Working paper.
Quine, Willard V. O. (1960), Word and object. Technology Press of the Massachusetts Institute of Technology, Cambridge.
Serra-Garcia, Marta, Eric van Damme, and Jan Potters (2011), “Hiding an inconvenient
truth: Lies and vagueness.” Games and Economic Behavior, 73, 244 – 261.
Smith, Lones, Peter N. Sørensen, and Jianrong Tian (2016), “Informational herding, optimal experimentation, and contrarianism.”
Tian, Jianrong (2016), “Monotone comparative statics for cut-offs.” Working paper.
Wilson, Andrea (2014), “Bounded memory and biases in information processing.” Econometrica, 82, 2257–2294.
29
Wood, Daniel H. (2016), “Communication-enhancing vagueness.” Working paper.
Wu, Qinggong (2016), “Coarse communication and institution design.” Working paper.
30
Appendices
A
Experimental Instructions - Treatment 2A-U-IND
INSTRUCTION
Welcome to the experiment. This experiment studies decision making between three
individuals. In the following two hours or less, you will participate in 21 rounds of decision
making. Please read the instructions below carefully; the cash payment you will receive
at the end of the experiment depends on how well you make your decisions according to
these instructions.
Your Role and Decision Group
There are 24 participants in today’s session. One third of the participants will be randomly assigned the role of Member A, another one third the role of Member B, and the
remaining the role of Member C. Your role will remain fixed throughout the experiment.
At the beginning of the first round, three participants, one Member A, one Member B and
one Member C, will be matched to form a group of three. The three members in a group
make decisions that will affect their rewards in all 21 rounds. That is, you will stay in the
same group so that you will interact with the same two other participants throughout the
21 rounds. You will not be told the identity of the participants in your group, nor will they
be told your identity—even after the end of the experiment.
Your Decision and Earning in Each of Round 1-20
In each round and for each group, the computer will select two integer numbers A and B
between 0 and 100 randomly and independently. Each possible number has equal chance
to be selected. The selected number A will be revealed only to Member A and the selected
number B will be revealed only to Member B. Member C, without seeing any of these
numbers, will have to choose one of two actions UP and DOWN.
The amount of Experimental Currency Unit (ECU) you earn in a round depends on the
two numbers A and B as well as the action chosen by Member C. In particular,
1. When A + B > 100, if Member C chooses
(a) UP, every member in your group will receive 50 ECU.
31
(b) DOWN, every member in your group will receive 0 ECU.
2. When A + B < 100, if Member C chooses
(a) DOWN, every member in your group will receive 50 ECU.
(b) UP, every member in your group will receive 0 ECU.
3. When A + B = 100, every member in your group will receive 50 ECU regardless of the
action chosen by Member C.
Member A’s Decisions
You will be presented with a two-dimensional coordinate system on your screen as in
Figure 5(a). The horizontal axis represents the number A and the vertical axis represents
the number B. You will see a blue vertical line, which represents the actually selected
number A in the horizontal axis. The red diagonal line represents the cases with A + B =
100.
With all this information on your screen, you will be asked to send one of two messages
“A is LOW” and “A is HIGH” to Member B in your group. Once you click one of the message
buttons, your decision in the round is completed and your message will be transmitted to
Member B in your group.
(a) Member A’s Screen
(b) Member B’s Screen
Figure 5: Screen Shots
Member B’s Decisions
You will be presented with a two-dimensional coordinate system on your screen as in
Figure 5(b). The horizontal axis represents the number A and the vertical axis represents
the number B. You will see a blue horizontal line, which represents the actually selected
number B in the vertical axis. The red diagonal line represents the cases with A + B = 100.
32
You will also receive a message from Member A in your group. With all this information
on your screen, you will be asked to send one of two messages “B is LOW” and “B is HIGH”
to Member C in your group. Once you click one of the message buttons, your decision in
the round is completed and your message will be transmitted to Member C in your group.
Member C’s Decisions
You will be presented with a two-dimensional coordinate system on your screen as in
Figure 6. The horizontal axis represents the number A and the vertical axis represents
the number B. The red diagonal line represents the cases with A + B = 100.
You will receive a message from Member B in your group. With all this information on
your screen, you will be asked to take one of two actions DOWN and UP. Once you click
one of the action buttons, your decision in the round is completed.
Figure 6: Member C’s Screen
Information Feedback
At the end of each round, the computer will provide a summary for the round: actually
selected numbers A and B, Member A’s message, Member B’s message, Member C’s action
choice, and your earning in ECU.
Your Decision in Round 21
After the 20th round, your screen will provide further instructions for your decisions in
Round 21. The game you are going to play in this round is essentially the same as before,
but you need to follow some new procedures. Please read the instructions carefully before
you start the 21st round. You will have an opportunity to ask questions if anything is
unclear about the new instructions.
33
Your Cash Payment
To calculate your cash payment, the experimenter will randomly select two rounds to
calculate your cash payment. Each round between Rounds 1 and 21 has an equal chance
to be selected. So it is in your best interest to take each round seriously. Your total cash
payment at the end of the experiment will be the sum of ECU you earned in the two
selected rounds, translated into HKD with the exchange rate of 1 ECU = 1 HKD, plus a
30 HKD show-up fee.
Quiz and Practice
To ensure your understanding of the instructions, we will provide you with a quiz and
practice round. We will go through the quiz after you answer it on your own.
You will then participate in 1 practice round. The practice round is part of the instructions which is not relevant to your cash payment; its objective is to get you familiar
with the computer interface and the flow of the decisions in each round. Once the practice
round is over, the computer will tell you “The official rounds begin now!”
Administration
Your decisions as well as your monetary payment will be kept confidential. Remember
that you have to make your decisions entirely on your own; please do not discuss your
decisions with any other participants. Upon finishing the experiment, you will receive
your cash payment. You will be asked to sign your name to acknowledge your receipt of
the payment. You are then free to leave. If you have any question, please raise your hand
now. We will answer your question individually.
1. Suppose you are assigned to be a Member A. The computer chooses the random numbers A = 25 and
B = 50. Which of the following is true?
(a) Both you and Member B know the chosen numbers A and B but Member C does not know any
of the numbers.
(b) Neither you nor Member B knows the chosen numbers A and B.
(c) You are the only person in your group who knows the chosen number A and Member B is the
only person in your group who knows the chosen number B.
2. Suppose that the computer chooses the random numbers A = 25 and B = 50. Member C in your group
takes action DOWN. Please calculate the earning for each player:
• Member A’s payoff:
34
• Member B’s payoff:
• Member C’s payoff:
3. Suppose that the computer chooses the random numbers A = 60 and B = 73. Member C in your group
takes action DOWN. Please calculate the earning for each player:
• Member A’s payoff:
• Member B’s payoff:
• Member C’s payoff:
35
B
Scripts for Strategy-method and Belief-elicitation Treatment 2A-U-IND
1. Strategy - Member A
In this round, we ask you to report your plan. After you specify your plan below, A will be realized and your plan will be
implemented accordingly.
, and otherwise, send “A is HIGH”.
Your plan: Send “A is LOW” if A is less than or equal to
What is the number for you in the blank above?
2. Belief - Member A
In this round, Member A is going to report his/her plan according to the following form:
Send “A is LOW” if A is less than or equal to
, and otherwise, send “A is HIGH”.
What do you think is the number for him/her in the blank above?
If your guess is in the range of the actual value (chosen by Member A) plus-minus 5, then you will receive extra 8 ECU.
3. Strategy - Member B
In this round, we ask you to report your plan. After you specify your plan below, A and B will be realized and your plan will
be implemented accordingly.
, and
(a) When receiving “A is LOW” from Member A, send “B is LOW” if B is less than or equal to
otherwise, send “B is HIGH”.
(b) When receiving “A is HIGH” from Member A, send “B is LOW” if B is less than or equal to
, and
otherwise, send “B is HIGH”.
What is the number for you in the first blank above?
What is the number for you in the second blank above?
4. Belief - Member B
In this round, Member B is going to report his/her plan according to the following form:
, and
(a) When receiving “A is LOW” from Member A, send “B is LOW” if B is less than or equal to
otherwise, send “B is HIGH”.
(b) When receiving “A is HIGH” from Member A, send “B is LOW” if B is less than or equal to
, and
otherwise, send “B is HIGH”.
What do you think is the number for him/her in the blank in (a)?
What do you think is the number for him/her in the blank in (b)?
If each of your guesses for (a) and (b) is in the range of the actual value (chosen by Member B) plus-minus 5, then you will
receive extra 4 ECU.
5. Strategy - Member C
In this round, we ask you to report your plan. After you specify your plan below, A and B will be realized and your plan will
be implemented accordingly.
What action would you like to take if the message from Member B is
(a) B is LOW
(b) B is HIGH
6. Belief - Member C
In this round, we ask Member C to report his/her plan about what action to take for each possible message.
What action do you think would Member C like to take if the message from Member B is
(a) B is LOW
(b) B is HIGH
If each of your guesses for (a) and (b) is correct, then you will receive extra 4 ECU.
36
C
Figures and Tables
(a) Treatment 2A-O-IND
(b) Treatment 3A-O-IND
(c) Treatment 2A-U-IND
(d) Treatment 3A-U-IND
(e) Treatment 2A-U-IMP
(f) Treatment 3A-U-IMP
Note: The red dotted lines indicate the optimal cut-off equilibrium strategy.
Figure 7: Alice’s Messages
37
(a) Treatment 2A-O-IND
(b) Treatment 3A-O-IND
(c) Treatment 2A-U-IND
(d) Treatment 3A-U-IND
(e) Treatment 2A-U-IMP
(f) Treatment 3A-U-IMP
Note: The red dotted lines indicate the optimal cut-off equilibrium strategy.
Figure 8: Bob’s Messages ∣ “Low” from Alice
38
(a) Treatment 2A-O-IND
(b) Treatment 3A-O-IND
(c) Treatment 2A-U-IND
(d) Treatment 3A-U-IND
(e) Treatment 2A-U-IMP
(f) Treatment 3A-U-IMP
Note: The red dotted lines indicate the optimal cut-off equilibrium strategy.
Figure 9: Bob’s Messages ∣ “High” from Alice
39
Figure 10: Alice’s Elicited Strategy and Other Players’ Beliefs
Figure 11: Bob’s Elicited Strategy and Other Players’ Beliefs
40
(a) Treatment 2A-O-IND
(b) Treatment 3A-O-IND
(c) Treatment 2A-U-IND
(d) Treatment 3A-U-IND
(e) Treatment 2A-U-IMP
(f) Treatment 3A-U-IMP
Figure 12: Charlie’s Elicited Strategy and Other Players’ Beliefs
’
41