Representation 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Sequential Language-based Decisions

Adam Bjorndahl Joseph Y. Halpern


Department of Philosophy Department of Computer Science
Carnegie Mellon University Cornell Univeristy
Pittsburgh, USA Ithaca, USA
abjorn@cmu.edu halpern@cs.cornell.edu

In earlier work, we introduced the framework of language-based decisions, the core idea of which
was to modify Savage’s classical decision-theoretic framework [6] by taking actions to be descrip-
tions in some language, rather than functions from states to outcomes, as they are defined classically.
Actions had the form “if ψ then do(ϕ )”, where ψ and ϕ were formulas in some underlying language,
specifying what effects would be brought about under what circumstances. The earlier work allowed
only one-step actions. But, in practice, plans are typically composed of a sequence of steps. Here,
we extend the earlier framework to sequential actions, making it much more broadly applicable. Our
technical contribution is a representation theorem in the classical spirit: agents whose preferences
over actions satisfy certain constraints can be modeled as if they are expected utility maximizers. As
in the earlier work, due to the language-based specification of the actions, the representation theorem
requires a construction not only of the probability and utility functions representing the agent’s be-
liefs and preferences, but also the state and outcomes spaces over which these are defined, as well as
a “selection function” which intuitively captures how agents disambiguate coarse descriptions. The
(unbounded) depth of action sequencing adds substantial interest (and complexity!) to the proof.

1 Background and motivation


In earlier work, we introduced the framework of language-based decisions [2], the core idea of which
was to modify Savage’s classical decision-theoretic framework [6] by taking actions to be descriptions
in some language, rather than functions from states to outcomes, as they are defined classically. Actions
had the form “if ϕ then do(ψ )”, where ϕ and ψ were formulas in some underlying language, specifying
what effects would be brought about under what circumstances.1 For example, a statement like “If
there is a budget surplus then do(MW = 15) else no-op” would be an action in this framework, where
MW = 15 represents the minimum wage being $15, and no-op is the action of doing nothing. The effect
of the action do(MW = 15) is to bring about a state where the minimum wage is $15. But this does not
completely specify the state. (Do businesses close? Is there more automation so jobs are lost? Are no
jobs lost and more people move into the state?)
In this context, we proved a representation theorem in the classical spirit: agents whose preferences
over actions satisfy certain constraints can be modeled as if they are expected utility maximizers. This
requires constructing not only probability and utility functions (as is done classically), but also the state
and outcome spaces on which these functions are defined, and a selection function that describes which
state will result from an underspecified action like do(MW = 15). In this construction the state and out-
come spaces coincide; intuitively, this is because the tests that determine whether an action is performed
(“If ϕ then...”) and the actions themselves (“do(ψ )”) are described using the same language.
1 This work in turn extended previous work by Blume, Easley, and Halpern [3] in which the tests in actions, but not the

effects of actions, were specified in a formal language.

R. Verbrugge (Ed.): Theoretical Aspects of


Rationality and Knowledge 2023 (TARK 2023)
EPTCS 379, 2023, pp. 131–141, doi:10.4204/EPTCS.379.12
132 Sequential Language-based Decisions

The earlier work allowed only one-step actions. But, in practice, plans are typically composed of a
sequence of steps, and we must choose among such plans: Do I prefer to walk to the cafe and then call
my friend if the cafe is open, or would it be better to call my friend first, then walk to the cafe and call
them back if it’s closed? Should I ring the doorbell once, or ring it once and then a second time if no one
replies to the first? Here, we extend the earlier framework to sequential actions, making it much more
broadly applicable.
At a technical level, a decision-theoretic framework in which the state and outcome spaces coincide is
the perfect setting in which to implement sequential actions, since—given that the actions are understood
as functions—we have an immediate and natural way to “put them in sequence”, namely, by composing
the corresponding functions.
Our contribution in this paper is, first, to lay the mathematical groundwork for reasoning about
sequential, language-based actions (Section 2), and second, to prove a representation theorem analogous
to earlier such results (Section 3): roughly speaking, agents whose preferences over sequential actions
satisfy certain axioms can be understood as if their preferences are derived by maximizing the expected
value of a suitable utility function. Proving this result is substantially harder in the present setting, owing
to the more complex nature of sequential actions (including but not limited to the fact that we allow
sequential nesting to be arbitrarily deep). The reader is thus forewarned that the main result depends on
a fairly lengthy, multi-stage proof.

2 Sequential language-based actions


The framework presented in this section is an expansion of that developed in [2]. We begin with the
same simple, formal language: let Φ denote a finite set of primitive propositions, and L the propositional
language consisting of all Boolean combinations of these primitives. A basic model (over L) is a tuple
M = (Ω, [[·]]M ) where Ω is a nonempty set of states and [[·]]M : Φ → 2Ω is a valuation function. This
valuation is recursively extended to all formulas in L in the usual way, so that intuitively, each formula
ϕ is “interpreted” as the “event” [[ϕ ]]M ⊆ Ω. We sometimes drop the subscript when the model is clear
from context, and write ω |= ϕ rather than ω ∈ [[ϕ ]]. We say that ϕ is satisfiable in M if [[ϕ ]]M 6= 0/ and
that ϕ is valid in M if [[ϕ ]]M = Ω; we write |= ϕ to indicate that ϕ is valid in all basic models.
Given a finite set of formulas F ⊆ L, the set of (sequential) actions (over F), denoted by AF , is
defined recursively as follows:
(1) for each ϕ ∈ F, do(ϕ ) is an action (called a primitive action);
(2) no-op is an action (this is short for “no operation”; intuitively, it is a “do nothing” action);
(3) for all ψ ∈ L and α , β ∈ AF , not both no-op, if ψ then α else β is an action;
(4) for all α , β ∈ AF , not both no-op, α ; β is an action (intuitively, this is the action “do α and then do
β ”).
In [2], actions were defined only by clauses (1) and (3). The idea of “sequencing” actions is of course
not new; the semicolon notation is standard in programming languages.
It will also be useful for our main result to have a notion of the depth of an action, which intuitively
should capture how deeply nested the sequencing is. We do so by induction. The only depth-0 action is
no-op. A depth-1 action is either (1) no-op; (2) a primitive action do(ϕ ); or (3) an action of the form if
ψ then α else β , where α and β are depth-1 actions. Now suppose that we have defined depth-k actions
for k ≥ 1; a depth-(k + 1) action is either (1) a depth-k action; (2) an action of the form if ψ then α
else β , where α and β are depth-(k + 1) actions; or (3) an action of the form α ; β , where α is a depth-k1
A. Bjorndahl & J.Y. Halpern 133

action, β is a depth-k2 action, and k1 + k2 ≤ k + 1. Note that we have defined depth in such a way that
the depth-k actions include all the depth-k′ actions for k′ < k, and so that if...then constructions do not
increase depth—only sequencing does.
As in [2], given a basic model M = (Ω, [[·]]M ), we want do(ϕ ) to correspond to a function from Ω to Ω
whose range is contained in [[ϕ ]]M . For this reason we restrict our attention to basic models in which each
ϕ ∈ F is satisfiable, so that [[ϕ ]]M 6= 0;
/ call such models F-rich. Moreover, in order for do(ϕ ) to pick out
a function, we need some additional structure that determines, for each ω ∈ Ω, which state in [[ϕ ]]M the
function corresponding to do(ϕ ) should actually map to. This is accomplished using a selection function
sel : Ω × F → Ω satisfying sel(ω , ϕ ) ∈ [[ϕ ]]M .
The intuition for selection functions is discussed in greater detail in [2]. Briefly: do(ϕ ) says that
ϕ should be made true, but there may be many ways of making ϕ true (i.e., many states one could
transition to in which ϕ is true); sel picks out which of these ϕ -states to actually move to. In this way we
can think of sel as serving to “disambiguate” the meaning of the primitive actions, which are inherently
underspecified.
Note that selection functions are formally identical to the mechanism introduced by Stalnaker [9]
to interpret counterfactual conditionals. In our context, we can think of a selection function as another
component of an agent’s model of the world, to be constructed in the representation theorem: in addition
to a probability measure (to represent their beliefs) and a utility function (to capture their preferences),
we will also need a selection function (to specify how they interpret actions).
A selection model (over F) is an F-rich basic model M together with a selection function sel :
Ω × F → Ω satisfying sel(ω , ϕ ) ∈ [[ϕ ]]M . Given a selection model (M, sel) over F, we define the inter-
pretation of do(ϕ ) to be the function [[do(ϕ )]]M,sel : Ω → Ω given by:

[[do(ϕ )]]M,sel (ω ) = sel(ω , ϕ ).

This interpretation can then be extended to all sequential actions in AF in the obvious way:
(
[[α ]]M,sel (ω ) if ω ∈ [[ψ ]]
[[if ψ then α else β ]]M,sel (ω ) =
[[β ]]M,sel (ω ) if ω ∈
/ [[ψ ]],

and
[[α ; β ]]M,sel = [[β ]]M,sel ◦ [[α ]]M,sel .

3 Representation
Let  be a binary relation on AF , where we understand α  β as saying that α is “at least as good as”
β from the agent’s subjective perspective. Intuitively, such a binary relation is meant to be reasonably
“accessible” to observers, “revealed” by how an agent chooses between binary options. As usual, we
define α ≻ β as an abbreviation of α  β and β 6 α , and α ∼ β as an abbreviation of α  β and β  α ;
intuitively, these relations represent “strict preference” and “indifference”, respectively.
We assume that  is a preference order, so is complete (i.e., for all acts α , β ∈ AF , either α  β
or β  α ) and transitive. Note that completeness immediately gives reflexivity as well. While there are
good philosophical reasons to consider incomplete relations (see [4] and the references therein), for the
purposes of this paper we adopt the assumption of completeness in order to simplify the (already quite
involved) representation result.
134 Sequential Language-based Decisions

A language-based SEU (Subjective Expected Utility) representation for a relation  on AF is a


selection model (M, sel) together with a probability measure Pr on Ω and a utility function u : Ω → R
such that, for all α , β ∈ AF ,

α β ⇔ ∑ Pr(ω ) · u([[α ]]M,sel (ω )) ≥ ∑ Pr(ω ) · u([[β ]]M,sel (ω )). (1)


ω ∈Ω ω ∈Ω

Our goal is to show that such a representation exists if the preference order satisfies one key axiom,
discussed below.

3.1 Canonical maps and canonical actions


For each a ⊆ Φ, let ^ ^
ϕa = p∧ ¬q,
p∈a q∈a
/

so ϕa settles the truth values of each primitive propositions in the language L: it says that p is true iff it
belongs to a. An atom is a formula of the form ϕa .2 Since we are working with a classical propositional
logic, it follows that for all formulas ϕ ∈ L and atoms ϕa , the truth of ϕ is determined by ϕa : either
|= ϕa → ϕ , or |= ϕa → ¬ϕ . In the framework of [2], it followed that every action could be identified with
a function from atoms to elements of F, since atoms determine whether the tests in an action hold. In
our context, however, things are not so simple: actions can be put in sequence, so even though an atom
may tell us which tests at the “first layer” hold, so to speak, it may not be enough to tell us which later
tests hold. For example, in an action like “if p then do(r) else do(r′ ); if q then do(¬r)”, the atom that
currently holds determines whether p holds, but tells us nothing about whether q will hold when we get
around to doing the second action in the sequence.
To deal with this, we need an outcome space that is richer than just F (i.e., richer than the set of
all primitive actions); roughly speaking, we will instead identify actions with functions from atoms to
“canonical” ways of describing the sequential structure of actions. We now make this precise.
Suppose that |2Φ | = N, so there are N atoms; call them a1 , . . . , aN . For each subset A of atoms, let
ϕA = a∈A ϕa . A basic fact of propositional logic is that for every formula ϕ , there is a unique set A of
W

atoms such that ϕ is logically equivalent to ϕA . Let F̃ = {ϕA : (∃ϕ ∈ F)(|= ϕA ↔ ϕ )}.
We want to associate with each action α of depth k a canonical action γα of depth k that is, intuitively,
equivalent to α . The canonical action γα makes explicit how α acts in a state characterized by an atom
a. We define γα by induction on the structure of α . It is useful in the construction to simultaneously
define the canonical map cα associated with α , a function from atoms to actions such that, for all atoms
a, cα (a) has the form no-op, do(ϕA ), or do(ϕA ); γβ for some set A of atoms and action β . Intuitively, cα
defines how α acts in a state characterized by an atom a. For example, if α is if a then do(ϕA ) else β ,
then cα (a) = do(ϕA ).
If α = no-op, then γno-op = no-op and cno-op is the constant function such that cno-op (a) = no-op for
all atoms a. If α is a depth-1 action other than no-op, then we define cα by induction on structure:

cdo(ϕ ) (a) = do(ϕA ), where A is the unique subset of atoms such that |= ϕA ↔ ϕ
(
cα (a) if |= ϕa → ψ
cif ψ then α else β (a) =
cβ (a) if |= ϕa → ¬ψ .

2 Not to be confused with atomic propositions, which is another common name for primitive propositions.
A. Bjorndahl & J.Y. Halpern 135

The action γα is the depth-1 action defined as follows:

γα = if ϕa1 then cα (a1 ) else (if ϕa2 then cα (a2 ) else (· · · (if ϕaN−1 then cα (aN−1 ) else cα (aN ))· · · ))

if at least one of cα (aN−1 ) or cα (aN ) is not no-op. If both are no-op, then if ϕaN−1 then cα (aN−1 ) else
cα (aN ) is not an action according to our definitions; in this case, we take

γα = if ϕa1 then cα (a1 ) else (if ϕa2 then cα (a2 ) else (· · · (if ϕam then cα (am ) else no-op)),

where m is the least index such that cα (am ) 6= no-op. (If cα (am ) = no-op for all m, then γα = no-op.)
If α is a depth-(k + 1) action other than no-op, then we again define cα by induction on structure:
(
cα (a) if |= ϕa → ψ
cif ψ then α else β (a) =
cβ (a) if |= ϕa → ¬ψ

cβ (a)
 if cα (a) = no-op
cα ;β (a) = do(ϕA ); γβ if cα (a) = do(ϕA )

do(ϕA ); γβ ′ ;β if cα (a) = do(ϕA ); γβ ′ .

The canonical action γα is defined as above for the depth-1 case.


We take CAk to be the set of canonical actions of depth k, and CM k to be the set of canonical maps
that correspond to some depth-k action. Finally, let CAk,− consist of all depth k-actions of the form
no-op, do(ϕA ), or do(ϕA ); γβ , where β is a depth (k − 1)-action. Note that if α is a depth-k action and a
is an atom, then cα (a) ∈ CAk,− . Observe that since the set of atoms is finite, as is F̃, it follows that for
all k, CM k , CAk , and CAk,− are also finite. This will be crucial in our representation proof.

3.2 Cancellation
As in [2, 3], the key axiom in our representation theorem is what is known as a cancellation axiom,
although the details differ due to the nature of our actions. Simple versions of the cancellation axiom go
back to [5, 7]; our version, like those used in [2, 3], has more structure. See [3] for further discussion of
the axiom.
The axiom uses multisets. Recall that a multiset, intuitively, is a set that allows for multiple instances
of each of its elements. Thus two multisets are equal just in case they contain the same elements with
the same multiplicities. We use “double curly brackets” to denote multisets, so for instance {{a, b, b}} is
a multiset, and it is distinct from {{a, a, b}}: both have three elements, but the mulitiplicity of a and b
differ. With that background, we can state the axiom:
(Canc) Let α1 , . . . , αn , β1 , . . . , βn ∈ AF , and suppose that for each a ⊆ Φ we have

{{cα1 (a), . . . , cαn (a)}} = {{cβ1 (a), . . . , cβn (a)}}.

Then, if for all i < n we have αi  βi , it follows that βn  αn .


Intuitively, this says that if we get the same outcomes (counting multiplicity) using the canonical maps
for α1 , . . . , αn as for β1 , . . . , βn in each state, then we should view the collections {{α1 , . . . , αn }} and
{{β1 , . . . , βn }} as being “equally good”, so if αi is at least as good as βi for i = 1, . . . , n − 1, then, to
balance things out, βn should be at least as good as αn . How intuitive this is depends on how intuitive
one finds the association α 7→ cα defined above; if the map cα really does capture “everything decision-
theoretically relevant” about the action α , then cancellation does seem reasonable.
136 Sequential Language-based Decisions

In particular, it is not hard to show that whenever α and β are such that γα = γβ (which of course
is equivalent to cα = cβ ), cancellation implies that α ∼ β . In other words, any information about α and
β that is lost in the transformation to canonical actions is also forced to be irrelevant to decisionmaking.
This means that (Canc) entails, among other things, that agents do not distinguish between logically
equivalent formulas (since, e.g., when |= ϕ ↔ ϕ ′ , it’s easy to see that γdo(ϕ ) = γdo(ϕ ′ ) ).

3.3 Construction
Theorem 1. If  is a preference order on AF satisfying (Canc), then there is a language-based subjective
expected utility representation of .

Proof. As in [2], we begin by following the proof in [3, Theorem 2], which says that if a preference order
on a set of acts mapping a finite state space to a finite outcome space satsifies the cancellation axiom, then
it has a state-dependent representation. “State-dependent” here means that the utility function constructed
depends jointly on both states and outcomes, in a sense made precise below. To apply this theorem in
our setting, we first fix k and take CM k to be the set of acts. With this viewpoint, the state space is the set
of atoms and the outcome space is CAk,− ; as we observed, both are finite.
The relation  on AF induces a relation k on CM k defined in the natural way:

cα k cβ ⇔ α  β .

As noted, (Canc) implies that α ∼ α ′ whenever cα = cα ′ , from which it follows that k is well-defined.
To apply Theorem 2 in [3], it must also be the case that k is a preference order and satisfies cancellation,
which is immediate from the definition of k and the fact that  is a preference order and satisfies
cancellation. It therefore follows that k has a state-dependent representation; that is, there exists a
real-valued utility function vk defined on state-outcome pairs such that, for all depth-k actions α and β ,

cα k cβ iff ∑Ni=1 vk (ai , cα (ai )) ≥ ∑Ni=1 vk (ai , cβ (ai )).a (2)

It follows from our definitions that for all depth-k actions α and β ,

α  β iff ∑Ni=1 vk (ai , cα (ai )) ≥ ∑Ni=1 vk (ai , cβ (ai )).

As we observed, we needed to restrict to depth-k actions here in order to ensure that the outcome space
is finite, which is necessary to apply Theorem 2 in [3].
Our next goal is to define a selection model M = (Ωk , [[·]]M , sel), a probability Prk on Ωk , and a utility
function uk on Ωk such that, for all actions α and β of depth k,

α  β iff ∑ Prk (ω )uk ([[α ]]M,sel (ω )) ≥ ∑ Prk (ω )uk ([[β ]]M,sec (ω )). (3)
ω ∈Ωk ω ∈Ωk

Eventually, we will construct a single (state and outcome) space Ω∗ , a probability Pr∗ on Ω∗ , and a utility
u∗ on Ω∗ that we will use to provide a single representation theorem for all actions, without the restriction
to depth k, but we seem to need to construct the separate spaces first.
As a first step to defining Ωk , define a labeled k-tree to be a balanced tree of depth k whose root is
labeled by an atom such that each non-leaf node has exactly N children, labeled a1 , . . . , aN , respectively.
An ordered labeled k-tree (k-olt) is a labeled k-tree where, associated with each non-leaf node, there is a
total order on its children. We assume that in different labeled k-trees, the nodes come from the same set,
and corresponding nodes have the same label, so there is a unique labeled k-tree and k-olts differ only in
A. Bjorndahl & J.Y. Halpern 137

the total order associated with each non-leaf node and the label of the root. Let T k consist of all k-olts.
′ ′
For k′ ≥ k, a (k′ )-olt sk extends (or is an extension of ) a k-olt sk if sk is the prefix of sk of depth k; we

call sk the projection of sk onto depth k.
The intuition behind a k-olt is the following: the atom associated with the root r describes what is
true before an action is taken. For each non-leaf node t, the total order associated with t on the children
of t describes the selection function at t (with children lower in the order considered “closer”). For
example, suppose that there are two primitive propositions, p and q. Then there are four atoms. If we
take the action do(ϕ ) starting at r, we want to “move” to the “closest” child of r satisfying ϕ , which is
the child lowest in the ordering associated with r. For example, suppose that the total order on the atoms
associated with r is ¬p ∧ q < ¬p ∧ ¬q < p ∧ ¬q < p ∧ q. Then if we take the action do(p ∨ q) starting at
r, we move to the child labeled with the atom ¬p ∧ q; if we instead do do(p ∨ ¬q), we move to the child
labeled ¬p ∧ ¬q; and if instead we do if q then do(p ∨ q) else do(p ∨ ¬q), which of these two children
we move to depends on whether q is true at the atom labeling r. For an action do(p ∨ q); do(p ∨ ¬q), we
move further down the tree. The first action, do(p ∨ q), takes us to the child t of r labeled ¬p ∧ q. We
then take the action do(p ∨ ¬q) from there, which gets us to a child of t. Which one we get to depends
on the ordering of the children of t associated with t.
It turns out that our states must be even richer than this; they must in addition include a k-progress
function g that maps each node t in a k-olt sk to a descendant of t in sk . We give the intuition behind
progress functions shortly. We take Ωk to consist of all pairs (sk , g), where sk ∈ T k and g is a k-progress
function and for each primitive proposition p ∈ Φ, we define

[[p]] = {(sk , g) : p ∈ a, where a labels g(r) and r is the root of sk }.

We now want to associate with each depth-k action α a function fα : Ωk → Ωk ; intuitively, this is the
transition on states that we want to be induced by the selection function. To begin, we define fα only on
states of the form (sk , id), where id is the identity function. We take fα (sk , id) = (sk , gα ,sk ), where gα ,sk is
defined formally below. Intuitively, if t is a node at depth k′ of sk , then gα ,sk (t) describes the final state if
the action α were to (possibly counterfactually) end up at the node t after running for k′ steps, and then
continued running.
Given a k-olt sk whose root r is labeled a and an action α of depth at most k, we define gα ,sk (t) by
induction on the depth of α . For the base case, we take gno-op,sk = id. Now suppose inductively that α
has depth m and we have defined gα ′ ,sk for all actions α ′ of depth m − 1. There are three cases to consider.
(1) If cα (a) = no-op, then gα ,sk = id. (2) If cα (a) = do(ϕA ), then gα ,sk (r) is the “closest” (i.e., minimal)
child t ′ of r among those labelled by an atom in A, according to the total order labeling r; gα ,sk (t) = t
for all nodes t 6= r. (3) If cα (a) = do(ϕA ); γβ (which means β is an action of depth at most m − 1), then

gα ,sk (r) = gβ ,sk,t ′ (t ′ ), where t ′ = gdo(ϕA ),sk (r) and sk,t is the (k − 1)-subolt of sk rooted at t ′ . The intuition
here is that gα ,sk (r) is supposed to output the descendent of r that is reached by doing α ; the fact that
cα (a) = do(ϕA ); γβ tells us that the way α works (in a state where a holds) is by first making ϕA true,
and then following up with β . This means we must first move to the “closest” child of r where ϕA holds,
which is t ′ , and subsequently moving to whichever descendant of t ′ that β directs us to (which is defined,
by the inductive hypothesis). Finally, if t 6= r, let t ′′ be the first step on the (unique) path from r to t
′′
and let sk,t be the (k − 1)-subolt of sk rooted at t ′′ . Then gα ,sk (t) = gβ ,sk,t ′′ (t) (where, once again, this is
defined by the inductive hypothesis). This essentially forces us to “follow” the unique path from r to t,
and then continue from that point by doing whatever the remaining part of the action α demands. It is
clear from this definition that if the root of sk is labeled by a, then gα ,sk = gcα (a),sk .
We now extend fα to states of the form (sk , gβ ,sk ) by setting fα (sk , gβ ,sk ) = fβ ;α (sk , id). Intuitively,
138 Sequential Language-based Decisions

the state (sk , gβ ,sk ) is a state where β has “already happened” (i.e., it’s the state we would arrive at by
doing β in (sk , id)) so doing α in this state should be the same as doing first β then α in (sk , id).
Observe that a k-progress function gα ,sk not only tells us the node that α would reach if it started
at the root of sk , but also gives a great deal of counterfactual information about which nodes would be
reached starting from anywhere in sk . This is in the same spirit as subgame-perfect equilibrium [8],
which can depend on what happens at states that are never actually reached in the course of play, but
could have been reached if play had gone differently. Like this game-theoretic notion, our representation
theorem requires a kind of counterfactual information.
In light of (2), to prove (3), it suffices to define our selection function sel so that [[α ]]M,sel = fα , and
find Prk and uk such that for all actions α of depth k,

N
∑ vk (ai , cα (ai )) = ∑ Prk (sk , g)uk ( fα (sk , g)). (4)
i=1 (sk ,g)∈Ωk

Our definition of fα is set up to make defining the right selection function straightforward: we simply
set sel((sk , g), ϕ ) = fdo(ϕA ) (sk , g), where A is the unique set of atoms such that |= ϕA ↔ ϕ . It is then easy
to check that [[α ]]M,sel = fα .
Define Prk (sk , g) = 0 if g 6= id, and Prk (sk , id) = 1/|T k | for all sk ∈ T k . Given this, to establish (4), it
suffices to define uk such that for all actions α of depth k,

N
|T k | ∑ vk (ai , cα (ai )) = ∑ uk ( fα (sk , id)). (5)
i=1 sk ∈T k

Given an atom a, let Tak consist of all k-olts whose root is labeled by a. By definition of fα , to prove
(5), it suffices to prove, for each atom a ∈ {a1 , . . . , aN } and all actions α of depth k, that

|T k |vk (a, cα (a)) = ∑ uk (sk , gα ,sk ) = ∑ uk (sk , gcα (a),sk ), (6)


sk ∈Tak sk ∈Tak

where the second equality follows from the fact, observed above, that gα ,sk = gcα (a),sk whenever sk ∈ Tak .
Since vk is given, for each depth-k action cα (a), the left-hand side of (6) is just a number. Replace
each term uk (sk , gcα (a),sk ) for sk ∈ Tak by the variable xsk ,g k
. This gives us a system of linear equations,
cα (a),s
one for each action cα (a), with variables xsk ,g , where the coefficient of xsk ,g in the equation corresponding
to action α is either 1 or 0, depending on whether gα ,sk = g. We want to show that this system has a
solution.
We can describe the relevant equations as the product MX = U of matrices, where M is a matrix
whose entries are either 0 or 1, and X is a vector of variables (namely, the variables xsk ,g ). The matrix M
has one row corresponding to each action in CAk,− (since, for all actions α of depth k, cα (a) ∈ CAk,− ),
and one column corresponding to each state (sk , g) with sk ∈ Tak . The entry in M in the row corresponding
to the action γα and the column corresponding to (sk , g) is 1 if gγa ,sk = g (i.e., if fα (sk , id) = (sk , gα ,sk ) =
(sk , g)) and 0 otherwise. A basic result of linear algebra tells us that this system has a solution if the rows
of the matrix M (viewed as vectors) are independent. We now show that this is the case.
Let rα be the row of M indexed by action α ∈ CAk,− . Suppose that a linear combination of rows is
0; that is, ∑α dα rα = 0, for some scalars dα . The idea is to put a partial order ❂ on CAk,− and show by
induction on ❂ that for all α ∈ CAk , the coefficient dα = 0.
A. Bjorndahl & J.Y. Halpern 139

We define ❂ as follows. We take no-op to be the minimal element of ❂. For actions α = doϕA ; γβ
and α ′ = do(ϕA′ ); γβ ′ (where we take γβ to be no-op if α = do(ϕA ) and similarly for γβ ′ ), α ❂ α ′ iff
either (1) A ) A′ , (2) A = A′ , β 6= no-op, and β ′ = no-op, or (3) A = A′ , cβ (a) ❂ cβ ′ (a) for all atoms a.
We show that dα = 0 by induction assuming that dα ′ = 0 for all actions α ′ ∈ CAk,− such that α ❂ α ′ .
For the base case, α = no-op. Consider the k-progress function gkno-op such that gkno-op (t) = t for all nodes
t in a k-olt. Note that gno-op (sk , id) = gkno-op for all k-olts sk . It is easy to see that if β has the form do(ϕA )
or do(ϕA ); γβ ′ , then for all k-olts sk , gβ ,sk 6= gkno-op (since for the root r of sk , gβ ,sk (r) 6= r). Thus, the entry
of rno-op corresponding to the column (sk , gkno-op ) is 1, while the entry of dβ for β 6= no-op corresponding
to this column is 0. It follows that dno-op = 0.
For the general case, suppose that we have an arbitrary action α 6= no-op in CAk,− and dα ′ = 0
for all α ′ ∈ CAk such that α ❂ α ′ . We now define a k-olt sk,α ∈ Tak such that if gα ′ ,sk,α = gα ,sk,α and
α 6= α ′ , then α ❂ α ′ , so dα ′ = 0 by the induction hypothesis. Once we show this, it follows that dα = 0
(since otherwise the entry in ∑α ′ dα ′ rα ′ corresponding to gα ,sk,α would be nonzero). We construct sk,α by
induction on the depth of α . If α has depth 1 and is not no-op, it must be of the form do(ϕA ) for some
set A of atoms. Suppose that b ∈ A. Let the total order at the root of sk,α be such that the final elements
in the order are the elements in A, and b is the first of these. For example, if A = {b, c, d}, we could
consider an order where the final three elements are b, c, and d (or b, d, and c). Note that if r is the root
of sk,α , then gα ,sk,α (r) is the child tb of r labeled b. Now consider an action α ′ of the form do(ϕA′ ); β (β
may be no-op). If A′ contains an element not in A, then gα ,sk,α (r) 6= tb (because there will be an atom in
A′ that is greater than b in the total order at r). If A′ ⊂ A, then α ≻ α ′ , as desired. And if A = A′ and
α 6= α ′ , then α ′ = ϕA ; γβ and β 6= no-op, so it is easy to see that gα ′ ,sk,α (r) 6= tb = gα ,sk,α (r).
Suppose that m > 1 and we have constructed sk,β for all actions β ∈ CAk,− of depth less than m. We
now show how to construct sk,α for actions α ∈ CAk,− of depth m that are not of depth m − 1 . This
means that α must have the form do(ϕA ); β . We construct the total order at r as above, and at the subtree
of sk,α whose root is the child of r labeled a, we use the same orderings as in sk−1,cβ (a) , which by the
induction hypothesis we have already determined. It now follows easily from the induction hypothesis
that if gα ′ ,sk,α = gα ,sk,α ′ and α 6= α ′ , then α ❂ α ′ . This completes the argument for (6).
The argument above gives us a representation theorem for each k that works for actions of depth k.
However, we are interested in a single representation theorem that works for all actions of all depths
simultaneously. The first step is to make the state-dependent utility functions v1 , v2 , . . . that we began
with (one utility function for each k in the argument above) v-compatible, in the sense that if α is a

depth-k action and k′ > k, then vk (a, cα (a)) = vk (a, cα (a)). That is, we want to construct a sequence
(v1 , v2 , v3 , . . .) of v-compatible utility functions, each of which satisfies (2). We proceed as follows.
We can assume without loss of generality that each utility function has range in [0, 1], by applying
an affine transformation. (Doing this would not affect (2).) For each utility function vk let vki , for i ≤ k,
be the restriction of vk to actions of depth i. Thus, vkk = vk . Now consider the sequence v11 , v21 , v31 , . . .
It must have a convergent subsequence, say vm1 ,1 , vm2 ,1 , vm3 ,1 , . . .. Say it converges to w1 . Now consider
the subsequence vm2 ,2 , vm3 ,2 , . . .. (We omit vm1 ,2 , since we may have m2 = 1, in which case vm1 ,2 is not
defined.) It too has a convergent subsequence. Say it converges to w2 . Continuing this process, for
each k, we can find a convergent subsequence, which is a subsequence of the sequence we found for
k − 1. It is easy to check that the limits w1 , w2 , w3 , . . . of these convergent subsequences satisfy (2) and
are v-compatible (since, in general, vki is v-compatible with vk j for i, j ≤ k). For the remainder of this
discussion, we assume without loss of generality that the utility functions in the sequence v1 , v2 , . . . are
v-compatible.
Note that it follows easily from our definition that probability measures in the sequence Pr1 , Pr2 , . . .
140 Sequential Language-based Decisions


are Pr-compatible in the following sense: If k′ > k, (sk , id) ∈ Ωk , and E k (sk , id) consists of all the pairs
′ ′ ′ ′
(t k , id) such that sk is the projection of t k onto depth k, then Prk (sk , id) = Prk (E k (sk , id)). We will also
want a third type of compatibility among the utility functions. To make this precise, define a k′ -progress
function g to be k-bounded for k < k′ if for all nodes t of depth ≤ k, we have that g(t) has depth ≤ k, and
if the depth of t is greater than k, then (t) = t. Note that if α is a depth-k action, then gα ,sk′ is k-bounded.
If k′ > k and g is a k-bounded k′ -progress function, then g has an obvious projection to a k-progress
function. We want the utility functions in the sequence u1 , u2 , . . . that satisfies (4) to be u-compatible
in the following sense: if g′ is a k′ -progress function that is k-bounded, g is the projection of g′ to a k-
′ ′ ′
progress function, and sk is the projection of t k onto depth k, then uk (sk , g) = uk (t k , g′ ). We can assume
without loss of generality that the utility functions in the sequence u1 , u2 , . . . , are u-compatible. For given
a sequence u1 , u2 , . . ., define the sequence w1 , w2 , . . . as follows. Let w1 = u1 . Suppose that we have
defined w1 , . . . , wk . If the (k + 1)-progress function g′ is k-bounded, define wk+1 (t k+1 , g′ ) = wk (sk , g),
where sk is the projection of t k+1 onto depth k and g is the projection of g′ to a k-progress function; if g
is not k-bounded, define wk+1 (t k+1 , g′ ) = uk+1 (t k+1 , g′ ). Clearly the sequence w1 , w2 , . . . is u-compatible.
Moreover, it is easy to check that (Prk , wk ) satisfies (4).
We are now ready to define a single state space. Define an ∞-olt just like a k-olt, except that now the
tree is unbounded, rather than having depth k. Let Ω∞ consist of all pairs (s∞ , g), where s∞ is an ∞-olt and
g is a k-bounded progress function for some k. This will be our state space. Define E ∞(sk , id) by obvious

analogy to E k (sk , id): it consists of all the pairs (t ∞ , id) such that t ∞ extends sk . Then, by Carathéodory’s
extension theorem [1] there is a measure Pr∞ on the smallest σ -algebra extending the algebra generated
by the sets E ∞(sk , id) which agrees with Prk for all k (i.e., Prk (sk , id) = 1/|T k | = Pr∞ (E ∞(sk , id)). Let u∞
be defined by taking u∞ (s∞ , g) = uk (sk , gk ) if g is k-bounded and sk is the unique k-olt that s∞ extends. It
is easy to check that this is well-defined (note that if g is k-bounded then g is k′ -bounded for k′ > k, so
there is something to check here). Finally, it is easy to check that for a depth-k action α , we have that the
expected utility of α is

∑ Pr∞ (E ∞ (sk , id))u∞ (sk , gα ,sk ) = ∑ vk (a, cα (a)),


(sk ,g)∈Ωk a

giving us the desired result.

4 Conclusion and Future Work


We have extended the results of [2] to allow for actions that are composed of sequences of steps, and
proved a representation theorem in this setting. More precisely, we have shown that when an agent’s
language-based preferences satisfy a suitably formulated cancellation axiom, they are acting as if they
are an expected utility maximizer with respect to some background state space Ω, a probability and utility
over Ω, and a selection function on Ω that serves to “disambiguate” the results of actions described in
the language. Allowing for (possibly unbounded) sequences of steps made the proof significantly more
complicated.
In [2], we also considered axioms regarding the preference order  that restricted properties of the
selection function in ways that are standard in the literature on counterfactual conditions (e.g., being
centered, so that sel(ω , ϕ ) = ω whenever ω |= ϕ ). Although we have not checked details yet, we believe
it will be straightforward to provide axioms that similarly restrict the selection function in our setting,
and to extend the representation theorem appropriately.
A. Bjorndahl & J.Y. Halpern 141

We also believe it is of interest in some contexts to consider more complex sequential actions, such
as “do ϕ until ψ ”. This opens the door for potentially non-terminating actions, which of course will add
further complexity to the analysis.
Finally, and perhaps most urgently, while the cancellation axiom is quite amazing in the power it has,
it is not particularly intuitive. As shown in [3], more intuitive axioms can be derived from cancellation,
such as transitivity of the relation  or the classic principle of independence of irrelevant alternatives
(see [6]). In order to bring the technical results of this project more in line with everyday intuitions about
preference, it would be very beneficial to “factor” the cancellation axiom into weaker, but easier to intuit,
components. This is the subject of ongoing research.

Acknowledgments
Joe Halpern was supported in part by ARO grant W911NF-22-1-0061, MURI grant W911NF-19-1-0217,
AFOSR grant FA9550-12-1-0040, and a grant from the Algorand Centres of Excellence program man-
aged by Algorand Foundation. Any opinions, findings, and conclusions or recommendations expressed
in this material are those of the author(s) and do not necessarily reflect the views of the funders.

References
[1] R. B. Ash (1970): Basic Probability Theory. Wiley, New York.
[2] A. Bjorndahl & J. Y. Halpern (2021): Language-based decisions. In: Theoretical Aspects of Rationality
and Knowledge: Proc. Eighteenth Conference (TARK 2021). The proceedings are published as Electronic
Proceedings in Theoretical Computer Science. doi:10.4204/EPTCS.335.5
[3] L. E. Blume, D. Easley & J. Y. Halpern (2021): Connstructive decision theory. Journal of Economic Theory
196. An earlier version, entitled “Redoing the Foundations of Decision Theory”, appears in Principles of
Knowledge Representation and Reasoning: Proc. Tenth International Conference (KR ’06). doi:10.1016/j.
jet.2021.105306
[4] J. Dubra, F. Maccheroni & E.A. Ok (2004): Expected utility theory without the completeness axiom. Journal
of Economic Theory 115, pp. 118–133. doi:10.1016/S0022-0531(03)00166-2
[5] D. H. Krantz, R. D. Luce, P. Suppes & A. Tversky (1971): Foundations of Measurement, Vol 1: Additive and
Polynomial Representations. Academic Press, New York.
[6] L. J. Savage (1954): Foundations of Statistics. Wiley, New York.
[7] D. Scott (1964): Measurement structures and linear inequalities. Journal of Mathematical Psychology 1, pp.
233–247. doi:10.1016/0022-2496(64)90002-1
[8] R. Selten (1975): Reexamination of the perfectness concept for equilibrium points in extensive games. Inter-
national Journal of Game Theory 4, pp. 25–55. doi:10.1007/BF01766400
[9] R. C. Stalnaker (1968): A theory of conditionals. In N. Rescher, editor: Studies in Logical Theory, Blackwell,
Oxford, U.K., pp. 98–112. doi:10.1007/978-94-009-9117-0_2

You might also like