A model of spatial reference frames in language
Thora Tenbrink and Werner Kuhn
University of Bremen | University of Münster, Germany
tenbrink@uni-bremen.de | kuhn@uni-muenster.de
Abstract. We provide a systematic model of spatial reference frames. The
model captures concepts underlying natural language expressions in English
that represent both external and internal as well as static and dynamic
relationships between entities. Our implementation in the functional language
Haskell generates valid English sentences from situations and reference frames.
Spatial reference frames are represented by the spatial roles of locatum,
relatum, and, optionally, vantage, together with a directional system. Locatum,
relatum, and vantage can be filled by entities taking on the discourse roles of
speaker, addressee, and participant (grammatically expressed by first, second,
and third person). Each of these roles may remain unspecified in a linguistic
description.
Keywords: reference frames, spatial relations, motion, conceptual modeling,
natural language.
1
Introduction
Spatial descriptions represent a major challenge for natural language interpretation as
well as conceptual models. The literature provides a vast range of approaches
focusing on diverse aspects and pursuing a variety of aims. For example, RetzSchmidt (1988) presents a useful account of diverse reference frames, clarifies a
number of confusions and ambiguities, and provides an outline of a dialogue system
utilizing the introduced distinctions. Herrmann (1990) as well as Levinson (1996)
both suggest (fairly equivalent) schematic approaches capturing the most frequent
types of spatial reference frames in a more systematic way than hitherto available.
Both Levelt (1996) and Frank (1998) emphasize the role of perspective choice and
mental rotation in their accounts. Frank (1998) in particular proposes a formalization
with respect to the mental operations required to assign spatial regions to objects from
various perspectives. Talmy (2000) embeds a thorough discussion of the diversity of
conceptual reference frames within his comprehensive cognitive grammar theory,
capturing a much wider range of spatial (and other conceptually crucial) terms than
most other approaches. This list could be extended considerably. Altogether, this
intricate work has provided deep insights into humans' understanding of their spatial
surroundings, which as such has been shown to be at the center of human cognition in
many crucial respects (e.g., Miller & Johnson-Laird, 1976; Lakoff & Johnson, 1980).
2
Thora Tenbrink and Werner Kuhn
The present paper adds to this body of literature by introducing a systematic
conceptual model that represents conceptual reference frames underlying English
language usage by employing simple spatial relationships between entities
consistently. The framework directly builds on Levinson's (1996) approach but
extends it in various respects. It represents absolute reference frames consistently with
intrinsic and relative frames, and it integrates the systematic difference language
makes between topologically internal and external relationships. Crucially, the
framework is capable of modeling dynamic spatial concepts in just the same way as
the static reference frames usually focused on in most accounts. The model achieves
this by separating roles (such as a vantage providing a perspective) from properties
and affordances (such as having an intrinsic orientation). Furthermore, it distinguishes
spatial and discourse roles and abstracts these from concrete linguistic expressions.
This allows for the integration of those crucial distinctions and conceptions that have
been identified in the earlier literature by various authors as just cited, and many
others, in multiple ways. We thus propose a uniform and simple model that is flexible
enough to account for a wide range of interrelated concepts.
2
Spatial reference frames
2.1
Basic framework
Levinson (1996) proposed a systematic framework that distinguishes between three
basic spatial reference frames called absolute, relative, and intrinsic. This
framework is now widely adopted by researchers for the interpretation of a particular
kind of spatial descriptions, namely those that involve the high degree of conceptual
complexity that is represented by these frames. Our model captures these three basic
reference frames via a uniform set of obligatory spatial roles, namely locatum,
relatum, and directional system. Additionally, there is the optional role of vantage,
providing a perspective. Figure 1 introduces these basic roles in the form of symbols
that will be used throughout this paper to represent roles within a situation. The
conceptual reference frame that underlies a spatial description is defined by the
relations between these roles. It assigns concrete entities from the situational context
to the roles of locatum and relatum (and possibly vantage), as well as a set of
linguistic terms to the directional system. For the latter, we will here be concerned
with two options only, namely the so-called projective terms (front, back, right, left),
and compass terms (north, west, south, east). Both of these sets partition a spatial
plane using four different directions, which our symbols represent as an abstract
cross.
Introducing abstract roles supports the identification of (and differentiation
between) implicit and explicit conceptual participants of a linguistic description. This
enables a consistent identification of reference frames underlying spatial descriptions.
In the case of a fully specified (explicit) description this results in a direct association
between the description and one specific type of reference frame. However, natural
discourse often leaves conceptual participants implicit, resulting in an underspecified
A model of spatial reference frames in language
3
spatial description. In this case, our model allows for the identification of the range of
reference frames that are compatible with a description.
The spatial roles of locatum, relatum, and vantage are filled by entities (which can
be objects, people, or places). These spatial role fillers can also fill discourse roles,
namely those of speaker, addressee, and participant. These three distinct discourse
roles correspond grammatically to first person (speaker: "I"), second person
(addressee: "you"), and third person (an entity other than speaker or addressee: "he,
she, it") (Herrmann, 1990). This three-fold distinction has systematic repercussions on
the linguistic expression of the underlying reference frames, as will be shown below.
Fig. 1. Depictions for spatial roles as basic elements in our model. The circle represents the
relatum, the square the locatum, the cross the directional system, the arrow the perspective, and
the triangle another entity within a scene (filling the role of vantage, for example). The three
roles represented on the left are obligatory for intrinsic, relative, and absolute reference frames,
while the two on the right are optional.
Apart from the basic differentiation between absolute, relative, and intrinsic
reference frames, which will be discussed in Sections 2.2 through 2.4, further options
emerge. While the standard situation in Levinson's approach is that objects are
spatially separate (external reference frames, see below), they may also be located
inside of one another (internal reference frames, Section 2.5). Further linguistic and
conceptual options arise from various motion concepts (Section 2.6).
In the static case (without the involvement of motion), the role of locatum is filled
by the entity that is currently being described (which may itself, as a shorthand, be
referred to as locatum); and the role of relatum is filled by another entity in relation
to which the locatum is being described. In intrinsic and relative reference frames,
another entity may provide the basis for determining the perspective by filling the
(optional) spatial role of vantage. Alternatively, a perspective may be conveyed by
motion. In each case, the perspective provides a vector that determines the assignment
of projective terms (e.g., which side should be referred to as left) within the
directional system, independently of whether the (actual or potential) vantage of a
person is involved.
The entities filling the roles of locatum, relatum, and vantage need not be
individual objects or persons. Tenbrink & Moratz (2003), for instance, discuss the
role of a group of similar objects filling the role of relatum in a static external relative
reference frame. This specific case exemplifies the increasing complexity, which
would be multiplied if other roles and reference frames were affected in this way. We
therefore restrict our current discussion to individual entities. Furthermore, our model
assumes the simplest spatial extension possible for the spatial roles, namely point-like
except where some extension is needed. For example, in an internal reference frame
(Section 2.5), the relatum needs to be extended in order to contain the locatum. In
practice, of course, the entities filling the spatial roles are never point-like. This,
4
Thora Tenbrink and Werner Kuhn
again, leads to further complexities in the assignment of reference frames, as shown,
for instance, by Herskovits (1986) and Eshuis (2003).
The exact position of the locatum relative to the relatum (for instance, whether an
object is conceived of as being directly or rather diagonally in front of another, or
how close it is) depends on a variety of factors including the (functional) relationships
between objects (e.g., Coventry & Garrod, 2004; Carlson-Radvansky et al., 1999),
their size (Talmy, 2000), and the situational context (Bateman et al., 2007). This is
true for all types of reference frames. In the following, representations will reflect
prototypical or "ideal" spatial relationships (Herskovits, 1986): front and north are
associated with 0° relative to the relatum, right and east with 90°, back and south with
180°, and left and west with 270°. In actual discourse, this is almost never precisely
true, but this association provides a suitable abstraction of the relevant qualitative
distinctions. More precise spatial distinctions have been modelled, for instance, by
Freksa (1992), Regier and Carlson (2001), Moratz and Tenbrink (2006), and Moratz
and Ragni (2008), focusing on different psychological, formal, or discourse-related
aspects. Concerning the distance of objects to each other, it can be observed that the
use of projective terms typically implies a direct (uninterrupted) adjacency
relationship between locatum and relatum (Talmy, 2000; Pribbenow, 1991). For
example, if object A is described as left of object B, there is no further object between
A and B. In contrast, this is not the case for spatial descriptions using compass
directions. Apart from this qualitative effect of proximity with projective terms, there
are no further constraints on spatial distance.
Having clarified the general properties of our framework, we will now introduce
the specific cases, starting with Levinson's three basic reference frames: intrinsic,
relative, and absolute. All of these represent static external situations.
a.
b.
c.
Fig. 2. Basic reference frames, represented schematically. a: Intrinsic case. b: Relative case
(in which the perspective is provided by the vantage, depicted by a triangle). c: Absolute case.
2.2
Static external intrinsic reference frames
In the (static) intrinsic case, the relative position of the locatum with respect to the
relatum is described by referring to the relatum's intrinsic properties such as front or
back. Therefore, one can say:
(1)
There is a box in front of me.
This example represents a case in which the relatum is the speaker and the locatum is
an entity other than speaker or addressee (a box). The perspective is supplied by the
speaker's front or view direction, i.e., the speaker provides the vantage. In Figure 2a,
A model of spatial reference frames in language
5
this idea is represented by the arrow coinciding with the relatum, with the directional
system imposed on both. The front direction is thus provided by the speaker's view
direction, the right direction by the speaker's right, and so forth, yielding the order
front-right-back-left in clockwise direction. Any entity with the potential to provide a
direction may serve as relatum in an intrinsic reference frame, including objects with
functional parts (chairs or cars) and the like (Herrmann, 1990).1 The other options for
filling the roles are as follows.
(2)
There is a box in front of you.
(3)
There is a box in front of the chair.
(4)
I am in front of you.
(5)
You are in front of me.
Together these examples illustrate the three distinct cases of relatum (first person: (1);
second person: (2); third person: (3)); as well as the three distinct cases of locatum
(first person: (4); second person: (5); third person: (1)). Since the relatum coincides
with the vantage, there are no additional options for filling this role within an intrinsic
reference frame.
2.3
Static external relative reference frames
Unlike the intrinsic case, the relative case is based on a different entity (other than the
relatum) providing a perspective. In (6) the relatum (the ball) does not possess an
intrinsic front. To interpret such an utterance, the underlying perspective needs to be
identified, based on the speaker's or the addressee's vantage, or on a different entity
that provides a basis for a view direction (see Figure 2b). The perspective allows for
the assignment of a directional system to the relatum, i.e., determines where the front,
left, back, and right sides of the relatum will be.
(6)
There is a box to the right of the ball (from my vantage (point)2).
The other options for filling the roles of relatum, locatum, and vantage can be spelled
out as follows. While (6) shows first person as vantage, (7) exemplifies second
person, and (8) third person, respectively. (9) provides the case of first person as
locatum, (10) gives second person as locatum, and (6) third person as locatum. The
1
Not all objects provide all directions (front, back, right, and left), even if they are asymmetric,
such as pencils whose tip may provide a "front" to some speakers. Furthermore, Tyler &
Evans (2003) point out that even entities which have no inherent orientation at all can
sometimes be used for reference of this kind (i.e., without an external observer), as in Sarah
stood in front of the tree. Quite exceptionally, the front side of the locatum (rather than the
relatum) is used here to determine the direction. Tyler and Evans trace this phenomenon back
to what Clark (1973) called the "canonical encounter", i.e., a face-to-face interaction
transferred, in this case, to the tree. This seems likely since the effect only appears with the
front direction; the locatum's back, left, and right sides cannot be used in this way.
2 In our model, "vantage" is the technical term for a particular spatial role. In natural language,
speakers would be more likely to refer to it as "vantage point" or "point of view", if they
chose to specify it at all, which is only rarely the case (Herrmann & Grabowski, 1994).
6
Thora Tenbrink and Werner Kuhn
relatum is represented by the first person in (11), second person in (12), and third
person in (6), respectively.
(7)
There is a box to the right of the ball (from your vantage).
(8)
There is a box to the right of the ball (from the chair's vantage).
(9)
I am to the right of the ball (from your vantage).
(10)
You are to the right of the ball (from my vantage).
(11)
There is a box to my right (from your vantage).
(12)
There is a box to your right (from my vantage).
As these examples demonstrate, not all conceivable ways of filling the roles are
equally likely to occur in natural discourse. Example (6) is natural, since the speaker
of this description uses their own vantage, which is a normal thing to do. Using the
addressee's vantage, as in (7), is also natural; which of these two options is chosen
depends on various discourse factors such as the relationship between the people
involved (Herrmann & Grabowski, 1994; Schober, 1993). In contrast, describing a
scene from another entity's vantage as in (8) may need a particular reason for doing
so. Moreover, it is untypical for a speaker to describe their own position from the
addressee's vantage as in (9), or vice versa as in (10), or to describe the location of an
object in relation to one's own body from the addressee's vantage as in (11), or vice
versa as in (12). However, discourse situations exist in which these kinds of
descriptions become relevant and might be used, since they belong to the general
repertory available to speakers. In the following, we will restrict our account to a
subset of possible cases out of the general system in which the roles of locatum,
relatum, and vantage can theoretically be filled by all three options of speaker,
addressee, or participant.
There is one further complication worth mentioning. With the front-back axis used
in example (13) below, relative reference frames are somewhat ambiguous. As Hill
(1982) demonstrates, two conceptual alternatives are conceivable (see Figure 3). In
English, the relation in front of usually expresses that the locatum is closer to the
vantage than the relatum, yielding the order front-left-back-right in clockwise
direction – notably, the inverse of the order reported above for intrinsic reference
frames. However, the opposite may also be the case. In other languages such as
Hausa, the opposite is the preferred interpretation (Hill, 1982). Then the locatum is
further away from the vantage than the relatum, and the same order (front-right-backleft) is maintained as with intrinsic reference frames. In the following we will assume
inverse ordering of directions for relative reference frames as a default, which is
generally accepted as the more typical interpretation in English, leaving the
alternatives implicit.
(13)
There is a box in front of the ball (from my vantage).
A model of spatial reference frames in language
7
Fig. 3. Two possible interpretations of the front-back axis in a relative reference frame: With
in front of, the locatum (box) may be (a) closer to or (b) more distant from the vantage than the
relatum.
2.4
Static external absolute reference frames
In the absolute case, ubiquitous orientation systems provide a culturally shared basis
for determining the directional system (Levinson, 1996). These include compass
directions (north, east, south, west, established in clockwise order as shown on maps)
as well as, in other languages, environmental features (uphill, downhill, upriver,
downriver, which may be less stably established). For example, if the north direction
is towards the top of the page, the following is consistent with the depiction in Figure
2c:
(14)
There is a box east of the ball.
Since absolute reference frames presuppose a directional system that is already
present within the discourse context via its anchoring in the culture, no further
perspective is needed to establish an assignment of directions.
2.5
Internal relationships
Levinson's framework is geared toward (and typically applied to) external
relationships, i.e., relations between objects that are spatially separate, as in the
examples given so far. Does it equally account for cases in which the locatum is
positioned inside of the relatum, yielding an internal relationship? Language
sometimes distinguishes between these two topological cases grammatically (Miller
& Johnson-Laird, 1976; Talmy, 2000), as seen from the distinction between the
external example (15) and the internal relationship expressed in (16):
(15)
The box is in front of the car.
(16)
The box is in the front of the car.
In internal relationships, the relatum is conceptually divided into parts that are
described by projective terms, sometimes explicitly so by referring to sides (such as
"on the left/right side", Carroll, 1993:30). As with external relationships, the
directional system underlying such a description can be assigned in different ways. In
8
Thora Tenbrink and Werner Kuhn
(16), represented by Figure 4a, the directional system is based on the relatum's
intrinsic parts (or perspective); this yields a clear internal intrinsic case in which the
relatum encompasses the locatum. Again, as with external intrinsic reference frames,
directions are assigned as front-right-back-left in clockwise order.
Internal relative cases are based on an observer's vantage. For instance, if the
relatum room in example (17) has no intrinsic parts of its own (e.g., a room with
several doors), a perspective may be derived from the speaker looking into the room,
imposing a directional system on the room. If Figure 4b is taken to represent example
(17), the relatum corresponds to the room and the locatum to the box. The exact
position of the vantage is not reflected linguistically in internal relative reference
frames; it may be located inside or outside the relatum (or at the borderline, standing
in the door, for example). Since intrinsic sides are typically ascribed to objects by the
way humans interact with them (Herrmann, 1990), internal relative reference frames
may sometimes not be distinguishable from internal intrinsic ones.
(17)
The box is in the back of the room.
The interpretation in terms of a relative internal reference frame entails the
assignment of regions in the same way as with external relative frames, namely frontleft-back-right in clockwise order (where front corresponds to the region closest to the
vantage). Furthermore, regions may also be partitioned into internal (relative) sections
by adopting a global perspective (Carroll, 1993). The observed region can be a
specific assembly of objects that are perceived as belonging together or being relevant
for the discourse situation (Gorniak & Roy, 2004), or any other kind of region that is
within the limits of perception. For example, in German it is possible to say:
(18)
Dort hinten steht eine Kiste. [lit., "There in the back stands a box."]
Here, the visual field is partitioned into regions in relation to the position of the
speaker. Then, the area close to the observer is referred to as vorne (front), and the
area more distant from the speaker within the visual field is referred to as hinten
(back) (see Tenbrink, 2007, for discussion of syntactical patterns). Paralleling
example (17), example (18) also corresponds to the situation in Figure 4b if the circle
(relatum) represents the speaker's visual field, the square represents the box, and the
view direction (the arrow) is derived from the speaker.
Finally, the internal absolute case is straightforward, as it employs a ubiquitous
directional system, both within and outside of any relatum. In example (19), the town
is the locatum (represented by the square in Figure 4c) and the country is the relatum
(represented as the big circle).
(19)
The town is in the east of the country.
A model of spatial reference frames in language
a.
b.
9
c.
Fig. 4. Internal relationships: the relatum (represented by the big circle) is large enough to
contain the smaller locatum (the square). a: Intrinsic case. b: Relative case, with the entity
providing a perspective (i.e., vantage) positioned either inside or outside of the relatum. c:
Absolute case.
2.6
Motion
So far, the discussion has focused on static relationships between objects, which have
been described as conceptually primary (Svorou, 1994:22). When the entities in
question are in motion, several distinct effects emerge. Motion can be expressed by a
range of spatial terms, some of which are semantically dynamic, while others
resemble static expressions (Miller & Johnson-Laird, 1976); for instance, projective
terms may be used dynamically just as well as statically (Retz-Schmidt, 1988:102).
Motion can provide an independent perspective (Svorou, 1994; Fillmore, 1997), and
motion descriptions can reflect the same three types of reference frames as static
descriptions (Levinson, 2003:96f.). However, depending on which object (or role in
the present framework) is affected by the motion event there may be quite different
effects. For example, an object may undergo change with respect to its own former
position or extension (Brugman & Lakoff, 1988). To our knowledge, these
observations have not been integrated comprehensively in any framework, nor have
the effects of motion on reference frames been explored in much detail. We propose
that the introduction of motion allows for the roles of locatum and relatum to be filled
by different entities at different times, resulting in a system of reference frame options
that is far more complex than the static situation reveals, yet utilizes the same
underlying conceptual patterns. The following account first addresses motion as
perspective; then the various effects of motion on the roles of relatum and locatum
will be spelled out.
2.6.1 Motion (or Sequence) as Perspective
Directed movement may in some cases provide a perspective for intrinsic and relative
reference frames. Then the directional system is imposed on the relatum not on the
basis of perception (a view direction), but on the basis of the direction of movement.
In such cases the roles of relatum and locatum can be specified in the same manner as
with static situations, since the movement does not affect their relative position (see
also Talmy, 2000). Example (20) represents an (external) intrinsic case that is
schematically illustrated by Figure 5a below. Here the relatum (ball) and the locatum
10
Thora Tenbrink and Werner Kuhn
(mouse) remain in a stable spatial relationship to each other, without requiring an
additional vantage, as the described movement provides a basis for the directional
system, i.e., the interpretation of "in front of".
(20)
The mouse is running in front of a ball rolling down the hill.
(21)
The wheel is rolling towards the box placed to the right of the ball.
Example (21) represents a relative case, shown in Figure 5b below. It involves two
spatial concepts: a movement (of the wheel) towards the box, and the location of the
box (locatum) to the right of the ball (relatum). The movement description of the first
spatial concept provides the basis for assigning a directional system for the second
spatial concept. In other words, the direction of movement within the scene fills the
role of perspective in the description of a static spatial relationship. Another
possibility for a relative reference frame is that the direction of movement
encompasses both relatum and locatum (Figure 5c)3. In example (22) both the relatum
(ball) and the locatum (box) are floating at the same speed, and therefore remain in a
stable relationship to each other. As before, the direction of movement within the
scene fills the role of perspective in the description of a static spatial relationship.
Similarly, concepts of sequence (with or without movement) may provide a
(functional) direction; example (23) appears to be valid no matter how Peter and Mary
are currently oriented, and thus conceptually equivalent to example (22).
a.
(22)
The box is floating in the river, in front of the ball.
(23)
Peter is in front of Mary in the queue.
b.
c.
Fig. 5. Movement inducing a perspective for intrinsic and relative reference frames. The
direction of movement is indicated by a thin arrow. a: Intrinsic case; the locatum is in front of
the relatum, which is currently moving and therefore capable of providing a perspective. b:
Relative case with external movement; the right side of the relatum is assigned by the
"perspective" of another moving entity in motion. c: Relative case with surrounding movement
(or sequence).
2.6.2 Motion from anywhere to locatum: All reference frames
Spatial terms sometimes refer to the destination point (or region) of a motion
trajectory, as in the following examples:
3
This is one way of interpreting this specific movement case. See Tenbrink (2011) for a
different interpretation within a slightly modified model.
A model of spatial reference frames in language
(24)
The box should be placed in front of me.
(25)
Put the box to the right of the ball.
(26)
Put the box to the east of the ball.
(27)
Place the box in the front of the car.
(28)
Place the box in the back of the room.
(29)
Place the box in the east area of the town.
11
Similar to example (21), all of these descriptions involve two spatial concepts. Here,
the first concept a) concerns a movement of the box, starting from an unknown
position, and the second b) concerns the definition of the future position of the box
relative to a relatum. In such cases, the entity in focus (the box) no longer continually
represents the locatum; the reference frame underlying spatial concept b) only holds
at time t1 after completing the movement trajectory of a), but not at time t0 before or
while the motion occurs. At time t1, reference frames are established that are
equivalent to the static reference frames described above; the difference is due to the
nature of the verb (dynamic rather than static). All three kinds of basic reference
frames can be used in this way, both externally and internally. After completing the
movement, example (24) can be interpreted in terms of a dynamic external intrinsic
reference frame, with the new location of the box representing the role of locatum as
defined by its relation to the relatum (the speaker). (25) depends on an external
perspective (which the context will provide), yielding a dynamic external relative
reference frame. The dynamic case of an absolute reference frame is shown in (26).
(27) and (28) are examples for intrinsic and relative dynamic internal reference
frames, and (29) gives the dynamic internal absolute case. All of these cases are
straightforwardly represented by the schemata depicted in Figures 2 and 4, showing in
this case the end position of the movement at time t1. The start position of the moving
object and the trajectory of movement are irrelevant in each of these cases, since the
relatum and perspective (if any) are defined independently of the motion event, and
the locatum is defined only by the end point of the trajectory.
2.6.3 Motion from vantage to locatum in a dynamic relative reference frame
Example (30) is similar to the examples just discussed in that, again, two spatial
concepts are involved: a) a movement (by the speaker to the box), and b) the
definition of the position of the box as being to the right of the ball in a relative
reference frame (as in example (25)). However, in this example, the speaker is also a
likely vantage4 for the perspective used in b), given at time t0, prior to the motion
event described in a). Then the motion event a) starts from the vantage position at
time t0. The two other objects remain unaffected by the motion in a) and can thus
straightforwardly (and without considerations of time) be described as relatum (ball)
and locatum (box). This situation is represented in Figure 6, which shows how the
entity providing a perspective at time t0 moves towards the position of the locatum.
4
Alternatively, the direction of movement itself may provide the perspective as in example (22)
above.
12
Thora Tenbrink and Werner Kuhn
(30)
Fig. 6.
I will go to the box to the right of the ball.
Dynamic relative reference frame: Movement from vantage to locatum.
Now consider the following, describing basically the same situation except that the
locatum is a place (the Aristotelian notion of a location with the potential to be
occupied by an object) rather than an object:
(31)
I'm going to a place to the right of the ball.
(32)
I'm going to the right of the ball.
Again, the perspective can only be defined from an external position, for example the
speaker's position at time t0, prior to movement, yielding a dynamic relative reference
frame. The end point of the trajectory – the place to the right of the ball – at time t1
corresponds to the role of locatum, as in example (25) above. In example (31) this
place is linguistically represented explicitly, but the implicit case in (32) appears to be
pragmatically equivalent and perhaps more natural. Again, the trajectory of the entity
that provides the perspective leads from the position of vantage to that of the locatum
as depicted in Figure 6. Note that the moving entity may change its orientation during
the movement without changing the definition of the goal location (locatum); the
perspective relies on the position of the oriented entity at the time of the description
(t0).
2.6.4 Motion from relatum and vantage to locatum in an external intrinsic
reference frame
So far, all examples contained an explicit relatum, rendering the underlying spatial
relationship unambiguous (except for underspecification of perspective). However,
neither in static nor in dynamic spatial descriptions does this have to be the case. In
the examples of static relationships described in Sections 2.2 and 2.3, the relatum
could unproblematically remain implicit as in example (33) below, without changing
the intended reference frame. But how can the dynamic examples (34) and (35) be
interpreted in the present model of reference frames?
(33)
There is a box on the right.
(34)
I'm going to the right.
(35)
I'm going right.
Conceivably, the spatial relation underlying a description like (34) is the same as in
example (32), using the dynamic version of a relative reference frame, and omitting
A model of spatial reference frames in language
13
the relatum (ball). A more likely explanation, however, may be that no additional
relatum is intended at all, and the utterance merely expresses a case of self-movement
towards a right direction – equivalent to example (35) which can only be interpreted
in this sense. This can be modelled as the dynamic version of an external intrinsic
reference frame: the relatum is reflexive (cf. Brugman & Lakoff, 1988) and
corresponds to the vantage, i.e., the speaker's position at time t0, providing the
direction of movement.
This idea can be best illustrated by starting with the front direction as illustrated in
Figure 7 (a and b). The schema in Figure 7a shows the static intrinsic case; the
locatum (square) is described with respect to the relatum (circle) which also provides
the perspective (big arrow). A corresponding description is example (1) above,
repeated here for convenience:
(36)
There is a box in front of me.
This is directly mirrored by (37) and – if the end position of the movement is not
defined by an object but simply a place – also by (38) (schematically depicted by
Figure 7b). Again, these two utterances involve two spatial descriptions each: the goal
of the speaker's movement is specified by a noun ("the box" in (37); "a position" in
(38)), and the location of these goals is then defined by a static spatial description ("in
front of me"). However, essentially the same spatial situation as in (38) can in English
be addressed in a shorter form, namely by (39) using an expression that is
semantically dynamic (also called directional, cf. Winterboer et al., in press), leaving
the end point of the trajectory implicit. Figure 7c shows the situation for a movement
towards the right with respect to the start position of the mover, as in examples (34)
and (35) above.
(37)
I'm going to the box in front of me.
(38)
I'm going to a position in front of me.
(39)
I'm going forward.
Movement from the vantage and relatum as described so far may or may not involve a
re-orientation of the moving entity. Example (40), in contrast, gives an explicit
description of a re-orientation; this is expressed by the verb turn. Here, the situation is
reversed in that the re-orientation may or may not also imply a movement to a new
position. If uttered in a route context, it usually expresses re-orientation combined
with a continued movement straight on, yielding a trajectory resembling a quarter of a
circle.
(40)
I'm turning (to the) right.
14
Thora Tenbrink and Werner Kuhn
a.
b.
c.
Fig. 7. Intrinsic case: Movement from start position (vantage and relatum) to end position
(locatum). a: Static intrinsic reference frame (for comparison). b: Forward movement. c:
Movement to the right of the view direction at the start of the movement.
2.6.5 Motion from relatum (not vantage) to locatum: Dynamic relative and
absolute reference frames
Another kind of dynamic relative reference frame (distinct from the kind described in
Section 2.6.3 above) emerges if directionals are used to describe the movement of
objects relative to their own previous position as described from an external vantage.
In example (41), both the vantage and the relatum are unspecified and need to be
derived from the context (see Jörding & Wachsmuth, 2002, for an inspiring study
exploiting this underdeterminacy). The context may provide possible interpretations
for a relatum similar to examples (25) and (27) above. However, the object's original
position may also serve this role; then the object is moved to the right of its own
position at time t0. As for perspective, it is perhaps most likely that the speaker is
using their own vantage (which remains unchanged through the time of the
movement), which then yields a situation as depicted in Figure 8a. Other vantages are
equally possible. The end position of the movement at time t1 then again corresponds
to the role of locatum.
(41)
Move the box to the right.
If a non-oriented entity is moved in a forward direction as in example (42), the moved
object (the box) might move from its own position at time t0 (the relatum) to a
position (the locatum at time t1) forward (or: in front) of the relatum, using a
perspective provided by a different entity (possibly the speaker in example (42)), as
shown in Figure 8b.
(42)
Move the box forward.
a.
b.
Fig. 8. Dynamic relative reference frames. a. Movement of an object from the relatum (start
point) to the locatum (end point). b. The object is moved forward with respect to its own earlier
position, using an external vantage determining the directional system.
Alternatively, the perspective (which in this case determines the direction of
movement) may be provided by an externally defined type of sequence or movement,
A model of spatial reference frames in language
15
as in Figure 5c, example (22) above. Example (43) illustrates that, in this case, no
further entity (such as the speaker) is required for interpretation of the direction of
forward movement. The box moves to a new position that is further in the front of the
ordered sequence or conveyor belt than its previous position (cf. Figure 9).
(43)
The box is moved forward in the ordered sequence / on the conveyor
belt.
Fig. 9. Relative reference frame providing a direction of movement. An entity is moved from
the position of the relatum (its own earlier position, which the new position is related to) to that
of the locatum, based on the encompassing perspective given by external movement or
sequence.
However, as with the lateral axis, other interpretations are available as well, filling
the lexically unspecified roles of relatum and perspective in different ways. Imagine,
for instance, a situation in which objects are arranged in order to be photographed.
Then an instruction to move the box forward could be interpreted to mean moving the
object towards the area in front of the camera, with the camera filling the roles of
vantage and relatum, yielding a dynamic intrinsic reference frame similar to example
(24) above. The end position of the movement then again becomes the locatum at
time t1.
2.7
Summary of spatial reference frames
Spatial reference frames have been distinguished in the present framework along the
following lines:
•
•
•
•
intrinsic, relative, or absolute concepts
external or internal relationships between entities
static or dynamic situations
For dynamic situations:
o Movement direction as perspective
o Movement from anywhere to locatum
o Movement from vantage to locatum
o Movement from relatum to locatum
The distinctions can be combined almost non-restrictively. Further complexities arise
by the choice of axis (frontal vs. lateral) as well as perspective (speaker, addressee, or
other) and type of relatum (an object or person, a group of objects, etc.). Each of these
kinds of variability deserves attention in its own right, as reflected in the vast amount
of research literature in this area (see Tenbrink, 2007 for an overview). For instance,
16
Thora Tenbrink and Werner Kuhn
if the relatum consists of several objects (such as a group of same-class objects), this
may have several repercussions on the language used (cf. Tenbrink & Moratz, 2003).
3
Implementation
The goal of our implementation of the model is a simulation that generates valid
sentences in English (and ultimately other languages) from spatial situations and
discourse roles, using appropriate types of reference frames from the available set of
options. Alternatively, one might look for implementations that generate possible
reference frames from given situations and linguistic descriptions, or that generate
possible situations from linguistic descriptions along with reference frames. A
suitable formalization tool for our current purposes is the functional language Haskell
(see www.haskell.org), which has been used successfully for simulations of other
phenomena, such as transportation (Kuhn 2007) and observation (Kuhn 2009).
Haskell can capture the role-based nature of our model particularly well, as it allows
for distinguishing between types of entities and the roles they fill. The lack of this
distinction in existing models of reference frames motivated the work presented here.
In order to test the completeness and adequacy of the role-based model, we have as
a first step implemented the simulation, before producing a more refined ontology.
The simulation consists of a small set of rules to produce English sentences from
situations described as role assignments, with associated discourse roles. Situations
are records of locatum, relatum, and optionally vantage. Discourse roles are
assignments of entities to the roles of speaker, addressee, and participant. Each role
slot is filled by one or more entities, which are described as records with noun,
position, footprint, heading, and motion direction. The geometric properties are
represented internally in simple raster coordinates local to situations.
Somewhat surprisingly, the only analytical procedure required is a simple (one line
of code) function to determine the direction from relatum to locatum as seen from the
vantage of a situation. The only other interesting rule is the one to determine the
preposition (such as “in front of”) from this relative direction, using the frame of
reference type. All sentences can be generated as a field (“There is a box in front
of…”) or object (“The box is in front of…”) representation.
Our implementation reproduces sentences 1 to 24 of the examples in this paper, a
few of them with minor grammatical variations (such as “to the right of me” rather
than “to my right”). The spatial referencing for the locatum of all remaining dynamic
situations (25 to 43) is also correctly reproduced, though no effort has been made to
capture their dynamic verb phrases (involving put, place, move, go, turn, etc.), as
these are independent of spatial referencing. The main point about these examples is
that a movement direction can supply a perspective, that spatial roles can be defined
at certain times (prior to or after movement), and that they can be filled by abstract
places rather than objects or people. Excerpts from the simulation code are given in
the Appendix. The current version of the complete code can be inspected and
downloaded from http://musil.uni-muenster.de/resources.
A model of spatial reference frames in language
4
17
Conclusions
In this paper we have extended widely used accounts of spatial reference frames by
integrating dynamic cases and some further fundamental distinctions made in
language. By using abstract roles that are filled by entities in a discourse context, our
model consistently captures a wider range of spatial descriptions than has been
proposed in earlier approaches. Moreover, we have proposed an implementation in
the form of a simulation generating our example sentences.
Various applications of this framework are conceivable. Natural language
generation systems can profit from our approach just as well as computational
implementations of spatial descriptions. Moreover, a range of controversies in the
literature on this complex topic may be reconciled by realizing the diversity of spatial
concepts (static and dynamic, non-projective and projective, etc.) that may potentially
support temporal descriptions as outlined in Tenbrink (2011). This is true for the wellresearched English language, which is the basis for the current framework, but also
for other cultures and languages, which have only partly been explored so far with
respect to their spatiotemporal conceptualizations.
For future work, we therefore target an extension of the simulation to other
languages, but also to more than four directions, and to three-dimensional as well as
temporal situations. The roles will be generalized to allow for multiple fillers, such as
several relata or addressees. The simulation will also be lifted to an ontology of
spatial referencing, tied into an upper level ontology like DOLCE and/or GUM. Apart
from the theoretical insights to be gained from this, it will provide a backbone to
models of spatial referencing in areas like robotics, indoor navigation, or
choreography, where resorting to geodetic reference systems is often impractical or
insufficient.
Rather than representing an account of spatial referencing per se, this framework is
intended as a basis for further exploration. One major purpose is to facilitate further
discussion by providing a comprehensible toolbox for research within the domain of
space, based on a more flexible and integrative representation of spatial relationships
than has been available before. This toolbox may be employed and further explored
also for those cases that are not currently directly represented by the available models.
It supports systematic explorations concerning the extent to which particular spatial
models are transferred in a language to the temporal domain (cf. Tenbrink, 2011),
highlighting universal as well as idiosyncratic principles in cross-linguistic research.
As research progresses and further cognitively relevant distinctions are revealed, these
can be incrementally incorporated using the proposed roles and relations as basic
ingredients. Finally, beyond the description of general principles of conceptualization,
the framework can be used as a tool for analysis of discourse expressing concepts of
space and, furthermore, of time, contrasting speakers' pragmatic choices in actual
language usage with the generally available repertory of a language.
18
Thora Tenbrink and Werner Kuhn
Acknowledgements
Funding by the DFG to the first author, project I5-[DiaSpace], SFB/TR 8 Spatial
Cognition, and to the second author, speaker of the IRTG Semantic Integration of
Geospatial Information, is gratefully acknowledged. Joana Hois has provided
invaluable advice in the development of the conceptual framework. Comments from
four anonymous reviewers and many colleagues in the SFB/TR 8 and IRTG helped us
improve the model and its presentation.
Appendix
Without further explanation of Haskell syntax (which, at this level, is largely selfexplanatory), we present illustrative excerpts from our simulation code. They
constitute more than half of the entire code (not counting the example data). First, we
list the main declarations:
Positions are cells and directions are vectors in a simple raster:
type Position = (Int, Int)
type Footprint = [Position]
type Direction = (Int, Int)
Directional systems are ordered lists of direction names:
projective = ["front", "right", "back", "left"]
inverse = ["back", "right", "front", "left"]
compass = ["north", "east", "south", "west"]
Reference frames have a type and an associated directional system:
data Frame = Intrinsic DirectionalSystem |
Relative DirectionalSystem |
Absolute DirectionalSystem
Spatial situations assign spatial roles to entities:
data
type
type
type
Situation
Locatum =
Relatum =
Vantage =
= Situation Locatum Relatum (Maybe Vantage)
Entity
Entity
Entity
Secondly, we show the small set of computations:
Directions (as unit vectors) can be computed from Positions:
fromTo :: Position -> Position -> Direction
fromTo p1 p2 = (signum(fst p2-fst p1), signum(snd p2-snd p1))
A direction seen from another is obtained by a vector rotation:
rotate:: Direction -> Direction -> Direction
rotate d1 d2 = (snd d2 * fst d1 - fst d2 * snd d1,
fst d2 * fst d1 + snd d2 * snd d1)
A situation is internal if the locatum is contained in the relatum:
internal (Situation locatum relatum vantage) =
(position locatum) `elem` (footprint relatum)
A model of spatial reference frames in language
19
The perspective defines the direction of the first element of the directional system. It
is taken from the heading or motion of the relatum or vantage:
perspective (Situation locatum relatum Nothing) =
if (motion relatum) == (0,0) then heading relatum
else motion relatum
perspective (Situation locatum relatum (Just vantage)) =
if (motion vantage) == (0,0) then heading vantage
else motion vantage
Finally, the preposition of a sentence is computed from a situation and reference
frame as follows:
preposition situation frame =
if internal situation
then case frame of
(Absolute directionalSystem) -> "in the " ++
directionalSystem!!(quadrant (direction situation)) ++ " of "
(Intrinsic directionalSystem) -> case fst (direction situation) of
0 -> "in the " ++
directionalSystem!!(quadrant(direction situation))++" of "
1 -> "on the " ++
directionalSystem!!(quadrant(direction situation))++" side of "
(Relative directionalSystem) -> case fst (direction situation) of
0 -> "in the " ++
directionalSystem!!(quadrant(direction situation))++" of "
1 -> "on the " ++
directionalSystem!!(quadrant(direction situation))++" side of "
else case frame of
(Absolute directionalSystem) ->
directionalSystem!!(quadrant(direction situation))++" of "
(Intrinsic directionalSystem) ->
case (direction situation) of
(0,1) -> "in " ++
directionalSystem!!(quadrant (direction situation))++" of "
(1,0) -> "to the " ++
directionalSystem!!(quadrant (direction situation))++" of "
(0,-1) -> "behind " ++
directionalSystem!!(quadrant (direction situation))
(-1,0) -> "to the " ++
directionalSystem!!(quadrant (direction situation))++" of "
(Relative directionalSystem) ->
case (direction situation) of
(0,1) -> "behind "
(1,0) -> "to the " ++
directionalSystem!!(quadrant (direction situation))++" of "
(0,-1) -> "in " ++
directionalSystem!!(quadrant (direction situation))++" of "
(-1,0) -> "to the " ++
directionalSystem!!(quadrant (direction situation))++" of ".
References
Bateman, J., Hois, J., Ross, R.J., Tenbrink, T. 2010. A Linguistic Ontology of Space for
Natural Language Processing. Artificial Intelligence 174: 1027–1071.
Bateman, J., Tenbrink, T., Farrar, S., 2007. The Role of Conceptual and Linguistic Ontologies
in Discourse. Discourse Processes, 44(3), 175–213.
Brugman, C., Lakoff, G., 1988. Cognitive topology and lexical networks. In Small, S.L.,
Cottrell, G.W., Tanenhaus, M.K. (Eds.), Lexical Ambiguity Resolution, pp. 477-508. San
Mateo, CA: Morgan Kaufmann.
Carlson-Radvansky, L. A., Covey, E. S., Lattanzi, K.M., 1999. "What" effects on "where":
Functional influences on spatial relations. Psychological Science, 10, 516-521.
20
Thora Tenbrink and Werner Kuhn
Carroll, M., 1997. Changing place in English and German: language-specific preferences in the
conceptualization of spatial relations. In Nuyts, J., Pederson, E. (eds.), Language and
Conceptualization. Cambridge University Press, pp. 137-161.
Clark, H.H. 1973. Space, time, semantics, and the child. In: Moore, Timothy E. (ed), Cognitive
Development and the Acquisition of Language, pp. 27-63. N.Y. Academic Press.
Coventry, K.R., Garrod, S.C., 2004. Saying, seeing and acting: The psychological semantics of
spatial prepositions. Hove and New York: Psychology Press.
Eshuis, R. 2003. Memory for Locations Relative to Objects: Axes and the Categorization of
Regions. In van der Zee, E., Slack, J. (eds.), 2003. Representing Direction in Language and
Space, pp. 226-254. Oxford: Oxford University Press.
Fillmore, C. J., 1997. Lectures on Deixis. Bloomington: Indiana.
Frank, A.U., 1998. Formal models for cognition – Taxonomy of spatial location description and
frames of reference. In Freksa, C., Habel, C., Wender, K.F. (eds.), Spatial Cognition, pp.
293-312. Berlin: Springer.
Freksa, C.. 1992. Using Orientation Information for Qualitative Spatial Reasoning. In Frank,
A. U., Campari, I., Formentini, U. (eds.), Theories and Methods of Spatio-Temporal
Reasoning in Geographic Space, pp. 162-178. Berlin: Springer.
Gorniak, P., Roy, D., 2004. Grounded Semantic Composition for Visual Scenes. Journal of
Artificial Intelligence Research 21: 429-470.
Habel, C., Eschenbach, C., 1997. Abstract Structures in Spatial Cognition. In Freksa, C.,
Jantzen, M., Valk, R. (eds), Foundations of Computer Science - Potential - Theory –
Cognition, pp. 369-378. Berlin: Springer.
Halliday, M. A.K., Matthiessen, C. M.I.M., 1999. Construing experience: A language-based
approach to cognition. London, New York: Continuum.
Herrmann, T., 1990. Vor, hinter, rechts und links: das 6H-Modell. Psychologische Studien zum
sprachlichen Lokalisieren. Zeitschrift für Literaturwissenschaft und Linguistik 78. 117-140.
Herrmann, T., Grabowski, J. 1994. Sprechen. Psychologie der Sprachproduktion. Heidelberg:
Spektrum.
Herskovits, A., 1986. Language and spatial cognition. Cambridge: Cambridge University
Press.
Hill, C., 1982. Up/down, front/back, left/right. A contrastive study of Hausa and English. In
Weissenborn, J., Klein, W. (eds.), Here and There. Cross-linguistic Studies on Deixis and
Demonstration, pp. 13-42. Amsterdam: Benjamins.
Jörding, T., Wachsmuth, I., 2002. An Anthropomorphic Agent for the Use of Spatial Language.
In Coventry, K.R, Olivier, P. (eds.), Spatial Language: Cognitive and Computational
Aspects. Dordrecht: Kluwer, pp. 69-85.
Kuhn, W., 2007. An Image-Schematic Account of Spatial Categories. Spatial Information
Theory, 8th International Conference, COSIT 2007. Melbourne, Australia: Springer Lecture
Notes in Computer Science 4736: 152-168.
Kuhn, W., 2009. A Functional Ontology of Observation and Measurement. In K. Janowicz, M.
Raubal, and S. Levashkin (Eds.): GeoSpatial Semantics · Third International Conference
(GeoS 2009), Mexico City, 3-4 December 2009. Springer Lecture Notes in Computer
Science 5892: 26–43.
Lakoff, G., Johnson, M., 1980. Metaphors we live by. Chicago: University of Chicago Press.
Langacker, R.W., 1999. Grammar and Conceptualization. Berlin: Mouton de Gruyter.
Levelt, Willem J.M., 1996. Perspective Taking and Ellipsis in Spatial Descriptions. In Bloom,
P., Peterson, M.A., Nadel, L., Garrett, M.F. (eds.), Language and Space, pp. 77-107.
Cambridge, MA: MIT Press.
Levinson, S. C., 1996. Frames of reference and Molyneux's question: Crosslinguistic evidence.
In Bloom, P., Peterson, M.A., Nadel, L., Garrett, M.F. (eds.), Language and Space.
Cambridge, MA: MIT Press, pp. 109-169.
Levinson, S. C., 2003. Space in Language and Cognition. Cambridge University Press.
A model of spatial reference frames in language
21
Miller, G.A., Johnson-Laird, Philip N., 1976. Language and Perception. Cambridge:
Cambridge University Press.
Moratz, R., Ragni, M.. 2008. Qualitative spatial reasoning about relative point position. Journal
of Visual Languages and Computing 19: 75-98.
Moratz, R., Tenbrink, T.. 2006. Spatial reference in linguistic human-robot interaction:
Iterative, empirically supported development of a model of projective relations. Spatial
Cognition and Computation 6:1, 63-106.
Pederson, E., 2003. How many reference frames? In: Freksa, C., Brauer, W., Habel, C.,
Wender, K.F. (Eds.), Spatial Cognition III: Routes and Navigation, Human Memory and
Learning, Spatial Representation and Spatial Learning. Berlin: Springer, pp. 287-304.
Pribbenow, S. 1991. Zur Verarbeitung von Lokalisierungsausdrücken in einem hybriden
System. Dissertation, Fachbereich Informatik der Universität Hamburg.
Regier, T., Carlson, L., 2001. Grounding spatial language in perception: An empirical and
computational investigation. Journal of Experimental Psychology: General, 130(2), 273-98.
Retz-Schmidt, G., 1988. Various views on spatial prepositions. AI Magazine 9: 2. 95-105.
Schober, M. F. 1993. Spatial Perspective-Taking in Conversation. Cognition 47: 1-24.
Svorou, S., 1994. The Grammar of Space. Amsterdam: Benjamins.
Talmy, L., 2000. Toward a Cognitive Semantics. Cambridge, MA: MIT Press.
Tenbrink, T., 2007. Space, Time, and the Use of Language. Berlin: Mouton de Gruyter.
Tenbrink, T., 2011. Reference frames of space and time in language. Journal of Pragmatics
43:3, 704-722.
Tenbrink, T., Moratz, R.. 2003. Group-based Spatial Reference in Linguistic Human-Robot
Interaction. Proceedings of EuroCogSci 2003: The European Cognitive Science Conference,
September 10-13, Osnabrück, Germany, pp 325-330.
Tyler, A., Evans, V., 2003. The Semantics of English Prepositions: Spatial Sciences, Embodied
Meaning, and Cognition. Cambridge: Cambridge University Press.
Winterboer, A., Tenbrink, T., Moratz, R., in press. Spatial Directionals for Robot Navigation.
In Dimitrova-Vulchanova, M., van der Zee, E. (eds.), Motion Encoding in Spatial Language.
Oxford: Oxford University Press.
View publication stats