Deixis and the Interactional Foundations of Reference

Jack Sidnell

Deixis and the Interactional Foundations of Reference

217 Chapter 11 Deixis an d t h e Interacti ona l Fou ndations of Refere nc e Jack Sidnell and N. J. Enfield 11.1 Introduction All reference involves directing the attention of some other person to something. The something to which attention is directed may or may not be present in the immediate context of interaction. Whether the referent is a hilltop in plain view, a bird’s singing, Gottlob Frege, sorrow, the ideas of Augustine, or the concept of liberty, making reference requires bringing the recipient’s attention in line with that of the speaker. If human cognition is fundamentally intentional in the sense of being about or directed towards something, reference is a form of shared intentionality in which the cognitive focus of two or more persons is aligned and jointly focused. In deictic reference, this directing of attention is accomplished by relating an object of reference to some aspect of the event of speaking—the indexical origo (Bühler 1982 [1934])—via a ground. So for instance when I point to a book and say ‘this one’ in response to the question ‘Which are you reading?’, my recipient’s attention is directed to the book by relating it to my location, and specifying the relation as one of relative proximity (or immediacy of access— see Fillmore 1982; Hanks 1990, 2005). In this chapter we develop an account of deixis that builds from its simplest manifestation in acts of gaze-following. For humans, gaze-following results from a basic propensity to attend to the attention of others. Because co-present others are able to control their own gaze and other visible signs of attention they can actively manipulate another’s attention such that what was a cue becomes a signal (see Krebs and Dawkins 1984). Pointing and all other forms of deixis (indeed all forms of reference) exploit this 218 218 Jack Sidnell and N. J. Enfield propensity by actively directing others’ attention. Of special importance to our account is so-called lip-pointing in which a meta-communicative facial expression (conveyed by a configuration of lip and or head; Sherzer 1972, Enfield 2009: ch. 3) indicates that a participant’s gaze direction is, at that moment, to be understood as an intentional, communicative signal. With shared intentionality as a foundation, all languages have developed systems of deictic markers: for example, demonstratives such as English that and this. These systems display a defining semiotic property of human communication, namely the use of signs that not only have meanings in themselves, but whose meanings are enriched through relations of opposition and contrast with other elements of the system, such that each element has a composite meaning, a combination of what it is and what it is not. Simple systems in the domain of deixis feature a semantically marked form in opposition to an unmarked one. More complex systems involve multiple dimensions of contrast. A further way in which the meanings of elements of a deictic system may be enriched is through their mapping onto the local socioculturally constituted worlds of their users. Speakers use deictic forms to refer to locally relevant features of the environment and deictic systems are interwoven with the sociocultural world in complex and sometimes counter-intuitive ways. An overarching question to be addressed is, ‘What’s special about deixis as a form of reference? How does it differ if at all from reference accomplished by non-deictic means, and what consequence does this difference have for its function or use in actual situations of social interaction?’ In order to address this question, we begin by developing an account of deixis that is rooted in basic, instinctive human propensities for (a) intentional, goal-directed behaviour and (b) the capacity for two or more individuals to share attention. Together, these human capacities provide a basis for the collective or shared intentionality that underwrites all forms reference, including reference accomplished via the use of deixis. We then turn to briefly sketch the semantic domain describing the essential elements of deictic reference and some of the documented typological variation. Much of the literature in this area focuses on just these issues and so here we do little more than provide a thumbnail sketch and point to relevant landmarks. We then consider demonstrative reference in which the recipient’s attention is directed either by talk, gesture, or gaze to some enumerable thing. Here we show that deixis is a low-cost, high-efficiency, minimally characterizing way to accomplish reference. These features surely account for many of its uses in interaction. But we suggest that referrers select deixis for reference for reasons other than efficiency. First, the semantically general character of deictic forms makes them well-suited for reference to hard-to-describe and/or nameless objects. In such a situation a deictic form can exploit features of the artifactual environment including the presence of the thing being referred to. Second, the semantically non-specific, minimally characterizing features of deixis allow speakers to avoid description where such description may be counterproductive to some interactional goal. Third, because these forms require for their interpretation the application of knowledge in common ground (shared knowledge), successful reference via such a form can be 219 Deixis and the Interactional Foundations of Reference 219 a demonstration of social proximity—an informational enactment of intimacy (see Enfield 2006). 11.2 Directing Attention in Deixis At about nine months of age human infants begin to engage in a suite of joint attentional behaviours such as gaze-following and joint engagement with objects. These behaviours differ markedly from those of younger infants which are primarily dyadic. At about this age, ‘infants for the first time begin to “tune in” to the attention and behavior of adults toward outside entities …’ (Tomasello 1999: 62). We can think of gaze-following schematically as in Figure 11.1. In following the gaze of another, a human infant is attending to that other’s attentional state. Essentially the infant is treating the other’s gaze direction as a sign and their own gaze redirection is an interpretant of that sign (see Kockelman 2005). Importantly, however, gaze-following of this kind occurs at least partially independently of whether the other intended their own gaze to function as a communicative signal. The initial gaze redirection then may function to prompt an infant’s gaze redirection either as a signal or a cue. As Tomasello notes, it is at around this same age—nine months—that infants also begin to direct adult attention to things using deictic gestures such as pointing, or by holding up an object to show it to someone. So at least ontogenetically there seems to be a correlation between the emergence of gaze-following and the emergence of deictic pointing and showing. There is also a clear conceptual connection between gaze-following and deictic pointing. Milgram and colleagues (1969) showed, somewhat inadvertently, that gazefollowing in adults was sensitive to the character of the stimulus. The study showed that larger crowds of gazing individuals were more likely to promote gaze-following than smaller crowds. The basic, apparently instinctive, propensity of humans to follow the gaze of others is then available for manipulation—altering aspects of the stimulus/sign will make gaze-following by others more likely. This, then, allows us to see the connection between gaze-following and ‘true’ pointing, one version of which, often referred Infant Parent Object Figure 11.1 Basic structure of gaze-following 220 220 Jack Sidnell and N. J. Enfield to as ‘lip-pointing’, is done with gaze—indeed, it is essentially ‘gaze-pointing’. Figure 11.2 is a still image from video taken by Niclas Burenhult of a Jahai speaker in Malaysia lip-pointing. As Enfield (2001: 186) writes, in relation to a study of lip-pointing among speakers of Lao, the term ‘lip-pointing’ ‘should not be taken to suggest that only the lips are involved…. Additional actions of chin-raise/head-lift, gaze direction, and eyebrow raise are usually involved.’ Key for our purposes is the fact that the vector of pointing is defined by gaze while the ‘lips’ actually serve a meta-communicative purpose, signalling that the gaze is being used as a point. Enfield (2001: 185) thus writes, ‘the “vector” of lip-pointing is in fact defined by gaze, and the lip-pointing action itself (like other kinds of “pointing” involving the head area) is a “gaze-switch”, i.e. it indicates that the speaker is now pointing out something with his or her gaze.’ The example of lip-pointing thus illustrates the way that humans can accomplish intentional reference (i.e. non-natural meaning in Grice’s 1957 sense) through small manipulations of naturally meaningful behaviours (gaze direction) which exploit the human propensity to follow another’s gaze. The introduction of a meta-communicative overlay (chin/lip/head) on gaze direction transforms a cue into something another can recognize as a true intentional signal—‘He’s referring to that thing/person/area over there.’ Figure 11.2 Still image from video of a Jahai speaker lip-pointing, provided by Niclas Burenhult 221 Deixis and the Interactional Foundations of Reference 221 Table 11.1 Current and projected focus of attention in deixis Gaze-following Lip-pointing Finger-pointing CFA only CFA=PFA CFA=PFA or CFA≠PFA We are now in a position to describe one distinctive feature of finger-pointing relative to the other forms of primitive deixis so far described. Specifically, in finger-pointing it is possible to separate the speaker’s current focus of attention from the focus that they are proposing for a recipient. We can describe the basic elements and their combinatorial possibilities by means of the following table and figures. In Table 11.1 ‘current focus of attention’ is annotated CFA and ‘proposed focus of attention’ is annotated PFA. This can be easily seen in the frame-grabs in Figure 11.3, taken from a video-recorded interaction among speakers of Bequia creole. In the first frame Viv (in the foreground) is telling Baga (in the background) about a man she thinks he might know but whose name he does not recognize. When the description given allows Baga to identify the person Viv is talking about, he points up the hill to his right (Figure 11.3b). Notice that when Baga initially points, his own gaze is directed to the place he is indicating with his finger (i.e. CFA = PFA). In Figure 11.3c, he maintains the pointing gesture but now gazes toward Viv apparently to check whether his reference has been successful—checking, that is, on his recipient’s focus of attention (i.e. CFA≠PFA). He finds Viv pointing to the same place and the two engage in a moment of mutual gaze. Here then we can see, in the visible behaviour of the participants, how reference involves joint attention such that two persons are not only publicly projecting their attention to the same referent, but where they are, in addition, mutually aware of the current alignment, and thus sharedness, of their two lines of attention. So we can see why this possibility of separating the speaker’s/gesturer’s directing signal from the speaker’s gaze is important since a joint attentional frame crucially involves the speaker monitoring the recipient’s attention to some third object (Carpenter et al. 1998; Tomasello 1999; Liszkowski et al. 2004; Tomasello et al. 2005). This monitoring of the recipient transforms common attention to a THIRD into true joint attention—a basic form of shared intentionality (see also Gilbert 1989; Searle 2010). It is relevant to note here that ‘lip-pointing’—which crucially involves gaze as noted above—seems specialized among Lao speakers in two ways: First, ‘lip-pointing is apparently restricted to cases when the addressee is looking at the speaker’ (Enfield 2001: 192) and second ‘to acts of direct ostension in which the location or identity of a referent in the physical environment is in focus’ (Enfield 2001: 196, emphasis added). The prior establishment of recipiency along with the already ‘in focus’ character of the referent, it can be supposed, obviates or at least alleviates the need to monitor the recipient. Finger-pointing (Kita 2003) would also seem to allow for a higher informational load than do the forms of ‘lip/gaze-pointing’ we have considered. Thus, researchers have noted various functional contrasts here in little versus big points (Enfield et al. 222 222 Jack Sidnell and N. J. Enfield (a) (b) (c) (d) Figure 11.3 Finger-pointing, Bequia, St. Vincent (see Sidnell 2005) (Still image from video recording). 2007), those that are accompanied by gaze versus those without (Streeck 1993), as well as informational possibilities associated with different hand and finger configurations (Wilkins 2003; Kendon and Versante 2003). Finger-pointing also makes ‘path descriptions’ possible, as well as illustrative combinations. At the far end of the informational scale are diagrammatic representations in which pointing gestures are used to identify positions within a virtual drawing (see Enfield 2005). We can see many of the basic features of deictic reference in another form of behaviour among infants which Kidwell and Zimmerman (2007) as well as Tomasello (1999) and Clark (2003) describe as ‘showing’. In a typical showing sequence, a young child will approach another (typically an adult) with an outstretched arm and an object in hand (see Figure 11.4), the other might produce a response which identifies the object (‘Watermelon’), expresses a social-relational feature of the object (‘Your shoe’), or appreciates it in some way (‘Oh wow, a pretty hat’). The showing child then withdraws the object from view and/or moves out of the recipient’s line of vision, either returning to the activity she was engaged in before the showing or initiating some new activity. Such ‘showings’ are arguably one of the most basic forms which exhibit the triadic, joint attentional interaction configuration that constitutes the very foundation of reference in all its various forms (see Tomasello 1999, 2003). Clark (2003) has explicated 223 Deixis and the Interactional Foundations of Reference 223 Figure 11.4 Human infant showing object to camera person the parallels and the key difference: in pointing, the other’s attention is made to move toward the current location of a thing, while in showing, a thing is moved into the current line of the other’s attention—either way, the other’s attention ends up directed towards the thing. In the current context, showings can be understood as an early form of demonstrative, or better, ‘presentational’ deixis akin to adult uses of French ‘voilà’ or English ‘look at this’. Instructional activities build upon the human propensity for attending to the attention of another, and showings play an important role in their organization. Rembrandt’s Anatomy Lesson of Dr Nicolaes Tulp provides a stunning illustration (Figure 11.5). Here Tulp is presenting a part of the cadaver for the consideration of his students, some of whom look attentively at that which is being shown. Through such presentations or showings, novices are socialized into new ways of ‘seeing’ the world around them, ways of seeing that are appropriate to some particular status or role (see Goodwin 1994; Kockelman 2007). In showings, then, we see not only the roots of reference in 224 224 Jack Sidnell and N. J. Enfield Figure 11.5 The Anatomy Lesson of Dr Nicolaes Tulp, Rembrandt Harmenszoon van Rijn (1632) human action but moreover the interactional foundations of human teaching, learning, the transmission of knowledge across generations, and thus, ultimately, of culture (see Tomasello 1999). 11.3 Demonstrative Systems In order to achieve joint attention on something, that thing must be somehow picked out from the range of possible things that one might be attending to. Often there are many possible things a person might be looking at or pointing to, and there are various ways to solve the problem of figuring out just which thing is the focus of attention. In the joint-attentional behaviours described in the previous section, details of body comportment such as eye gaze, pointing, and showing constitute relatively straightforward ways to narrow another’s attention on something to the exclusion of other possible referents in a context.1 But when the deictic function is supplied purely by the selection 1 Of course, as Wittgenstein (1953) and others pointed out, all reference involves a certain degree of indeterminacy. So, for instance, in the example from Rembrandt a recipient must infer, on the basis of 225 Deixis and the Interactional Foundations of Reference 225 of a word, there is little of inherent value in the word form itself that helps to solve this narrowing-in function. This is why demonstratives like that and this are often accompanied by some form of deictic bodily behaviour (or descriptive lexical content—e.g. ‘That blue one’ etc.). At the same time, such linguistic forms are also able to rely on the special salience of potential referents as determined by the current common ground of interlocutors; for instance, one might say ‘My brother has a car like that one’ while there are numerous cars in view, but where the car has a special salience in the scene—for example, it just drove past us, or it is painted a garish colour, or is particularly expensivelooking (Clark, Schreuder, and Buttrick 1983). Take examples like I heard that, Take this, or Were you at that party? These are semantically very general forms of expression, and a listener can only make sense of them by connecting the speech to something semantically much more specific such as a physical object or something in the spoken discourse or other shared knowledge, in other words, in the common ground (Clark 1992, 1996). The salience required for the successful connecting of a demonstrative to a referent may come from different sources. Certain things might be salient already because they are large, bright, central, or otherwise prominent in their surroundings (Clark et al. 1983). And one can render something salient in various ways (e.g. by pointing at it, looking at it, using a laser pointer, shining a light, holding the thing up). Ultimately, however, even where many sources of information converge to suggest a single referent, recipients of deictic expressions must infer what is being indicated. Syntactically, demonstratives may serve a range of different functions. For example, in English that may occur as an independent noun phrase (e.g. I saw that) or as a modifier within a noun phrase (e.g. I saw that car). Some demonstratives are ‘adverbial’ in function, in that they can be seen to relate to or modify events and actions (e.g. there in I went there). Depending on which language system we consider, demonstratives show different distributions (thus, in English I saw that/*there, I went *that/there, I saw that car/*there car). The details of such distinctions are subtle and complex and are particular to each language system (see Anderson and Keenan 1985, Diessel 1999, Dixon 2003 for reviews). One common function of demonstratives in spoken language is ‘exophoric’. In exophoric uses, reference is made to physical things and places that can be seen and pointed to in the context of the speech event. Alongside these exophoric functions, there are also endophoric referential uses of demonstratives (Halliday and Hasan 1976). In endophoric uses, reference is made not to things that can be physically pointed to and shown, but to things in the discourse context, which often includes things that have been said (e.g. anaphoric use of that in He said it was good and I agreed with that), but could also refer to things that will be said next (e.g. cataphoric use of this in What I want to say is this: I agree). Another kind of endophoric reference points to whatever evidence is available, whether the doctor intends to draw attention to the arm, the tendon, the flesh, or the entire body of the cadaver. Or again whether it is the colour, the size, or shape of some or all of the cadaver that is being indicated. 226 226 Jack Sidnell and N. J. Enfield things in the shared common ground, sometimes referred to as a ‘recognitional’ usage (Himmelmann 1996, after Sacks and Schegloff 2007b [1979], see later); this is found in cases like He reminds me of that boyfriend of Jane’s, where in order to resolve the reference of ‘that’, the listener consults neither the physical setting nor the current discourse, but rather the interpersonally shared common ground of the dyad. The endophoric uses of demonstratives are often regarded as secondary or derived from exophoric uses, based on arguments from both ontogeny (infants acquire exophoric functions first) and diachrony (endophoric functions often develop from exophoric ones; see Diessel 1999 for a statement of this position). However, it is not clear that from a synchronic perspective either function is subordinate to the other. Hanks (1990) and Enfield (2003) have argued that the core meanings of demonstratives do not semantically specify an exophoric versus endophoric distinction, rather that these are simply distinct (and sometimes not-so-distinct) pragmatic contexts of use of the semantically general terms. Typological work on demonstratives indicates that there is significant and subtly complex variation across languages in terms of the semantic dimensions that are encoded, the number of distinctions, and the grammatical properties of the various elements of the systems. Here we do not attempt to give an overview of the typological properties of demonstratives and demonstrative systems (for that, see e.g. Himmelmann 1996; Diessel 1999; Dixon 2003; Huang 2014). We will simply introduce a few of the known ‘realms of possibility’, concentrating specifically on the number and semantic types of possible distinctions found in systems of ‘demonstrative adjectives’ (i.e. words like that and this as modifiers in expressions with nominal referents; e.g. that car or this book). A demonstrative system can be extremely simple in terms of the number of distinctions it makes along a given dimension such as distance from speaker. Colloquial German, for example, has essentially a one-term system of demonstrative adjectives. While grammars of German state that a noun may be modified by one of three distinct terms: der ‘that/the’, dieser ‘this’, and jener ‘yon’, in fact only der (and its variants die and das, depending on the gender of the head noun) tends to be used. So, for instance, a German speaker would be more likely to say das Buch hier for ‘this book’ (proximal to the speaker, literally ‘that/the book here’) to distinguish from das Buch ‘that/the book’. A more common and still very simple type of system features a two-way distinction. In English, for example, a ‘proximal’ term this stands in opposition to a ‘distal’ term that. There is an archaic term yon ‘far distal, over there’, but it is almost never used. A similar situation is found in the non Pama-Nyungan language Kayardild, with ‘distal’ dathin- and ‘proximal’ dan-, and a ‘rarely used’ form nganikin- meaning ‘that, beyond the field of vision’ (Evans 1995: 206–210). It is surprisingly difficult to determine precisely what is the semantic distinction between the terms in such a system, though the most common characterization is ‘proximate’ versus ‘distal’. This captures the fact that, in general, things that one refers to with the word this tend to be spatially closer to the speaker than things one would refer to with that. However, there are problems with this suggestion. For one thing, these words are used in endophoric, non-spatial domains where the application of an analysis in terms of ‘proximate’ and ‘distal’ is metaphorical at best. A more parsimonious analysis would then not specify spatial distance as the 227 Deixis and the Interactional Foundations of Reference 227 operative factor (Enfield 2003; Hanks 1990; Kirsner 1979). For another thing, there is no objective measure of what would count as ‘proximal’ versus ‘distal’, yet these terms imply some kind of specifiable distance. When we observe actual usage, it turns out that spatial distance between speaker and referent does not predict which term will be used. This was demonstrated in an analysis of situated usage of a two-term system in Lao (a Southwestern Tai language of Laos; Enfield 2003). The account that best captures the observed data posits a semantic asymmetry in the system: one of the terms is semantically specified as ‘external’, ‘distal’, or more accurately ‘not here’, while the other term has no specification for ‘externality’ or ‘distance’. This is a basic ‘informativeness scale’ (Horn 1989; Levinson 2000), by which the unmarked member of a paradigm can readily pick up extra pragmatic meaning by virtue of its opposition to the other members. In the Lao case, the semantically general form tends to imply ‘proximal’, not because it semantically specifies proximal but because it is being chosen when ‘distal/ external’ could have been chosen instead. A similar solution has been implied in analyses of the English that/this opposition, though without consensus as to which term is the semantically unmarked one (Halliday and Hasan 1976: 59 say that that is basic, while Wierzbicka 1980: 27 and Dixon 2003: 81 say that this is the basic form). Many languages have three-term systems, often described in terms of the familiar ‘proximate’ versus ‘distal’ distinction, but where there are two ‘proximate’ terms: one refers to things that are proximate to the speaker, the other to things that are proximate to the addressee. For example, in Yimas, a Lower Sepik language of Papua New Guinea, there are three deictic stems: -k ‘this (near me)’, m- ‘that (near you)’, and -n ‘that yonder (near neither you nor me)’ (Foley 1991: 112). Or in Manambu, also spoken in the East Sepik, there are the forms k- ‘close to speaker’, wa- ‘close to hearer’, and a- ‘far from both’ (Aikhenvald 2008: 201). Other three-term systems operate on different semantic principles. In Turkish, alongside a contrast between ‘proximal’ (bu) and ‘distal’ (o), there is a term (şu) that encodes ‘the absence of the addressee’s visual attention’ on the thing being referred to (Küntay and Özyürek 2006: 304). There are also many languages with demonstrative systems that have more than three terms. Often the extra terms mark spatial contrasts associated with living in a particular kind of physical environment and lifestyle. For example, in Kri, a Vietic language of Laos (Enfield and Diffloth 2009), there is a five-term system of exophoric demonstratives, featuring a familiar-looking proximal versus distal distinction, in addition to semantic distinctions of ‘across’, ‘up’, and ‘down’, motivated by the Kri speakers’ riverine up–down environment (this system is also used with reference to small-scale or ‘table top’ space; see further discussion of the Kri system in section 11.4): (1) a. b. c. d. e. nìì naaq seeh cồồh lêêh general (‘this’, proximal) external (‘that’, distal) external, across (‘yon’, far distal) external, down below external, up above 228 228 Jack Sidnell and N. J. Enfield A similar system is found in Lezgian, a Nakho-Daghestanian language of the Eastern Caucasus (Haspelmath 1993: 190; note that according to Haspelmath, in ‘modern standard’ Lezgian, only the two forms glossed as ‘that’ and ‘this’ are commonly used). In the Lezgian system, yet another term (ha) is added, which has a dedicated ‘discourse anaphoric’ function: (2) a. b. c. d. e. f. this that yonder the aforementioned that up there that down there ‘i’ a at’a ha wini aǧa These few examples can only hint towards the complexity and subtlety of different demonstrative systems in languages of the world. The list of possible semantic distinctions is long. In his typological survey of demonstrative systems, Diessel (1999: 52) summarizes all of the semantic features that are attested. These divide into ‘deixis’ and ‘quality’, subcategorized in Table 11.2. Adding to the complexity and richness of the possibility space for demonstratives, the various terms may be enlisted in many different ways for endophoric Table 11.2 Diessel’s summary of semantic distinctions attested in demonstrative systems (A) Semantic distinctions in demonstratives of the type ‘deixis’: (i) distance (ii) visibility (iii) elevation (iv) geography (v) movement (or direction) (B) Semantic distinctions in demonstratives of the type ‘quality’: (i) ontology (ii) animacy (iii) humanness (iv) sex (v) number (vi) boundedness 229 Deixis and the Interactional Foundations of Reference 229 usages, and in other syntactic functions (e.g. as demonstrative adverbs like English there and here). The most important future line of research is to test the proposed semantics of these systems in the context of their usage in everyday life. Since the understanding of demonstratives are so heavily context-dependent, they cannot be meaningfully studied without looking at a corpus of usage. This issue is discussed in section 11.4. 11.4 Demonstratives in the Context of Common Ground We began this survey of deictic reference with the simplest kinds of joint-attentional scenes, the kinds that allow a 9-month-old to get started on his or her long journey of socialization. It is a years-long path of countless moments of joint attention, countless instances of learning and guidance, of gradual convergence in knowledge and stance with elders and peers, first through simple gestures and shared participation frames, and soon within the increasingly rich matrices of language, kinship, ritual, livelihood, and material culture. These aspects of the sociocultural world all form the basis of a community’s common ground, and thus are naturally caught up in the elements of demonstrative systems, dependent as they are on whatever sources of ‘mutual salience’ happen to be at hand. Most previous work on deixis, such as the research on demonstrative systems outlined in the previous section, has approached the task as a search for the right ‘gloss’ of each form’s meaning. However, deictic terms like demonstratives are especially hard to gloss in the abstract since interpreters are so heavily dependent on context in figuring out what they refer to on any given occasion. Research such as that by Hanks (1990) and Enfield (2003) has shown that the situated dynamics—both spatial and socialrelational—of social interaction bears directly on how a simple demonstrative distinction, e.g. between that and this in English, is to be interpreted. The key to interpreting deictic expressions is the common ground that pertains between interlocutors (Clark 1996; cf. Hanks 2006b). In a study of Lao, Enfield (2003) shows how the rapidly changing common ground arising from fluidly evolving participation frames in marketplace interactions can affect the differential selection of demonstratives for picking out referents that are all proximate and in common view. In other kinds of context, we see how common ground of the more enduring kind—that is, cultural common ground (Clark 1992)—also has a bearing on the selection and interpretation of demonstratives. Let us consider an example from research on speakers of Kri, an Austroasiatic language of Laos (Enfield and Diffloth 2009). 230 230 Jack Sidnell and N. J. Enfield Figure 11.6 Kri house In the Kri-speaking community of Mrka village in upland central Laos, houses are built to a precise plan, by which the physical layout of the house is a diagram of certain social-relational asymmetries, on two axes (see Enfield 2009 for detailed discussion). Running laterally across the house is an ‘in–out’ axis, where ‘in’ maps onto ‘private, family, women, children, storage room, food preparation’ and ‘out’ maps onto ‘public, non-kin, men, adults, guests, drinking, public ritual’. Orthogonal to this is an axis that runs from what we would call in English the ‘front’ of the house, where one enters, to the ‘back’ of the house. In Kri, this is referred to as a ‘below–above’ axis, where ‘below’ maps onto socially lower rank, and ‘above’ to socially higher rank, where relative ‘height’ is determined primarily by relative age, often attenuated by classificatory kinship. See Figures 11.6 and 11.7. The Kri house is therefore conceptualized spatially as a mini-version of the larger geographical environment, as coded in the demonstrative system. Recall that in that system (see (1)), beyond the ‘proximal’ and ‘distal’ forms, there are three forms in addition: ‘the one up/above/upstream/uphill’, ‘the one down/below/downstream/downhill’, and ‘the one across’ (i.e. away but neither up or down). While the house floor is normally perfectly level, the ‘up/down/across’ scheme is nevertheless readily mapped onto 231 Deixis and the Interactional Foundations of Reference 231 5 m approx. prùng kùùjh ‘ﬁre pit’ upper roong ‘upper corner’ sùàmq ‘inner room’ sùàmq sùàmq tkoolq ‘giant mortar’ khraa ‘storage and work room’ prùng kùùjh ‘ﬁre pit’ cààr ‘verandah (covered)’ krcààngq ‘ladder’ cààr ‘verandah (open)’ lower outer inner Figure 11.7 Kri house floor plan it, thanks to its diagrammatic relation to the socialcultural dimensions represented as ‘in–out’ (family versus non-family) and ‘up–down’ (senior versus junior). Now consider an example from a video-recorded interaction between a group of Kri-speaking women sitting on a front verandah, in which this socioculturally motivated mapping provides the solution for a simple referential problem of locating an object. Figure 11.8 shows a still image from the video recording. 232 232 Jack Sidnell and N. J. Enfield Figure 11.8 Image of the speakers (Still image from video recording). The scene is in the house of E, the elderly woman at the right of frame. We focus on an exchange between her and B, the young woman second from left, visible in the door frame, who does not live in this house. (3) Kri interaction B: piin sulaaq Give leaf Pass some leaf. E: sulaaq quu kuloong lêêh, sulaaq, quu khraa seeh Leaf LOC inside DEM.UP leaf loc store DEM.ACROSS The leaf is inside up there, the leaf, in the store. môôc cariit hanq one backpack 3SG (There’s) a (whole) backpack. (5s; B walks inside) Here, B makes a request to be given some ‘leaf ’ (actually, corn husk) with which she can roll a cigarette. In E’s reply, she uses a complex combination of referential 233 Deixis and the Interactional Foundations of Reference 233 expressions to inform B of the location of the ‘leaf ’ so that she can go and get some herself. First, an intrinsic spatial reference (kuloong ‘inside’) is combined with the ‘up’ demonstrative lêêh, in alignment with the up–down axis of the house. From their perspective sitting on the verandah, the ‘lowest’ part of the house, the inside area of the house is ‘up’, and, accordingly, this is coded in the demonstrative chosen. E then narrows in further on the spatial location; where they are currently sitting is the ‘outer’ edge of the house, and the ‘leaf ’ in question is located inside the khraa ‘storage room’ at the ‘innermost’ side of the house: once one has entered the house going ‘up’ from where the speakers are sitting, one would then have to go ‘across’; this is specified with use of the relevant demonstrative seeh ‘the one across there’. This example has illustrated one way in which the interpretation of demonstratives depends crucially on shared background knowledge, as relevant to the context of speaking. In the case of Kri, the selection and interpretation of demonstratives draws directly on a conventional mapping of the sociocultural domain of kinship and other personal relations onto the 2D spatial array of the house floor plan. 11.5 What’s Special about Deixis as a Form of Reference? In this final section we address the central question with which we began: what is special about deixis as a form of reference? Another way to ask the same question is: where both deictic and non-deictic formulations of a referent are possible, why might a speaker choose the deictic one? Consider the following case from the second presidential debate between John McCain and Barack Obama in 2008. Here the moderator has asked McCain the following: ‘Should we fund a Manhattan-like project that develops a nuclear bomb to deal with global energy and alternative energy or should we fund 100,000 garages across America, the kind of industry and innovation that developed Silicon Valley?’ McCain has already begun to respond when he produces the following segment: (4) McCain2 01 JM: By the way my friends: I-I know you grow a little wea:ry 02 with this back-and-forth. 03 (.) It was an energy bill on the floor of the Senate loaded down 04 05 with goodies. billions for the oil companies. An’ it was 06 sponsored by- Bush and Cheney. 2 English examples are presented using the transcription conventions originally developed by Gail Jefferson. For present purposes, the most important symbols are the period (‘.’) which indicates falling and final intonation, the question mark (‘?’) indicating rising intonation, and brackets (‘[’ and ‘]’) marking the onset and resolution of overlapping talk between two speakers. Equal signs, which come 234 234 Jack Sidnell and N. J. Enfield 07 08 09 10 11 12 13 (0.2) You know who voted for it, might never know, That one. You know who voted against it? Me. I have fought time after time against these pork barrel—these-these bills that come to the floor and they have all kinds of goodies an’ all kinds of things in them for everybody and they buy off the votes, Notice then that McCain selected the deictic formulation ‘that one’ in referring to Obama who was sitting close by at the time (see Figure 11.9). This is clearly a marked usage in contrast to ‘Obama’ or ‘Senator Obama’ and it was noted in the press, with many ordinary people as well as political pundits weighing on what the formulation might ‘mean’. For instance, the Huffington Post reported: During a discussion about energy, McCain punctuates a contrast with Obama by referring to him as “that one,” while once again not looking in his opponent’s direction (merely jabbing a finger across his chest). That’s not going to win McCain any Miss Congeniality points. Nor will it reassure any voters who believe McCain is improperly trying to capitalize on Obama’s “otherness.” David Axelrod—an Obama strategist—was reported as saying: ‘Senator Obama has a name. You’d expect your opponent to use that name.’—clearly drawing attention to the marked character of ‘that one’. Other commentators suggested that the usage was disrespectful, rude, or even racist. Defenders of McCain, in contrast, argued that the press and others were making something out of nothing. Drawing on the basic principles sketched in this chapter we can develop an analysis of how people were able to arrive at these diverse interpretations. First, the reference is accompanied by a pointing gesture in the direction of Obama (Figure 11.9), indeed there is prior point at Obama produced over ‘you know who voted for it?’ Second, while producing the reference (the deictic formulation ‘that one’ with point in Obama’s direction), McCain was gazing at the studio audience. Third, the reference combines the deictic ‘that’ with the characterizing ‘one’—a usage which denotes any enumerable person or thing. The combination is roughly equivalent to ‘him’ in denoting a third person, non-participant in the immediately available speech situation, i.e. not a speaker, not an addressee; and note that it is compatible with the referent being an inanimate object. Fourth, McCain can be seen to have selected ‘that’ from the pair of contrasting terms ‘this/that’—‘that’ is what we gloss as the distal member of the pair and, in contrast to ‘this’, conveys distance from speaker (see Stivers 2007). in pairs—one at the end of a line and another at the start of the next line or one shortly thereafter—are used to indicate that the second line followed the first with no discernable silence between them, i.e. it was ‘latched’ to it. Numbers in parentheses (e.g. (0.5)) indicate silence, represented in tenths of a second. Finally, colons are used to indicate prolongation or stretching of the sound preceding them. The more colons, the longer the stretching. For an explanation of other symbols, see Enfield and Stivers (2007). 235 (a) (b) Figure 11.9 McCain and Obama, ‘That one’ 236 236 Jack Sidnell and N. J. Enfield We can see that this reference positions Obama as a non-participant in a speech event comprised of McCain and the audience to whom his talk is directed. In addition, the use of ‘one’ and ‘that’ (rather than ‘this’) conveys distance. These effects, along with McCain’s use of ‘my friends’ to address and align the audience, thus work together to construe an interactional rift that divides himself and the audience on the one side from Obama on the other. At the same time, of course, these meanings are defeasible—from another perspective, McCain was simply using a highly efficient, minimally characterizing referring expression to identify who he was talking about. The availability of seemingly incompatible, even opposed interpretations is surely an outcome of the fact that so much of the meaning of these forms is inferred rather than encoded. We are now in a position to summarize at least some the features of deixis that distinguish it from other forms of reference and to see how these might shape a speaker’s selection of a deictic over non-deictic formulation. 1. Deictic reference is a low-cost, highly efficient, minimally characterizing way to accomplish reference. Many of the examples we have so far discussed exemplify just this point. Simply put, there are many situations in which a deictic formulation is the most efficient way to accomplish reference. Where the intended referent is already available in the common ground and perhaps even co-present, a deictic formulation constitutes the most straightforward way of referring to it. Notice that this likely explains the universal occurrence of deictic words in the world’s languages—a language without them would be unnecessarily cumbersome. It should be noted however that there are some (perhaps many) situations in which sociocultural norms override any pressure towards efficiency. So for instance in Vietnamese, in many situations, speakers avoid minimally characterizing deictic formulations in referring to speaker and hearer (tôi/ta ‘I’, mày ‘you’) in favour of kin terms which explicitly characterize the social relationship between speaker and hearer (Luong 1990; Sidnell and Shohet 2013). So while matters of efficiency are clearly at play, their relevance may not always be paramount. 2. The semantically general character of deictic forms makes them well-suited for reference to hard-to-describe and/or nameless objects. In such a situation a deictic form can exploit features of the artifactual environment, including the presence of the thing being referred to. For instance, in the following case something hanging on the door of the small room where three children are playing is initially referred to by ‘it’. However, when the recipient initiates repair of the reference with ‘move what?’, a deictic formulation is used which locates the referent relative to landmarks in the physical environment rather than characterizing or describing it. 237 Deixis and the Interactional Foundations of Reference 237 (5) Kids_11_24_05(2of2)T7 @11:33 01 02 03 04 A: C: -> A: C: ((looks at door)) Maybe R---, maybe you can move it, °Move what?° Move that thing that’s in the lock Okay. 3. The semantically non-specific, minimally characterizing features of deixis allow speakers to avoid description where such description may be counter-productive to some interactional goal. There are situations in which a speaker may wish to avoid characterizing the thing referred to and here deictic formulations are particularly well-fitted. Sacks (1995) discussed this issue in his consideration of ‘indicator terms’ (the term used by analytic philosophers such as Russell and Goodman to talk about deictics). Sacks observed that in the context of group therapy one patient may wish to avoid saying ‘why are you in therapy?’ and prefer instead ‘why are you here?’—these questions having quite different implications. The first invites an answer that makes reference to the real or supposed psychological issues with which the recipient is struggling. The second, in contrast, can be answered with something like ‘my father sent me’ or ‘it’s a condition of my parole’ etc.—i.e. practical circumstances. This points to some of the ways sociocultural rules or norms may come into play in the selection of deictic or non-deictic forms. Levinson (2005, 2007) has discussed data from Rossel Island that is also relevant here. The Rossel Islanders observe taboos on name use when the bearer of the name is recently deceased. In their attempts to observe these taboos, speakers of Yélî Dnye sometimes resort to highly circumspect reference often involving elaborate deictic gestures or linguistic formulations—eyebrow flashes to distant locales, points, or expressions like ‘that girl’ and so on. 4. Because these forms require for their interpretation the application of knowledge in common ground (shared knowledge), successful reference via such a form can be a demonstration of social proximity—an informational enactment of intimacy (see Enfield 2006). Schegloff (2007b) discusses how this works via the indexical meaning of a person’s voice. In the following example, Clara picks up the phone and says hello (line 6b), to which the caller, Agnes, responds with ‘Hi’ (line 6c). From this one-syllable voice sample, Clara knows it is Agnes, and demonstrates this knowledge in her subsequent utterance, by using Agnes’s name. (6) a. b. Clara c. Agnes d. Clara ((Ring)) Hello Hi Oh hi, how are you Agnes 238 238 Jack Sidnell and N. J. Enfield This indexically-based understanding is a way of making a genuine demonstration of shared knowledge between a particular dyad. Had the caller been someone who Clara did not know, or knew less well, she would have been simply unable to make this demonstration of knowing who it was, and thus would have made explicit the greater social distance between the two. This example relates to the indexical meaning that allows us to recognize a person just from their voice, and so is not in the realm of linguistic deixis; however, we see exactly the same effect in the domain of grammatical deixis. In this example from Lao (see Enfield 2006 for more information), a man is talking about a riverine environment near his village, where villagers were once able to collect large amounts of a certain herbal medicine. (7) 6 tè-kii4 before paj3 go haak5 vang2-phêêng2 pcl vp nanø tpc.nonprox tèø-kii4 before khaw3 3pl.b tèq2-tòòng4 touch ‘Before, in Vang Phêêng weir, before, for them (the villagers) to go and touch it 7 bòø daj4, neg can paa1-dong3 forest man2 3.nonresp lèwø prf dêj2 fac.news was impossible, it was the forest of itnon-respect, you know.’ The deictic element in line 7—man2 ‘it’—has no local antecedent, and so the speaker is evidently assuming that his listeners will know how or what ‘it’ is. A couple of lines later, a woman who is listening to the man’s story asks: (8) 8 FW khuam2 reason phen1 3.p haaj4 angry niø tpc naø tpc.periph ‘Owing to itsrespect being angry?’ She uses a different pronoun, this time marking respect, however the referent is still entirely inexplicit. In the next line, the man does make the referent explicit: (9) 9 FM qee5 — bòò1 mèèn2 yeah neg be phii3 spirit lin5 play vang2-phêêng2 V lin5 play dêj2, fac.news niø pcl ‘Yeah—It’s not playing around you know, the spirit of Vang Phêêng.’ 239 Deixis and the Interactional Foundations of Reference 239 The deictic expressions man2 and phen1, both third person pronouns, were first used in this sequence in such a way as to assume certain cultural common ground; namely that ‘weirs’ and similar deep water environments have spirit owners that protect the aquatic resources and that are feared and respected. The fact that these interlocutors were able to successfully refer to these spirits with only the use of these semantically very general demonstrative expressions is a demonstration of their common membership in a particular sociocultural world, and not only in a common ‘speech community’. In this chapter we have sketched the interactional foundations of deixis (and reference in general) in the joint attentional scenes and associated action trajectories of ordinary social life. We then discussed two ways in which the basic features of deictic reference are elaborated—in semantically complex systems of linguistic opposition and in the way they map onto the rich, conventionally meaningful cultural systems that make up the life-world. Finally, we have tried to address the fundamental question of why any given speaker on any given occasion would select a deictic over a non-deictic expression. Abbreviations Used Orthography used for Lao in this book follows Enfield (2007). Orthography used for Kri follows Enfield and Diffloth (2009). Following are the conventions used for interlinear morphemic glossing: 1 1st person 2 2nd person 3 3rd person B bare dem demonstrative dir directional dist distal fac factive loc locative neg negation news news marker nonprox non proximal pcl particle pl plural prf perfect tpc topic

Log In

Deixis and the Interactional Foundations of Reference

Related papers

Related papers

Related topics