edited by
Jost Gippert
Nikolaus P. Himmelmann
Ulrike Mosel
Editors preface
Language documentation is concerned with the methods, tools, and theoreti-
cal underpinnings for compiling a representative and lasting multipurpose
record of a natural language or one of its varieties. It is a rapidly emerging
new field in linguistics and related disciplines working with little-known
speech communities. While in terms of its most recent history, language
documentation has co-evolved with the increasing concern for language
endangerment, it is not only of interest for work on endangered languages
but for all areas of linguistics and neighboring disciplines concerned with
setting new standards regarding the empirical foundations of their research.
Among other things, this means that the quality of primary data is carefully
and constantly monitored and documented, that the interfaces between pri-
mary data and various types of analysis are made explicit and critically
reviewed, and that provisions are taken to ensure the long-term preservation
of primary data so that it can be used in new theoretical ventures as well as
in (re-)evaluating and testing well-established theories.
This volume presents in-depth introductions into major aspects of lan-
guage documentation, including a definition of what it means to document
a language, overviews on fieldwork ethics and practicalities and data
processing, discussions on how to provide a basic annotation of digitally-
stored multimedia corpora of primary data, as well as long-term perspectives
on the preservation and use of such corpora. It combines theoretical and
practical considerations and makes specific suggestions for the most com-
mon problems encountered in language documentation.
The volume should prove to be most useful to students and researchers
concerned with documenting little-known languages and language varie-
ties. In addition to linguists and anthropologists, this includes students and
researchers in various regional studies and philologies such as African
Studies, Indology, Turkology, Semitic Studies, or South American Studies.
The book presupposes familiarity with the basic concepts and terminology
of descriptive linguistics (for example, basic units such as phoneme or lex-
eme), but most chapters will also be accessible and useful to non-
specialists, including educators, language planners, politicians, and govern-
ment officials concerned with linguistic minorities.
Nearly all chapters of this volume are based on a series of lectures and
seminars presented during the Fi r st I nter nati onal Summer School on Lan-
guage Documentati on: Methods and Technol ogy held in Frankfurt/Main
(Sept. 111, 2004). While not a textbook in the strict sense (which would in-
clude exercises, etc.), the volume is designed to serve as the main source of
readings for a university class on language documentation (for third-year
students and above). Parts of it can also be used as readings in fieldmethod
classes and classes in linguistic anthropology. However, it is not a guide to
linguistic fieldwork. Instead, it focuses on issues which aretypically not men-
tioned at all, or all too briefly, in fieldwork manuals such as, for example,
the cooperative interaction between researcher(s) and speech community,
orthography development, the function of metadata, archiving recordings
and transcripts. When used as a textbook in a language documentation class,
it should be complemented with readings on linguistic fieldwork and lin-
guistic anthropology from other sources (see further Section 5 of Chapter 1).
Of major import to documentary linguistics is the technology used in
recording and preserving linguistic primary data, most of which is IT-
related today. Since this is a rapidly changing field, we have kept the dis-
cussion of specific technological aspects and procedures to an absolute
minimum, focusing on conceptual issues and practicalities which we be-
lieve will stay with us for some time to come. Nevertheless, a considerable
number of technical standards, software programs, and institutions con-
cerned with corpus building and preservation are mentioned in this book in
order to provide examples for a given conceptual issue or a recommended
general procedure. The appendix provides an alphabetical list of all the
abbreviations used in this regard, as well as internet links providing more
up-to-date information on them. This information is continuously updated
on the books website at:
On this website, the reader will also find video and audio files for some of
the examples given in this book as well as links and suggestions for topics
which could not be adequately dealt with here.
Finally, it bears emphasizing once again that language documentation in
many ways is still a rather new discipline where many basic concepts and
procedures are in the process of being tested and fully elaborated (see also
Section 3.2 in Chapter 1). In particular, while considerable progress has
been made in recent years with regard to the compilation and archiving
aspects of language documentation, to date there is very little experience
Pr eface vii
indeed with regard to actually working with digitally-stored multimedia
corpora of lesser-known languages. In the coming years, we expect to see
major developments here with regard to the etiquette of working with such
corpora (How are they evaluated? How are they referred to in publications?
How can work by different investigators on the same variety be combined
into a single coherent corpus?) as well as with regard to the technology
used in exploring them and extracting relevant information for a specific
project. We also expect an impact on the methodological and theoretical
debate in the subject areas working most intensively with data from such
corpora, including linguistic typology, linguistic anthropology, and oral
literature. As a part of these developments, it may well turn out that some
of the suggestions made in this book, e.g. with regard to the structuring of
the corpora or the format for annotations, will need to be revised or perhaps
even be discarded. Still, we trust that the discussion of the basic conceptual
issues as laid out here will be of continued interest and relevance for many
years to come and thus truly merit to be considered essentials of language
Nikolaus P. Himmelmann, Bochum
J ost Gippert, Frankfurt
Ulrike Mosel, Kiel
We gratefully acknowledge the very generous support of the Volkswagen-
Stiftung ( which has been instrumental
in producing this book. The foundation not only funded the summer school
for which most chapters were drafted, but also provided the means to dis-
tribute a substantial number of copies of this book free of charge outside of
Western Europe, North America, and J apan. By granting a research fellow-
ship for Himmelmann in 20042005, it has allowed him to focus his re-
search on the issues dealt with in Chapters 7 and 10 and to engage in the
editing of the book in a way which otherwise would not have been possible.
Through its DoBeS Progr amm (Documentation of Endangered Languages
program), which started in the year 2000, it has made a major contribution
to the development of documentary linguistics as an innovative field of
study and practice within the humanities.
Our sincerest thanks are due to the contributors of the volume who spent
a lot of time on conceiving their chapters and have always been ready to
cooperate with us in the difficult task of preparing a consistent book.
We also gratefully acknowledge much practical help we have received
in putting the volume together. Marcia Schwartz checked English and style
conventions; J udith Khne compiled the combined list of bibliographical
references at the end. At Mouton, Ursula Kleinhenz did a great job of seeing
the book through to press. Many thanks to all of you.
Chapter 1 Language documentation:
What is it and what is it good for?. . . . . . . . . . . . . . . . . . . . . 1
Ni kol aus P. Hi mmel mann
Chapter 2 Ethics and practicalities of cooperative fieldwork
and analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Ar i enne M. Dwyer
Chapter 3 Fieldwork and community language work. . . . . . . . . . . . . . 67
Ul r i ke Mosel
Chapter 4 Data and language documentation. . . . . . . . . . . . . . . . . . . . . 87
Peter K. Austi n
Chapter 5 The ethnography of language and language
documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Jane H. Hi l l
Chapter 6 Documenting lexical knowledge . . . . . . . . . . . . . . . . . . . . . . 129
John B. Havi l and
Chapter 7 Prosody in language documentation . . . . . . . . . . . . . . . . . . . 163
Ni kol aus P. Hi mmel mann
Chapter 8 Ethnography in language documentation. . . . . . . . . . . . . . . 183
Br una Fr anchetto
Chapter 9 Linguistic annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Eva Schul tze-Ber ndt
Chapter 10 The challenges of segmenting spoken language . . . . . . . . 253
Ni kol aus P. Hi mmel mann
Chapter 11 Orthography development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Fr ank Sei far t
Chapter 12 Sketch grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Ul r i ke Mosel
Chapter 13 Archiving challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Paul Tr i l sbeek and Peter Wi ttenburg
Chapter 14 Linguistic documentation and the encoding of
textual materials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Jost Gi pper t
Chapter 15 Thick interfaces: mobilizing language documentation
with multimedia. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
Davi d Nathan
Chapter 1
Language documentation:
What is it and what is it good for?
Nikolaus P. Himmelmann
This chapter defines language documentation as a field of linguistic inquiry
and practice in its own right which is primarily concerned with the compi-
lation and preservation of linguistic primary data and interfaces between
primary data and various types of analyses based on these data. Further-
more, it argues (in Section 2) that while language endangerment is a major
reason for getting involved in language documentation, it is not the only
one. Language documentations strengthen the empirical foundations of
those branches of linguistics and related disciplines which heavily draw on
data of little-known speech communities (e.g. linguistic typology, cognitive
anthropology, etc.) in that they significantly improve accountability (verifi-
ability) and economizing research resources.
The primary data which constitute the core of a language documentation
include audio or video recordings of a communicative event (a narrative, a
conversation, etc.), but also the notes taken in an elicitation session, or a
genealogy written down by a literate native speaker. These primary data are
compiled in a structured corpus and have to be made accessible by various
types of annotations and commentary, here summarily referred to as the
apparatus. Sections 3 and 4 provide further discussion of the components
and structure of language documentations. Section 5 concludes with a pre-
view of the remaining chapters of this book.
1. What is a language documentation?
An initial, preliminary answer to this question is: a language documenta-
tion is a lasting, multipurpose record of a language. This answer, of
course, is not quite satisfactory since it immediately raises the question of
2 Nikolaus P. Himmelmann
what we mean by lasting, multipurpose and record of a language. In
the following, these constituents of the definition are taken up in reverse
order, beginning with record of a language.
At first sight, a further definition of record of a language may look
like a bigger a problem than it actually is since it involves the highly com-
plex and controversial issue of defining a language. The main problem
with defining a language consists in the fact that the word language refers
to a number of different, though interrelated phenomena. The problems in
defining it vary considerably, depending on which phenomenon is focused
upon. That is, different problems surface when the task is to define lan-
guage as opposed to dialect, or language as a field of scientific enquiry, or
language as a cognitive faculty of humans, and so on. Unless we want to
postpone working on language documentations until the probably never
arriving day when all the conceptual problems of defining language in all
of its different senses are resolved and a theoretically well-balanced delimi-
tation of a language for the purposes of language documentations is pos-
sible, we need a pragmatic approach in dealing with this problem.
The basic tenet of such a pragmatic approach is implied by the qualifiers
multipurpose and lasting in the definition above: The net should be cast as
widely as possible. That is, a language documentation should strive to in-
clude as many and as varied records as practically feasible, covering all
aspects of the set of interrelated phenomena commonly called a language.
Ideally, then, a language documentation would cover all registers and varie-
ties, social or local; it would contain evidence for language as a social prac-
tice as well as a cognitive faculty; it would include specimens of spoken
and written language; and so on.
A language documentation broadly conceived along these lines could
serve a large variety of different uses in, for example, language planning
decisions, preparing educational materials, or analyzing a set of problems
in syntactic theory. Users of such a multipurpose documentation would
include the speech community itself, national and international agencies
concerned with education and language planning, as well as researchers in
various disciplines (linguistics, anthropology, oral history, etc.). In fact, the
qualifier lasting adds a long-term perspective which goes beyond current
issues and concerns. The goal is not a short-term record for a specific pur-
pose or interest group, but a record for generations and user groups whose
identity is still unknown and who may want to explore questions not yet
raised at the time when the language documentation was compiled.
Obviously, this pragmatic explication of lasting, multipurpose record
of a language rests on the assumption that it is possible and useful to com-
Chapter 1 Language documentation: What is it and what is it good for? 3
pile a database for a very broadly defined subject matter (a language)
without being guided by a specific theoretical or practical problem in mind
which could be resolved on the basis of this database. With regard to its use
in scientific inquiries, the validity of this assumption is shown by the suc-
cess of all those social and historical disciplines working with data not spe-
cifically produced for research purposes. Thus, for example, cave dwellers
in the Stone Age did not discard shellfish, animal bones, fragments of tools,
and the like within the cave with the purpose in mind of documenting their
presence and aspects of their diet and culture. But archeologists today use
this haphazardly discarded waste as the primary data for determining the
length and type of human occupation found in a given location. Similarly,
inscriptions on stones, bones, or clay tablets were not produced in order to
provide a record of linguistic structures and practices, but they have suc-
cessfully been used to explore the structural properties of languages such as
Hittite or Sumerian, which had already been extinct for millennia before
their modern linguistic analysis began.
However, it is also well known that historical remains and records tend
to be deficient in some ways with regard to modern purposes. Stone in-
scriptions and other historic documents with linguistic content, for exam-
ple, never provide a comprehensive record of the linguistic structures and
practices in use in the community at the time when these documents were
written. Thus, given that the Hittite records discovered to date mostly per-
tain to matters of government, law, trade, and religion, it remains unknown
how Hittite adolescents chatted with each other or whether it was possible
to have the verb in first position in subordinate clauses.
The experience with historical remains and records thus is ambivalent:
On the one hand, it clearly shows that they may serve as the database for
exploring issues they were not intended for. On the other hand, they show
that haphazardly compiled databases hardly ever contain all the information
one needs to answer all the questions of current interest. Based on this ob-
servation, the basic idea of a language documentation as developed here
can be stated as follows: The goal is to create a record of a language in the
sense of a comprehensive corpus of primary data which leaves nothing to
be desired by later generations wanting to explore whatever aspect of the
language they are interested in (what exactly is meant by primary data
here is further discussed in Section 3.1.1 below).
Put in this way, the task of compiling a language documentation is
enormous, and there is no principled upper limit for it. Obviously, every
specific documentation project will have to limit its scope and set specific
4 Nikolaus P. Himmelmann
targets. Guidelines and suggestions as to how to go about setting such limits
and targets are further discussed below and in the remaining chapters of
this book. But to begin with, the fundamental importance of taking a prag-
matic stance in all matters of language documentation needs to be empha-
sized once again. There are major practical constraints on the usefulness of
targets and delimitations for language documentations which are exclu-
sively based on theoretical considerations regarding the nature of language
and speech communities. In most if not all documentation settings, the
range of items that can be documented will be determined to a significant
degree by factors that are specific to the given setting, most importantly,
the availability of speakers who are willing and able to participate in the
documentation effort. In fact, recent experiences make it clear that encour-
aging native speakers to take an active part in determining the contents of a
documentation significantly increases the productivity of a documentation
project. Consequently, a theoretical framework for language documentation
should provide room for the active participation of native speakers. While
the input of native speakers and other factors specific to a given setting is
not completely unpredictable, it clearly limits the level of detail of a general
framework for language documentation which can be usefully explored in
purely theoretical terms.
This assessment, however, should not be construed as denying the rele-
vance of theorizing language documentations. Not everything in a docu-
mentation is fully determined by the specifics of a given documentation
situation. Speakers and speech communities usually do not have a fully
worked-out plan for what to document. Rather, the specifics of a documen-
tation are usually established interactively by communities and research
teams. On the part of the research team, this presupposes a theoretically
grounded set of basic goals and targets one wants to achieve.
Furthermore, without theoretical grounding language documentation is
in the danger of producing data graveyards, i.e. large heaps of data with
little or no use to anyone. While language documentation is based on the
idea that it is possible and useful to dissociate the compilation of linguistic
primary data from any particular theoretical or practical project based on
this data, language documentation is not a theory-free or anti-theoretical
enterprise. Its theoretical concerns pertain to the methods used in recording,
processing, and preserving linguistic primary data, as well as to the question
how it can be ensured that primary data collections are indeed of use for a
broad range of theoretical and applied purposes.
Among other things, documentation theory has to provide guidelines for
determining targets in specific documentation projects. It also has to develop
Chapter 1 Language documentation: What is it and what is it good for? 5
principled and intersubjective means for evaluating the quality of a given
documentation regardless of the specific circumstances of its compilation.
A further major concern pertains to the interface between primary data and
analysis in a broad range of disciplines. Based on a detailed investigation
and evaluation of basic analytical procedures in these disciplines, it has to
be determined which type and format of primary data is required for a par-
ticular analytical procedure so that it can be ensured that the appropriate
type of data is included in a comprehensive documentation.
The present book provides an introduction to basic practical and theo-
retical issues in language documentation. It presents specific suggestions
for the structure and contents of language documentations as well as the
methodologies to be used in compiling them. To begin with, it will be useful
briefly to address the question of what language documentations are good
for. That is, why is it a useful enterprise to create lasting, multipurpose re-
cords of a language?
2. What is a language documentation good for?
From a linguistic point of view, there are essentially three reasons for engag-
ing in language documentation, all of them having to do with consolidating
and enlarging the empirical basis of a number of disciplines, in particular
those branches of linguistics and related disciplines which heavily draw on
data of little-known speech communities (e.g. descriptive linguistics, lin-
guistic typology, cognitive anthropology, etc.). These are language endan-
germent, the economy of research resources, and accountability.
Certainly the major reason why linguists have recently started to engage
with the idea of multipurpose documentations is the fact that a substantial
number of the languages still spoken today are threatened by extinction (see
Grenoble and Whaley 1998; Hagge 2000; Crystal 2000; or Bradley and
Bradley 2002 for further discussion and references regarding language en-
dangerment). In the case of an extinct language, it is obviously impossible to
check data with native speakers or to collect additional data sets. Creating
lasting multipurpose documentations is thus seen as one major linguistic
response to the challenge of the dramatically increased level of language
endangerment observable in our times. In this regard, language documenta-
tions are not only seen as data repositories for scientific inquiries, but also
as important resources for supporting language maintenance.
Creating language documentations which are properly archived and
made easily accessible to interested researchers is also in the interest of
6 Nikolaus P. Himmelmann
research economy. If someone worked on a minority language in the
Philippines 50 years ago and someone else wanted to continue this work
now, it would obviously be most useful if this new project could build on
the complete set of primary data collected at the time and not just on a
grammar sketch and perhaps a few texts published by the earlier project.
Similarly, even if a given project on a little-known language is geared to-
wards a very specific purpose say, the conceptualization of space it is in
the interest of research economy (and accountability) if this project were to
feed all the primary data collected in the project work into an open archive
and not to limit itself to publishing the analytical results plus possibly a
small sample of primary data illustrating their basic materials.
While the set of primary data fed into an archive in these examples
would surely fail to constitute a comprehensive record of a language, it
could very well be of use for purposes other than the one motivating the
original project (data from matching tasks developed to investigate the lin-
guistic encoding of space, for example, are also quite useful for the analysis
of intonation, for conversation analytic purposes, for grammatical analysis,
and so on). More importantly, if it were common practice to feed complete
sets of primary data into open archives (which do not necessarily have to
form a physical unit), comprehensive documentations for quite a number of
little-known languages could grow over time, which in turn would strength-
en the empirical basis of all disciplines working on and with such lan-
guages and cultures. That is, while much of the discussion in this chapter
and book is concerned with projects specifically targeted at creating sub-
stantial language documentations, the basic idea of creating lasting, multi-
purpose documentations which are openly archived is not necessarily tied
to such projects. It is very well possible and desirable to create such docu-
mentations in a step-by-step fashion by compiling and integrating the pri-
mary data sets collected in a number of different projects over an extended
period of time. In fact, it is highly likely that in most instances, really com-
prehensive documentations can only be created in this additive way.
Finally, establishing open archives for primary data is also in the interest
of making analyses accountable. Many claims and analyses related to lan-
guages and speech communities for which no documentation is available
remain unverifiable as long as substantial parts of the primary data on which
the analyses are based remain inaccessible to further scrutiny. Accountability
here is intended to include all kinds of practical checks and methodological
tests with regard to the empirical basis of an analysis or theory, including
replicability and falsifiability. The documentation format developed here
Chapter 1 Language documentation: What is it and what is it good for? 7
encourages, and also provides practical guidelines for, the open and widely
accessible archiving of all primary data collected for little-known lan-
guages, regardless of their vitality.
3. A basic format for language documentations
This section presents a basic format for language documentations and then
highlights some features which distinguish this format from related enter-
3.1. The basic format
3.1.1. Primary data
Continuing the argument developed in the preceding sections, it should be
clear that a language documentation, conceived of as a lasting, multipur-
pose record of a language, should contain a large set of primary data which
provide evidence for the language(s) used at a given time in a given com-
munity (in all of the different senses of language). Of major importance in
this regard are specimens of observable linguistic behavior, i.e. examples
of how the people actually communicate with each other. This includes all
kinds of communicative activities in a speech community, from everyday
small talk to elaborate rituals, from parents baby-talking to their newborn
infants to political disputes between village elders.
It is impossible to record all communicative events in a given speech
community, not only for obvious practical, but also for theoretical and ethical
reasons. Most importantly, such a record would imply a totalitarian set-up
with video cameras and microphones everywhere and the speakers unable to
control what of their behavior is recorded and what not. A major theoretical
problem pertains to the fact that there is no principled way for determining
a temporal boundary for such a recording (all communicative events in one
day? two weeks? one year? a century?).
Consequently, there is a need to sample the kinds of communicative
events to be documented. Once again, we can distinguish between a prag-
matic guideline and theoretically grounded targets. The pragmatic guideline
simply says that one should record as many and as broad a range as possible
of communicative events which commonly occur in the speech community.
8 Nikolaus P. Himmelmann
The theoretically grounded sampling procedure will be determined to a
significant degree by the purposes and goals of the particular project. The
rather broad and unspecific goal of a lasting, multipurpose record of a lan-
guage envisioned here implies that, as much as possible, a sufficiently large
number of examples for every type of communicative event found in a
given speech community is collected. This in turn raises the highly complex
issue of how the typology of communicative events in a given speech com-
munity can be uncovered. Within sociolinguistics, the framework known as
the ethnography of communication provides a starting point for dealing
with this issue. Chapter 5 provides a brief introduction to major concepts
relevant here. Chapter 8 lists a range of important topics and parameters.
Besides observable linguistic behavior, is there anything else that needs
to be documented in order to provide for a lasting, multipurpose record of a
language? Or can all relevant information be extracted from a comprehen-
sive corpus of recordings of communicative events? One aspect of a lan-
guage that is not, or at least not easily, accessible by analyzing observable
linguistic behavior is the tacit knowledge speakers have about their lan-
guage. This is also known as metalinguistic knowledge and refers to the
ability of native speakers to provide interpretations and systematizations for
linguistic units and events. For example, speakers know that a given word
is a taboo word, that speech event X usually has to be followed by speech
event Y, or that putting a given sequence of elements in a different order is
awkward or simply impossible. Similarly, metalinguistic knowledge as
understood here also includes all kinds of linguistically based taxonomies,
such as kinship systems, folk taxonomies for plants, animals, musical in-
struments and styles, and other artifacts, expressions for numbers and
measures, but also morphological paradigms.
The documentation of metalinguistic knowledge, while not involving
principled theoretical or ethical problems, is also not a straightforward task
because much of it is not directly accessible. To be sure, in some instances
there are conventional speech events involving the display of metalinguistic
knowledge, such as reciting a genealogy or lengthy mythological narratives
which sketch a cognitive map of the landscape. In many societies, there are
also a number of well established and much discussed topics where speakers
engage in metalinguistic discussions regarding the differences between dif-
ferent varieties (in village X they say da but we say de; young people
cannot pronounce our peculiar /k/-sound correctly anymore, etc.). Further-
more, transcripts prepared by native speakers without direct interference by
a linguist often provide interesting evidence regarding morpheme, word,
Chapter 1 Language documentation: What is it and what is it good for? 9
and sentence boundaries (see Chapters 3 and 10 for further discussion). But
very often documenting metalinguistic knowledge will involve the use of a
broad array of elicitation strategies, guided by current theories about different
kinds of metalinguistic knowledge and their structure. One very important
type of elicited evidence are monolingual definitions of word meanings
provided by native speakers. See Chapters 3 and 6 for further discussion
and exemplification.
The documentation of metalinguistic knowledge as understood here in-
cludes much of the basic information that is needed for writing descriptive
grammars and dictionaries. In particular, it includes all kinds of elicited
data regarding the grammaticality or acceptability of phonological or mor-
phosyntactic structures and the meaning, use, and relatedness of lexical
items. However, it should be clearly understood that documentation here
means that the elicitation process itself is documented in its entirety, in-
cluding the questions asked or the stimuli presented by the researcher as
well as the reaction by the native speaker(s). That is, documentation per-
tains to the level of primary data which provide evidence for metalinguistic
knowledge, i.e. what native speakers can actually articulate regarding their
linguistic practices or their recordable reactions in experiments designed to
probe metalinguistic knowledge.
A grammatical rule as stated in a grammar
or an entry in a published dictionary are not primary data in this sense, even
though some linguists may believe that they are part of a native speakers
(unconscious) metalinguistic knowledge. In this view, grammatical rules
and dictionary entries are analytical formats for metalinguistic knowledge.
Whether and to what extent these have a place in a language documentation
is an issue we will take up in Section 4.2.
It is also worth noting that the documentation of observable linguistic
behavior and metalinguistic knowledge are similar in that they basically
consist of records of communicative events. In the case of observable lin-
guistic behavior, the communicative event involves the interaction of native
speakers among themselves, while in the case of metalinguistic knowledge
it involves the interaction between native speakers and documenters. There
is a superficial difference with regard to the preferred documentation for-
mat in that it is now standard practice to make (video) recordings of ob-
servable linguistic behavior, while for the elicitation of metalinguistic knowl-
edge it is still more common simply to take written notes. In principle,
(video-)recording would also be the better (i.e. more reliable and compre-
hensive) documentation format for elicited metalinguistic knowledge, but
there may often be practical reasons to stay with paper and pencil (among
10 Nikolaus P. Himmelmann
other things, native speakers may be more comfortable to discuss metalin-
guistic knowledge without being constantly recorded). But, to repeat, regard-
less of the recording method, records of observable linguistic behavior and
metalinguistic knowledge both contain primary data documenting linguistic
interactions in which native speakers participate.
In the following, we will use the label corpus of primary data as a short-
hand for corpus of recordings of observable linguistic behavior and meta-
linguistic knowledge for this component of a language documentation.
Throughout this book it is assumed that this corpus is stored and made
available in digital form.
To date, there is very little practical experience with regard to structuring
and maintaining such digital corpora. Consequently, no widely-used and
well-tested structure exists for them. Within the DoBeS program, it is a
widespread practice to operate with two basic components in structuring
primary data: records of individual communicative events and a lexical
database (this obviously follows a widespread practice in linguistic field-
work where apart from transcripts of recordings and fieldnotes the compila-
tion of a lexical database is a standard procedure).
Records of individual communicative events are called sessions (alter-
native terms would be document, text, or resource bundle). In the
manual for the IMDI Browser,
a session is defined as a meaningful unit of
analysis, usually [] a piece of data having the same overall content, the
same set of participants, and the same location and time, e.g., one elicita-
tion session on topic X, or one folktale, or one matching game, or one
conversation between several speakers. It could also be the recording of a
two-day ceremony. Sessions are typically allocated to different sets defined
according to parameters such as medium (written vs. spoken), genre (mono-
logue, dialogue, historical, chatting, etc.), naturalness (spontaneous, staged,
elicited, etc.), and so on. It is too early to tell whether some of the various
corpus structures currently being used are preferable to others.
There are two reasons why a lexical database appears to be a useful
format for organizing primary data. On the one hand, there is a need to
bring together all the information available for a given item so that one can
make sure that the meaning and formal properties of the item are well un-
On the other hand, and perhaps more importantly, a list of lexical
items is a very useful resource when working on the transcription and trans-
lation of recordings. One of the most widely used computational tools in
descriptive linguistics, the program Toolbox (formerly Shoebox),
allows for
the semi-automatic compilation of a lexical database when working through
Chapter 1 Language documentation: What is it and what is it good for? 11
a transcript, and the existence of this program is certainly one reason why
the compilation of a lexical database currently is almost an automatic pro-
cedure when working with recordings. However, as with all other aspects
of organizing a digital corpus of primary data, it remains to be seen and
tested further whether this is indeed a necessary and useful procedure.
3.1.2. Apparatus
Inasmuch as linguistic and metalinguistic interactions cover the range of
basic interactional possibilities,
a documentation which contains a com-
prehensive set of primary data for both types of interactions is logically
complete with regard to the level of primary data. However, it is well
known that a large corpus of primary data is of little use unless it is pre-
sented in a format which ensures accessibility for parties other than the
ones participating in its compilation. To be accessible to a broad range of
users, including the speech community, the primary data need to be accom-
panied by information of various kinds, which following philological
tradition could be called the apparatus. The precise extent and format of
the apparatus is a matter of debate, with one exception: the uncontroversial
need for metadata.
Metadata are required on two levels. First, the documentation as a whole
needs metadata regarding the project(s) during which the data were com-
piled, including information on the project team(s), and the object of docu-
mentation (which variety? spoken where? number and type of records; etc.).
Second, each session (=segment of primary data) has to be accompanied
by information of the following kind:
a name of the session which uniquely identifies it within the overall
when and where was the data recorded?;
who is recorded and who else was present at the time?;
who made the recording and what kind of recording equipment was
an indication of the quality of the data according to various parameters
(recording environment and equipment, speaker competence, level of
detail of further annotation);
who is allowed to access the data contained in this session?;
12 Nikolaus P. Himmelmann
a brief characterization of the content of the session (what topic is being
talked about? what kind of communicative event [narrative, conversa-
tion, song, etc.] is being documented?);
links between different files which together constitute the session, e.g. a
media file (audio or video) and a file containing a transcription, trans-
lation, and various types of commentary relevant for interpreting the re-
cording contained in the media file (on which see further below).
The metadata on both levels have two interrelated functions. On the one
hand, they facilitate access to a documentation or a specific record within a
documentation by providing key access information in a standardized for-
mat (what, where, when, etc.). In this function, they are similar to a cata-
logue in a library and we can thus speak of a cataloguing function.
On the
other hand, they have an organizational function in that they define the
structure of the corpus which, in particular in the case of documentations in
digital format, in turn provides the basis for various procedures such as
searching, copying, or filtering within a single documentation or across a
set of documentations. Obviously, a metadata standard which targets the
organizational function has to be richer and more elaborate than one which
targets the cataloguing function. The former is actually a corpus manage-
ment tool, which defines digital structures and supports various computa-
tional procedures, rather than just a standard for organizing a catalogue.
Currently there exist two metadata standards which in fact complement
each other in that they target these different functions. The OLAC standard
targets exclusively the cataloguing function and provides an easy and fast
access to a large number of diverse repositories of primary data on a
worldwide scale (in both digital and non-digital formats). The IMDI stan-
dard, which incorporates all the information included in the OLAC standard
and hence is compatible with it, is actually a corpus management tool
which primarily targets digitally archived language documentations. Further
discussion of metadata concepts and standards is found in Chapters 4 and
Apart from metadata, there is in most instances also a need for further
information accompanying each recording as well as the documentation as
a whole in order to make the corpus of primary data useful to users who do
not know the language being documented. On the level of individual ses-
sions, such additional information is called here an annotation.
Thus, in
the case of audio or video recordings of communicative events, it is obvi-
ously useful to provide at least a transcription and a translation so that users
Chapter 1 Language documentation: What is it and what is it good for? 13
not familiar with the language are able to understand what is going on in
the recording.
However, the exact extent and format of the annotations that should be
included in each session is a matter of debate. It is common to distinguish
between minimal and more elaborate annotation schemes. A widely as-
sumed minimal annotation scheme consists of just a transcription and a free
translation which should accompany all, or at least a substantial number of,
primary data segments. More elaborate annotation schemes include various
levels of interlinear glossing, grammatical as well ethnographical commen-
tary, and extensive cross-referencing between the various sessions and re-
sources compiled in a given documentation. See further Chapters 8 and 9.
On the level of the overall documentation, information accompanying
the primary data set other than metadata is, for lack of a well-established
term, subsumed here under the heading general access resources (alterna-
tively, it could also simply be called annotation). Such general (in the
sense of: relevant for the documentation as a whole) access resources
would include:
a general introduction which provides background information on the
speech community and language (language name(s), affiliation, major
varieties, etc.), the fieldwork setting(s), the methods used in recording
primary data, an overview of the contents, structure, and scope of the
primary data corpus and its quality;
brief sketches of major ethnographic and grammatical features being
an explication of the various conventions that are being used (orthogra-
phy, glossing abbreviations, other abbreviations);
indices for languages/varieties, key analytic concepts, etc.;
links and references to other resources (books and articles previously
published on the variety or community being documented; other pro-
jects relating to the community or its neighbors, etc.).
For further discussion of some aspects of relevance here, see Chapters 8
and 12.
Table 1 provides a schematic overview of the components of the lan-
guage documentation format sketched in this section.
14 Nikolaus P. Himmelmann
Table 1. Basic format of a language documentation
Primary data Apparatus
Per session For documentation as a whole
time and location of
recording team
recording equipment
content descriptors
further linguistic and
ethnographic glossing
and commentary
location of documented
project team(s) contributing
to documentation
participants in documentation
General access resources
orthographical conventions
ethnographic sketch
sketch grammar
glossing conventions
links to other resources
3.2. Whats new?
Language documentation in the way depicted in Table 1 is not a totally new
enterprise. The compilation of annotated collections of written historical
documents and culturally important speech events (legends, epic poems,
and the like) was the major concern of philologists in the nineteenth cen-
tury. Linguistic and anthropological fieldwork in the Boasian tradition has
also always put major emphasis on the recording of speech events. Within
linguistic anthropology, recording and interpreting oral literature is a major
task. All of these traditions have had a major influence on documentary
linguistics as developed in this book.
Chapter 1 Language documentation: What is it and what is it good for? 15
Nevertheless, the idea of a language documentation as sketched above is
new for mainstream linguistics, and even compared to these earlier ap-
proaches, it is new with regard to the following important features:
Focus on primary data: The main goal of a language documentation is
to make primary data available for a broad group of users. Unlike in the
philological tradition, there is no restriction to culturally or historically
important documents, however such importance may be defined.
Explicit concern for accountability: The focus on primary data implies
that considerable care is given to the issue of making it possible to
evaluate the quality of the data. This in turn implies that the field situa-
tion is made transparent and that all documents are accompanied by
metadata which detail the recording circumstances as well as the further
steps undertaken in processing a particular document.
Concern for long-term storage and preservation of primary data: This
involves two aspects. On the one hand, metadata are crucial for users of
a documentation to locate and evaluate a given document, as just men-
tioned. On the other hand, long-term storage is essentially a matter of
technology, and while compilers of language documentations do not
have to be able to handle all the technology themselves, they need to
have a basic understanding of the core issues involved so that they avoid
basic mistakes in recording and processing primary data. Among other
things, the quality of the recording is of utmost importance for long-
term storage and hence needs explicit attention. See further Chapters 4,
13, and 14.
Work in interdisciplinary teams: Work on a truly comprehensive lan-
guage documentation needs expertise in a multitude of disciplines in ad-
dition to the basic linguistic expertise required in transcription and trans-
lation. Such disciplines include anthropology, ethnomusicology, oral
history and literature, as well as all the major subdisciplines of linguis-
tics (socio- and psycholinguistics, phonetics, discourse analysis, corpus
linguistics, etc.). There are probably no individuals who are experts in
all of these fields, and few who have acquired significant expertise in a
substantial number of them. Hence, good documentation work usually
requires a team of researchers with different backgrounds and areas of
Close cooperation with and direct involvement of speech community:
The documentation format sketched above strongly encourages the active
involvement of (members of) the speech community in two ways. On
16 Nikolaus P. Himmelmann
the one hand, as mentioned above, native speakers are among the main
players in determining the overall targets and outcomes of a documenta-
tion project. On the other hand, a documentation project involves a sig-
nificant number of activities which can be carried out with little or no
academic training. For example, the recording of communicative events
can be done by native speakers who know how to handle the recording
equipment (which can be learned in very short time), and it is often
preferable that they do such recordings on their own because they know
where and when particular events happen, and their presence is fre-
quently felt to be less obtrusive. Similarly, given some training and
regular supervision, the recording of metalinguistic knowledge and also
the transcription and translation of recordings can be carried out by na-
tive speakers all by themselves. See further Chapter 3.
3.3. Limitations
As with most other scientific enterprises, the language documentation for-
mat developed here is not without problems and limitations. Some of the
theoretical and practical problems have already been mentioned in the pre-
ceding discussion, and it will suffice here to emphasize the fact that the
documentation format in Table 1 is based on a number of hypotheses which
may well be proven wrong or unworkable in practical terms (see further
Section 4 below). In addition to theoretical and practical problems, there
are also ethical problems and limitations which are related to the fact that
even the most circumspectly planned documentation project has the poten-
tial to profoundly change the social structure of the society being docu-
mented. This may pertain to a number of different levels, only two of
which are mentioned here (see Wilkins 1992, 2000; Himmelmann 1998;
and Grinevald 2003: 6062 for further discussion).
On a somewhat superficial level, there are usually a few, often not more
than one or two native speakers who are very actively involved in the pro-
ject work. Through their work in the project, their social and economic
status may change in a way that otherwise may have been impossible. This
in turn may lead to (usually minor) disturbances in the wider community,
such as inciting the envy or anger of relatives and neighbors. It is also not
unknown that affiliation with an externally funded and administered project
is used as an instrument in political controversies and competitions within
the speech community.
Chapter 1 Language documentation: What is it and what is it good for? 17
On a more profound level, in non-literate societies the documentation of
historical, cultural, and religious knowledge generally introduces a new way
for accessing such knowledge and thereby may change the whole psycho-
social fabric of the society (Ong 1982). This is particularly true of societies
where much of the social fabric depends on highly selective access to cultural
and historical knowledge, transmission of such knowledge thus involving
different levels of secrecy (see Brandt [1980, 1981] for a pertinent example).
That is, in some instances a documentation project may contribute to the
demise of the very linguistic and cultural practices it proposes to document.
In these instances, it would appear to be preferable not to document, but
rather to support language maintenance in other ways, if necessary and pos-
Note that in general, language documentation and language maintenance
efforts are not opposed to each other but go hand in hand. That is, it is an
integral part of the documentation framework elaborated in this book that it
considers it an essential task of language documentation projects to support
language maintenance efforts wherever such support is needed and wel-
comed by the community being documented. More specifically, the docu-
mentation should contain primary data which can be used in the creation of
linguistic resources to support language maintenance, and the documenta-
tion team should plan to dedicate a part of its resources to mobilizing the
data compiled in the project for maintenance purposes. Chapter 15 elabo-
rates some of the issues involved here.
4. Alternative formats for language documentations
The format for language documentations sketched in the preceding section
is certainly not the only possible format. In fact, within structural linguistics
there is a well-established format for language documentations consisting
primarily of a grammar and a dictionary. In this section, I will first briefly
present some arguments as to why this well-established format is strictly
speaking a format for language description and not for language documen-
tation proper, and thus is not a viable alternative to the basic documentation
format of Table 1. In Section 4.2, we will then turn to the question of
whether it makes sense to integrate the grammar-dictionary format with the
basic documentation format of Table 1 and thus make fully worked-out
grammars and dictionaries essential components of language documenta-
18 Nikolaus P. Himmelmann
It should be clearly understood that this section is merely intended to draw
attention to this important topic at the core of documentation theory. It
barely scratches the surface of the many complex issues involved here. For
more discussion, see Labov (1975, 1996), Greenbaum (1984), Pawley
(1985, 1986, 1993), Lehmann (1989, 2001, 2004b), Mosel (1987, 2006),
Himmelmann (1996, 1998), Schtze (1996), Keller (2000), Ameka et al.
(2006), among others.
4.1. The grammar-dictionary format
The grammar-dictionary format of language description targets the language
That is, it is based on the notion of a language as an abstract sys-
tem of rules and oppositions which underlies the observable linguistic be-
havior. In this view, documenting a language essentially involves compiling
a grammar (=set of rules for producing utterances) and a dictionary (=a
list of conventional form-meaning pairings used in producing these utter-
ances). To this core of the documentation, a number of texts are often
added, either in the form of a text collection or in the appendix to the gram-
mar, which have the function of extended examples for how the system
works in context. These texts are usually taken from the corpus of primary
data on which the system description is based, but they do not actually pro-
vide access to these primary data because they are edited in various ways.
Providing direct access to the complete corpus of primary data is typically
not part of this format.
The compilation of grammars and (to a lesser extent) dictionaries is a
well-established practice in structural linguistics, with many fine specimens
having been produced in the last century. But even the best structuralist
grammars and dictionaries have been lacking with regard to the goal of
presenting a lasting, multipurpose record of a language. Major problems
with regard to this goal include the following points:
a. Many communicative practices found in a given speech community
remain undocumented and unreconstructable. That is, provided with a
grammar and a dictionary it is still impossible to know how the lan-
guage is (or was) actually spoken. For example, it is impossible to derive
from a grammar and a dictionary on how everyday conversational rou-
tines look like (how does one say hello, good morning?) or how one
linguistically interacts when building a house or negotiating a marriage.
Chapter 1 Language documentation: What is it and what is it good for? 19
b. In line with the structuralist conception of the language system, gram-
mars and dictionaries contain abstractions based on a variety of analyti-
cal procedures. With the data contained in grammars and dictionaries,
most aspects of the analyses underlying the abstractions are not verifi-
able or replicable. There is no way of knowing whether fundamental
mistakes have been made unless the primary data on which the analyses
build are made available in toto as well.
c. Grammars usually only contain statements on grammatical topics which
are known and reasonably well understood at the time of writing the
grammar. Thus, for example, grammars written before the advent of
modern syntactic theories generally do not contain any statements re-
garding control phenomena in complex sentences. Many topics of cur-
rent concern such as information structure (topic, focus) or the syntax
and semantics of adverbials have often been omitted from descriptive
grammars due to the lack of an adequate descriptive framework. As
pointed out in particular by Andrew Pawley (1985, 1993, and elsewhere),
there is a large variety of linguistic structures often subsumed under the
heading of speech formulas which do not really fit the structuralist idea
of a clean divide between grammar and dictionary and thus more often
than not are not adequately documented in these formats.
d. Grammars and (to a lesser extent) dictionaries provide little that is of
direct use to non-linguists, including the speech community, educators,
and researchers in other disciplines (history, anthropology, etc.).
These points of critique mostly pertain to the fact that structuralist language
descriptions are reductionist with regard to the primary data on which they
are based and do not provide access to them. Or, to put it in a slightly dif-
ferent and more general perspective, they document a language only in one
of the many senses of language, i.e. language as an abstract system of
rules and oppositions. Inasmuch as structuralist language descriptions are
intended to achieve just that, the above critique is, with the possible ex-
ception of point (b), not fair in that it targets goals for which these descrip-
tions were not intended.
In this regard, it should be emphasized that the above points in no way
question the usefulness and relevance of descriptive grammars and diction-
aries with regard to their main purpose, i.e. to provide a description and
documentation of a language system. While there is always room for im-
provement (compare points (b) and (c) above), there is no doubt about the
fact that grammars and dictionaries are essentially successful in delivering
20 Nikolaus P. Himmelmann
system descriptions. What is more, the above points also do not imply that
grammars and dictionaries do not have a role to play in language documen-
tations, as further discussed in the next section. The major thrust of the
critical observations above is that a description of the language system as
found in grammars and dictionaries by itself is not good enough as a lasting
record of a language, even if accompanied by a text collection. And it is
probably fair to say that the way primary data have been handled in the
grammar-dictionary format is now widely seen as not adequate and thus in
need of improvement.
From this assessment, however, it does not necessarily follow that the
basic format of Table 1 is the only imaginable format for lasting, multipur-
pose records of a language. Instead, it may reasonably be asked, why not
combine the strong sides of the two formats discussed so far and propose
that language documentations consist of the combination of a large corpus
of annotated primary data as well as a full descriptive grammar and a com-
prehensive dictionary? This is the question to be addressed in the next sec-
4.2. An extended format for language documentations
Assuming that the structuralist notion of a language as a system of rules
and oppositions is a viable and useful notion of a language, though not
necessarily the only useful and viable one for documentary purposes, and
assuming further that a descriptive grammar and a dictionary provide ade-
quate representations of this system, it would seem to follow that a truly
comprehensive language documentation does not simply consist of a large
corpus of annotated primary data as sketched in Section 3 but instead
should also include a comprehensive grammar and dictionary. Along the
same lines, one may ask why the apparatus in Table 1 should only contain a
sketch grammar and not a fully worked-out comprehensive grammar, thus
replacing the format in Table 1 with the one in Table 2.
Chapter 1 Language documentation: What is it and what is it good for? 21
Table 2. Extended format for a language documentation
Primary data Apparatus
Per session For documentation as a whole
General access resources
orthographical conventions
glossing conventions
links to other resources
further linguistic and
ethnographic glossing
and commentary
Descriptive analysis
descriptive grammar
The difference between the basic format for language documentations in
Table 1 and the extended format depicted in Table 2 pertains to the addition
of fully worked out descriptive analyses on various levels (as indicated by
the shaded area in Table 2), replacing the corresponding sketch formats
(sketch grammar, ethnographic sketch) under general access resources in
the basic format. Whether this is in fact a fundamental difference or rather a
gradual difference in emphasis, is a matter for further debate. In actual
practice, the difference may not be as relevant as it may appear at first sight,
as we will see at the end of this section. Still, in the interest of making clear
what is involved here, it will be useful to highlight the differences between
the two formats and to indicate some of the problems that are created by
incorporating comprehensive descriptive formats in the extended documen-
tary format. There are at least two types of such problems, one relating to
theoretical issues, the other to research economy.
The theoretical problem pertains to the fact that it is not at all clear how
exactly the descriptive grammar (or the ethnography or the dictionary)
should look that is to be regarded as an essential part of a language docu-
mentation. As is well known, for much studied languages such as English,
22 Nikolaus P. Himmelmann
Latin, Chinese, Arabic, Tagalog, Quechua, or Fijian, there exist not only
different types of grammars (pedagogical, historical, descriptive) but also
different descriptive grammars, each having its particular emphasis and
way of presenting the structure of the language system. This simply reflects
the fact that at least according to the current state of knowledge, there is not
just exactly one descriptive grammar which correctly and comprehensively
captures the system of a language. Instead, any given descriptive grammar
is a more or less successful attempt to capture the system of a language
(variety), rarely if ever comprehensive, and usually also including at least
some contested, if not clearly wrong, analyses.
As a consequence of this state of affairs, the following problem arises
with regard to the extended format for language documentations in Table 2.
Either one has to specify a particular type of descriptive grammar as the
one which is the most suitable one for the purposes of language documenta-
tions and thus is able to provide a reasonably precise definition of this part
of a documentation. Alternatively, one allows for a multitude of descriptive
grammars to be included in a documentation, thus declaring it a desirable
goal to include a number of different analyses of the language system as part
of the overall documentation of a language. The latter option clearly raises
the issue of practical feasibility, which leads us to the second problem men-
tioned above, i.e. the essentially pragmatic problem of research economy.
Practical feasibility also is an issue if just one analysis of the grammati-
cal system is assumed to be an essential part of a language documentation,
for the following reason. It is a well-known fact that it is possible to base
elaborate descriptive analyses exclusively on a corpus of texts (either texts
written by native speakers or transcripts of communicative events) and
most good descriptive grammars are based to a large degree on a corpus of
(mostly narrative) texts. A large corpus of texts in fact provides for the pos-
sibility of writing a number of interestingly different descriptive grammars,
targeting different components of the language system and their interrela-
tion. Consequently, one could argue that even if one accepts the claim that a
comprehensive documentation should also document the language system,
there is no need to include a fully worked-out descriptive grammar in a
language documentation. The information needed to write such a grammar
is already contained in the corpus and the resources needed to extract this
information and to write it up in the conventional format of a descriptive
grammar are not properly part of the documentation efforts. In this view,
resources allocated to documentation should not be wasted on writing a
grammar but are better spent on enlarging the corpus of primary data, the
Chapter 1 Language documentation: What is it and what is it good for? 23
quantity or quality of annotations, or on the mobilization of the data (mo-
bilization is further discussed in Chapter 15).
The major counterargument against this position would be the claim that
actually producing a descriptive grammar is a necessary part of a language
documentation because otherwise, essential aspects of the language system
would be left undocumented. The evaluation of this claim rests on the ques-
tion of whether there is some kind of important evidence for grammatical
structure which, as a matter of principle, cannot be extracted from a suffi-
ciently large and varied corpus of primary data as sketched in Section 3
above. As far as I am aware, there is especially one type of evidence of this
kind, i.e. negative evidence. Obviously, illicit structures cannot be attested
even in the largest and most comprehensive corpora.
However, the lack of explicit negative evidence in a corpus of texts does
not per se necessitate the inclusion of a descriptive grammar in a language
documentation. On the one hand, with regard to the usual way of obtaining
negative evidence (i.e. asking one or two speakers whether examples x, y, z
are okay), it is doubtful whether this really makes a difference in quality
compared to evidence provided by the fact that the structure in question is
not attested in a large corpus. Elicited evidence is only superior here if it is
very carefully elicited, paying adequate attention to the sample of speakers
interviewed, potential biases in presenting the material, and the like. On the
other hand, and more importantly, the basic documentation format of Table
1 does not only consist of a corpus of more or less natural communicative
events but also of documents recording metalinguistic knowledge. Metalin-
guistic knowledge includes negative evidence for grammatical structuring,
as already mentioned above.
Obviously, gathering negative evidence on grammatical matters presup-
poses that the researcher asks the right questions, which in turn presupposes
grammatical analysis. In this regard, it bears emphasizing that documenta-
tion does not exclude analysis. Quite the opposite: analysis is essential. What
the documentary approach implies, however, is that the analyses which are
carried out while compiling a documentation do not necessarily have to be
presented in the format of a descriptive grammar. Instead, analyses can (or
should) be included in a documentation through (scattered) annotations on
negative evidence, the inclusion of experiments generating important evi-
dence for problems of grammatical or semantic analysis, and so on (see
further Chapters 8 and 9).
The major reason for choosing a distributed grammatical annotation
format instead of the established descriptive grammar format is one of time
24 Nikolaus P. Himmelmann
economy. The writing of a descriptive grammar involves to a substantial
degree matters of formulation (among other things, the search for the most
suitable terminology) and organization (for example, chapter structure or
the choice of the best examples for a given regularity; see Mosel 2006 for
further discussion and exemplification). These are very time consuming
activities which in some instances may enhance the analysis of the lan-
guage system, but in general do not contribute essential new information on
it. Thus, with regard to the economy of research resources, it may be more
productive to spend more time on expanding the corpus of primary data
rather than to use it for writing a descriptive grammar.
In short, then, the difference between the basic and the extended formats
as conceived of here is one between different formats or styles for the
inclusion of analytical insights in a documentation. In the basic format,
analyses are included in the form of scattered annotations and cross-
references between sessions (and, of course, indirectly also by the fact that
for topics for which little or no data can be found in the recordings of
communicative events, elicited primary data are included). In the extended
format, analyses are presented as such in full, i.e. as descriptive statements
about the language system, usually accompanied by (links to) relevant ex-
In actual practice, there will be many instances where this apparently
clear difference will become blurred. For example, when the number and
types of communicative events that can be recorded in a given community
is severely limited, it may be more useful to work on full, and fully explicit,
descriptions of aspects of the grammatical system not represented in the
texts, rather than recording more texts of the same kind with the same
speaker. Furthermore, on a much more mundane level, there are (individu-
ally widely diverging) limits as to the time and energy that can be produc-
tively spent on the not always thrilling routine work involved in documen-
tation (filling in metadata, checking translations and glossing, etc.), and it
would be a counterproductive and rather ill-conceived idea generally to
restrict work with a speech community to pure documentation to the ex-
clusion of all fully explicit (=publishable) analytic work. It is thus unlikely
that linguists undertaking language documentations will stick to the basic
format in its purest form and refrain from working on aspects of a fully
explicit descriptive analyses while compiling the annotated corpus of pri-
mary data. It should, then, also not come as surprise that many researchers
including some of the contributors to this volume tend to ignore the
difference between the two formats and to remain implicit as to what ex-
Chapter 1 Language documentation: What is it and what is it good for? 25
actly they have in mind when referring to grammatical analyses and dic-
Most language documentations that have been compiled in recent years
are actually hybrids with regard to the two formats. They tend to include
many scattered analytical observations as well as substantial fully worked-
out descriptive statements of some aspects of the language system (rarely
comprehensive grammars). It remains to be seen whether this practice is
actually viable in the long-term or whether there are clear advantages at-
tached to adhering to either the basic or the extended format as discussed in
this section.
5. The structure of this book
The following chapters provide in-depth discussions and suggestions for
various issues arising when working on and with language documentations.
While the authors have slightly different views of what a language docu-
mentation is (or should be) and clearly differ with regard to their major
topics of interest and theoretical preferences, they share a major concern for
the maintenance of linguistic diversity, including the quality, processing,
and accessible preservation of linguistic primary data, which in some way
or other all these chapters are about.
The focus of each chapter is on a topic which is rarely dealt with within
descriptive linguistics (and mainstream linguistics in general), reflecting the
fact that issues relating to the collection and processing of primary data
have been widely neglected within the discipline until very recently. For
each topic, both theoretical and practical issues are discussed, although the
chapters differ quite significantly as to how much space they allot to either,
in accordance with the topic being dealt with.
Apart from the present introduction, there are roughly four parts to this
book which, however, are closely linked to, and overlap with, each other.
Chapters 2 to 4 deal with general (i.e. not specifically linguistic) ethical
and practical issues which have to be considered and reconsidered from the
earliest planning stage of a documentation project through to its completion.
The guiding questions here are: How to interact with speech communities
and individual speakers; and how to capture, store, and process relevant
data. These issues are interrelated, in that data capture and processing is not
just a technological issue, but also has to pay attention to sensitivities and
interests of the speech community and the individual speakers contributing
26 Nikolaus P. Himmelmann
data. Chapter 3 includes suggestions for getting started with the actual lin-
guistic documentation work in the field.
The next eight chapters (Chapters 5 to 12) pertain to the recording and
processing of primary linguistic data from an anthropological and linguistic
point of view. The first three of these chapters (Chapters 5 to 7, but also a
considerable part of Chapter 8) are primarily concerned with the issue of
how and what to document, given the goal of creating a lasting and multi-
functional record of a language. Chapter 5 provides an introduction to a
cultural and ethnographic understanding of language. This is essential for
the success of a documentation project, not only with regard to the neces-
sity of being able to identify the types of communicative events that should
be recorded, but also for being able to successfully interact within a speech
community which has a different set of norms of interaction. In the latter
regard, Chapter 5 complements and expands Chapters 2 and 3.
Chapter 6 addresses the issue of how to access and represent meta-
linguistic knowledge, focusing primarily on lexical knowledge. Chapter 7
briefly discusses the kinds of data needed for prosodic analysis, while
Chapter 8 reports on the demands of anthropologists for language docu-
mentations, which complements the discussion of this topic in Chapter 5.
Chapter 8 also addresses the issue of ethnographically relevant annota-
tion and commentary and thus forms a group with the next four chapters
(Chapters 9 to 12) all of which are concerned with the part of a documenta-
tion called apparatus in Table 1. That is, they deal with the processing of
primary data necessary for them to become useful and accessible to a broad
range of users. While Chapters 8 and 9 provide an overview of the basic
structure and various practical aspects of ethnographic and linguistic anno-
tation and commentary, respectively, the following two chapters address
some more specific issues with regard to the written representation of re-
corded communicative events. Chapter 10 is concerned with one major
aspect of transcription, namely, the need to segment the continuous flow of
spoken language into smaller units, in particular words and intonations
units. Issues relating to the development of a practical orthography which
can be used for the written representation of the recordings, for educational
materials, etc., and which is acceptable and accessible to the speech com-
munity are discussed in Chapter 11. The final chapter in this part of the
book, Chapter 12, discusses the structure and format of the sketch grammar
which is part of the overall apparatus of the documentation, intended to
facilitate access to the primary data themselves as well as the grammatical
information to be found in sessions and lexical database.
Chapter 1 Language documentation: What is it and what is it good for? 27
The last part of the book, consisting of the final three chapters, relates to the
long-term perspectives of a documentation, in particular, archiving issues
and its use in language maintenance. Apart from an obvious focus on tech-
nological issues, the main concern of Chapter 13 on Archiving challenges
is a critical review of the different interests and goals of the three major
groups involved in the archiving process: the donators (the people handing
material to the archive), the archivists (the people running and maintaining
the archive), and the users of archival sources. Chapter 14 takes up one
particularly critical issue in long-term preservation, i.e. the changing stan-
dards in character and text structure encoding which very easily render
digitally-stored information uninterpretable. Finally, Chapter 15 focuses on
speech communities as potential users and argues that there is a need for
elaborate and creative concepts for mobilizing primary data, i.e. creating
language resources from archival data which are of interest and use to a
given community.
There are a number of important topics which actually should also be
dealt with in a book such as the present one but which unfortunately and for
reasons beyond the control of the editors could not be included at this point.
In particular, the following three topics are also of critical importance to
language documentation (see the books website for additional and up-to-
date information on these and other topics).
One major aspect of linguistic interactions which has to be attended to
in documentations are so-called paralinguistic features, in particular ges-
ture. The recent textbook on gesture by Kendon (2004) provides a thor-
ough general introduction to this topic. See also Section 2.5 in Chapter 9
for a brief note on paralinguistics.
There is no chapter on the basics of producing high-quality audio and
video recordings. While this topic in part involves a lot of technological
aspects which change rather rapidly and thus would in any event not
have been included in this book, there is a need to be aware of what de-
fines good recordings. In addition to the books website, see the Lan-
guage Archiving Newsletter and the DoBeS and ELDP websites for
relevant pointers and links.
Apart from the kind of mobilization of primary data for language main-
tenance purposes discussed in Chapter 15, there are also more traditional,
but equally important contributions that a language documentation can
make to language maintenance efforts. These include, in particular, the
development of teaching materials in the documented variety. See von
Gleich (2005) for a brief discussion and references.
28 Nikolaus P. Himmelmann
The book is also heavily biased towards the more narrowly linguistic ap-
proaches to language. Documentary work that aims at a truly comprehensive
record of a language also has to engage with ethnobotany, musicology,
human geography, oral history, and so on. We hope that it will be possible
before too long to compile a further introductory volume where the core
issues and methodologies of these and related disciplines are presented from
the point of view of enhancing language (and culture) documentations.
Even though the focus is on linguistic approaches to language, it should
be clearly understood that even for this domain the ability to engage in lan-
guage documentation projects cannot be gained by mastering only the topics
and techniques presented here. Ideally, training in language documentation
includes a training in the basics of a broad range of linguistic subdisciplines
and neighboring disciplines. Training in descriptive and anthropological
linguistics is indispensable.
The latter two topics are not dealt with here because good textbooks for
them are readily available. As for descriptive linguistics, the classic text-
books by Hockett (1958) and Gleason (1961) still provide an excellent in-
troduction which, however, should be complemented by typologically
grounded surveys of major categories and structures as, for example, in the
second edition of Shopens Language Typology and Syntactic Description
or in Kroeger (2005). As for anthropological linguistics, Duranti (1997)
introduces the most important concepts and issues, which could be com-
plemented with the more in-depth discussion of the ethnography of com-
munication by Saville-Troike (2003). Finally, the contributions in Newman
and Ratliff (2001) combine descriptive and ethnolinguistic topics and in-
sights and complement the discussion of linguistic fieldwork in Chapters 2
and 3 of this volume.
In conclusion, it may be worthwhile to emphasize the fact that docu-
mentary linguistics is an emerging field where many things are still in flux.
Most importantly perhaps, large multimedia corpora on lesser-known lan-
guages are very new and largely unexplored entities. It is very well possible
that new techniques for working with such corpora will emerge before too
long, requiring major adjustments to the format for language documenta-
tions discussed in this chapter and book. But rather than a shortcoming, this
should be seen as one of the exciting aspects of language documentation.
Apart from being a useful introduction to language documentation, provid-
ing theoretical grounding as well practical advice, this book should make it
clear that language documentation is an important, engaging and rewarding
enterprise with many repercussions for linguistics and other language-
related disciplines and projects.
Chapter 1 Language documentation: What is it and what is it good for? 29
I wish to thank my co-editors and Eva Schultze-Berndt for critical discus-
sion of many of the issues touched upon here, as well as helpful comments
on the draft version of this chapter.
1. With regard to the latter point, compare the following quote from Luraghi
(1990: 128 FN1) which nicely illustrates the problems arising when data types
are missing in a given corpus: As to the position of the verb, the most impor-
tant difference [between main and subordinate clauses, NPH] lies in the ab-
sence of VSO sentences in subordinate clauses. It can of course be objected that
this may be due simply to the shortage of sources, since VSO sentences are on
the whole very infrequent. However, in the light of comparative data from
other Indo-European languages, this objection could perhaps be rejected
2. The major limitation here are restrictions on access to recordings imposed by
speakers or communities which, of course, should be observed.
3. Experiment here is to be taken in a broad sense, including, for example, the
testing of the acceptability of invented examples.
4. IMDI =ISLE Metadata Initiative. The manual can be downloaded at http://
5. Note that this does not necessarily imply that all the information for a lexical
item has to be gathered in a single location (i.e. an entry in the database), as it
is currently done by most researchers. Alternatively, the lexical database could
consist simply of links to all the sessions where the item in question occurs.
This could include a session where the item is elicited as part of the elicitation
of a word list or semantic field, a session where the item has been recorded in a
list of items or a carrier phrase in order to document characteristic sound pat-
terns, and a session where it occurs as part of a procedural text.
6. Please refer to the appendix for further information on this program.
7. Note that linguistic interaction here includes interactions with native speakers
of other varieties inasmuch as they are a common occurrence in the speech
community which is being documented.
8. The following list takes an audio or video recording as its main example. Of
course, the same type of metadata is needed for primary data gathered in a dif-
ferent way such as written fieldnotes or photos.
9. Note that the term cataloguing is used here in a somewhat broader sense than
in Chapter 4 where it is used to refer to one particular subtype of metadata.
10. Strictly speaking, annotations could also be called metadata since the term
metadata in general refers to all kinds of data about data. However, within the
30 Nikolaus P. Himmelmann
context of language documentations it is useful to distinguish between different
types of metadata (in this broad sense), and it is now a widely-used practice to
use the term metadata in the context of language documentations exclusively
for data types which have a cataloguing or organizational function and to use
annotation (or commentary) for other types of information accompanying
segments of primary data.
11. The structuralist idea of language as an abstract system has been articulated in
a variety of oppositions including the well-known Sassurean distinction of
langue vs. langage vs. parole and the Chomskyan distinction of competence
vs. performance. For the present argument, the details of how the abstract lan-
guage system is conceived of do not matter and thus are ignored.
12. With regard to falsifiability (point (b)), not providing access to the primary
data is indeed a major problem for the scientific status of these descriptions.
However, the basic assumption here appears to have been that whoever wanted
to replicate and possibly falsify a descriptive analysis on the basis of material
other than the one made available in examples and texts could compile their
own set of primary data. This assumption is no longer viable in the case of en-
dangered languages and, as already pointed out in Section 2, it is hence not by
chance that a close connection exists between language endangerment and the
recent increased concern for the preservation of primary data in linguistics and
related disciplines.
13. The part called descriptive analysis in the rightmost column could also be
added in other ways to the overall format, for example as an additional column
of its own, on a par with primary data and apparatus. While there are theo-
retical issues associated with these alternative overall organizations, these do
not play a role for the argument in this section and hence can be safely ignored.
14. Essentially the same points made here and in the following with regard to de-
scriptive grammars could also be made with regard to conventional dictionaries
and ethnographic monographs (see Chapter 6 for a brief discussion of different
types of dictionaries, which is also relevant here). Including these two other
main analytical formats in the discussion would, however, unnecessarily com-
plicate the exposition. Hence, dictionaries and ethnographies are not further
discussed in this section. The choice of descriptive grammars as the main ex-
ample is simply due to the fact that it is the format the author is most familiar
15. Very occasionally, though, especially in the interaction between parents and
children, unacceptable or highly marked structures might be attested in admon-
ishments of the form: Dont say X, say Y.
Chapter 2
Ethics and practicalities
of cooperative fieldwork and analysis
Arienne M. Dwyer
This chapter examines central ethical, legal, and practical responsibilities of
linguists and ethnographers in fieldwork-based projects. These issues span
all research phases, from planning to fieldwork to dissemination. We focus
on the process of language documentation, beginning with a discussion of
common ethical questions associated with fieldwork: When is documenta-
tion appropriate in a particular community, and who benefits from it? Which
power structures are involved, both in and out of the field? Section 1 ex-
plores key concepts of participant relations, rights, and responsibilities in
fieldwork in the context of ethical decision-making. It introduces a set of
guiding principles and examines some potential pitfalls. Section 2 discusses
the legal rights issues of data ownership (intellectual property rights and
copyright) and data access. Such information aids planning before field-
work and especially the archiving phase.
Sections 3 and 4 cover the more concrete practical aspects of the field-
work situation: developing a relationship with a speech community and
organizing and running a project. We survey what may be termed the five
Cs critical to planning and executing a project: criteria (for choosing a
field site), contacts, cold calls, community, and compensation. Finally, since
even the best-planned projects encounter logistical and interpersonal chal-
lenges, we present several generic case studies and some possible methods
of resolving such disputes.
Such ethical and logistical planning is essential to successful commu-
nity-centered knowledge mobilization, from which documentation products
useful for both academics and community members are produced in an
environment of reciprocity. It is the linguists responsibility to focus on
process (Rice 2005: 9)
as much as the end goals.
32 Arienne M. Dwyer
1. Ethics
1.1. Research as mediation
Ethical behavior is often assumed to flow intuitively from the noble goals
of scientific research. Most fieldworkers consider themselves well-inten-
tioned, rational people. But have all participating individuals and groups
been considered in these research goals? Have their ethical standards been
Fieldwork methodology has in the last decades progressed from a typi-
cally non-cooperative model (research on a community) to a cooperative
model which in its strongest form explicitly empowers speech communities
(research on, for, and with a community) (Cameron et al. 1992: 2224).
Assumptions about what is ethical for a particular field situation are best
avoided, especially assumptions on the part of the researcher about what
participants want.
The researcher should also have a grasp of the legal
implications (local, national, and international) of data ownership.
An un-
derstanding of ethical and legal responsibilities also facilitates the building
of trust and thus a successful relationship with a community research
team. Finally, making ethical and legal premises explicit, helps to anticipate
and avoid problems. A field researcher mediates between speakers, their
communities and the fieldworkers own community, which includes an
institution, a funding body, and possibly an archive. Inevitably, all partici-
pants in a language documentation will face ethical dilemmas, in which no
course of action seems quite satisfactory. There may be no right decision,
only [one] more right than the alternatives (Hill, Glaser and Harden
1995: 19).
Distilled to its essence, the ethics of field research entails indigenous
people and field researchers mediating each other's cultural imperatives.
This contextualization of ethical principles can only occur through produc-
tive mutual negotiation at the local level. The ethical principles presented
here may seem as both imperious and overly generic, given that in this
chapter broad-brush principles are often preceded by the cajoling impera-
tive should or the bossy must. But these are suggestions awaiting contextu-
alization in a particular research situation. And this mediation of ethical
principles by all participants forms the nucleus of any research project.
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 33
1.2. Normative ethics
The ethical decisions made during fieldwork belong to the domain of pro-
fessional ethics. Since many field research networks also create codes of
conduct, we are also concerned here with normative ethics. Normative
practices attempt to prescribe best-practice standards for field situations.
A research team might make the normative decision to adhere to a de-
tailed set of ethical principles determined in advance, asking is our aim
just to evaluate the resolution of past ethical dilemmas in the field by con-
sensus? Normative guidelines generally follow a deductive or an inductive
approach. Some researchers review such a list of field experiences and at-
tempt to achieve consensus on future ethical research behavior.
Another less normative approach might simply be to observe and note
the ethical dilemmas that appear. This descriptive list of relevant field di-
lemmas and how they were resolved could serve as a reference for future
field researchers. An example of a less normative approach is the do no
harm credo discussed below.
The dangers of excessive normativity are well-known; colonial subjuga-
tion, religious or cultural conversion-induced linguicide, and business profit
are all examples of normative frameworks which are tendentially destruc-
tive. Such frameworks are assumed by their proponents to be universally
held, and universally beneficial.
1.2.1. Documenting endangered languages as a normative framework
Claiming that languages should be documented before they disappear is
also a normative act, and it is a framework in which not everyone believes.
But most researchers strongly support the documentation of endangered
languages, arguing that a decline in linguistic diversity constitutes a decline
in specific forms of knowledge and expression. Speakers of endangered
languages also often support such a normative framework, since language
is a central part of culture and of ethnic identity. Should a language be
documented when its speakers would prefer it to disappear? How should
community priorities and external western-scientific priorities be weighed?
Many would argue that documentation should make the language available
to future generations; most would also argue that both sets of priorities
should be accommodated, to the extent possible.
34 Arienne M. Dwyer
1.2.2. Balancing priorities
Since field linguistic situations are so diverse, one-size-fits-all codes of
conduct are impractical. Codes of conduct are voluntary and often largely
unenforceable, but good guidelines help ensure good working relationships
and a positive research outcome. For the sake of methodological transpar-
ency, and for smooth communications between all parties, some norms are
always part of the field experience.
Most research teams choose a pragmatic approach, making use of both
explicit ethical guidelines as well as drawing observations from specific
field experiences.
No matter what form is chosen, research teams would do
well to make explicit the ethical norms of their particular project.
1.2.3. Normative ethics in language documentation
Individual teams should establish a code of ethical norms specific to their
particular area for a given research project. This code would encompass
detailed guidelines on consultation and negotiation between indigenous
people and researchers for all phases of the research, including planning
and dissemination.
Since such voluntary normative approaches have proven useful, the sci-
entific community can aim at establishing a two-tiered, flexible ethical code
for linguistic field research: a generic code of putatively universal ethical
norms, and as above a specific individual code for a research on an ethnic
group in a particular area, created by individual researchers.
At present, linguists lack a generic code of conduct. Ideally, field lin-
guists will work with the countrys linguists and social scientists to devise
this generic code. This code would be specific for field linguistics but could
be modelled on existing well-articulated guidelines (such as the Australian
Institute of Aboriginal and Torres Strait Islander Studies Guidelines for
ethical research in indigenous studies [AIATSIS 2000], the African Studies
Associations Guidelines for ethical conduct in research and projects in
Africa [African Studies Association n.d.], and the American Anthropological
Associations Code of ethics [AAA 1998]). Though the above are designed
as regional codes, they are actually generic enough as to be potentially ap-
plicable to any world region.
A generic statement on ethical principles should address all phases of re-
search: planning, fieldwork, analysis, archiving, and end products. Planning
ethically for each phase entails assessing the roles played by participants
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 35
and the potential benefits and detriments of research; it also ideally includes
local participants participation at every phase. In the planning phase, re-
searchers should identify all the potential participants (see Section 1.3 be-
low), including sponsoring institutions, and estimate remuneration for local
participants. During fieldwork, the researchers establish and maintain rela-
tionships, and negotiate contracts or protocols for obtaining data. It is at
this crucial phase that the researchers must obtain informed consent (see
Sections 1.5 and 2.2.1 below). The analysis phase includes such normative
ethical decisions as the number of minimally adequate levels of annotation.
Annotation decisions are questions of ethics, as what annotation is included
will determine the accessibility of the materials to particular audiences.
During the archiving phase, the researcher must carry through the wishes of
consultants in terms of anonymity and recognition by making speakers
anonymous; decisions must be taken on user access to the materials (com-
munity, scientific researchers, general public) and which materials are to be
In the longer term, such codes of conduct could be developed for spe-
cific regions (countries or ethnolinguistic areas), based on a comparison of
individual codes of conduct from the same area. This would result in a third
tier of guidelines, a regional code. Though regional codes are the least
critical of the three types of guidelines, such a code would outline certain
region- or country-specific practices spanning a number of ethnic groups
for a given area, e.g. archival practices for material from a consultant who
passed away since the data collection.
1.3. Players
The practical application of ethical principles entails the specification of
ethical and legal relationships between all participants in the documentation
process. These relationships should be made explicit and clearly differenti-
First, consultants (speakers/singers) are part of a certain sociocultural
context in a certain country (see Figure 1). The sociocultural context con-
sists not only of the speaker community itself but its relationship nested
within local society. Then, the interaction between researcher(s) and con-
sultant(s) occurs within a regional and national context, which includes
governments, officials, subject experts, and eventually users of the analyzed
data. Speaker-consultants are part of both linguistic and administrative com-
munities; language communities are usually part of larger ethnolinguistic or
36 Arienne M. Dwyer
ethnoreligious regions. These regions, in turn, may be contiguous with or
reach across provincial or national boundaries.
The roles and perspectives of participants are gradient and dynamically
created. We can use insider/outsider as shorthand to describe two ex-
tremes of how a researcher situates himself or herself with regards to the
research situation, as well as how other participants view that researcher.
The researcher might be an insider (i.e. accepted as a member of that com-
munity) or an outsider (from a distant community, whether in that country
or in another). These roles are gradient rather than absolute, since a foreign
researcher and a native speaker from a distant community may both be con-
sidered outsiders from the community under investigation. A local re-
searcher often assumes multiple insider/outsider roles: it is often the case
that a researcher is part of the ethnolinguistic group, but not or no longer
from the particular community. In this situation, that researcher is both an
insider and an outsider. The distinction may be relevant for research plan-
ning, as it often facilitates research to work with a person from the actual
community under investigation.
Furthermore, researchers institutional connections play an important
role in determining both the direction and scope of the research. Every in-
stitution has its own agenda. If a researcher is funded by a university in that
nations capital, for example, in some cases he/she might be expected to
produce a study that enhanced that countrys ethnic policy. A researcher
from overseas might, in contrast, be subtly pressured by the home univer-
sity or the funding agency to quickly obtain a lot of data and produce publi-
cations, while overlooking the need for reciprocity with the speech com-
munity. Creating research products useful to communities is an issue which
will become more and more central to the ethical practice of the research
enterprise, though currently grant funding is mostly limited to products for
a scientific audience.
Institutional affiliations almost invariably insinuate themselves into the
power relationships between players. Though outsiders may be regarded
with more suspicion than insiders, the affiliations of outsiders generally are
seen as prestigious. Usually enhancing this prestige is the economic means
of the researcher as a result of the funding.
Then in this web of relations there is the archive, in which the researcher
deposits his or her materials. Though requirements of the granting agency
vary, each has specific guidelines for data depositing and use. Finally, the
archive disseminates data to users.
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 37
That these players individual fieldworkers, communities, research consor-
tia, funding agencies, archives and users may all be located in different
countries has legal implications for the storage, ownership, transfer, and
publication of the data (see Section 2 below). But more important to the
success or failure of a given research collaboration are the shifting and
highly contextual nets of power and belonging (insider/outsider) between
these players. A research project on any scale would do well to evaluate
both these legal and social relationships in the planning stage.
insider/outsider insider/outsider
Figure 1. Participants in linguistic fieldwork (adapted from Hi 2001, Wittenburg
1.4. Ethical principles
Heritage can never be alienated, surrendered or sold, except for conditional
use. Sharing therefore creates a relationship between the givers and receiv-
ers of knowledge. The givers retain the authority to ensure that knowledge
is used properly and the receivers continue to recognize and repay the gift.
(Daes 1993: 9)
We can outline the following five fundamental ethical principles for lan-
guage documentation:
government: national, local
experts: national, local
38 Arienne M. Dwyer
Principle 1: Do no harm (including unintentional harm)
Though inarguable, this maxim requires individuals to specify what harm
means in the specific local context. Since research is a kind of prying, pro-
tecting privacy largely concerns deciding which information to protect from
public view. Harm to privacy may come from revealing information that
discredits a person (Thomas and Marquart 1987: 90).
There are, of course, many kinds of inadvertent harm. For example,
publicizing one persons name might result in embarrassment, whereas not
publicizing anothers name may be viewed as a slight. Moreover, the people
with whom an outsider-researcher associates could be stigmatized by the
community for giving away cultural or even national security secrets, for
example, which might lead to trouble with community leaders or police.
Also, since many researcher-consultant exchanges involve compensation,
unintentional harm can be caused by arousing financial or material envy in
the indigenous community.
Part of fairness is being attentive to relative compensation: what one
person acquires in material or political gains as a result of participation may
cause envy or ill will in others in the community. Such attentiveness re-
quires researching not only what is the appropriate form of compensation
(e.g. money, goods, recognition) and the appropriate amount, but also re-
quires knowledge of project participants status in and relationship with the
community (see Section 3.5).
Gifts or payments of goods or money, where culturally appropriate,
compensate for both the expertise of another individual and the inconven-
ience caused him or her. Even where no overt compensation changes hands,
the core participants create a dynamic of reciprocity, whereby the gift of
language knowledge is reciprocated by the researcher in some way, e.g. by
compiling a community course book. After all, the term compensation lit-
erally means hanging together. Underlying this equilibrium is the second
principle that we might simply articulate as:
Principle 2: Reciprocity and equity
The research relationship must be consultative, continuously negotiated,
and respectful. Accommodate community input into your research goals,
or, better yet, plan the research collaboratively with the indigenous com-
munity. Re-negotiation of methodologies and goals is a normal part of this
process. Part of the culture of respect is acknowledging that ones view-
points may not be universally held. The researcher should also respect both
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 39
the indigenous knowledge system under study and respect the confidence
and trust of individual participants.
One area of normative ethics that modern researchers generally think of
right away is the idea of giving something back to the community. This
notion is not altruistic, but rather reflects the consideration that when re-
searchers enter a community, they disturb it at least temporarily, and also
take data away. Even with compensation, research behavior is nearly al-
ways a lopsided proposition, with clear benefits accorded more to the re-
searcher than the community. Thus, many researchers in recent years have
come to feel strongly that they should additionally compensate communi-
ties with scientific products or even economic development aid. Therefore,
our generic code also includes:
Principle 3: Do some good (for the community as well as for science)
What constitutes a generous act of giving back varies greatly depending
on community needs. Such acts are more abstract than mere compensation
for a consultants time; they are also never 1:1, in the sense that a re-
searcher can never repay a community for the rich but nonetheless snap-
shot-like view of the culture obtained during a particular field research ex-
The most common examples of giving back include preparing peda-
gogical and cultural materials useful to the community, such as promulgat-
ing an orthography, developing textbooks and primers, making audio CDs,
VCDs and documentary film, and creating picture books on material culture,
e.g. embroidery or architecture.
Principle 4: Obtain informed consent before initiating research
It is critical for the researcher to establish an agreement with data producers
(speakers, singers and/or a community) to record, archive and disseminate
these data. Researchers are ethically obligated to inform data producers of
all possible uses of the data so as to implement the do no harm principle
above. Permission should be recorded in a culturally appropriate form:
written, video or audio-taped. A detailed discussion of the issues and pro-
cedures in informed consent are found below in Section 2.2.1.
Such mandatory contracts certainly encourage researchers to document
permissions. However, in some local situations, unrecorded oral contracts
may be most conducive to mutual trust, though they usually do not fulfil
the legal requirements of IRBs (Institutional Review Boards).
40 Arienne M. Dwyer
Principle 5: Archive and disseminate your data and results
Researchers must avoid being buried with their unpublished field notes and
recordings. Within bounds of informed consent, those working with endan-
gered-language communities have an obligation to appropriately store and
publish data and analyses. Even in imperfect form, ordered, shared data are
more useful than no data; disseminating or at least properly archiving col-
lected data is far more respectful to a speaker community than piling it in
the back of a closet. Hence, many field researchers now believe that best-
practice archiving (cf. EMELD 20002005) and dissemination (in any for-
mat) should be a requirement of fieldwork.
Such principles sketch out the bare minimum in ethical linguistic fieldwork
practice. For more elaborated documents, see AIATSIS (2000) and the Af-
rican Studies Association (n.d.).
1.5. Potential problems: some examples
1.5.1. The observers paradox and covert research
The requirement of obtaining informed consent rules out covert research,
i.e. recording without speakers knowledge. The deception inherent in covert
research renders it taboo for many who do fieldwork. Yet many social scien-
tists routinely pretend to be ordinary citizens in order to obtain a naturalistic
view of their research subjects: they, for example, join a group that believes
in UFOs, work desk jobs for the sensationalist newspaper Bild Zeitung, or
staff a Wal-Mart store to reveal the group or corporate practices (Wallraff
1977, Ehrenreich 2002). Such fieldworkers and journalists will vocifer-
ously defend their enterprise.
In anthropology and linguistics fieldwork, a researchers presence changes
the phenomena under observation, often making conversation less sponta-
neous. Most field workers simply attempt to minimize the intrusiveness of
their presence (the so-called observers paradox [Labov 1971: 171]) by, for
example, using a small recording device, or by having native-speaker insid-
ers conduct the field research. These methods have provided adequate data
and have been seen as ethically sound by the majority of field linguists and
community researchers.
However, since the observer is always intrusive to some extent, some
language researchers have decided to make surreptitious recordings. This
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 41
issue is so controversial among language researchers and language activists
that it is usually dismissed out of hand. But such practices do exist, and
therefore merit some discussion here. Covert recording has been reviewed
by Allen (1997) and defended by Larmouth et al. (1992), who examined
U.S. state and federal laws. Harvey (1992) argues that occasional surrepti-
tious recording simply constitutes a greater degree of non-disclosure in a
research environment where all researchers inevitably withhold some in-
formation from native speaker-consultants. (For example, a researcher may
ask a consultant to converse freely when she is really only interested in the
relative clauses produced.) When not based on clearly-delineated ethical
principles, though, this rationalization for covert research is untenable.
When might covert research be acceptable for some linguists, then? One
technique which appears to satisfy both the need for spontaneity and in-
formed consent is the following: (1) recordists and speakers already have a
trusting working relationship; (2) the researcher surreptitiously records
spontaneous speech of said speakers, if and only if (3) the subject of the
speech is estimated to be non-sensitive, and (4) the speakers are immedi-
ately afterwards given the option of informed consent, i.e. they listen to the
recording to decide whether or not it should be erased or kept.
Community members and outside researchers together must develop a
policy on covert recording for every research project. If covert research is
allowed, then the terms should be specified. One model is the American
Sociological Associations statement (1997: sect. 12.05).
Nonetheless, the ethics of covert research are far from clear-cut. Thomas
and Marquart (1987: 1112) argue that ethics codes and academic goals are
often completely contradictory. They suggest that rather than rationalizing
behavior, academic researchers should instead squarely face each ethical
dilemma as a matter of honor: The operative question should not be Does
behaviour violate the ASA ethical code, but instead Did the researcher, in
this given situation, act honourably? Most important, however, is whether
or not local people accept as ethical post-facto consent to surreptitious re-
cordings. If there is any doubt, it is best to avoid covert recording entirely.
1.5.2. Change in permissions
Sometimes a speaker who has given permission for material to be used in
research and/or publicly disseminated later wants it removed. The re-
searcher or archivist faces the dilemma of whether or not to remove the
material, even though archiving was one of the original goals of that re-
42 Arienne M. Dwyer
cording session. It is best to be explicit about the consultants future rights
to the recording at the time of recording.
1.5.3. When a previously uninvolved party becomes involved
A linguist wants to contribute a legacy recording
to an archive, but then a
grandson of the speaker objects, saying that the rights to the recording now
belonged to him. If an archive does not have an explicit policy, then the
two parties must attempt to mediate these situations, based on the original
agreement and on the cultural norms of the speaker community.
1.5.4. Ensuring accessibility
What good is an electronic archive to native speaker communities, espe-
cially if they lack Internet access? In addition to giving back tangible
research products such as primers, the researcher should find ways to get
offline electronic data to the communities. A researcher could even con-
sider establishing WiFi (wireless) networks, if appropriate.
1.5.5. Management of the resources
When material is in an archive or a private collection, the question arises as
to who represents the annotated data: the community, the researcher, or the
archivist? Since it is inevitably some combination of these actors, it is wise
to specify decision-making power in advance for the concerned parties.
When one party, for example, wants to close the resource to the public, it is
best to have protocols for making ultimate decisions.
2. Rights
2.1. Scope
Participants in linguistic fieldwork are subject to at least three separate ju-
ridical realms: (1) the laws of the country in which data recording takes
place; (2) the laws of the researchers country; and (3) international law.
Additionally, researchers may be subject to a regional transnational law,
such as EU law for the DoBeS archive in the Netherlands. Within each of
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 43
these realms, the distinction between intellectual property rights, copyright,
and access is useful. Note that these issues are moot unless these rights are
exercised (e.g., through a claim of ownership of material in an archive).
Even then, there is little legal precedent testing protocols on rights and ac-
cess to linguistic resources, until language archives accumulate several
decades of experience with data rights.
2.2. Intellectual Property Rights (IPR)
Intellectual property rights concern the individual, group, local, and na-
tional ownership of so-called creations of the mind, e.g. books, musical
performances, films and even folklore. The western notion of property
rights may well have no indigenous conceptual counterpart. Nonetheless, a
number of documents on indigenous knowledge and property rights have
successfully attempted to respectfully address indigenous issues. These
include Hansen and VanFleet (2003); AILLA (n.d.a) IPR; and for New
Zealand, Sullivan (2002).
2.2.1. Informed consent
At a recordings origin (i.e. at taping), it is necessary to obtain the informed
consent of all parties. Informed consent is a negotiation between researcher
and data producer/consultant of all future uses of the material: who will
access the data, where will the data be housed, in what form will it be
stored, and who will make future decisions over its use. Informed consent
does not simply entail the researcher informing the consultant of to what
use he/she intends to put the data. Of course, linguistic and anthropological
goals often overlap with but differ from community goals, so part of the
consent process entails community members convincing outsider linguists
of practical data uses, and vice-versa.
Though informed consent has both ethical and juridical dimensions,
academic institutions in certain countries have emphasized the legal aspects
of such contracts. Many field researchers today, particularly those in North
America and Australia, find that any of their projects involving direct work
with people are subject to an obligatory institutional screening process.
Though such informed consent contracts are a positive development, uni-
versities need to establish a generic and more flexible consent template for
44 Arienne M. Dwyer
linguistic and social science research in non-clinical settings under different
cultural circumstances. For now, each researcher must tailor his/her own
contract with his/her own Institutional Research Board.
There are three major types of consent documentation: written, verbal,
and third-party.
Written consent
The advantage of having so-called Human Subject Consent forms is
that both parties have a written record of their agreement. The disadvan-
tages, though, are legion among linguists: they require the anonymity of
consultants (which is often inappropriate) and the written forms may
breed mistrust. Therefore, field researchers often resort to verbal consent.
Verbal consent
Verbal contracts should be recorded with audio or video devices if at all
possible. Though western societies are insistent that written contracts
are the only really binding forms of agreement, in many contexts a verbal
contract can be equally or more powerful and binding than a written one.
A spoken agreement requires at least two parties physically present, it
requires eye contact, and it carries with it all the intertwining obligations
and respect of a personal relationship between two people bound to-
gether in a social network. For a written agreement, by contrast, both
parties need not be present nor have or maintain any sort of personal re-
lationship. And this is why many people (e.g. indigenous peoples of the
Americas) find oral contracts more binding than written ones: written
ones can be torn up and forgotten, but not ones sealed by physical con-
Furthermore, in a society with varying degrees of literacy, the written
contract may wisely be viewed with suspicion, as it has often been the
medium used historically by colonial powers to wrest property and land
from indigenous peoples.
It has been difficult in the past to convince IRBs of the appropriate-
ness of oral contracts in certain contexts. Even now, a researcher must
make a case to these boards, who by definition represent the legalistic
and writing-centered aspect of academic culture. However, today most
IRBs accept oral contracts as legitimate.
Third-party consent
The last type of consent entails making use of an intermediary such as a
village leader to negotiate a contract between participants. The consent
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 45
contract may be written or verbal, but using an intermediary may be the
best way to quickly establish a modicum of trust between parties, and to
facilitate communication between the research world and the commu-
nitys world.
Issues requiring our attention with regard to consent include attending to
sufficient explanation, that is, ensuring that ones goals are explained
clearly in a culturally appropriate manner. Additionally, participants should
anticipate as many future uses of the data as possible.
2.2.2. Some laws governing consent
Though it is not feasible to survey the consent laws of dozens of countries
here, even when laws exist on the books in countries, these laws are too
loosely defined to protect speakers and singers. Under U.S. law, for exam-
ple, though the basic law is intended to protect data producers, certain de-
tails allow for an unacceptable degree of leeway. A person may generally
record, film, broadcast or amplify any conversation where all the parties to
it consent. Yet the consent of data producers is presumed without asking,
as long as the recording device is in plain view.
Such flexibility, though
pragmatically appealing, leaves open the possibility of unethical behavior.
U.S. federal publications do recommend (but do not require) obtaining con-
sent individually from all parties recorded. We can only second that rec-
ommendation here: Permission should always be obtained except where
truly impractical, e.g. in a crowd situation with dozens of spontaneous per-
2.2.3. World Intellectual Property Organization (WIPO)
The primary concern of the World Intellectual Property Organization is to
protect the commercial value of intellectual property. When the data pro-
ducer has a solid legal contract recognized by commercial institutions (e.g.
as a recording artist would have with a recording company), then the WIPO
generally protects both the data producer and the data recordist/mediator.
When, however, the data producerdata mediator relationship is not part of
a commercial enterprise (such as that of endangered language researchers
and native speaker-consultants), the WIPO basically serves to open up lan-
guage materials to potential commercial exploitation.
46 Arienne M. Dwyer
There are various proposals by the World Intellectual Property Organization
for new sui generis rights in databases, folklore, and life forms. These in-
dependent rights essentially specify that rights can be bought and sold; thus
a film company or a pharmaceutical enterprise could even buy rights to a
certain body of folklore. Once purchased, an utilization, even by members
of the community where the expression has been developed and main-
tained, requires authorization if it is made outside such a context and with
gainful intent (WIPO 1998: 7; WIPO 19981999: 33). Critics see this as a
potential for tyranny by the governments who would be authorized to en-
force these ownership rights.
Enforcing such rights also has enormous practical barriers. The fact
that ethnic groups do not exactly coincide with national boundaries will
make it hard to figure out which government would get to authorize activi-
ties and collect the tariffs for which body of folklore. For instance, would a
Chicago polka band need [to] get clearance from and pay royalties to the
Polish government? (Liberman 2000/2001).
Even if intellectual property rights are not a pressing legal issue in a
given country or society, they are generally still an underlying ethical issue.
These western, business-oriented notions must in one way or the other be
squared with indigenous knowledge systems so that intellectual property
rights as conceived by WIPO and other organizations do not go against the
interests of indigenous peoples.
2.3. Copyright
The preponderance of resources on ethics and rights deal with copyright as
a financial issue. Copyright refers to the ownership and distribution of a
particular work: who owns what aspects of the result, and whether it is le-
gitimate to distribute or publish the result. As a form of property, copyright
can be inherited, given away, or sold.
The focus of copyright law is monetary: if a copyright is violated, the
originator of the material will lose profits due her/him. This pecuniary focus
is irrelevant for language documentation projects, since they are generally
money-losing propositions, yet the inappropriateness of copyright laws does
not prevent documentation projects from being subject to those laws.
Copyright law applies where the copying of the work is being done, not
where the work copied was created. So if a theater piece or a story was per-
formed in Latin America but written down or reproduced in Canada, it
would be subject to Canadian copyright law.
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 47
There are a number of common misconceptions about copyright law, for
The publisher automatically owns the copyright. (This is not necessarily
The language community owns the copyright for traditional material. (In
Western law, this is not so, though it could be given to a legal persona.)
Owning the copyright to the collection means owning the copyright to
the parts. (Not so, since editing is an act in its own right, creating a
unique work.)
The speaker owns the rights to a recorded text. (Translations are deriva-
tive works which are separately owned, but the publication of it still re-
quires the speakers permission; cf. Whalen/SALSA 2001.)
In a collaborative effort, deciding who owns rights can get complicated. In
some projects, one native speaker may collect and do a rough transcription
and translation of the data, another regularizes it, another person does a
translation into another language, and a fourth and fifth may add morpho-
logical annotation. Under such circumstances it is best to note each person
involved in the process.
In some countries, copyright law distinguishes being paid for doing part
of a work from being paid to do an entire work. In the United States paid
employment for part of a work is known as works made for hire. In this
case, the employer and not the employee is considered to be the author
(U.S. Copyright office 2004). If this route is taken and the project is subject
to U.S. law, then sub-contractors who do part of the work should be made
aware in writing of these restrictions right at the beginning of the project.
Note that the concept of works made for hire may be different or even
non-existent in the copyright law of other countries.
Make liberal assumptions about what copyrights may exist;
Make copyright arrangements from the beginning of the project:
Be explicit about what is work for hire;
In other cases, explicitly assign copyrights in writing, where possible to
a single entity.
Copyright law is not a very good conceptual fit to the purposes of language
documentation, but we must use it as we can. Some have recommended
48 Arienne M. Dwyer
non-exclusive licenses for appropriate research and educational use for use
in different language documentation situations (Whalen/SALSA 2001).
Fortunately, excellent resources are available on copyright, e.g. the Na-
tional Library of Australia (n.d.), the U.S. Copyright office (2004), and
Nimmer (1998).
2.4. Moral rights non-economic rights
Independent of an originators copyright (economic rights) there are non-
economic, so-called moral rights to a given work. The Berne Convention,
which was established to protect artistic works, states in part: Even after
the transfer of the said rights, the author shall have the right to claim
authorship of the work and to object to any distortion, mutilation or other
modification of, or other derogatory action in relation to the said work,
which would be prejudicial to his honor or reputation. (WIPO Interna-
tional Bureau 18861979: Article 6(1), emphasis added). This convention
ensures at least theoretically that a data originator (e.g. storyteller, speaker,
singer) will always have some legal rights to his or her work. Whether or
not these rights can be exercised over the work in the absence of economic
rights remains a largely untested question, at least for language data origi-
nators. Until the legal strength of moral rights is evaluated empirically,
the interests of both communities and researchers are usually best protected
by ensuring that the economic rights are secured by the most appropriate
parties. Data originators and analysts or one of the two are often the most
appropriate choice; another possibility would be a data archive.
2.5. Access
During fieldwork, it may seem far from the concern of researchers to inten-
sively ponder the uses of a data set in future years and decades, but the time
to ask speaker/singers permission for access is precisely at the moment of
recording, when researchers are still in the field.
Concerns about the privacy or, conversely, the recognition of data con-
tributors apply not only to these speakers and singers, but also to all other
people mentioned in the recording. (Thus, if a person talks about her sis-
ters wedding and uses her sisters name, then the sister should be involved
in decisions of access.) Furthermore, access concerns apply also to all re-
searchers and helpers on site, including e.g. local researchers and facilita-
Disputed questions of access very often create ethical issues. One such
example is when villagers allow full access including crediting recordings
to their name, but local coordinators, possessing an overview of social is-
sues, suggest anonymity for political reasons. Generally, it is best to err on
the side of caution and make the names anonymous.
An archive mediates between its collections and the public. The concept
most central to this mediation is graded access, which allows different de-
grees of accessibility of materials and to users. The best currently available
reference point is AILLAs (n.d.b) graded access system. Types of graded
access generally include:
Fully open;
Partially open: speaker-based/materials-based/user-based;
Speaker-based: e.g. texts from Speakers 120 are open, those from
Speakers 2125 not;
Materials-based: e.g. taboo or secret material is closed; general mate-
rial is open;
User-based: e.g. only open to researchers, not commercial firms;
Fully closed.
Most researchers are creating digital repositories, even if these are often ad
hoc. These data must be accessible to the native community. Whether the
data are deposited in an established archive or on an office shelf, it often
falls to the researcher to make relevant material available in a format that the
native community can use, which is often not internet-based (see Section
3.5 below).
2.6. Legal requirements for research
In addition to the legal requirements of the researcher-consultant relation-
ship (informed consent) and of the collected and annotated data (e.g. copy-
right and access), project planning must include obtaining legal permission
for personal logistics, the most important of which are:
Appropriate visas (e.g. tourist/student/research/visiting scholar)
Residence permits
50 Arienne M. Dwyer
Health exams (for longer-term foreign residents many countries require
testing for chronic illnesses such as HIV or tuberculosis)
Research permits (national and/or local) permission may in some
countries or locales require employing certain people not of the re-
searchers choice, e.g. bureaucrats, known local authorities, and/or
2.7. In sum: ethics and rights
For planning fieldwork and especially for archiving and disseminating data,
being informed of the national and international treaties is very useful, even
if national or international treaties on data ownership may not seem to af-
fect a research project.
The ethical requirements of fieldwork-based investigation are complex,
as they demand that the researcher attend both to a respectful and reciprocal
relationship with the language community and produce a documentation
meeting the standards of the academic community and the funding agency.
The latter requires ensuring quality (observational adequacy) as well as
quantity (working with reasonable efficiency and having adequate cover-
age) (Krauss 2005); the former entails a duty to consult, to share benefits as
well as the management and control of data (Castellano 2005).
3. Practicalities I: How to find a community and develop a cooperative
Two factors are crucial for successful outcomes in linguistic fieldwork: a
good relationship between researchers and indigenous partners, and a well-
organized work plan based on knowledge sharing and mutually negotiated
goals. The more researchers understand both the local culture and their in-
digenous partners goals, and the more indigenous consultants understand
the researchers goals, the more nuanced the research results (cf. Chapter 3).
When a researcher lacks a previous working or personal relationship
with a specific community of speakers, he or she must identify one, estab-
lish contact, and build a cooperative working relationship to that commu-
nity. Even for a researcher with prior connections, protocols and participant
roles must be negotiated cooperatively for each new project. Both kinds of
researchers undergo a process of establishing the five Cs: criteria, con-
tacts, (avoiding) cold calls, community, and compensation. Much of this
section is designed to be employed as a checklist in advance of field re-
3.1. Criteria
Four criteria generally dictate a researchers initial decision about research
location and variety: With which communities and language variety do I
Linguistic diversity and/or conservativeness
If you have the freedom to choose the language variety you will work
on, your linguistic criteria for deciding may be typological (language x
is unusual or typologically interesting in some way), and/or that the va-
riety preserves an earlier stage of the language very well.
Political expediency
Certain places may be open or closed to your research team for reasons
of regional or national security. Local authorities may prefer that you go
to only certain places, for reasons of personal safety or turf.
Logistical expediency
It may be only practical to combine work in a limited number of re-
gions, if one is working in remote or inaccessible places. This logistical
limitation may require the linguist to redefine the theoretical or scien-
tific goals of the project.
Interpersonal expediency
Certain language varieties may already be dominated by a national re-
searcher of great stature, who would resent the competition you repre-
sent (see political expediency above). Conversely, certain villages have
no such reservations, but they either lack a sufficient number of consult-
ants who are able to produce the phenomenon under investigation, or the
local research talent on your team knows more people somewhere else.
52 Arienne M. Dwyer
3.2. Contacts
Native speaker-consultants
Of all your contacts, consultants are the most important, and are best
found via introductions from intermediaries. Creating the conditions for
introductions requires patience, as establishing a consultant-researcher
relationship is usually only possible after a period of trust-building with
Native speakers are all potential teachers to the outsider-researcher
and crucial to any research project. Rather than zeroing in on a single
consultant for the entire research, most projects benefit from a pool of
consultants, so as to avoid inadvertently producing a study of for exam-
ple one persons peculiar idiolect, or a study of male language.
ing with a number of consultants allows the researcher to draw on each
consultants strengths, and also to correlate sociolinguistic parameters
such as sex, age, place of origin and languages spoken with linguistic
Scholars based in the country or region in question are often a crucial
aid to jump-starting our research. We often rely on their prior work,
even if only in a related field, e.g. local history. Discussions with these
scholars can give you the lay of the land, and may yield valuable con-
As these relationships, are also based on equitable exchange, it is im-
portant that the outsider-investigator offers something genuinely useful
to such scholars, e.g. copies of publications, offers of academic collabora-
tion, and/or volunteering to send hard-to-find books from overseas. It may
or may not be appropriate to include some academics in your project.
Although most bureaucrats in any country seem to have been put on this
earth to hamper research, some can be surprisingly helpful. Brace your-
self for the worst while maintaining a pleasant and undemanding demean-
or. On the occasion when they are helpful, one is pleasantly surprised.
Officials are of course often crucial in obtaining research permission;
and they may provide valuable (or dreadful) introductions. In some
cases it may be better to keep them abreast of project developments only
in the vaguest way for often these contacts are very political, and
could hamper the project or even endanger consultants, depending on
local conditions.
Local people (non-native speakers)
Other local people outside of the target language group often provide an
etic-emic perspective (outsider-insider) on the group you are actually
investigating. They can constitute an important control group for a so-
ciolinguistic or language-contact investigation.
A long-term view of contacts
It is not an exaggeration to suggest that if you are an outsider-re-
searcher, that you plan to continue returning to communities for several
decades if you really want successful and mutually satisfying research
results. Even though in many cases these iterative visits are impractical,
maintaining contact is desirable.
From the view of western academia, repeated field research in the
same community is unfortunately not yet encouraged; in fact, many aca-
demics are under pressure to do precisely the opposite, undertaking
many different projects for typological comparison or for demonstrating
scholarly breadth. Yet depth the thorough understanding of a par-
ticular language family or area and an ability to speak and think in its
languages is often sacrificed for breadth.
Recent developments, lead by endangered-language linguists and an-
thropologists, indicate a trend toward depth and breadth. The key is to
work cooperatively with speaker communities and with other scholars.
In this way, one can undertake diverse projects and continue to work
with previous communities.
3.3. (Avoiding) cold calls
If a researcher has no connections to the community, region, or even coun-
try, her work is very difficult people will understandably mistrust her,
shell spend a lot of time explaining what shes doing and attempting to
build trust among some members of a community. Basically, successful
initial fieldwork planning is about avoiding this situation, by being intro-
duced by an individual or individuals and building trust however tenu-
ously with a community.
54 Arienne M. Dwyer
That facilitating person should be as local as possible; a villager is usually
more trusted than one from the nearest town, and a town resident is usually
better than a person from the regional capital, and a person from the re-
gional capital is usually better than one from the national capital the more
local the person is, the more reliable she is perceived.
Of course, the issue of prestige sometimes skews this hierarchy, so that
sometimes an outsider with the right credentials has a surprising amount of
access into a society. (For example, in a society in which the local authori-
ties are detested, someone from the distant capital or even from overseas
may be seen as more trustworthy.) However, an outsider having connec-
tions is no substitute for local knowledge. Only a villager can identify where
the men who know the origin story live, which of them have the teeth to
articulate dentals, where the medicinal herbs grow, and who is not speaking
to whom.
3.4. Community: cooperative work between consultants and researchers
3.4.1. Lone-ranger linguistics vs. research teams
Lone-ranger linguistics
What I term lone-ranger linguistics (with a nod to Americas colonial
past) represent the old go-at-it alone model of linguistic research: go in,
get the data, get out, publish. It had its advantages: no negotiation was
necessary, and it seemed that the one researcher was alone capable of
wonders. Its disadvantages, however, are chiefly that it is inefficient and
tends to promote ill-will. It is inefficient use of time, money, and other
resources for an outsider to travel long distances for short periods and
learn a language poorly; it promotes ill-will by giving the researcher no
incentive to treat contacts in an egalitarian manner, to maintain relation-
ships, nor to reciprocate the communitys generosity.
Community-researcher teams
Cooperative arrangements between community members and outside re-
searchers have a number of convincing advantages: they are enormously
efficient in terms of human and economic resources, matching local
skills to local tasks and transferring technology; they provide linguistic
and ethnographic field methodology training in loco; they tend to pro-
duce huge quantities of data; and the observers paradox (at least that
of an outside observer) is not so strong, since it is generally community
members themselves who are conducting the fieldwork. There are some
disadvantages to cooperative arrangements of this sort: they are logisti-
cally challenging, as greater numbers of people are involved, hence
more intercultural mediation; a longer training period is required; and
the data produced usually require more regularization before analysis.
3.4.2. Developing a mutual learner-teacher relationship
The linguist should ideally first acquire the mindset: I am here to learn;
can you teach me? In return, he should make clear what skills, equipment,
and/or resources he has to offer, for example, technology, an orthography,
or help with grant applications. Many excellent works have been devoted to
developing and maintaining the relationship between researcher-learner and
community member-consultant-teacher; see e.g. McCarty, Watahomigie,
and Yamamoto (1999), Hinton et al. (2002), Grinevald (2003: 5760), and
Chapter 3.
3.4.3. Organization of a community research team
Developing a smooth and mutually agreeable workflow entails the coopera-
tive organization of some kind of community research team, the organization
of the researchers own tasks, and regular mutual consultation and exchange.
This collaboration often entails the following steps:
Assemble trusted local colleagues:
If a researcher lacks local contacts, she should probably first introduce
herself to the community, either directly via a pilot research project or
indirectly by working in a nearby town (e.g. as an English teacher or
development volunteer);
Propose a research plan;
Get their feedback and suggestions on the research plan:
Ideally, before even applying for funding, the researcher should plan the
project and budget with input from local colleagues;
Narrow the scope consultatively:
56 Arienne M. Dwyer
In each research locale, a researcher should work together with his local
team person and village elders, if appropriate, to focus the research plan,
For an overall documentation, make an emic list of all the discourse
genres that local people feel are important to document;
For a project on a specific topic, make a list of all potential inter-
For a sociolinguistic survey, plan with and train the researchers, and
obtain necessary permissions, as well as notifying the villagers via a
trusted leader that the research will be carried out;
Archive materials locally and remotely (e.g. at the researchers univer-
sity and in the local partners town);
Work with small, stable, offline software;
Work with computer programs with which your local partners are com-
Keep checking in with team members:
Regular consultations by the researcher or local manager are crucial
both for logistical and technical support as well as to keep the momen-
tum going;
Make sure the local researchers see interim and final products:
If it is feasible, show them not just the texts and translations they have
worked on, but a complete session consisting of a recording with time-
linked annotation should be demonstrated. If appropriate demonstration
equipment is lacking, sharing data printouts, photos, sketches or even
fieldnotes is important in maintaining a relationship of reciprocity.
3.5. Compensation
Common practices include:
For consultant time and expertise: money or gifts?
A local contact person in the pilot stages is invaluable for advice about
what kind of compensation is appropriate. If it is monetary compensa-
tion, should it be time- (per hour) or piece-based (per text)? The same
compensation for the same work is recommended for every participant.
If compensation is given in the form of gifts, popular items include
foods, candy, tea, or cloth. Note that some presents such as tobacco or
liquor will only benefit one part of a family, and may, in some situa-
tions, delight one family member while angering another.
Common-courtesy compensation: media
Audio and visual media of all types are among the nicest ways to give
something back to a consultant or a community. Some common exam-
ples include:
Audio and video recordings
copied onto more accessible formats
(cassette, CD, VCD);
Written material printed in a format useful to the community, e.g.
texts in a practical orthography (without excessive linguistic or com-
putational markup);
Photos, sketches, and maps reproduced in pamphlet, album, or book
For communities
At present, most researchers present native speaker consultants with
small tokens of cooperative work, such as photographs and copies of re-
cordings. In the future, documentary activity may well be coupled with
or followed by providing primers, texts, and dictionaries to the commu-
nity. Given that both academic funding and linguists time is extremely
limited, these products may best be created by research partners (e.g.
pedagogy specialists) funded by nonacademic sources (e.g. economic
development funding). Though such product development at present
remains beyond the scope and funding of a scientific project, if the lin-
guist is still able to catalyze this work, the community will benefit
4. Practicalities II: Common problems and some solutions
4.1. Money, gifts, and other obligations
What constitutes respectful and commensurate compensation will vary
widely from region to region, but some form of compensation is obligatory.
If community members have played a major role in creating the compensa-
58 Arienne M. Dwyer
tion structure, and if that structure is transparent, then the chances of diffi-
culty will be minimized. Even so, the material and/or interpersonal advan-
tages conferred by project work can still create tensions between research-
ers and community members, or between community members themselves.
4.1.1. Between outsider-researchers and consultants/community members
Scenario #1:
One common ethical dilemma resulting from ignoring participants com-
munity roles is dealing with the outrage of an uncompensated community
leader upon discovering that a young, non-prominent member received
remuneration for project work. Similar cases of envy may arise in a com-
munity when people hear what a consultant got paid or given, while the
clearly unqualified son of the village head wants that much too. If the re-
searcher does not pay the son, the village head may well withdraw permis-
sion for the researcher to do the sociolinguistic survey. (Solution: Be prag-
matic. If a researcher must, the son can be paid or given something, but
hopefully prevented from harming the project.)
Scenario #2:
One of your local team members is certain that she is not getting her
share of the budget, and furthermore is convinced that the outsider-
researcher is making thousands of Euros every day on this project. (Possi-
ble solution: If there is enough trust between you, share the project budget
with the team member and explain allocations. If this is not possible, re-
view and reach an agreement with her over adequate compensation.)
Often no amount of discussion can ever totally subdue the suspicion that
the P.I is horribly wealthy (which in comparison with local people at least
is often true), and also making a fortune off the project. In situations of
mutual trust, an open budget may be appropriate. In other situations, a fully
open budget might exacerbate perceptions of inequity. Core indigenous
research partners should in any case be central to budget and compensation
planning, and should have a clear idea of the scope of the project. The out-
sider-researcher can go a long way to dispelling perceptions of inequity
(real or imagined) by modelling parsimonious conduct, i.e. by living inex-
pensively as much as possible. Care with expenditures (but not stinginess)
can also help. Also, he should avoid answering questions about how much
recording equipment costs, as it really is shockingly expensive. Instead, he
can just say, Oh, pretty much or Yeah, its a good tape recorder.
4.1.2. Between researchers and their funding agencies
Researchers who wish to produce lasting and useful products for communi-
ties are in a bit of a bind. On the one hand, they are universally grateful for
the academic research funding they receive. On the other hand, scientific
funding agencies are not in the business of technology or pedagogical mate-
rials transfer to the community; their primary goal is to support the analytical
by-products of research on an international standard, such as books, articles,
analytical databases, and of course annotated data with associated metadata.
The production and transfer of materials to a community, from the point of
view of a funding agency, is not quite science and a Pandoras box of end-
less expenses.
In the longer term, as ethical documenters we must do a better job of
convincing both academic and development funding agencies that linguistic
fieldwork unlike much of natural science research, to which these funding
agencies are oriented entails a long-term commitment (however superfi-
cial) to the communities, and thus the production of at least minimal mate-
rials for the communities is essential to doing fieldwork. Scientific funding
agencies will justifiably argue that they are not in the business of economic
development, but with endangered languages these issues simply cannot be
separated; economic impoverishment so often goes hand in hand with lan-
guage endangerment. Diversifying funding sources from non-governmental
development organizations may well be a workable future solution.
4.1.3. Between the outsider-researchers and communities
The compensation discussed above photos, tapes, and gifts or contract
payments in the short term, a dictionary and/or grammar in the longer term
is fully adequate. However, such compensation may still seem lacking,
given the time lag in producing reference works and their possible irrele-
vance for the parts of the community not involved in language maintenance
or revitalization. Some PIs, therefore, may be motivated to apply for eco-
nomic development funding. Such funding exponentially increases the long-
term contributions of a research collaboration to a community, for under
60 Arienne M. Dwyer
ideal circumstances scientific research has thus contributed to both cultural
and economic development.
4.2. Organization
Though an entire chapter could be written on project organization, we will
restrict ourselves to two brief remarks on management. The first is time
management. Building a cooperative work team is much more time-
consuming (but also more rewarding) than working alone. Allow three
times as much time as you estimate for a project of any size. Secondly, a
linguistic research project entails both data and personnel management.
While under older colonialist models, outsider-researchers would typically
be responsible for both, the experience of diverse cooperative research pro-
jects has shown that the more local partners manage both data and person-
nel, the more likely it is that these community members consider them-
selves genuine shareholders in the project. And if local partners consider it
their own project, then it has a much greater chance of being self-sustaining
and self-perpetuating after the external funding runs out. Thus, if appropri-
ate to the local situation, make sure that local team members with a talent
for organization are actually managing the project; make sure that they
have mirror archives of any annotated data archived elsewhere.
5. Conclusions
There is an inherent contradiction, namely that we have predefined the
issues in a non-aboriginal context. The concepts of intellectual property
and heritage resources arise out of a way of viewing the world that either
excludes or is antithetical to that of many First Nations and therefore pre-
cludes a real understanding of aboriginal culture and society.
(Marsden 2004, by permission)
Clearly, a grasp of the legal requirements for both the researcher-consultant
relationship (informed consent) and for the data produced and analyzed
(e.g. copyright and access) is important for any project. Such requirements
are complex since they involve a web of participants subject to laws often
of more than one country. But it is the attentiveness to ethical issues which
can determine a projects success. If the researcher is an outsider, the real
challenge lies in learning and mediating between at least two ethical sys-
tems: that of the researcher, and that of the community. Only with an un-
derstanding of both systems and this applies equally to outsider-
and insider-community members can ethical and honorable
behavior be determined and evaluated.
Chapter 2 Ethics and practicalities of cooperative fieldwork and analysis 63
Chapter 3
Fieldwork and community language work
Ulrike Mosel
Linguistic fieldwork, especially language documentation, relies heavily on
theworking relationship between the professional linguist and theindigenous
language workers a challenging relationship because except for their in-
terest in the community language, both parties do not share much common
ground in terms of background and aims. This chapter will first outline the
differences between thelinguists and thecommunitys approach to language
documentation and then describe the kind of input the linguist can give into
the communitys linguistic training and language work. Drawing from expe-
riences in the Primary Education Materials Project in Samoa (19972000)
and the Language Documentation Project of Teop in Bougainville, Papua
New Guinea (20002005), the chapter will deal with individual apprentice-
ship and teamwork and conclude with a short section on workshops.
1. Research aims and personal motivations
If we take a close look at why researchers and indigenous people engage in
linguistic fieldwork, we can distinguish between research aims and per-
sonal motivations. In most general terms, the linguists research aim is to
contribute to our scientific knowledge of the worlds languages or to lin-
guistic theory, while the local language workers aim is to do something for
the maintenance and development of their language and culture. Thus lin-
guists and local language workers research the same language, but take
different perspectives. While the linguists ask what makes this language
interesting for general linguistics, historical linguistics, linguistic typology,
or linguistic anthropology, the native speakers may ask what does their
language and culture contain that they want future generations to learn, or at
least to remember. As a consequence, academic field researchers focus their
attention on otherness, on what makes this language unique in comparison
to already researched languages, whereas the community members see their
language in relation to the dominant official language and their neighbors
Beyond intellectual curiosity, linguists are also motivated by academic
career prospects, just as the indigenous people are concerned with their
status within the community and earning money as fieldworkers. The lin-
guists must meet the expectations of their funding institution and deliver
the work they had planned in their application for funding. In many cases,
this will be a PhD thesis with a focus on theory or some specialized inves-
tigation, rather than a dictionary for the speech community or a language
documentation. In contrast, the objectives are less clearly defined for the
indigenous people. Frequently, a dictionary ranks highest on their list of
priorities, followed by educational reading materials, or translations of texts
that are important for the community (e.g. religious texts).
Table 1. Linguists and local language workers perspectives on fieldwork projects
Linguists Local language workers
Aims academic educational, cultural
Perspective focus on otherness focus on identity
Motivation intellectual curiosity
academic career advancement
intellectual curiosity
status, money
Products PhD thesis,
specialized investigation
dictionary, reading materials,
These different viewpoints, which are summarized in Table 1, can give rise
to conflicts. If linguists make a strong commitment to the communitys
interests, they (or their supervisors) may feel that the academically relevant
aspects of the fieldwork are not receiving sufficient priority. Neglecting the
communitys interests on the other hand may lead to feelings of guilt to-
wards the language community, who are being exploited with no real bene-
fit in return (see also Chapter 2). The sections below try to show that true
cooperation, in which each party recognizes the others interests, can lead
to fruitful results (see also Mithun 2001). But before discussing in detail
how such a cooperation can work, Ill briefly outline further differences
between linguists and local language workers interests when collaborating
in compiling a language documentation.
2. The two perspectives of language documentation
Assuming that both parties have agreed on producing a documentation of
the language, comprising recordings with transcriptions and translations, a
dictionary and a grammar, they still do not share much common ground.
On the contrary, their views differ with respect to the most relevant issues:
the choice of speech genres to be recorded, the content of the recordings,
the choice of orthography, the format of the texts resulting from the record-
ings, and the content and format of the dictionary and the grammar.
2.1. Speech genres
From the linguists point of view, a language documentation ideally con-
sists of a large variety of speech genres ranging from ritual language and
formal speeches, to casual gossip (see Chapter 1). Local language workers
take a different point of view. Gossip, for instance, generally not only ap-
pears unsuitable for school materials, but also socially inappropriate, and
the knowledge of the ritual language may be restricted to the holders of
chiefly titles. In order to avoid the impression of being intrusive, the lin-
guist should be sensitive to the peoples attitudes, and be content with what
they are prepared to offer. For a detailed discussion on rules of conduct, see
Chapter 2.
2.2. Content of recordings
The same applies to the content of the recordings. The ideas of the linguist
or anthropologist may not meet with the approval of the local language
workers. An additional potential complication lies in the fact that local lan-
guage workers may well disagree among themselves. While some people
would like to preserve the old legends because they are no longer transmit-
ted to the younger generations, others may argue that they belong to the
dark ages, and are unsuitable for childrens education. The researcher
should try to avoid becoming involved in debates on such matters of prin-
ciple their outcome might be counterproductive but simply try to convey
the message that the communitys oral literature will be lost forever unless
it is recorded now, and that the community may well later regret its loss. A
list of ethnographically interesting topics that might be suggested to the
speech community is discussed in Chapter 8.
70 Ulrike Mosel
2.3. The format
Speaking and writing are conceptually different activities, and so is a lan-
guage in its spoken and written form. For the scientific documentation of a
language it would suffice to render all recordings utterance by utterance in
a phonetic transcription with a translation, and the metadata that explain all
relevant circumstances of the recording. This is, however, not necessarily
what indigenous speech communities want.
For language maintenance measures and educational purposes, tran-
scriptions are not regarded as suitable because they usually contain hesita-
tion phenomena, speech and factual errors, repetitions, etc. They need to be
edited. But these edited versions differ in many respects from oral literature
in written form. In fact, they represent a quite different kind of language to
the oral narrative in respect to its physical nature, its conceptualization, its
discourse structure, its phraseology, its grammar, and its lexicon. Conse-
quently, such educational materials may introduce a new form of the lan-
guage (or at least type of text) into the community, hence arguably changing
the language and the culture of language use. For this reason, it might be
argued that their value for language maintenance and the preservation of
cultural identity is doubtful, as the written form of the language will be
heavily influenced by the dominant language and culture (Foley 2003).
Undeniably, the written language developed for educational purposes will
be different from the spoken one, but the real question is whether one
should deny the communitys desire to have reading materials in their lan-
guage. Surely if the community expresses such a wish, it is the linguists
obligation to provide all the assistance she can. Language documentation
and language maintenance do not mean preserving the language untouched
like a fossil in a museum. In fact, language purism can be most harmful to
endangered languages (Florey 2004). In creating an authentic literature that
can be rooted in oral traditions (though this is not a prerequisite), the lin-
guist can encourage and assist the people to find their own ways of devel-
oping new modes of expression, rather than taking the written dominant
language as a model (see below the section on editing texts).
Such somewhat artificially created text editions are in fact innovative
communicative events and may lead to a change of the languages struc-
tures (for a brief account of such changes, see Raible 1994). However, they
are not useless for future linguistic research. Provided that the linguists do
the right job, they reflect the native editors linguistic competence, and the
expressive potential of the language, and thus are a genuine object of lin-
guistic research. Therefore, these edited versions deserve the linguists at-
tention and should also be archived and accompanied by metadata, transla-
tions, and comments on their language and content.
2.4. Orthography
While linguists as second language learners often prefer a phonological
orthography that allows them to correctly pronounce words they do not
know, native readers often want a more morphologically-based orthography
that just allows them to quickly recognize the words in silent reading. Or-
thographical issues are often of marginal interest for linguists, but they are
very important to the speech community (see Chapter 11).
2.5. Dictionary
Dictionary making is the area where the linguists and the community have
the most divergent interests (Hinton and Weigel 2002). As an instrument of
language maintenance and as a resource for keeping the cultural heritage in
memory, the communitys dictionary will contain more encyclopedic in-
formation than the linguists dictionary, and thus also meet the interests of
ethnographers (see Chapter 8). Furthermore, the linguists dictionary con-
tains grammatical information such as the indication of parts of speech,
details on pronunciation, inflection, and derivation that are irrelevant for
the speech community as long as the language is vital. As nobody can pre-
dict how long a language keeps its vitality, the community should accept
the presence of this kind of information. It should, however, be presented in
a way that does not impede the accessibility of the dictionary by native
speakers (see Chapter 6).
3. Setting up the project team
In fieldwork manuals, you can find sections like Selecting an informant
(Vaux and Cooper 1999: 7), or a list of qualities an ideal informant
should have (cf. Kibrik 1977: 5456). But most of the time, linguists can-
not select the local language workers any more than the language com-
munity can select a linguist from outside. Rather, the linguist will work
with people who were chosen by others or who offered themselves to work
72 Ulrike Mosel
on the project. Of course, the researchers can ask their intermediaries, their
hosts, or some institution like the local school or church to help them find
someone with particular qualities such as being literate, bilingual, and in-
terested in language work (see Section 3 in Chapter 2). But they do not
know the peoples selection criteria. Not everywhere in the world is the
appointment of people for certain tasks exclusively guided by their qualifi-
cations. As much as their knowledge, experience, and skills, their social
standing and relationships play a decisive role. When I lived in a Samoan
village, for instance, it was only socially appropriate for me to work with
members of the extended family which adopted me.
As the fieldworker is a guest in the community, she is not in a position
to hire and fire (McLaughlin and Sall 2001: 195). Even if a local language
worker really fails to live up to expectations due to laziness, unreliability,
or whatever, she cannot just dismiss her/him because the consequences for
this person, for the fieldworkers relations within the community, and even-
tually for the project are unpredictable. In order to avoid any disruption, it
would be wise to first consult the intermediary or some respected person in
the community in case such a problem arises.
Leaving social and political motivations aside, thelanguageworker others
choose is the person they considered the most suitable. If he or she does not
meet the fieldworkers expectations, this means that she either could not
communicate her expectations well or that her expectations were unsuitable.
As long as someone has a genuine interest in his or her language, is co-
operative and can afford some time to work for the project, he or she will
be capable to do some job in the project (Grinevald 2003: 67 f.). As
Dimmendaal (2001: 63) puts it, It is a truism but worth repeating that dif-
ferent informants have different talents. Some are truly excellent at explain-
ing semantic subtleties, while others have deep intuitions about the sound
structure of their language.
While the fieldworker is prepared for her tasks she is a trained linguist
and has designed the research plan her local counterparts mostly start
their work unprepared. They do not know what kind of activities linguistic
fieldwork involves and what kind of work they may be good at. In order to
avoid disappointment and frustration, some time needs to be allocated for
identifying their strengths and weaknesses, and most important, they them-
selves need some time to overcome shyness and insecurity and discover
their own talents and interests. If someone does not feel comfortable with
his or her job, the fieldworker might find him or her a different one. In my
experience, the main tasks that can be distributed among different people are:
helping the linguist to learn the language;
recording, transcribing, and translating;
editorial work;
helping the linguist to understand and translate the recordings;
dictionary work.
4. Learning and teaching
Fieldwork is a mutual learning and teaching process for all people involved.
The researcher will learn the language and a great deal about the culture
from his local counterparts and, at the same time, teach them linguistic
methods and the organization of language work. But in contrast to the re-
searcher, the local language workers face a situation that is completely new
to them with respect to:
the subject matter, namely, the indigenous language that has never been
taught before as a second language;
their role as a teacher of an adult second language learner (see Ch. 5);
the fact that their student comes from a foreign and often dominant cul-
the fact that they do not share the same culture of learning with their
When the researcher asks a native speaker to become her teacher, he or she
will probably answer, I dont know how to teach my language. Teaching
ones native language to adult learners does not belong to any speakers
natural linguistic competence, but is a skill that requires training and expe-
rience. In the fieldwork situation, the local teachers will develop this skill
through the cooperation with the linguist when she helps them to become
aware of the structures of their language and the various areas and methods
of research (see below Section 5).
In order to achieve fruitful teamwork, the researchers must be aware of
the possible difference between their and the indigenous peoples teaching
and learning practices. German people, for instance, teach practical and
intellectual skills by explaining in detail how you do this and that, and why
you do it, they may even add what would happen if you do it differently, or
elaborate on alternative ways of doing it. But there are other ways. One day
when I was working in Samoa, I met a German medical student who was
74 Ulrike Mosel
doing his practical year in the maternity ward of the national hospital. He
told me in near desperation, They dont explain anything. They just want
me to watch. J ust watching, how can I learn anything? In fact, this is pre-
cisely what Samoans and many other people expect, and are expert at:
learning by observation.
Such different attitudes and practices can lead to misunderstandings. If,
for example, you explain how to use a tape recorder and continue talking
while showing how to insert the batteries, switch on the microphone, and
press the recording button, your counterpart might have the impression that
you regard him or her as stupid: Too much talk can be interpreted as pa-
tronizing. Accordingly, the learner is expected not to bother the teacher
with questions, but be a silent observer (see Duranti [1997: 104 ff.] and
Chapter 5).
In many fieldwork situations, the indigenous teachers will be pleased
when the linguist learns to actively speak the language and they may be
disappointed when she does not make the effort to learn phrases and para-
digms by heart. But this is not necessarily so. There are speech communi-
ties who consider it as inappropriate or even intrusive when an outsider
tries to speak their language or a particular variety of the language (see also
Chapter 5). My Samoan family, for example, did not want me to speak col-
loquial Samoan.
Many linguists no longer see the production of annotated recordings,
grammars, and dictionaries as the only goal of linguistic fieldwork. Instead,
they regard it as their responsibility to train and mentor the indigenous lan-
guage workers to enable them to work on the documentation themselves
and thus consider themselves genuine shareholders in the project (Dwyer
in Chapter 2, also see Grinevald 2003) So, what do the local language
workers need to learn in order to eventually become independent of re-
searchers from outside? The answer is, much of what a student of linguis-
tics also has learned at school or at university, namely:
handling technical tools (recorders); organizing notebooks, folders, files,
etc. (see also Chapter 4);
understanding the basic theoretical concepts of phonology, grammar,
and lexicography (see further Section 5);
making recordings, transcriptions, and translations and editing the tran-
scriptions (see further Section 6);
organizing the work flow (see Section 7).
5. Getting started: elicitation
In the very beginning of fieldwork, the researcher has to rely on elicitation.
Elicitation means getting linguistic data from native speakers by asking
Accordingly, some older fieldwork manuals give advice on what
kind of questions to ask or not to ask, how to make the interview interesting
and keep the informant attentive, etc. In this manner, such manuals quite
automatically assign a passive role to the native speaker.
If we regard fieldwork as a mutual teaching-learning event, this approach
is no longer acceptable. Rather, we have to develop methods that involve
the speaker as an active partner who eventually becomes an independent
language documenter him- or herself. In the remainder of this section, we
will briefly outline how in the initial fieldwork phase the data collection
can be combined with training in basic linguistics. Section 6 describes how
the linguist and the local language workers can cooperate in building up a
corpus of annotated recordings and edited texts.
5.1. Wordlists
In the very first sessions of fieldwork, you need to compile wordlists to
investigate the phonological system and create a working orthography, or
understand an existing orthography. Traditional fieldwork manuals recom-
mend compiling wordlists by asking bilingual native speakers for the trans-
lation of wordlists in the lingua franca into their native language. Some also
provide the translation terms for such wordlists (Kibrik 1977: 103124;
Vaux and Cooper 1999: 4449). This method is questionable on both lin-
guistic and psychological grounds. The native speakers might feel embar-
rassed when asked for the translation of a word they do not understand or
even worse, a word that they cannot translate because they have forgotten
the indigenous equivalent, or because there is a taboo about it. An alterna-
tive method works like this:
Explain what you need the wordlists for this is not just for studying
phonology and orthography; the first wordlist of about 180 words will
also serve as the starting point to build short clauses;
discuss what semantic fields might be suitable to start with, and perhaps
suggest food and cooking;
ask the native speaker to teach you words of this particular semantic
field by dividing it into subcategories, e.g.:
fruit and vegetables, edible animals
Thus you ask:
tell me the names of fruit and vegetables you grow and eat
(apples, spinach, beans, potatoes ...);
what do you do when you make a dish with potatoes?
(wash, peel, boil, fry ...);
what kind of things do you use?
(knife, spoon, tongs, pot ...);
When eliciting words expressing activities like wash, cut, boil, roast, etc.,
it is often useful to ask for commands because imperatives are in many
languages the most simple verb forms. In order to get the simplest forms
and avoid complex polite expressions which may be crucial in certain so-
cieties, establish a scenario where the mother asks her daughter to wash the
vegetables, boil the water, etc.
This method of active eliciting will not only help you to learn the first
words and short sentences, but also make the native speaker aware of the
notion of semantic fields and different word classes, e.g. verbs and nouns.
5.2. Phonology
180 words are not enough to study the phonology of a language. But no-
body is expected to do a more or less complete study of the phonology be-
fore investigating morphology or syntax. Rare sounds, sound combinations,
or tonal patterns that are overseen in a preliminary phonological analysis
will certainly show up in the course of later analysis and then phonology
can be revised accordingly. In a fieldwork methods course I taught with a
speaker of Acoli, a tonal language from Uganda, most of us had difficulties
hearing the tonal patterns. Instead of spending numerous frustrating ses-
sions on phonology, we started with syntax before we had worked out the
phonology in detail. This gave us time to familiarize ourselves with other
features of the language, while at the same time, our teacher became in-
Chapter 3 Fieldwork and community language work 77
creasingly aware of the tone system of his own language by observing our
errors, thus putting him in a better position to identify and correct our pro-
nunciation mistakes. A German proverb says, You learn by your mis-
takes. In fieldwork, your teacher learns through your mistakes and you
will profit from this yourself.
Once you have come across two or three minimal pairs, you can try to
explain to your teacher what a minimal pair is. Avoid any linguistic terms,
work in a playful way, you may even invent games for the children, such as
finding words that nearly sound the same or finding words that rhyme, e.g.
the Teop words [bon] day, [bo:n] mangrove, and [vasu] stone, [tasu]
5.3. Short clauses
The next step is to ask the native speaker to build short clauses from the
wordlist. If English were the language to be researched and food prepara-
tion chosen as the semantic field, the list would probably contain the words
water, fish, boil, cook, and fry and the teacher would produce clauses such
as boil the water, cook the fish, fry the fish. When the linguist tries out other
combinations like *cook the water, the indigenous teacher will correct her
and, at the same time, become aware of the notion of collocation. In addi-
tion, the existence of functional words (e.g. articles) and the first rules of
word order can be learned from such short clauses. Put differently, while
the linguists learn the first rules of the grammar of this particular language,
the native speakers have their first lesson in grammatical analysis. This
would also include morphology when, for instance, the nouns inflect for
case and the verbs for gender in the imperative. Similar to phonology, the
more the indigenous teacher becomes aware of the grammatical structure of
his or her language and of collocation rules as in the case of boil, cook, and
fry, the easier it is for her to identify mistakes and thus become a better
6. Creating a corpus of recordings with transcriptions and translations
The documentation of a language should contain recordings of a large vari-
ety of naturally spoken language. But in the beginning, such recordings
would be much too difficult to transcribe and analyze. Short simple stories
78 Ulrike Mosel
are more suitable for both the linguist as a language learner and the indige-
nous language teacher who is introduced into the techniques of recording,
transcribing, and translating. If the speech community has a tradition of
telling stories to children, these stories may be a good starting point be-
cause their content, sentence structure, and vocabulary will be relatively
easy to understand.
Before starting with the recordings for a corpus, the linguist needs to
discuss the contents of the recordings (see above Section 2.2), and explain
the various tasks and the workflow. Once the teacher knows how to handle
the recorder, he or she can ask other people for such stories and can record
them without the outsider linguist being present. I practice this method
wherever possible because my mere presence creates an unnatural situation
that might influence the way the people talk. At worst, speakers may even
use a kind of foreigner talk (albeit unconsciously) to make sure that I under-
stand them. Or they might speak what they think is the purest or best lan-
guage, even though nobody speaks this way. Furthermore, people just might
feel uncomfortable in the presence of a foreign visitor. Because the record-
ing of people can be felt as intrusive, many linguists and anthropologists
have agreed on certain rules of conduct as further discussed in Chapter 2.
6.1. Recordings
Before advising the local language workers how to operate a recorder, it
will be useful to think about the sequence of steps to be done, e.g. insert the
battery into the recorder or the camera and the microphone, connect the
microphone, etc., and to stick to this sequence whenever you show them
how to do recordings. Explain how to hold the microphone (not too close to
the mouth) and that one should avoid noisy places for the recording. Practice
with them and let them practice with others so that they gain confidence. If
they are not used to dealing with modern technology, they will need some
time to lose their fear of doing anything wrong or breaking the equipment.
6.2. Transcriptions
If the local language workers are literate in any language, they can be asked
to make transcriptions. Even if their spelling is inconsistent or neglects
important distinctions (ones that linguists might consider indispensable),
Chapter 3 Fieldwork and community language work 79
their transcriptions will be a great help. The most important thing to teach
them is to transcribe what the speaker actually says and not to correct speech
errors and other mistakes, although such editing is certainly legitimate in
later stages of data collection and analysis (see Section 2.3).
In order to allow for a genuine participation of the speech community in
the documentation project, it is imperative that all recordings are tran-
scribed in a practical orthography readily accessible to literate but not lin-
guistically-trained native speakers. For specialists interested in phonetics
and phonology, only a selected corpus needs to be rendered in a phonetic
transcription. The more time spent on narrow phonetic transcriptions, the
smaller and the less useful the corpus of annotated recordings will be for the
speech community and for researchers who are not interested in phonetics
and phonology. For a detailed discussion on transcription and orthography
development, see Chapters 911.
The local language workers may be afraid of spelling mistakes in their
work. But as long as the orthography has not been standardized, there is no
such thing as a right or a wrong spelling, and they should be encouraged to
follow their intuitions, which may be relevant for the analysis of the pho-
neme system (Duranti 1997: 170172). As discussions on spelling prob-
lems and standardization can be quite emotional and are often guided by
sociopolitical issues, they should be postponed to a later stage when the
linguist is more familiar with the speech community and the local language
workers have gained more experience in writing their language.
However, for the data base of the project, especially for the lexicon, a
consistent working orthography that distinguishes between norms and vari-
ants is a prerequisite, but this does not necessarily imply that the local tran-
scribers have to learn and use it. Later, when the speech community decides
on their own norms, the working orthography can be adjusted to their stan-
dard orthography.
6.3. Translations
The purpose of the translation determines whether a free and idiomatic or a
more literal and, hence, non-idiomatic translation is given preference. For
the linguistic analysis, the latter is more suitable, but bilingual members of
the speech community and readers who are more interested in the content
than the linguistic form will certainly prefer the idiomatic one (see further
Chapters 8 and 9) For our Teop project, we solved this conflict by having
80 Ulrike Mosel
the idiomatic translation next to the transcription and giving a literal trans-
lation in the footnotes wherever we thought it necessary for a better under-
standing of linguistic structures.
It might be difficult to find people in the community who feel confident
to do translations on their own, but if you do, employ them even if their
knowledge of the target language is not perfect and their translations cannot
be directly used for the documentation. Any differences between their
translation and yours can provide useful indications that some of your inter-
pretations are misguided and need to be revised. Before they start, explain
to them that this will be only a raw translation and that they need not be
worried about making mistakes at this stage. If they do not know the trans-
lation equivalent of an expression, or if there is none in the target language,
they can use the original expression and explain its meaning in brackets or
in a footnote. To clearly show how the translation relates to the original, it
is advisable to number the utterances in the transcription and ask the trans-
lator to do the translation utterance by utterance using the same numbers in
his or her translation. Otherwise, there is the danger that he or she might be
inclined to retell the recording, rather than translate it.
6.4. Editorial work
Since transcriptions are, as mentioned above, not a pleasant read, the local
fieldworkers may want to edit them. In order to prevent that they model
their editorial work in syntax, style, phraseology, or discourse structure on
the written dominant language (see Foley 2003), the following guidelines
may be helpful:
as an editor, always respect the speakers way of saying things;
never change words and phrases for stylistic reasons, but only where the
speaker makes an obvious mistake;
do not change the sentence structure; do not, for instance, replace coor-
dinate clauses by subordinate clauses;
do not change direct speech into indirect speech or vice versa;
add information only where absolutely necessary for understanding; for
instance, when the speaker refers to things no longer known to the
younger generation;
do not shorten the text.
7. Work flow and time management
Efficient work presupposes a well organized work flow and good time man-
agement. It is impossible to plan everything in advance, because one does
not know the talents and interests of the local language workers, and they
themselves do not know them before having had some practice in linguistic
work. Therefore, it is advisable to start with only two or maximally three
people and allocate some time for the development of a work routine. Later
on, more people can join the team.
The researcher and local language workers should always have a clear
idea of what kind of work needs to be done and when it needs to be done
and, therefore, jointly organize their work along the following lines:
identify what kind of activities are required to produce a piece of docu-
mentation work;
discuss who will do what;
make a work plan by putting the various activities into a certain order
and allocating a certain time for each;
try to stick to the work plan; finish one thing before you do the next;
evaluate the work plan and revise it.
As the organizer of the documentation, you will only be successful if you
divide your project into small and easily manageable subprojects, and al-
ways try to finish one before you start with the next. On no account should
the transcription, the translation, and the description of the circumstances of
the recordings be postponed until later, because the recordings might be so
context-bound that they are hardly understandable once the details of this
context have been forgotten. Duranti (1994: 31) reports about his experi-
ences in Samoa: I found that even people in the same village would mis-
interpret utterances when removed from their immediate context and the
fact of speaking the same language or living in the same community was no
guarantee of the accuracy of transcription and interpretation.
Furthermore, with each transcription and translation you will discover
exciting features of the language, and you and the other team members will
become more and more motivated when you see how the drafts are com-
pleted one after the other. There are areas and circumstances where you can-
not use a computer and have to revert to handwriting or a manual typewriter.
If, however, the field situation allows you to use a computer, you should
also have a printer in order to give your co-workers printouts to read.
82 Ulrike Mosel
One problem with time management is that local language workers may
hesitate to conclude a piece of work. There is always something that can be
improved so that they may insist on continuous revisions. They also might
be afraid of criticism from other members of the community. And criticism
will definitely come. Here strict deadlines help. When I worked with a team
of Samoan teachers on the Samoan monolingual dictionary for school chil-
dren (Mosel and Soo 2000), I very much appreciated the strict deadline set
by the funding agency, the Australian Agency for International Develop-
ment. Meeting the deadline obliged us to make compromises and refrain
from perfectionism. One of the mistakes we discovered soon after publica-
tion was the definition of koale coal that translates into English as coal is
a black or dark-brown mineral found in the ground. It is used for making
fire as well as for the production of the drink Coca-Cola. (Mosel and Soo
2000: 150) But having a dictionary containing such a mistake is certainly
better than a half-finished manuscript that will never be published.
When I was working on the Teop language in Bougainville in 2004, which
was my fourth fieldtrip to the area, we established the following work flow;
note that all work had to be done in handwriting:
recordings on MDs (Enoch, Shalom, Ulrike);
writing down the metadata of the recordings and copying the MDs on
cassette tapes (Ulrike);
transcribing the cassette tapes (Enoch, J oyce, Shalom);
checking the transcriptions and rewriting them in legible handwriting
using a consistent practical orthography (Ulrike);
discussing the transcription with the transcribers (Ulrike with Enoch,
J oyce, and Shalom);
going through the revised transcription while again listening to the tape,
trying to understand the recording, noting down new words with the ex-
planations of a native speaker (Ulrike with Siimaa and J oyce);
translating the transcriptions into English (Ulrike with Siimaa and
J oyce);
giving the original transcriptions back to the transcribers J oyce and
Enoch for editing;
checking the editing, discussing and revising them (Ulrike with Enoch
and J oyce respectively, often in the presence of Siimaa);
giving the revised edited versions to the translator (Naphtaly);
checking and discussing the translation (Ulrike with Naphtaly, often in
the presence of Siimaa).
At the same time, Siimaa and Vaabero worked on example sentences and
monolingual definitions for the dictionary, while two graphic artists,
Neville and Rodney, made illustrations for the legends and the dictionary.
8. Workshops
In so-called Third World countries, workshops are frequently conducted by
foreign aid agencies and non-governmental organizations in order to dissem-
inate information, skills, or new technologies. The community might there-
fore expect you to run a workshop. However, before you enthusiastically
agree, carefully consider the following issues:
1. What is the purpose and the envisaged outcome of the workshop?
2. How much money do you have at your disposal? How much do you
have to calculate for transport, food, and accommodation for each
participant per day?
3. How many participants can be invited on the basis of this calculation,
and for how many days?
4. Who decides on the selection of participants? What are the selection
5. Who will help with the organization (i.e. invite the participants; organize
food, stationary, and accommodation)?
6. Who decides on the agenda?
7. Who writes a report?
8. What kind of rituals are to be observed (e.g. opening ceremony, farewell
The less you are involved with organizational matters the better, because
that gives you more time to concentrate on the content. On the other hand,
not being involved may marginalize your professional input and be coun-
terproductive to the original goals of the workshop.
There are several kinds of workshop that are useful in the context of
language documentation projects, for example:
1. introductory workshops;
2. workshops on the standardization of the orthography;
3. workshops for the training of community language workers;
4. workshops for the training of school teachers.
Content, structure, and logistics of workshops are so much dependent on
the sociocultural context and the resources in terms of money, time, and
manpower, that only some very general points can be discussed here.
At the beginning of the project, a half-day or maximally a full-day work-
shop can be useful to introduce the researchers, to inform the community
about the project and what language documentation is all about, and to dis-
cuss the expectations and wishes of the community as well as those of the
researchers. This workshop can also help to recruit local language workers
and may be visited by many people.
The second type of workshop is of a very different nature and needs to
be planned with utmost care. As already mentioned above, orthography is a
sensitive, often a political, issue as the written form of a language is literally
seen as representative of the language and a symbol of cultural identity.
Practical issues like learnability or linguistic adequacy often play an inferior
role, especially when there are already two or more competing orthographies
in the community; more important in decision making are the societal
standing of the people involved, and perhaps rivalries between various
groups in the community (see also Section 2 in Chapter 11). To avoid con-
flicts and disruption as far as possible, it is advisable to keep the number of
participants small and leave their selection to the elders of the community.
If you have the funds, the equipment, and a team of three or more lin-
guists, you can also run longer workshops or a series of workshops in
which members of the community are trained in the linguistic and technical
skills needed for language documentation and revitalization. A detailed de-
scription of workshops for community language workers is found in Florey
The fourth type of workshop is better conducted by school inspectors or
senior teachers so that the linguists role may only be to assist in the pro-
duction of workshop materials and make suggestions for how they can be
Working with a team of native speakers in the community is a most fasci-
nating enterprise, intellectually, socially, and personally. Each day you dis-
cover interesting linguistic phenomena and learn more about the peoples
culture. Nowhere else can you find people showing so much enthusiasm for
linguistic work. During your university studies, you may often have been in
Chapter 3 Fieldwork and community language work 85
doubt whether you are doing the right thing, especially when relatives and
friends keep asking what linguistics might be good for. But once you have
started with language documentation work, you know the answer.
I am grateful to all my Samoan and Teop colleagues with whom I had the
pleasure to gain experience in community language work. Special thanks
go to Ainslie Soo, Fosa Siliko, and Agafili Tuitolovaa, who were my
counterparts in the Primary Education Materials Project in Samoa (1997
2000), to Ruth Saovana Spriggs, who introduced me to her people in Bou-
gainville and works on the documentation of the Teop language, and to the
team of the Teop language workers. In particular, I would like to express
my gratitude to my host and teacher Siimaa Rigamu of Hiovabon in Bou-
1. For a critical overview of elicitation techniques, see Himmelmann (1998: 186 ff.).
186 ff.).
Data and language documentation
Peter K. Austin
The role of data in language documentation is rather different from the way
that data is traditionally treated in language description. For description, the
main concern is the production of grammars and dictionaries whose pri-
mary audience are linguists (Himmelmann 1998; Woodbury 2003). In these
products language data serves essentially as exemplification and support for
the linguists analysis. It is typically presented as individual example sen-
tences, often without source attribution, and often edited to remove irrele-
vant material. There may also be a sample text or two in an appendix to
the grammar. Language documentation, on the other hand, places data at
the center of its concerns. Woodbury (2003: 39) proposes that
direct representation of naturally occurring discourse is the primary project,
while description and analysis are contingent, emergent byproducts which
grow alongside primary documentation but are always changeable and para-
sitic on it.
For language documentation then, data collection, representation and diffu-
sion is the main research goal with grammars, dictionaries, and text collec-
tions as secondary, dependent products that annotate and comment on the
documentary corpus. The audience for language documentation is also very
wide, encompassing not only linguists and researchers from other areas
such as anthropology, musicology, or oral history, but also members of the
speech community whose language is being documented, as well as other
interested people. A significant concern for documentation is archiving, to
ensure that materials are in a format for long-term preservation and future
use, and that information about intellectual property rights and protocols
for access and use are recorded and represented along with the data itself.
Important also is mobilization of materials (cf. Chapter 15), i.e. generation
of resources in support of language maintenance and/or learning, especially
where the documented languages are endangered and in need of support.
Woodbury (2003: 4647) argues that a good documentation corpus should
1. diverse containing samples of language use across a range of genres
and socio-cultural contexts, including elicited data;
2. large given the storage and manipulation capabilities of modern infor-
mation and communications technology (ICT), a digital corpus can be
extensive and incorporate both media and text;
3. ongoing, distributed, and opportunistic data can be added to the corpus
from whatever sources that are available and be expanded when new
materials become available;
4. transparent the corpus should be structured in such as way as to be
useable by people other than the researcher(s) who compiled it, in-
cluding future researchers;
5. preservable and portable prepared in a way that enables it to be ar-
chived for long-term preservation and not restricted to use in particular
ICT environments;
6. ethical collected and analyzed with due attention to ethical principles
(see Chapter 2) and recording all relevant protocols for access and use.
This means the corpus must be stored digitally and ideally collected digitally.
In this chapter we outline the major processes involved in collecting and
representing language data in a documentation framework, briefly discuss
the tools that are available to assist with this work, and illustrate some of
the products that documentary linguists have developed to present the re-
sults of their research. Further technical details about data structures and
encoding, tools, archiving, and outputs can be found in other chapters in
this volume (see Chapters 13, 14, 15).
It is important to emphasize that language documentation is a develop-
ing field that has emerged only recently and that is undergoing rapid change
in terms of both theory and practice. It can be anticipated that much of what
is presented in this chapter will be subject to change and development in
coming years.
Language documentation begins with the development of a project to work
with a speech community on a language and can be seen as progressing
Chapter 4 Data and language documentation 89
through a series of stages, some of which are carried out in parallel. In the
following we discuss the processes that involve data collection, processing,
and storage. These can be identified as follows:
1. recording of media (audio, video, image) and text;
2. capture moving analogue materials to the digital domain;
3. analysis transcription, translation, annotation, and notation of metadata;
4. archiving creating archival objects, and assigning access and usage
5. mobilization publication, and distribution of the materials in various
Note that at the time when a documentation project is being developed each
of these processes should be considered and relevant procedures included
in the project planning. In particular, archiving and mobilization must be
considered from the beginning of the project and not left to the end of the
project or as an afterthought (see further below).
A crucial aspect that must be kept in mind at all stages is backup.
It is prudent for any project, and especially one involving digital ICT, to de-
velop a regular and effective regime of backing up the project data, ideally on
a range of different media (e.g. CD-ROM, DVD, flash memory, external hard
disk). Backups should be incremental and intended for full recovery, should
disaster strike. One widely agreed mantra is LOCKSS lots of copies keep stuff
safe (see Remember, it is highly likely that
you will lose data at some point in your project work, however, a good backup
regime will ensure that such loss can be minimized.
2. Documentation processes recording, metadata creation, and cap-
A good documentation corpus will include audio and/or video materials,
ideally recorded in authentic settings and under good conditions. When
recording outdoors, if possible attempt to minimize noise from animals,
90 Peter K. Austin
traffic, machinery and electrical equipment, wind and the environment, and
non-linguistic activities (e.g. children playing in the vicinity). When record-
ing indoors, it is important to keep away from machines and electrical
equipment, hard walls (that reflect sound), and windows. For video, it is
necessary to consider light conditions, use artificial lighting and reflectors
as appropriate, and to learn some basic filming techniques, ideally from an
ethnographic filmmaker or relevant textbook.
Note that we are often unaware of and filter out much of the noise and
movement around us, however, this will appear on your recordings, some-
times over the top of the intended documentary data. There are four ways to
check on and reduce unwanted noise:
1. monitor the recording through closed headphones as you make it;
2. use a good quality external microphone and never rely on the micro-
phone built into the equipment, especially for video cameras;
3. cover the microphone with a wind shield and place it as close to the
speakers mouth as possible, using a boom or shotgun microphone if
4. reduce all unnecessary movement and sound such as shuffling papers,
audience members moving, etc.
It is imperative to use good quality equipment (the best you can afford with
the project resources available) including good microphones, lighting,
headphones, and consumables (tapes, discs, batteries). It is also important
to divide up duties and individual researchers should not attempt to do all
the recording tasks. It is better to employ and train assistants, ideally inter-
ested members of the language community, to help with microphones, re-
corders, cameras, lights, and interaction with the people being recorded.
The choice of recording equipment (DAT, minidisk, solid state, DVD,
analogue tape) may be a compromise between quality/cost and convenience
and needs to be carefully considered, taking into account such factors as the
local climate (DAT recorders are notably unstable in tropic climates, for
example), access to electrical power, and portability. Two basic principles,
however, are never record in compressed format such as mp3, and never
record direct to computer hard-disk, as such techniques risk irrecoverable
data loss (on sound file formats, see below and Chapter 13). There is good
advice about audio and video recording available in textbooks such as Lade-
foged (2003) and on the internet (see especially David Nathans fact sheet
on microphones at
Video recordings have a number of advantages: they are immediate, rich in
authenticity, multi-dimensional in context, of great interest to communities,
and can be produced independently by members of the community without
the researcher in situ. They present several problems, however, including
being more difficult to produce, harder to process (transcribe, annotate
see below), difficult to access without a time-aligned transcription, difficult
to transfer and store (raw video requires large amounts of storage space),
and difficult to preserve in the long term (since there are as yet no univer-
sally-agreed standards for digital video). There may also be complexities
having to do with prohibitions in some communities against viewing the
images of dead people appearing in video recordings (necessitating delicate
treatment in terms of access and use restrictions). Note that in some com-
munities making video recordings is not possible for cultural reasons.
Audio recordings are less difficult to produce than video and are rela-
tively simpler to manipulate, store, and curate. Audio is also more familiar
as a medium and has been in general use by linguists for more than 50
years. Several audio processing software tools exist (see below), and ar-
chiving is less problematic than video. Conversely, audio recordings con-
tain less information than video, are difficult to access without time-aligned
transcription, and changing formats (both carriers and data formats) make
obsolescence a major problem, e.g. locating equipment to play the media
on. This is especially true of legacy sound recordings (wax cylinders, wire,
reel-to-reel tape) but will becoming increasingly the case for digital media,
including DAT recordings and probably minidisk as new machines are in-
troduced by manufacturers and older equipment and carriers are no longer
available for purchase.
Before starting fieldwork
It is important to test all your equipment, including cables, connectors, and
adaptors before you leave for fieldwork. Remember that one missing cable or
connector can prejudice an expensive fieldtrip so prepare your equipment be-
fore you leave for the field and get professional advice as necessary. Make
sample recordings under a range of conditions and check their quality. Trans-
fer the recordings to your computer and be sure you know how to use the rele-
vant processing software and how to burn CD-ROM or DVD backup copies of
the data. Check the data on your backups on another computer to make sure
that your writer and software are working properly. If in doubt, seek advice.
While making the audio and video recordings it can be useful to take field-
notes, including rough transcriptions, translations, relevant recording meta-
data, diagrams, drawings, and notes that can serve as aide memoire for later
writing up or checking. Fieldnotes should be written in ball-point pen (not
pencil and not washable ink!) on good quality paper (ideally in a bound
notebook) using one side of the page only. As soon as possible after the
recording session fieldnotes should be checked and elaborated, and trans-
ferred to a digital form. It is amazing how rapidly one forgets what abbre-
viated notes made while recording and interviewing mean.
Digital text has a number of advantages: it is compact, stable, easy to
store, access, and index, and can express hypertextual relationships (links).
There are a large number of tools available to process text data (text edi-
tors, word processors, databases, browsers, etc), and well established liter-
acy traditions and knowledge of written text in many communities. How-
ever, it is less rich than audio and video as there is always loss of
information when reducing language to writing. Text needs to be con-
nected to richer recordings of speech events through time-aligned transcrip-
tions and hyperlinks (see examples below and elsewhere in this volume).
However, written documentation outputs in the form of books are highly
valued in many language communities and, for those where ICT resources
are not available or limited, will be the ideal form of product from a docu-
mentation project.
Labelling and metadata
Whatever the recording medium, it is important to rigorously label everything,
including tapes, disks, CDs, containers, fieldnote books (number all the
pages!) immediately, consistently, and uniquely (e.g. using date and sequence
number). Write this information with an indelible marker on the object itself,
since disks and tapes can become separated from their covers. It is also im-
perative that a proper record of metadata (data about the recorded data, see be-
low), such as speaker name, recording location, dialect, etc., is made at the
same time as the recordings are labelled. You can do this in a notebook or as a
computer file (create a structured file using a spreadsheet, database or Word
table, whatever is most convenient).
2.2. Metadata creation
Metadata is data about data, i.e. structured information about events, re-
cordings, and data files. It is usually represented as text (but not always,
e.g., it could be a spoken introduction track on a video or audio recording),
but it is a different type of media because it is collected and used differently
from other types. Typically metadata is collected and stored according to
some formal specification. Metadata is needed for proper description of the
data and to enable it to be found and used (see Bird and Simons 2003).
There are two main competing international standards for linguistic meta-
data, that promoted by the Open Language Archives Community (OLAC)
and that promoted by the ISLE Metadata Initiative (IMDI), the former be-
ing less detailed than the latter. The choice of metadata format should be
made in consultation with the archives where the researcher intends to de-
posit the documentary materials (see Chapter 13).
There are several types of metadata:
1. Cataloguing information useful to identify and locate data, e.g. lan-
guage code, file ID number, recorder, speaker, place of recording, date
of recording, etc.
2. Descriptive information about the kind of data found in a file, e.g. an
abstract or summary of file contents, information about the knowledge
domain represented.
3. Structural for files that are organized in a particular way, a specification
of the file structure, e.g. that a certain text file is a bilingual dictionary.
4. Technical information about the kind of software needed to view a
document, details of file format, and preservation data.
5. Administrative background information such as a work log (indicating
when the files were last saved or backed up), records of intellectual
property rights, moral rights, and any access and distribution restrictions
imposed by researcher and/or community.
Note that information can be metadata for more than one purpose, depend-
ing on its nature and use, e.g. the identity of the speaker in an audio record-
ing could be relevant for cataloguing purposes and/or also for determining
access restrictions.
Table 1 provides an example of the different types of metadata associ-
ated with a computer file.
94 Peter K. Austin
Table 1. Different types of metadata associated with a computer file
Cataloguing Title: Sasak.dic; Collector: Peter K Austin; Speakers:
Yon Mahyuni, Lalu Hasbollah; Language code: SAS
Descriptive Trilingual Sasak-Indonesian-English dictionary, linked to
finderlists, morpheme forms link to Sasak text collection
Structural Dictionary entries with headword, part of speech, gloss in
Bahasa Indonesia and English, cross-references for semantic
relations; SIL FOSF record format
Technical Shoebox 5.0 ASCII text file
Administrative Open access to all; Last edited version dated 2004-06-25;
backup 2004-06-20 on DVD 012
Some linguistically-relevant descriptive metadata that you may wish to use
are: speaker (name, gender, age, place of birth, languages spoken, dialect,
education level), recorder (name, experience), date of recording, location of
recording, duration of recording, type (genre) of materials recorded, tran-
scriber (especially if different from the recorder), date of transcription, loca-
tion of transcription, location of all digital files, media and text (and location
of archive copies).
2.3. Capturing
Capture refers to the encoding and transfer of an analogue recording (as on
a cassette or reel-to-reel tape) or text written on paper to the digital domain
as a computer file. In many cases, modern ICT means that audio and video
recordings are born digital and can be transferred to computers without a
separate capture process, unless transcoding is involved (see Chapter 13).
When using digital capture software it is important to make sure you use
appropriate settings. It is also advisable to transfer fieldnotes from note-
books to computer files, ideally as soon as possible after recording so you
do not forget notes, abbreviations, and comments. As for recording, it is
imperative to name your computer files consistently and clearly, making
sure that you should not rely on directory structure to disambiguate file
names; e.g. if you have a file called fieldnotes1.doc in one directory
(folder) (for year 2004 research, say) and another also called field-
Chapter 4 Data and language documentation 95
notes1.doc in another directory (e.g. for your 2005 notes) then any loss of
directory information will result in confusion between these files. Different
naming schemes can be used, but clarity and transparency is the goal see
J ohnson (2004) for some suggestions. It is also essential to record the rele-
vant metadata for the data files you create as you make them, ideally in a
structured way such as a relational table using standard terminology.
3.1. Linguistic processing
Processing the documentary materials is a very different operation from
recording and capture, and operates on a very different time scale. Thus
each minute of audio can take hours to process in terms of transcription and
annotation (depending on familiarity with the language and the richness of
the annotation), while video is even more labor-intensive and requires
much more time to process. Video may require cutting and converting to
create manageable chunks and file sizes (this is done with computer soft-
). There are several tools that are useful for transcription and annota-
tion (see below).
Linguistic analysis, that is transcription, translation, and annotation,
requires decisions about representation, i.e. the levels and types of units.
This should make sense within the researchers chosen framework (theory)
and needs to be made clear in the structural metadata that accompanies the
relevant files.
There are good reasons for aiming at a certain degree of standardization
when processing the materials, including transparency, portability, and ease
of sharing and access (Bird and Simons 2003). Phonetic transcription
should follow the conventions of the International Phonetic Association
(IPA), and phonemic transcription should be IPA or a regionally-recog-
nized standard. Grammatical annotation tags (i.e. the abbreviated labels for
e.g. part of speech categories) should follow general linguistic practice, e.g.
the recommendations of EUROTYP or E-MELD (including its GOLD on-
tology), with a list of relevant abbreviations and symbols provided as meta-
data (for further discussion, see Chapter 9 and Leech and Wilson 1996).
For processed data we need to distinguish between the following:
1. Character encoding how characters are represented, e.g. Windows/
ANSI, Unicode, UTF-8, Big5, J ISC.
2. Data encoding how meaningful structures in the data are marked, e.g.
extensible markup language (XML), Shoebox/Toolbox standard markers,
MS Word table.
3. File encoding how the data is packaged into a digital file, e.g. plain
text, MS Word, PDF, Excel spreadsheet.
4. Physical storage medium the physical form used to store the file, e.g.
CD-ROM, minidisk, DAT, hard disk, flash memory stick.
As an example, certain documentary materials might be encoded as a hard
disk file in plain text Unicode Toolbox format (for further discussion and
examples, see Chapter 14).
When we consider file encoding it is useful to distinguish between pro-
prietary formats and non-proprietary formats. A proprietary format is one
whose structure is determined and owned by the maker of the software that
stores it, e.g. MS Word, Excel, Access, FileMaker Pro, or Sony ATRAC
(the audio format on minidisk). As such, this means that the data is not di-
rectly accessible, and the format is subject to change (so that attempting to
open a file stored in one version of the software with a later version may
not always work see Chapter 14 for examples). As a result, proprietary
formats are not ideal for long-term storage (i.e. the encoding is not portable
and reusable). Non-proprietary formats, e.g. Unicode plain text, or wav
audio, are open and transferrable between hardware and software.
When processing the data it can be useful to distinguish three kinds of
contexts each requiring different data formats (see also J ohnson 2004):
1. working context the way the data is stored for on-going research work
of annotation and analysis;
2. archiving context how the materials are to be stored for long-term
preservation (see below);
3. presentation context the form of the data for distribution and publica-
Researchers need to develop ways to flow data between contexts, typically
by exporting the data into some structured format that the software used for
other contexts can read (see Thieberger 2004 for some examples). Thus, a
common working format for text annotation is Shoebox/Toolbox; this can
Chapter 4 Data and language documentation 97
be exported into rich text format (RTF) to be read by MS Word in order to
produce presentation format PDF documents for printing and distribution.
Table 2 gives examples of the different format types for the three contexts.
Table 2. Data formats in different contexts
Working Archiving Presentation
Text Word, XLS, FMpro,
Video MPEG2 MPEG2, MPEG4 QuickTime, AVI, WMV
As an illustration, Figure 1 is a screen shot which shows Shoebox format
working context data for the Australian Aboriginal Guwamu language.
the window on the top left is lexical information, on the lower left is elic-
ited sentence data with morpheme-by-morpheme glossing annotation and
free translation, on the top right is descriptive metadata about the people
involved in the project, and on the bottom right metadata about abbrevia-
tions used in the lexical and sentence annotations. Note that the metadata is
hypertextually linked to the data in the two left-hand windows, while the
lexical root is hypertextually linked from the morpheme field in the sen-
tence window, and the sentence number links from the example field in the
A possible presentation form of the illustrated lexical entry is the following:
bawurra n
male red kangaroo, Note: used as a generic
term for kangaroos, cf. gula, gumbarr,
dhugandu, [SAW, WW], e.g. Gu206, Gu255
98 Peter K. Austin
Figure 1. Working with Shoebox
Note that in the presentation format, typography (e.g. italics, bolding, font
type, indentation) and dictionary literacy conventions are employed to par-
tially represent the data structure (see Nichols and Sprouse 2003 for other
examples). The sentence example can be presented as follows:
ngaya banbalguya nhunga yilunha bawurra
ngaya banba-lgu-ya nhunga yilu-nha bawurra
1sgnom spear-fut-1sg 3sgacc this-acc k.o.kangaroo
pro vtr-suff-suff pro dem-suff n
I will spear this red kangaroo [SAW, WW, Np12As004]
Linguists conventions (such as the Leipzig Glossing Rules see http:// have been established for an-
notated text so that, as in the given example, horizontal and vertical align-
ment on the page represents relationships between different types of data.
Chapter 4 Data and language documentation 99
The data structures encoded in these Shoebox files are relatively complex (see
the diagram in the Appendix below, and Austin 2005) but the links between
the data fields are lost in the process of export to RTF and presentation on the
printed page. Note that the links could be captured in a HTML file, however,
and thus be available to be viewed with a web browser. We discuss archival
formats for these examples below.
3.2. Tools for linguistic analysis and processing
There are a range of computational resources that facilitate creating, view-
ing, querying, or otherwise using language data. They include application
programs, components, fonts, style sheets, and document type definitions
(DTD). Application programs can be classified into two types:
1. general purpose software for which the user must design the data struc-
tures and can write application programs to manipulate the data and
carry out various tasks. Examples are MS Word and Excel, and File-
Maker Pro. Such software is powerful and flexible, however, they store
data in a proprietary format which is not optimal for long-term storage
and access;
2. specific purpose software which is designed to be used for particular
tasks. Examples of such software in common use by language docu-
menters include: Transcriber and EXMARaLDA (EXtensible MARkup
Language for Discourse Annotation see Schmidt 2004) for time-
aligned audio annotations, Shoebox/Toolbox for text and lexicon
annotations, Praat for speech analysis and annotation, ELAN for audio
and video annotation, and IMDI Browser for cataloguing and admini-
stration metadata.
Some of the specific purpose software is discussed and illustrated else-
where in this volume.
100 Peter K. Austin
In addition to the tools mentioned above, there also exist converter programs
for transferring data between encoding formats, such as those developed at
MPI-Nijmegen for uniting Transcriber and Shoebox encoded files, and con-
verting them to XML for use with ELAN. Further information about available
programs and computational resources can be found at the E-MELD School
of Best Practice website and in the list of resources at the back of this volume.
3.3. Archiving
Digital archiving involves the preparation of the recorded/captured data,
metadata, and processed analysis so that the information it contains is
maximally informative and explicitly expressed, encoded for long-term ac-
cessibility and safely stored with a reputable organization that can guarantee
long-term curation. A number of digital language and music archives exist;
the DELAMAN network created in 2003 links many of them (see resources
list). Digital archiving offers opportunities to store data for communities to
use, other scholars to access, and for preservation for future generations of
community members, the general public, and researchers. Note that not all
recorded data has to be archived (e.g. unprocessed video files) but we
should aim to make our materials archivable, that is, richly structured docu-
mentations maximizing the possibilities of the digital medium. Archiving
must be included as a process in our language documentation project plans,
and it is advisable to seek assistance with planning for archiving from an
archivist at the beginning of project conception.
Note that archiving is not publication (only those materials prepared for
distribution will be published by the archive), nor is it backup (the archive
will generally not accept backup copies of files alone but will expect the
data and metadata to be explicitly described, often by requesting that de-
posit forms be completed for each archival object). Archives also com-
monly have systems in place to manage protocols for intellectual property
rights, and for specification of access and usage rights, e.g. that a certain
archival object is only available to members of the speaker community. The
depositor should establish these by discussion and negotiation with the
owners, and describe them via metadata and deposit protocols. Data sensi-
tivity is not a reason to not archive; it is better to deposit data in an archive
with restrictions than not deposit at all. Researchers should also make
Chapter 4 Data and language documentation 101
preparations for assigning their rights into the future by including informa-
tion in your will and ensuring that your executors understand how to assign
them on your death.
The preferred format for archiving text materials is eXtensible Markup
Language (XML), a document description language used to encode the
content of structured documents (see Sperberg-McQueen and Burnard
2002). XML is a subset of SGML (standard generalized markup language)
and is used to explicitly describe a domain of knowledge through markup
tags enclosed in angle brackets (see Chapter 14 with the example of a play
structure implicit in a published document). Each part of a structured do-
cument is described within a defined and logical structure (stored in XML
schemas or DTDs document type definitions). XML is a good archival
format because XML documents explicitly represent data structure, and are
directly readable by humans even if computer software to display the
documents is not available.
XML documents are typically created by export from working context
materials, rather than being directly written by the researcher, because the
process of writing well-structured XML tends to be tedious and error prone
(various XML editors exist and these can be used to create documents, to
check markup tag syntax [well formedness], to create DTDs, and to ensure
that a document complies with a schema or DTD). XML encoded docu-
ments can be transformed into various archival and presentation formats by
XSLT, extensible stylesheet language transformations. Thus, an XSLT
could create a concordance of an annotated text collection, or HTML files
for web publication. Archivists can provide advice on possible transforma-
tions of XML documents.
The following are two examples of XML encoding. First, consider the
structure of a typical bilingual lexicon (such as seen in the Guwamu example
presented above):
1. lexicons contain entries;
2. the attributes of entries are: form, category, subcategory, language,
meaning specification (and any other additional information such as
notes, speaker, recorder, sense relations, sentence examples);
3. meaning specification can be gloss (for morpheme-by-morpheme gloss-
ing and finderlist production) and definition;
102 Peter K. Austin
4. cross-references to other lexical entries have a sequential order chosen
by the lexicographer;
5. cross-references to sentences examples also have a specified sequential
Table 3 shows the Guwamu sample entry discussed above in XML form,
which would be a possible archival representation.
Table 3. Example of an XML structure (lexicon entry)
<?xml version="1.0" encoding="ISO-8859-1"?>
<entry id=161>
<def>male red kangaroo</def>
<note>used as a generic term for kangaroos</note>
<cf n="1">gula</cf>
<cf n="2">gumbarr</cf>
<cf n="3">dhugandu</cf>
<eg n="1">Gu206</eg>
<eg n="2">Gu255</eg>
If we view this data using XML-aware software such as an XML editor
or a
web browser such as Mozilla Firefox or the current version of MS Internet
Explorer, the hierarchical relationships between the data entities are dis-
played as in Figure 2.
Chapter 4 Data and language documentation 103
Figure 2. XML structure display (lexicon entry)
For an annotated corpus we can set up a structure where:
1. the corpus contains sentences;
2. sentence properties are: sentence number, sentence form, sentence gloss,
speaker, recorder, sentence source reference, grammatical notes;
3. sentences contain words in sequential order;
4. word properties are: word form, word gloss;
5. words contain morphemes in sequential order;
6. morpheme properties are morpheme form, morpheme gloss, morpheme
category, morpheme subcategory.
104 Peter K. Austin
Table 4 shows an XML representation of the Guwamu sentence shown
above. Note that the XML representation makes explict the sequential order
of words in the sentence, and the relationships between elements, e.g. word
forms and their constituent morphemes, which are purely implicit in typical
working format (Shoebox) and presentation format (printed example) which
rely on horizontal and vertical alignment on the page or screen to signal the
Table 4. Example of an XML structure (Guwamu sentence)
<?xml version="1.0" encoding="ISO-8859-1"?>
<sform>ngaya banbalguya nhunga yilunha bawurra</sform>
<ft>I will spear this red kangaroo</ft>
<nt>pronoun co-occurrence with demonstrative and noun;
demonstrative inflected for accusative case</nt>
<word seq="1">
<morpheme id="053" seq="1">
<word seq="2">
<wgloss>will spear</wgloss>
<morpheme id="088" seq="1">
<morpheme id="012" seq="2">
Chapter 4 Data and language documentation 105
<morpheme id="028" seq="3">
<word seq="3">
<morpheme id="092" seq="1">
<word seq="4">
<morpheme id="009" seq="1">
<morpheme id="024" seq="2">
<word seq="5">
<morpheme id="161" seq="1">
106 Peter K. Austin
Again, we can view this representation using XML-aware software and see
its hierarchical structure; firstly in terms of a sentence made up of a se-
quence of words as in Figure 3.
Figure 3. XML structure display (Guwamu sentence, sentence level)
Now, if we view the information about words in the sentence in detail as in
Figure 4 we see that they consist of one or more morphemes in sequence
(notice that the triangle icon on the left margin changes from horizontal to
vertical as we move down the hierarchy).
Note that the information stored in the XML representation is extremely com-
pact but is still readable by humans and the structure can be recovered, even if
the software to display the data is missing; this is why XML is a good archival
format. For more information on archival encoding, see the Text Encoding Ini-
tiative ( or the resources websites listed at the end of this
book. There are numerous introductory textbooks for XML, though none of
them explicitly deals with language documentation issues.
Chapter 4 Data and language documentation 107
Figure 4. XML structure display (Guwamu sentence, word level)
3.3.2. Archiving sound and video
The formats for real-time media are subject to rapid technological change
and one of the major concerns of archives is to attend to refreshing files
(forward migration) so that they remain readable to the existing equip-
ment. For video, there are two internationally-agreed compressed formats,
namely MPEG2 and MPEG4, however there is no agreement about raw
formats which in any case are extremely difficult to store due to the very
large file size. For audio recordings, archives generally use uncompressed
CD-ROM-quality (44kHz, 16 bit) encoded as WAV files; some archives
also use 48kHz and/or BWF (broadcast wave format) where metadata is
bundled together with the audio. Note that MP3, RealAudio, or Windows
108 Peter K. Austin
Media Player formats are all compressed in a way that loses information;
they are useful for working and presentation (e.g. for publication, on web
sites) but not suitable for archiving.
More on sound archiving
There are a large number of well-equipped sound archives around the world,
ranging from regional, to national, to international coverage. Some, such as the
Austrian National Sound Archive have been established for a long time and
have extensive experience with material in older legacy formats. The Interna-
tional Association of Sound Archives (IASA) publishes lots of valuable and up-
to-date advice about archiving issues, and the Language Archives Newsletter
( focuses on archiving for linguistic research.
One of the ways that the presentation, publication, and distribution of rich
language documentations can be achieved currently is via multimedia
which links media, annotations (time-aligned transcriptions, analysis and
translations, hyperlinks) and metadata. One such format is linked files (in-
cluding HTML, MP3 sound clips, QuickTime, etc.) distributed via the
world wide web, but bandwidth can be problem for publication of media
files even small movies of a few minutes in a compressed format can be
megabytes in size and take a long time to download via slow connections
(the use of video streaming software can partially overcome this limitation).
There is also SMIL (Synchronized Multimedia Integration Language)
which is an application of XML to encode mixed media, text and image
information in a presentation form.
For highly complex richly annotated and linked media currently we
need to use multimedia platforms such as Macromedia Director, delivered
on CD-ROM or DVD as a publication format (see Chapter 15). Unfortu-
nately, the future of these formats and the carriers is unclear and how we
can archive multimedia for the future is also currently problematic. One
current major need is good multimedia players and ways for users to inter-
act with the rich documentations; it is necessary to model and design inter-
faces and access formats for various audiences. An example of such a for-
mat is the Spoken Karaim CD, described by Csat and Nathan (2003b),
Chapter 4 Data and language documentation 109
which presents video and audio recordings with accompanying transcrip-
tions, translations, glosses, lexicon, and cultural information, all of which
are linked and interactive. The interface enables users to explore their own
pathways through the corpus and to search, collect items of interest, back-
track, and interact with the corpus. It has a simple attractive interface that
enables maximum interactivity without forcing the user to digest too much
information, and has been used for Karaim language support in education,
language maintenance, and revitalization (Nathan and Csat, forthc.).
Figure 5 is a screenshot from a CD-ROM of conversational documen-
tary materials in the Sasak language of eastern Indonesia (Austin, J ukes,
and Nathan 2000) which is based on the Karaim model. The top-left win-
dow shows images of the consultants who worked on the corpus, and below
it a Sasak lexicon arranged alphabetically (clicking on an entry in the lexi-
con reveals full details of the individual item in the top left window in place
of the images), and on the top right is the Sasak transcription of the conver-
sation (colors indicate the two speakers, their voices can be heard in the left
and right channels respectively of the associated time-aligned digital stereo
recording). Below the transcription is a small central window displaying
morpheme-by-morpheme analysis and gloss for a selected item in the text,
and below that, a display of the free translation in English of the speaker
turns (again color-coded). In the lower bottom left of the display there is a
search facility which the user can employ to find occurrences of morphemes
Figure 5. Screenshot from a CD-ROM presenting Sasak conversational materials
110 Peter K. Austin
or glosses of interest throughout the corpus, and in the top left is a set of
buttons that produce pronominal inflected forms of verbs (via a morpho-
logical generator) when the user moves them over a selected lexical entry
in the top left window (see Chapter 15 and Nathan 2000b for further details
about the morphological generator developed for the Spoken Karaim CD).
Language documentation is an emerging field that involves recording,
analysis, annotation, archiving, and publication of rich and complex data.
By properly structuring the data representations and planning methods to
flow data between different formats and contexts, you can work produc-
tively with your materials, as well as publish and distribute them for others
and archive your resources to preserve them for the future. It is important
that all these aspects of a documentation project be incorporated in its plan-
Most of the material presented here has been road tested in lectures at
Frankfurt University, Uppsala University, the School of Oriental and African
Studies, and the DoBeS summer school; I am grateful for comments and
feedback from audiences on these occasions. A proportion of this chapter
derives from information on language documentation and guidelines for
grant applicants co-written by David Nathan and myself and published on
the Hans Rausing Endangered Languages website (see particularly http:// I am grateful to David Nathan for
permission to incorporate this material into the present chapter, and for his
detailed comments on an earlier draft which picked up a number of errors
and infelicities. Thanks also to J ost Gippert, Nikolaus Himmelmann, Robert
Munro, and Peter Wittenburg for suggestions for improvement of earlier
presentations. Any remaining errors are solely mine.
Appendix: Guwamu data structures
112 Peter K. Austin
and cultures it can be useful to look at textbooks on anthropology and ethno-
graphy, such as Brewer 2000, Wolcott 2004.
2. There are a range of video editing programs, including commercially available
software such as Adobe Premiere or freeware such as VirtualDub.
3. The Guwamu data was collected by the late Stephen A. Wurm in 1955 at Goo-
dooga in Queensland from the late Willy Willis and made available to me for
study in 1980. The annotations and glossing are based on Wurms translations
and my analysis of the materials.
4. The Shoebox/Toolbox tool automatically creates the appearance of vertical
alignment in its interlinear text function, though it actually stores spaces in the
data files to do so. Note that it does not store the relationships between the
aligned information and rather relies on the users implicit knowledge to inter-
pret these.
5. The chosen example is deliberately simple in order to present the main con-
cepts here; in practice lexical entries may have much more complex structures
and relationships.
6. A number of commercial and freeware editors are available; cf. the list attached
to this volume. The screenshots below show views within the ElfData XML
editor (see
7. A simple concatentative item-and-arrangement morphological model is adopted
here for purposes of illustration (this is the model assumed by the Shoebox
software); other morphological models could be used and represented in XML.
For further discussion of the structure of interlinear text and a proposal for rep-
resenting it in XML using the annotated graph formalism (Bird and Liberman
2001) see Bow, Hughes, and Bird 2003; and Hughes, Bird, and Bow 2003.
Chapter 5
The ethnography of language and
language documentation
Jane H. Hill
Documentary linguistics takes up a vision of the integration of the study of
language structure, language use, and the culture of language. Documentary
linguistics demands integration. If we are to succeed in sensitive documen-
tation, which by definition requires the deep involvement of communities,
we must incorporate a cultural and ethnographic understanding of language
into the very foundations of our research. Indeed, documentary linguistics,
because of practical necessity, may have a better chance of sustaining such
an integrated project than did its predecessors.
This chapter focuses on three requirements for the integration of the
study of the culture of language into documentary linguistics that have an
immediate practical relevance for this new discipline. The first is to move
forward with the foundational idea from Hymes (1971) formulation of the
ethnography of speaking, as the study of the way that language structures
and uses are diversely and locally organized in the cultures of local speech
communities. Documentary linguists need to be ethnographers, because
they venture into communities that may have very different forms of lan-
guage use from those of the communities in which they were socialized as
human beings or trained as scholars.
The second requirement is to attend to the cultural foundations of elici-
tation and second language learning specifically. Documentary linguists
undertake to inhabit a very peculiar role, that of adult second language
learner in communities that almost never encounter such a creature. Simi-
larly, their consultants enter into relationships that are without precedent in
their communities. Together, they constitute so-called communities of prac-
tice, local micro-societies that are very likely to produce emergent forms of
language and interaction that evolve very rapidly. Recent work on commu-
nities of practice, specifically learning communities, provide very useful
114 Jane H. Hill
theoretical foundations for understanding what is likely to go on in these
most dynamic of local systems, where goals and routines are negotiated at
the level of distinct individuals.
The last requirement is attention to language ideology. One of the reasons
history speeds up at the margins is that oppression and marginalization
and minority and indigenous language communities are almost by definition
oppressed and marginalized produces a special intensification of language-
ideological projects. These can silence the voices of speakers, render un-
tenable the presence of a researcher, or impede the distribution and imple-
mentation of the products of research, even within the community. Recent
advances in our understanding of the semiotics of language ideologies pro-
vide very useful tools for documentary linguists, who must be able not only
to identify and work among clashing ideological discourses, but assist
communities with what Nora and Richard Dauenhauer (1998) have called
ideological clarification to bring these discourses into line with what a
community truly desires for endangered-language resources.
1. The ethnography of language:
Relativity and the organization of diversity
Most linguists attend almost exclusively to what Michael Silverstein (1979
and elsewhere) calls denotational text. We can state the formal properties
of declarative vs. interrogative vs. imperative sentences, for instance, with-
out really paying much attention to the well-known fact that both assertions
and questions can function as commands, or that commands can be made
only under certain social conditions. But documentary linguistics on lan-
guages that are no longer taken for granted, where every construction car-
ries a heavy political burden, really does not permit us the luxury of this
particular reduction. We can find practical help in some of the foundational
principles of the ethnography-of-language tradition.
The first of these principles is that speech communities will differ not
only in manifesting different kinds of language structures, but in manifesting
different patterns of use. An ethnography of the distribution of registers,
speech-act types, and the like across the contextual landscape is critical to
linguistic documentation. For instance, certain kinds of syntactic construc-
tions may occur only in certain registers, so that even basic elicitation strate-
gies will require ethnographic preparation. Hymes well-known SPEAKING
heuristic provides a rule of thumb to help us notice patterns of usage. The
acronym SPEAKING abbreviates some of the major components of the
Chapter 5 The ethnography of language and language documentation 115
speech situation: Setting/Scene, Participants, Ends, Act Sequence, Key,
Instrumentalities, Norms, Genre (Hymes 1971; Saville-Troike 2003 offers a
more comprehensive compilation of analytic units in the ethnography of
communication). We need such heuristics, because patterns of usage are
not always noticeable or easily interpretable. While we encounter some
patterns as weird and jarring, others are so easily naturalized that they be-
come invisible before we ever notice them. I have two rules I share with my
own students: The first is to always assume that a difference is meaningful,
not natural. The second is never to assume that a difference is due to inade-
quacy on the part of speakers. Indeed, for the ethnographer, the feeling that
your interlocutors are rude, or stupid, or crazy, is an extremely useful signal
that you have probably bumped into a very interesting difference.
Let me give an example of a mistake of my own, where I assumed that a
difference was natural instead of meaningful. When I was working in cen-
tral Mexico and would visit my Nahuatl-speaking friends in their homes,
they would greet me with a peculiar intonation contour that starts in falsetto
and terminates in creaky voice. Women do a particularly exaggerated ver-
sion of this squeak-creak contour. I simply did not pick up on this as the
highly formal politeness that it was. Why? I think the reason is that most
people in this population are physically rather small. It is not uncommon
for older women especially to be less than 150 centimeters tall, and I often
felt like Gulliver among the Lilliputians. The falsetto voice of the squeak-
creak contour seemed a perfectly reasonable sound to emanate from these
tiny little women, and I never stopped to think that in fact on other occa-
sions they spoke in perfectly normal voices. I had been in and out of the
field in Tlaxcala for four or five years when the Mayanist linguist Louanna
Furbee asked me at a conference party if Nahuatl speakers used the same
polite falsetto that she had heard among the Tojolobales, a Mayan commu-
nity of the Mexican state of Chiapas. I had the sort of experience that car-
toonists represent by showing a lightbulb going on in the balloon above the
characters head; suddenly I could hear my friends saying, Coma:lehtzi:n!
Ximopano:ltitzi:no! Ximotla:li:tzi:no! and realized that what I had been
hearing was not a natural index of how small they were, but a highly mean-
ingful message expressing social distance and hierarchical order. They
meant not just Comadrita! Come in! Sit down! They also meant, We are
greatly honored by your presence. Fortunately my failure to understand
exactly what they were doing did not, I think, have much impact on my
work. But other cases of naturalization might have precisely such conse-
quences. It is for this reason that one of the ethnographic arts is to make
116 Jane H. Hill
strange, always to ask, Why did that just happen? How might it have
been different? Does it mean what I think it means? Can I find evidence in
favor or to the contrary? Staying for months on end in the hypothesis-
testing mode of making strange, rather than simply being there, is ex-
hausting, and we will always slip, but training in this ethnographic attitude
and how to sustain it is essential for documentary linguists. And the rule of
thumb Assume difference is meaningful, not natural is very helpful.
In contrast to differences in usage that are easily naturalized, some dif-
ferences in usage are highly salient and even startling. These are the kinds
of differences that are categorized under cross-cultural miscommunica-
tion, that lead people from one community to conclude that those in an-
other are uncivilized or stupid. I want to give an example that will not only
show how such differences are some of the most interesting for the ethnog-
rapher, but also to show how deeply embodied in speaker habitus the dif-
ferential patterns of language use are, and how departures from them will
seem almost physically uncomfortable. One extremely annoying feature of
my fieldwork in Mexico was working with people who treated appoint-
ments compromisos as less than fixed. When I tried to make appoint-
ments for interviews, people would smile happily and tell me to come a
una buena hora (literally, at a good hour, which turns out to mean
early), and assure me that primero Dios (if God wills it), they would be
pleased to be available to help me. About 60% of the time people in fact
kept such appointments. But on more than a few occasions I arrived for the
appointment only to learn that the intended interviewee was far away on
some errand that could have been easily predicted, such as a pilgrimage to a
saints festival that was fixed on the annual calendar or attendance at a
market that occurred on the same day every week without fail. I knew better
than to think of them as rude or insincere, and began to think about why this
happened. Eventually I developed an account of it in terms of the theory of
types of face from politeness theory (Lakoff 1973; Brown and Levinson
1987), which was very helpful in understanding other communicative prob-
lems as well. Put briefly, these communities were heavily biased toward
attention to so-called positive face, everybodys right to feel wanted and
liked. In local terms, to make a social commitment that you could not keep
was a fairly minor white lie, while to say No to someones face, even
very politely and with elaborate excuses, was a major threat, a threat to
positive face. The threat to my negative face (the right to the autonomy that
would permit me to avoid inconvenience) was practically irrelevant. I
would be annoyed when I found myself 50 kilometers from my home base
Chapter 5 The ethnography of language and language documentation 117
in front of a house compound that was deserted and locked up tight, but
nobody would be there to notice. In fact, I learned that in general threats to
negative face hardly counted at all in the Mexicano communities. I learned
as well that when there is any possibility of a No in a matter where an
insincerely-uttered Yes would create inconvenience of a kind intolerable
even for these people, that intermediaries were sent to pose the question. So
I did have a reasonable understanding about what was going on, and even
published an article on the local culture of politeness (Hill 1980). I didnt
make the mistake of thinking of local people as rude and inconsiderate. But
now comes the tough part I found it practically impossible to tell the little
white lies about keeping appointments that everybody else used. If some-
one said, Next week, lets go and visit the church at Ocotlan, my daughter
needs a ritual cleansing and you can take us there in your pick-up truck,
and I knew that next week I was expected in Mexico City at a professional
conference, I would carefully politely, in my terms, incredibly rudely in
theirs explain that I had a previous engagement but might be able to visit
the Virgin of Ocotlan another time. I knew the Primero Dios routine per-
fectly well, understood its deep cultural foundations, and simply could not
do it. In my cultural calculus, which I could not seem to set aside, the threat
to negative face the idea that someone might be inconvenienced if I didnt
show up was truly dire, while saying No politely to someones face was
a very minor matter. Although I attempted the Primero Dios routine occa-
sionally when I thought the matter at hand was a fairly light one, I suspect
that I acquired a reputation as a rather rude, stuck-up, and negative person,
but I simply couldnt help it. The American linguist Doris Bartholomew,
who worked for 40 years with Otomi speakers in a part of Mexico near my
own field site, told me that she finally learned to accomplish this particular
flavor of social lie with a straight face, but that it pained her every time.
The lesson of this case is that diversity in usage is not merely colorful, or
interesting, but that it can be very, very hard to live with, even for a person
with extensive anthropological training.
A second foundational presumption of the ethnography of language is,
of course, that speech communities are not linguistically homogeneous, but
are organizations of diversity. The idea of the speech community as an
organization of diversity is a very useful one for students of minority
languages who encounter communities that are at the very least bilingual.
Especially important, of course, is the distribution of the linguistic resources
of the minority language versus the other language or languages across the
repertoire of possible speech events and acts, across genres, across the kinds
118 Jane H. Hill
of speakers and addressees, across channels, across affective keys, and the
This organization of diversity has very practical consequences for our
work. Again, we can note the problem of naturalization of difference. I
never really learned Nahuatl very well when I was working in Tlaxcala, the
reason being that hardly anybody ever spoke it to me until I had been re-
turning to the communities off and on for almost a decade. This seemed
reasonable; I speak halfway decent Spanish, and so do they, so it was just
easier for everybody to use that language and that was how I initially
thought about what was going on. But in fact this was much more than just
a matter of least effort. People spoke Spanish to any stranger or outsider,
no matter what their native language might be. It was quite astonishing to
go to a public market and hear obviously indigenous sellers speaking heav-
ily accented and even ungrammatical Spanish to equally obviously indige-
nous buyers throughout all the stages of the bargaining process until the
very end of the event, when the deal was clinched and a few words of Na-
huatl would be exchanged to express the solidarity that came in the moment
of a successful transaction.
The sociolinguistic conventions that distributed Nahuatl and Spanish
across the local contextual landscape would have had the most profound
effect on my fieldwork had I been documenting grammar rather than lan-
guage shift, since they would have made it very difficult for me to hear
certain kinds of constructions or access certain lexical domains. I think it
has been shown that gaining a speaking competence in a language under
investigation is a prerequisite to truly sensitive description and analysis.
But it was very difficult to do that in the Nahuatl communities. I did try,
but without much success. I had the opportunity once to talk to a local vet-
erinarian who had learned to speak Nahuatl, not only to facilitate his work,
but because he was deeply interested in the language and its history. He
discovered, however, that people did not respond well to him when he
spoke it to them. He said, When I speak it, they dont respect me. He had
unwittingly run afoul of a convention of metaphorical switching that in-
volves the use of Spanish even by Nahuatl speakers when they discuss
technical topics, and, unfortunately, also of linguistic insecurity associated
with Nahuatl, the idea that people who speak it are not as good as people
who speak Spanish. If his interlocutors were relative strangers, he was
probably even insulting them by suggesting that they did not know Spanish.
Finding contexts for speaking the language in such circumstances requires
the most careful analysis of how the various languages in a community are
Chapter 5 The ethnography of language and language documentation 119
deployed, so that the face and reputation of all interlocutors are properly
attended to. Indeed, any community may have certain kinds of speech
events in which outsiders simply cannot successfully participate. For this
reason, and also because it is both ethical and sensible to build local capac-
ity, it is generally preferable to train local native speakers in recording
techniques and have them do most of the basic recording themselves.
2. Documenting languages in a community of practice
The kinds of diversity in patterns of usage studied by ethnographers of lan-
guage have often been treated as relatively stable in communities. But
documentary linguists must also attend to contexts in which new conven-
tions and forms of diversity can emerge very fast: the contexts of elicitation
and adult second-language acquisition that are at the center of their work.
Linguists who do field work have understood for many years that elicita-
tion is a collaborative process that requires mutual adaptation on the part of
researcher and consultant. Early attention to the problem of what happens
in elicitation and in the kind of adult second-language learning that docu-
mentary linguists undertake focused mainly on problems that would
emerge from different patterns about matters like asking questions. Charles
Briggs Learning how to ask (1986), where he argues that the acquisition of
new information must be embedded in local social understandings of who
is permitted to ask what kinds of questions to whom, is a classic discussion
of this issue. Some anthropologists, including Briggs himself, have found
that the best way to work is to undertake what is locally understood as an
apprentice role. I dont think this approach is a solution to the problems
faced by documentary linguists. Communities may have well-established
institutions for apprenticeship in wood-carving or divination. They will
certainly have very well-established patterns for first-language socializa-
tion. But it is highly unlikely that they will have well-established patterns
for adult second-language learning or elicitation. And certain local patterns
for adult learning may be quite inappropriate to the documentary linguists
task. A very good example is the routine of adult acquisition of ceremonial
orations and creation accounts among the Tohono Oodham of Arizona
described by Ruth Underhill (1946). A man (it was always a man) who
wished to learn a particular oration would approach someone who knew it
and present a very important gift, consistent with the significance of the
target text blankets, a rifle, a horse. If the source accepted the gift, he
120 Jane H. Hill
would then recite the oration: once. The job of the apprentice was to listen
with the most intense focus, to try to master as much of the oration as pos-
sible from this single recitation. Because if he needed to hear it again, an-
other expensive gift would be required. This particular method really would
not work for most documentary linguistics in fact, it has been tried. The
linguist Bill Graves described in his dissertation (Graves 1988) encountering
a Pima speaker, an immensely knowledgeable elder who had been very
highly recommended by everyone, who chose to organize his role as lin-
guistic consultant along the lines of the traditional model for learning that
Underhill had described. Graves had to arrive early, because if he was even
five minutes late for an appointment Mr. Brown would refuse to talk to
him. Graves had to listen with the most extreme care, because Mr. Brown
spoke very quietly, did not like repeating things, and refused to explain
things. Mr. Brown would occasionally rise abruptly and terminate a meeting
if he was annoyed. Finally, Mr. Brown required cash up front at every
meeting. After a summer of this sort of thing, Graves reluctantly concluded
that Mr. Brown was a bit too traditional and sought a consultant who was
willing to compromise.
The absence of established routines for adult second-language learning
and linguistic elicitation in most minority-language communities makes it
obvious that elicitation will produce some kind of new system that emerges
in collaboration. New theory in learning how to learn shows that such
emergent systems are always produced in learning communities, even in
ones that seem well-established and stable. Learning communities belong
to the category of social organizations that have come to be called com-
munities of practice. Eckert and McConnell-Ginet (1992: 464) provided a
founding definition of this entity: A community of practice is an aggregate
of people who come together around mutual engagement in an endeavor
practices emerge in the course of this mutual endeavor. Meyerhoff (2002)
has usefully summarized the theory of communities of practice, which have
become an important unit of analysis in recent variationist sociolinguistics.
The key elements of Eckert and McConnell-Ginets definition are mu-
tual engagement which may be harmonious or conflictual, and the en-
deavor, which Meyerhoff defines as a jointly negotiated enterprise, which
must be reasonably specific. Finally, a community of practice will develop
a shared repertoire of normative practices and interactional resources that
are the cumulative result of internal negotiations (Meyerhoff 2002: 528).
These subcomponents are in dialectical relationship: mutual engagement
both makes possible, and is made possible by, the negotiation of a joint
Chapter 5 The ethnography of language and language documentation 121
enterprise, and normative practices are negotiated and in turn facilitate ne-
gotiation and mutuality. The communities of practice in which documen-
tary linguists work are, then, different from the speech communities of
the classic ethnography of language. They may be constituted only for par-
ticular purposes, they may be ephemeral, and they can form and reform,
being salient at certain times and places and irrelevant in others. Further-
more, single individuals may belong to several of these, and their practices
and routines may overlap to some degree.
Wenger (1998) found that successful communities of practice exhibit
certain properties that are highly relevant to the documentary linguistic
enterprise. These include
1. rapid propagation of innovation;
2. jargon and shortcuts to communication;
3. the development of a certain very local insider perspective on the world;
4. a repertoire of insider resources and identifying markers such as jokes,
stories, and specific tools and representations.
Specifically linguistic variables such as phonological elements, lexical
items, and routinized phrases are a very important part of the emerging
normative order within communities of practice. That is, linguistic re-
sources evolve within communities of practice and may be quite specific to
The problem for the documentary linguist is to be aware of these emer-
gent properties, and to try to remain conscious not only of her own role in
such emergence, but of what consultants are doing as well. To think
through thoroughly the implications of the evolving theory of the commu-
nity of practice for the documentary project lies beyond the scope of this
chapter. But I will advance a couple of simple and suggestive examples
from my fieldwork with Cupeo, undertaken more than 40 years ago when
not even the tiniest ray of social-constructionist light had yet penetrated my
American structuralist training. I spent nearly all of my time working with
a single consultant, Roscinda Nolasquez, who was then in her mid-sixties
about the age I am as I write this. I thought of her as very old. We spent
hundreds of hours together, and became very intimate, a classic community
of practice of two, in which marginal members occasionally participated for
brief periods.
My first example of an emergent property within our community is the
fact that my fieldnotes, to my extreme embarrassment today, are very messy
122 Jane H. Hill
and often do not have glosses, in spite of the fact that I had some training in
field methods. This is an excellent example of a rapidly innovated short-
cut. In 1962 I was immersed in the language and had no trouble under-
standing anything in the notes, and really didnt need to systematically
gloss everything, and could use ellipses for predictable (to me, then) parts
of utterances. And of course this was also fine with Roscinda Nolasquez,
who was very quick-witted and did not enjoy waiting while I carefully
wrote things down and glossed them. We had developed a sort of rapid
work rhythm and my sloppy note-taking was one of its dimensions. And I
note that Im not the only person who ever did this. Shortly before his un-
timely death in 2001, Ken Hale turned over his field notes on Mountain
Pima from the late 1950s and early 1960s to my graduate student, Luis Bar-
ragan, who works on the language. Luis was very moved when Ken offered
him the notes, and awed when he discovered that only six pages into the
notes Hale, who of course was famous as a linguistic savant, stopped writing
glosses. I assure you that my glosses for Cupeo are fairly dense for many
more pages than six, but after two or three weeks of work they became
scantier and scantier. This is exactly what we would expect from findings
about communities of practice, where shortcuts emerge very rapidly, but of
course what it means is that my notes (and Ken Hales) are now very diffi-
cult to use. I was so immersed in my local formation of community in the
summers of 1962 and 1963 that I did not think about how, forty years down
the line, there would be nobody alive to check the odd form that I really am
not sure about any more. So one of the lessons is that documentary linguists
really do need to keep in mind, in the face of the profound force of local
social construction in the linguist-consultant relationship, that they belong
to a larger community with its own needs.
And of course consultants are contributing to the emerging structuration
of the community of practice and its products. To discuss one of these con-
tributions by Roscinda Nolasquez, I need to give you some background on
Cupeo demonstratives. Cupeo has three demonstratives: ii, a clear
proximal, axwesh, a clear distal, and a mystery demonstrative et. In writing
my reference grammar (Hill 2005) over the last few years, I had to figure
out what on earth the mystery demonstrative meant. What I determined was
that et and axwesh are contrasted as distal-proximal and distal-obviative.
Part of the evidence was that only axwesh appeared in narrative, except for
passages of reported speech, in which et could appear. The other bit of evi-
dence was that et was absolutely ubiquitous in elicited sentences, where
axwesh never appeared. For instance, in one section of field notes I was
Chapter 5 The ethnography of language and language documentation 123
investigating which noun stems would accept locative suffixes directly, and
which required relational noun constructions. I figured a fly could sit on
just about anything, and put a fly in all sorts of absurd places on the bas-
ket, on the acorns, on the string, on the berries, on the cow, etc., in sen-
tences for Roscinda to translate. She always translated English a fly, as et
kual that fly the distal-proximal (virtual) fly to which we were both
paying attention. The combination of the presence of et in elicitation and in
reported speech in narrative suggested that its function was distal, but
within the zone of attention of discourse participants. On the other hand,
axwesh meant distal, but not available to discourse participants. Hence, et
kual, the mutually-imagined fly of the context of elicitation, but axwesh
isily that coyote, a character of the mythic time who appears in narrative.
With my new-found understanding of the demonstratives, I am now able
to more fully understand Roscinda Nolasquezs goals, and why she was
willing to spend so much time with me. At the time I had completely natu-
ralized the idea that an American Indian community should include only a
few elderly speakers of a heritage language. As far as I could tell there was
almost no interest in the language; Roscinda never mentioned any regrets
about being one of the last speakers, and handled most of her life in Eng-
lish. Indeed, she positively avoided talking to a couple of other women of
her age who were speakers, because she didnt like them. She called what she
did with me teaching. But, looking at my notes forty years later, I could
see that she was trying to accomplish much more: She was documenting,
recording an archive, although she never said as much. And the distribution
of the demonstratives became one of the key pieces of evidence for this.
Roscinda really liked best of all to record stories and histories. After a
couple of months of work, she said that she wanted to tell about how the
Cupeo had moved from their original homeland at Kupa, Pal Atingve, to
their reservation at Pala. This is a dreadful story, of legal machinations by
greedy Whites and a desperate battle by the Cupeo to keep their lands,
which included valuable hot and cold springs in an arid region of San Di-
ego County in southern California. Roscinda was nine years old in 1903
when she and all her relatives were packed into wagons and moved out of
their beautiful village with its sturdy adobe houses and inviting pools of hot
and cold water and moved to Pala, to live in tents in the flea-ridden willow
thickets along the San Luis Rey river designated as their place of exile. She
told the story of the removal on three separate days. On the first day, she
narrated almost entirely from her own point of view, using almost no repor-
tative evidentials. When she resumed again on the second day, she began
124 Jane H. Hill
by labelling her talk as aalxi reciting history. In this section and in the
third section, the reportative evidential appears frequently, even where she
is describing scenes in which she played a role (such as the rescue of her
pet cats). On the first day, narrating as a sort of conversational account of a
personal experience, she uses the base eve-, the inflectional base of et, al-
most exclusively for the locatives. That is, even though the places being
referred to are not in the immediate discourse context, she refers to them
in the voice of an interlocutor in dialogue with the listener (in this case, me,
Jane Hill), who has been initiated into the world of the narrative and is
taken to share her point of view. But in the second and third telling, the base
eve- is entirely absent, and all references to place are with the base axwa-,
a-, the locative bases of the obviative demonstrative axwesh. That is, in her
second and third telling, Roscinda Nolasquez speaks in the voice of an
historian; she animates a tradition, rather than engaging directly with me
as her interlocutor. And it is clear that her descendants recognized what she
was doing. One of the ways that Cupeo have always used their oral tradi-
tion is to borrow lines from it to make songs. And singers today have taken
lines from my recordings of Roscinda Nolasquezs account of the removal.
When I returned to the community a year and a half ago, I was treated to a
performance of men singing to rattles, and was very moved to encounter a
beautiful new song, composed for the 2003 centennial of the removal, that
used a line that appears in her telling: Petaamay chemixani chemtewa$h
Kupangax We lost everything from Kupa.
In summary, the moral here is that what Roscinda Nolasquez took to be
the mutual goal of the community of practice that we formed in the sum-
mers of 1962 and 1963, to document her language and its traditions, shaped
even very fine details of her speech. In elicitation, where the sentences
would have no historic significance, her demonstrative was et. In reciting
texts where the sentences would have historic significance, she used obvia-
tive axwesh. So the notion of the community of practice teaches us that the
ethnography of language in documentary linguistics must take as its site for
study not only the organization of diversity in the speech community, but
also organization and patterning that is emergent, including emergent in the
context of elicitation and language learning itself.
Chapter 5 The ethnography of language and language documentation 125
3. Language ideology and documentary linguistics
The last set of ideas to be presented here involve how we can attend to the
very fast-moving dynamics of language ideology in endangered language
communities. Something of the significance of language ideologies has
been recognized for a very long time. For instance, the early ethnographers
and linguists working in indigenous North America discovered that accounts
of the creation were fully performed only in the winter, and so they could
not be elicited in the summertime; indeed, people thought it was dangerous
to do so.
But the early ethnographers thought of this kind of ideologically-driven
pattern as simply one more stable difference between them and their con-
sultants. Today we are finding, though, that these ideological systems can
evolve and spread in communities with astonishing rapidity. I will discuss
an example that unfortunately I had to observe at immediate second hand
the contretemps around the publication of the Hopi Dictionary, for which
my husband Kenneth C. Hill was project director. The Hopi, who live in
northeastern Arizona, are the western-most of the Puebloan societies. Paul
Kroskrity (1998) has shown how in the Puebloan communities of the U.S.
Southwest, all indigenous language tends to be ideologically assimilated to
the prototype of ritual language, the language of the kivas. Kiva knowledge
is not shared with people who have not been initiated into the relevant ritual
societies, and many of the pueblos have decided that their language is
strictly for insiders. Indeed, one Hopi linguist, briefly employed at the Uni-
versity of Arizona about 30 years ago, refused to teach the language to non-
Hopi students. A second point is important in understanding the dictionary
controversy: During the period when public ceremonies are underway, the
Hopi villages construct a sort of anti-market economy that extends the
practice of the kiva to the entire village: nothing is sold, everything that one
might need is given as a gift.
This was the background ideological context in which my husband
worked for more than a decade with colleagues Emory Sekaquaptewa,
Ekkehart Malotki, Mary Black, and others to compile the great dictionary
of Third Mesa Hopi (Hopi Dictionary Project 1998). During the period of
the research only the most minor difficulties appeared; all tribal officials
were involved and participating. They all knew that the project was the
brainchild and dream of a senior Hopi, Emory Sekaquaptewa. The diction-
ary research group was extremely careful of Hopi ritual sensitivities, and a
committee of Hopi elders made sure that the dictionary would not contain
126 Jane H. Hill
anything that would be in violation of ritual prohibitions. Arrangements
were made to distribute dictionaries free to schools and at a greatly reduced
price to Hopis, and all royalties were to be paid to the Hopi Foundation, a
non-profit foundation dedicated to Hopi education. However, when the
publication date of the dictionary neared, the University of Arizona Press
proudly published a handsome full-color brochure as an announcement of
this major work, in which a price of $80.00 for the volume was mentioned.
This announcement finally made public and unavoidable what everyone
had managed to keep in the background that the dictionary, which had
been largely funded by money from the U.S. governments National Endow-
ment for the Humanities, would be available to non-Hopis, and that it would
be sold. This precipitated a difficult year during which the Hopi Director of
Cultural Affairs, Leigh Kuwanwisiwma, supported by many other Hopis,
argued that the dictionary should not be published at all because the Hopi
language should not be bought and sold, and certainly not for the benefit of
non-Hopis. Eventually the political faction that supported the dictionary
prevailed and it was published, but this result was by no means guaranteed
(Hill 2002 discusses this episode).
Recent theoretical work on linguistic ideologies can help us to under-
stand this sort of episode, and perhaps to work better and more compre-
hendingly with community members who support documentation of their
heritage language in dictionaries and development projects like language
classes. Susan Gal and Judith Irvine (1995) showed that language ideolo-
gies nearly always invoke three major semiotic principles. These are ico-
nization, recursiveness, and erasure. In iconization, elements of
language are shaped to match elements in the world and by erasure, any
dimension of language that does not conform is ignored. By recursive-
ness, iconization operates throughout the system, bringing elements at
every level into line. Michael Silverstein (1996, 2003) has pointed out the
operation of what he calls the dialectic of indexicality, by which indexi-
cality is reshaped as reference. Miyako Inoue (2004) has shown how cer-
tain kinds of social circumstances episodes of rapid political economic
change, in which identities are being rapidly restructured heighten the
rapidity and strength of these processes.
Using these theoretical tools, we can say something about the Hopi case,
in which a language and an associated way of life that had always been
taken for granted becomes the object of the most acute attention and reflec-
tion. Such attention and reflection, and the iconization principle, yields an
exaggerated purism. In the Hopi case, by iconization the Hopi community
Chapter 5 The ethnography of language and language documentation 127
itself is assimilated to the prototype of the kiva, and the language is assimi-
lated to the language of the kiva. The words of the language become like
kiva objects, which should never be seen by non-initiates. Just as ritual
practice and ritual talk that occurs in the kiva is never shared with outsid-
ers, the language should not be shared with outsiders. Just as the kiva and
even public ritual is a site where nothing is bought and sold, and everything
is generously shared, no price can be put on the language, so it cannot ap-
pear in artifacts that bear a price. In this case we can see the dialectic of
indexicality: the language, which indexes Hopi identity, must be shaped so
that it refers perfectly to that identity: it must be ritually normalized, just as
the identity itself becomes the identity of a ritual participant. Thus a Hopi
word in an $80.00 dictionary published by a White institution, truly makes
no sense; it is, in the words of the anthropologist Mary Douglas (1966),
matter out of place, a form of pollution, and incites profound reactions in
those who are offended.
Anyone who works in indigenous North America, where communities
are only a few generations removed from a true genocide and continue to
confront severe economic marginalization as well as racism, will be able to
recount many examples like the case of the Hopi dictionary. The logic of
language ideology outlined above predicts that documentary linguists will
encounter similar episodes in communities that thus far have been reasona-
bly receptive to documentary projects. The theory also predicts the general
shape that such ideological projects are likely to take: they will assimilate
the resources of language to some image of purity and essence, ritually vali-
dated, and will attempt to remove the language forever from history. Need-
less to say, such ideological projects happen everywhere. However, the com-
munity of speakers of Norwegian, or French, or German is robust enough to
support the occasional outburst of purism without catastrophic results. In-
deed, purism can be a positive asset if the community has the resources to
do something about it; the examples of Israeli Hebrew and Catalan come to
mind. But small minority-language and indigenous communities may not
have such resources, and the state of the language may not give such com-
munities time to work through such episodes and achieve positive and dura-
ble syntheses. So research specifically on such episodes, and how to handle
and understand them, should be a part of our work. Leanne Hintons work
on vernacular orthographies (Hinton 2003), a focus of ideological construc-
tion that has stymied language development in some American Indian com-
munities for decades, seems to me a perfect example of the combination of
theoretical penetration and practical recommendation that we require.
128 Jane H. Hill
4. Conclusion
Training in documentary linguistics is very demanding, requiring as it does
expertise in linguistics, in anthropology, in recording technologies and data
management, and in a myriad other ancillary sub-fields. What I hope to
have made clear, though, is that its anthropological component needs to
include training not only in the foundations of ethnographic practice in
making strange, and in learning to notice and manage sites of miscom-
munication but also in such arcana as the emergent formation of norms
within a community of practice, and in the semiotics of ideology formation.
The problem for us is to make these insights as straightforward for our stu-
dents as is their training in phonology, morphology, and syntax. I hope that
we will succeed in doing this. Just as recent advances in linguistic typology
have immensely facilitated the recognition of the linguistic structures that we
encounter in field work, advances in the study of cultural processes can help
us organize our work and function more successfully, both as linguists and
as friends, colleagues, and advocates for minority-language communities.
1. Boas (1911a) great programmatic statement in the Introduction to the
Handbook of North American Indian Languages was followed by scattered
work by Boas, Sapir, Whorf, and a few others on cultural dimensions of lan-
guage use. But this work is barely integrated with their extensive work on the
description and documentation of grammar. In the 1960s Dell Hymes, John
Gumperz, and their colleagues tried to reopen the Boasian project, proposing
what Hymes called an ethnography of speaking, a sociolinguistics that
took grammar and phonology to be simply one dimension of a pragmatics, one
way that speakers actually use the material stuff of language. The diverse lines
of work that Hymes enumerated as the foundations of a unified discipline exist
today in over a dozen fragmented subspecialties with only occasional commu-
nication between them. Furthermore, very few people who emerged from the
ethnography of speaking tradition, even those who have worked on indigenous
and other minority linguistic communities, have made substantial contributions
to linguistic description and documentation. Although it is a bit early to tell,
the European pragmatics movement exhibits the same kinds of tendencies
toward subspecialization, and its adherents, as far as I can tell, do not seem to
be much involved in documentation of language organization at levels other
than that of rhetoric and discourse.
Chapter 6
Documenting lexical knowledge
John B. Haviland
Lexicography, the practice of documenting the meanings and uses of
words (literally by writing them down), is, through its products, per-
haps the most familiar branch of linguistics to the general public. It is also
an ancient and much theorized activity. In the Boasian trilogy for language
description of grammar, wordlist, and text, it is surely the dictionary whose
compilation is most daunting. The process begins with a learners first en-
counters with a language, and it ends, seemingly, never. Worse, it is an en-
deavor fraught with doubt, centrally about when enough is enough both for
the whole when one should assume that the basic or most common words
of a linguistic variety have been captured and characterized but also for
any single putative dictionary entry, given the apparent endless variety of
nuance and scope for words and forms, not to mention the idiosyncrasies of
compound or derived expressions. Moreover, despite bounteous speculation,
from many disparate linguistic traditions, on what metasemantic devices
one might employ to capture meanings, despite multiple models and exam-
ples of the results of dictionary-making, and despite ample experience, for
most of us, in the ordinary business of explaining the meanings of words,
doubt is likely to assail us on every single effort: have we said enough?
have we forgotten something? did we get even this single word right?
This chapter introduces techniques and concepts relevant to producing a
lexical database as part of a language documentation project. I concentrate
on a series of doubt-producing obstacles for the field lexicographer, with
some suggestions about how at least to address, if not to overcome them.
My coverage is deliberately partial. I draw heavily on my own fieldwork in
Mexico and Australia, to consider three general issues. First, I review fa-
miliar morals about the nature of word meaning concepts from linguistic
philosophy that are easy to forget in the heat of the lexicographic moment.
Second I consider semantic metalanguages proposed to deal with different
kinds of meaningful elements, from functional to lexical and from roots
130 John B. Haviland
to stems. Third, and most centrally, I review techniques for systematically
extracting lexical knowledge. I largely ignore several related and important
topics: lexical variation and how to represent it (see Chapter 5), ideological
issues inescapably involved in promulgating any dictionary (see again
Chapter 5, and the discussions in Frawley et al. 2002), and wider issues in
lexical semantic theory (about sense relations, problems of extension vs.
intension, etc.), which underlie all lexicographic practice but are beyond the
present scope. I begin with a highly selective review of published materials
on lexical knowledge, especially as relevant to documenting endangered
1. Lexicography and its products
In addition to a large theoretical literature on meaning, there is a practical
tradition of dictionary-making that has spawned handbooks and histories,
as well as essays on the lexicographers craft. These rarely provide solace
for the field worker.
The lexicon, in modern linguistics, has come to mean a repository for
otherwise anarchic facts, an inventory of arbitrary pairings of pronuncia-
tions with bundles of features. It is where language stores its idiosyncrasies
and irregularities. What systematicity there is to the lexicon so conceived
derives from feature systems themselves, taken to represent syntactic and
semantic patterning underlying surface lexical forms. Studying such pat-
terning is the usual province of lexical semantics, which catalogues various
relations between the senses of members of different subsets of lexical
forms (Cruse 1986), systematic properties of surface word classes or parts
of speech, facts of argument structure, diathesis, and the like. The main
contribution to linguistic theory of much empirical lexicography has been
in elucidating semantic and syntactic interrelationships at the level of the
surface word (Levin 1993).
Field linguistics, once the province of anthropological linguists, gave rise
to much of the underlying conceptual apparatus of lexical semantics. Early
theories pursued an analogy between phonological features and the com-
ponents of meaning in structured sets of folk terminology, from kinship
to ethnobotany, from pronoun systems to verbal typologies. The classic
studies of ethnoscience investigated culturally elaborated lexical systems,
particularly in natural domains like ethnobotany. Further empirical inspi-
ration for semantic theorizing came, for example, from the languages of
Chapter 6 Documenting lexical knowledge 131
Aboriginal Australians, celebrated for their linguistic acuity and creative
genius. Dyirbal verb semantics and the properties of special Dyirbal
mother-in-law vocabulary for affinal avoidance led Dixon (1971) to pos-
tulate a fundamental difference between semantically basic or nuclear
words, requiring some sort of decomposition into sublexical meaningful
dimensions, and non-nuclear words which could be defined in terms of the
nuclear words plus other devices of the grammar. Verbal play in ritual lan-
guage games learned by Warlpiri and Lardil initiates suggested that Abo-
riginal ethnolinguists had developed sophisticated semantic analyses of
ordinary vocabulary (Hale 1971, 1982).
The classic reference manual on lexicography is Zgusta (1971).
Of special
interest to the field lexicographer is Frawley et al. (2002), a collection of
essays by practicing lexicographers working on American Indian lan-
guages, which also considers problems in creating a lexicographic practice
in communities without one.
These range over theoretical issues in lexical
semantics (the nature of definition, the range of lexical knowledge that
speakers possess or a dictionary might include, and the interplay between
diachronic and synchronic lexical facts); to questions of representational
form, to sociopolitical issues in dictionary making (for whom is a dictionary
compiled and for what purposes; or, what kinds of sociolinguistic catego-
ries specialized speech genres, gender or class specific lexical forms, for
example are to be distinguished). These works go well beyond the limited
selection of topics addressed here.
The field linguist need not be a semanticist, except for practical pur-
poses, and lexicography in the service of documentation needs to strike a
balance between opposing desiderata. For example, in what sense is com-
pleteness however that might be defined for an endangered language
something to strive for? What about the mix of theoretically versus practi-
cally motivated metalanguages for representing lexical information? In the
field one should avail oneself of all possible tricks: bilingual dictionaries,
for example, can often start with existing word lists, in either the source or
the target language, and there is no reason to stand behind strict methodo-
logical principles or purism in generating lexemes for incorporation into a
lexical database.
Different lexicographic products reflect different starting points and
goals for compilers of lexical databases. Zgusta (1971) dedicates separate
chapters to the distinct issues involved in compiling polylingual (usually
bilingual) versus monolingual dictionaries. The contrast, and the choice of
132 John B. Haviland
which languages to include in a multilingual dictionary, raise obvious ques-
tions. For what sort of use is a lexical database produced? What knowledge
on the part of the user is presupposed in its design? Why did its compiler
produce it in the first place? Let me review several different kinds of field
dictionaries, related to my own research in Mexico and Australia. Especially
useful to me have been the introductions to two Tzotzil dictionaries by
Robert M. Laughlin (1975, 1988), one modern and the other based on a
sixteenth-century work.
In what I call the Colonial tradition, collecting vocabularies was always
a vocation of imperialists, often an accidental byproduct of exploration and
conquest. Explorers collected flora and fauna, and often they also collected
words. Somewhat less innocent were the wordlists created explicitly to aid
in conversion, conquest, and control. The friars dictionaries of Indian lan-
guages in the New World, or vernacular vocabularies destined for colonial
bureaucrats in Africa and India, represented unabashedly instrumental
documentation, often of languages whose eventual endangerment was a
byproduct of colonial expansion in the first place. Such wordlists were
plainly not made for the speakers of the languages so documented.
The missionary tradition continues to produce many field dictionaries,
and reading them gives some flavor of the purposes and populations served
by this particular lexicographic practice. In Chiapas, Mexico, the Summer
Institute of Linguistics a Protestant Bible-translating organization has
published many dictionaries of Indian languages from the region (Delgaty
and Ruiz [1978] for Tzotzil, Aulie and Aulie [1978] for Chol, to mention
just two), and they are widely used even by speakers who do not share the
religious beliefs of the translators. Such dictionaries are subtly infused with
cultural metacomment and religious ideology.
Here, for example, is a translation of the entry in Aulie and Aulie (1978)
for the Chol word ajaw, reflex of a root which means lord, master, God
in other Mayan languages. According to the Aulies, the Chol word means
espritu malo de la tierra, and they go on to comment:
They call it lak tat our father. It is believed that a person can make a pact
with it. Such a person can make requests of the spirit for or against another.
The person who establishes such relations with the ajaw is called a sac-
ristn. If a man or woman offends the sacristn, the latter appeals to the
spirit to curse the other, and in a short time the other person will die.
Here both the lexicographers voice and its underlying ideological accent
are plainly on display. Thus, for the Aulies there is no apparent dissonance
Chapter 6 Documenting lexical knowledge 133
between their proposed gloss, evil spirit of the earth and the alternate
locution our father (with a first-person plural inclusive prefix). Further-
more, the they of the comment is clearly someone other than the diction-
ary writers (though perhaps not different from the dictionary users). Note
finally an interesting voicing contrast. Although the possibility of making
a pact with ajaw is cited as something believed (presumably by them),
the consequences of the appeal on the part of the hypothetical sacristn (the
term itself a Spanish loan introduced into Chol during the Catholic conver-
sion of Chol speakers following the Conquest) are given a different episte-
mological status: in a short time the other person will die. The dictionary
thus incorporates different, perhaps mutually contradictory stances towards
Chol beliefs and practices into the lexical entries themselves.
Slightly different is the ethnolinguistic lexicographic tradition, whose
immediate origins are in ethnographic research. Sticking again to highland
Chiapas, Laughlins exhaustive dictionary of contemporary Zinacantec
Tzotzil (1975) has the form of a traditional bilingual dictionary. The first
section gives extensive glosses (in English) of Tzotzil words, both derived
and simple, and arranged under their putative underlying roots. There fol-
lows an English index to the Tzotzil section. Laughlins dictionary has over
35,000 Tzotzil to English entries, making it one of the largest dictionaries
of an indigenous language of the Americas. However, it is a bilingual dic-
tionary in Tzotzil and English, limiting its direct use to the handful of peo-
ple who speak those two languages.
It is also a defiantly dialect-bound
(and even gender-bound) dictionary, documenting the way middle-aged
men spoke during the 1960s and 1970s in just the single municipality of
Zinacantn, arguably a minority variant of what has since become a domi-
nant Indian language in highland Chiapas with a much larger number of
speakers from other dialects. Thus, the choice of language variety in the dic-
tionary reflects accidents of the background research rather than principled
lexicographic or sociolinguistic design. Moreover, grouping entries by a
theoretical underlying root (a form which does not occur in speech, having
only psychological rather than surface reality), and stripping words of all
affixes i.e. lemmatizing them makes locating a word in this dictionary
something of an analytical challenge, again, a reflection of the intellectual
priorities of its producers, but with possibly inconvenient consequences for
many potential Tzotzil-speaking users.
A different variant of the ethnolinguistic wordlist, from Australia, illus-
trates another aspect of the field lexicographers dilemma. Many linguists
have documented Australian Aboriginal languages with very few remaining
134 John B. Haviland
speakers, often not fully fluent. My own work on the now defunct Barrow
Point language (see Haviland 1998) is a minor example. In such cases,
wordlists reflect serendipitous opportunity more than systematic planning,
and coverage is spotty, based on happenstance and luck. Nonetheless, even
haphazardly assembled lists of words may be significant when political
processes for example, native title claims to traditional Aboriginal terri-
tory use linguistic evidence to establish links between land and Aborigi-
nal culture and society (Henderson and Nash 2002). Everything from a
place name to a plant name may turn out to have unsuspected relevance.
Thus the issue of coverage is less a matter of scientific completeness than
an ideological issue of clear political import, another matter to which I re-
turn fleetingly at the end of the chapter.
There is also a pedagogical tradition in dictionary making, source of the
most common dictionaries: those used by students to look up unfamiliar
words, or by tourists to translate menus. Here the question of dimension is
telling. Dictionaries of Mexican Spanish (for example, Lara Ramos 1986)
are explicitly graded by size: a small version meant for schoolchildren with
several thousand basic words, a larger intermediate version with more,
and so on. All celebrate Mexican Spanish, the most widely spoken variety
of the language, but one relegated to a subsidiary status by the language
academy of the colonial home country. The lexicon chosen and the facts of
usage are drawn from a huge corpus of Mexican textual material, from let-
ters, to newspaper articles, to popular songs. In Chiapas, the government
has similarly commissioned a variety of diccionarios de bolsa or pocket
dictionaries for the Indian languages of the state. These, along with a series
of grammatical sketches, are meant as both pedagogical tools and political
trophies, evidence of government concern for Indians in the wake of the
Zapatista uprising of 1994. Of a similar design but with an opposite ideo-
logical thrust are the illustrated school primers, or basic wordlists, designed
as literacy aids by Zapatista community schools which resist all govern-
ment aid and standardized school materials.
2. Referential indeterminacy and other pitfalls of fieldwork
What sorts of creatures are the meanings of words we wish to set down in
a lexical database? It is hard to escape the weight of many centuries of
Western philosophizing on the subject (although there are useful antidotes
in J . L. Austins early essay The meaning of a word in Austin 1961).
Chapter 6 Documenting lexical knowledge 135
Following Frege (1892) it is customary to begin with the notion that words
(characteristically nouns) can typically be used by speakers to pick out enti-
ties in the world the words referents by virtue of their sense or
denotation independent of any instance of their use for referring or predi-
cating about a specific state of affairs. Words, on this view, are a kind of
instruction from speaker to hearer, grounded in some shared understanding
of the meanings of expressions, and typically designed to achieve com-
mon reference.
Even with apparently simple cases, of course, the conundrums of refer-
ence as a theory of meaning immediately surface. Suppose someone wants
to refer to me as I am lecturing. Consider the following expressions she
might use:
(1) Expressions referring to the same referent
a. That guy (with a pointing gesture)
b. The linguistics professor from Oregon.
c. The tall guy with a black moustache at the front of the room.
d. The Mexican with a black moustache at the front of the room.
The speakers instructions if successful that is, if they induce the inter-
locutor to pick me out as the person to whom she refers rely on quite dif-
ferent sorts of relations to the meanings of the words she uses. The first
relies on some sort of categorial understanding of what we can use guy to
refer to, combined with two direct indexical devices, the deictic that and
the pointing gesture. At the other extreme, (b) picks out a presupposably
identifiable individual from the intersection of sets of denotata generated
compositionally from the constituent words (along perhaps with presuppo-
sitions of existence and uniqueness built into the definite article the). Ex-
pression (c) combines such a compositional strategy with some implied
deixis (calculating which room and where its front is), and (d) paradoxically
is likely to succeed as well as (c) despite the fact that, though I live and
teach in Mexico and possibly even look Mexican, I am not a Mexican at all
therefore, the meanings of the constituent words cannot add up to a true
So reference, although it is where we start in field linguistics, cannot be
where we want to end up. Quines famous gavagai example (Quine 1960)
in which a hypothetical and ontologically challenged linguist, in a parodied
setting of monolingual fieldwork, hears the word gavagai in the presence of
rabbits, but cannot decide whether the word means rabbit or rabbit part
136 John B. Haviland
or rabbit essence, etc. underscores the profound referential indeterminacy
of linguistic behavior. Perhaps more to the point is Zgustas analogy
(Zgusta 1971: 2526) with trying to discover the meanings of traffic signs
(in a system like the European one), but only on the basis of observing the
regularities in drivers behavior. Perhaps, speculates Zgusta, one could in
time decipher the meanings of, say, the red, yellow, and green signals of a
traffic light by direct observation; but the meaning of a great capital H on
a rectangular shield (which means in many countries that there is a hospital
not far away) would be much harder to divine, since such signs stand in
many different kinds of locations and a uniform effect on the behaviour of
other drivers is hardly observable.
Here is a less fanciful example from the annals of real field lexicography.
In 1770, Lt. J ames Cook and his crew collected wordlists from the Guugu
Yimithirr language, spoken near what is now called Cooktown, in north-
eastern Australia. (One word was gangurru, the name for a particular spe-
cies of what the world now calls kangaroos). Collating the shared entries of
different observers, one can see precisely that referential indeterminacy of
the gavagai variety plagued these early lexicographers. Thus, under the
gloss branch (with buds or stalk), the ships illustrator Parkinson has
maiye, Banks the botanist writes maye butai (adding the annotation with
leaves) or mayi bambier. Based on the modern language, I assume that
these expressions are based on the word mayi edible plant so not just
any old branch is involved and more specifically mayi bambiir the (edi-
ble) fruit of the mangrove species called bambiir. The other name Banks
records is plainly the expression mayi buday which is really an entire sen-
tence that means the edible part has been eaten or someone ate the
Cooks journal entry shows he was painfully aware of such Quinean
problems of lexical elicitation.
the list of words I have given could be got by no other manner than by
signs enquiring of them what in their Language signified such a thing, a
method obnoxious to many mistakes: for instance a man holds in his hand a
stone and asks the name of [it]: the Indian may return him for answer either
the real name of a stone, one of the properties of it as hardness, roughness,
smoothness &c, one of its uses or the name peculiar to some particular spe-
cies of stone, which name the enquirer immediately sets down as that of a
stone. (Cooks journal, see Cook 1955)
Chapter 6 Documenting lexical knowledge 137
Part of the problem, clearly, is in a primitive model of both reference and
ostension: what you can pick out by pointing, or what you can show the
A very different model of exemplification is advocated by J . L. Austin
in A plea for excuses (Austin 1961). Faced with a pair of expressions
(famously, in Austins case, the apparently similar by mistake vs. by acci-
dent) one elucidates the difference in their meanings by constructing a careful
example of when you would use the first expression but not the second, and
vice versa. In such a method one points not at things but at contexts of use.
Contexts themselves can be crucial in accessing lexical knowledge. In
trying to recover words from the native Barrow Point language of the late
Roger Hart, he and I worked largely through Guugu Yimithirr, a second
language for both of us (see Haviland 1998). We would often search
sometimes quite naively for the Barrow Point equivalent of a Guugu
Yimithirr word. Even looking for the names of plant or animal species,
however, we were often stymied, partly because the flora and fauna of Bar-
row Point were frequently different from those of Cape Bedford, more than
a hundred kilometers to the south, but partly because the environment in
general was just wrong. Roger had learned his tribal language before he
was removed from his family around the age of six. I first heard him speak
the language without hesitation, however, sixty years later. After a long
trek back overland, he and I stumbled out onto the beach where he had been
born. The country he had not seen for sixty years, its trees, rocks, and ani-
mals, seemed to speak to him in his childhood tongue, and he was only
there able to respond fluently.
Reference or more precisely those aspects of linguistic expressions
that render them useful for achieving reference though the staple of most
modern formal semantics, is of course an inadequate basis for understand-
ing meaning in an ordinary sense. The traditional notion of connotation,
for example, is based on the intuition that different words can in some
sense refer to the same thing without, thereby, having the same mean-
ing. This is not the same as Freges classic distinction between sense
(what an expression means) and reference (what it just happens to refer to,
as a function of what it means) where two different expressions, with dif-
ferent senses, can happen to refer to the same individual. Zgustas some-
what quaint example is the lexical triad decease, die, peg out (the last
in my own dialect of English would be something like check out or per-
haps go belly up). Zgusta (1971: 3940) cites Armenian as a language
which has exact counterparts (vaxanvel, mernel, satkel) for these English
Chapter 6 Documenting lexical knowledge 139
literally hear (or feel, or understand) in the heart. The Tzotzil phrase re-
quires morpho-syntactic completion: the transitive verb -a`i needs both a
syntactic subject (the one who presumably treats some matter) and object
(the matter treated). Moreover, the word -olonton heart also requires an
obligatory possessor, which judging by the modern language must be
coreferential with the subject of the verb, thus x hears with his/her OWN
heart not, with someone elses. These morphosyntactic restrictions are
not obvious from the original usage. Nor is it clear that the expression is
limited to the sort of referential context suggested by the English (or original
Spanish) gloss: it seems instead simply to suggest careful consideration of
anything, whether a negocio matter, business or something less specific
or concrete. Without access to fully fluent native speakers it is impossible
to supply more lexical detail. More problematic, and perhaps more relevant
to documenting an endangered language, is the case of an archaic word, or
one in limited use in a speech community. Again, Colonial Tzotzil provides
an instructive example. The ritual language of modern Tzotzil uses the ex-
pression tza-uk, evidently formed from a (non-attested) nominal root tza
plus an irealis or subjunctive suffix -uk. Laughlin (1975) suggests as a
meaning for tzauk take heed a translation suggested by knowledgeable
modern speakers. However, somewhat arbitrarily it seems, in the modern
dictionary he lists the word under the root tzak catch, grab. Only the dis-
covery of the Colonial dictionary (Laughlin 1988) revealed an archaic root
tza which has entirely fallen out of existence in Zinacantec Tzotzil except
for its surviving ritual use. The Colonial lexicographers recorded it with the
meanings cleverness, cognizance, craftsmanship, guess, industriousness,
intelligence, opinion, prudence, skill, speculation, talent, thought, but no
evidence is provided by modern usage.
Perhaps the oldest chestnut of anthropological linguistics is denotational
diversity in lexical mappings of reality, captured in the slogan that dif-
ferent words imply different worlds. One classic domain is ethno-
anatomy, the lexical (and thus, perhaps, conceptual?) slicing up of the body
into discrete parts. Whereas English speakers distinguish hands from
arms, Russian and Tzotzil speakers do not. Tzotzil has the single root
which can mean either hand or arm. Worse, it can also mean
branch, sleeve, crossbar (of a cross), front leg (of a cat), and so on.
Tzotzil ni` nose denotes not only noses, but any relatively sharp-pointed
protrusion, or the thin end of almost any sort of object, not necessarily a
face or a head. So why privilege a body part gloss like hand or nose?
Perhaps a non-anatomical model is involved in such partinomies.
140 John B. Haviland
Another possibility is that a basic meaning is extended in various ways
into a chain or continuum of derived meanings without well defined end-
points. Cruse (1986) argues that terms like mouth in English participate in
sense spectra where each derived or metaphorical meaning leads to
(2) sense spectrum (Cruse 1986: 71 ff.)
J ohn keeps opening and shutting his mouth like a fish.
This parasite attaches itself to the mouths of fishes, sea-squirts, etc.
The mouth of the sea-squirt resembles that of a bottle.
The mouth of the cave resembles that of a bottle.
The mouth of the enormous cave was also that of the underground
The kinds of meaningful elements one chooses for a lexical database are
also inextricably linked to the whole of ones categorial analysis for a lan-
guage, what parts of speech are postulated, and what sorts of semantic
profiles are associated with them. The standard formal semantic starting
point that nouns will map onto things (i.e. sets), adjectives to properties
(i.e. subsets), and verbs to events or states of affairs (predicates over n-
tuples of entities), quickly disintegrates in the face of the diverse sorts of
semantic conflation (Talmy 1985) routinely observed in lexical items. A
standard example is climb in English, whose Frame Net
definition is: to
move vertically usually upwards, usually with effort. That is, the verb
suggests, in the default case, vertical movement upward, combined with the
sort of effort Fillmore called clambering. Either of these conflated ele-
ments upward motion, or effort can be suspended, but not both without
semantic oddness.
(3) Conflation in climb (Fillmore 1982)
The snake climbed (up) the tree.
The monkey climbed (up/down) the tree.
?The snake climbed down the tree.
Another commonplace of anthropological linguistics is that languages con-
flate semantic domains in unexpected ways, perhaps most characteristically
in verbs. For example, the following Tzotzil positional predicates all might
receive a similar English gloss stuck.
Chapter 6 Documenting lexical knowledge 141
(4) Tzotzil words for stuck
Kakal stuck (between two surfaces)
Chikil stuck (into a narrow or tight crevice)
Katzal stuck (in a jaw-like orifice)
Xojol stuck (inside an enclosing hole)
Tzapal stuck (a pointed thing anchored in a surface)
As the detailed glosses show, however, each word specifies different con-
figurations, kinds of attachment, and different shapes, in both figure and
The exact conflation, I believe, involves such factors as the fol-
lowing, taking the root tzap as an illustration.
(5) Conflation in tzap
a. the end of the Figure is inside the Ground;
b. the Ground need not have a y-ut inside (or perhaps it must not be
so structured, conceived of instead as a mere surface);
c. the Figure has a pointed end (in Tzotzil, s-ni` nose);
d. typically the Figure is stuck into the Ground pointed end-first,
i.e., attached somehow, and self-supporting; and
e. typically it is vertically oriented.
Linguists have posited various classifications of semantic types, in different
root classes, and the field lexicographer should borrow shamelessly from
such typologies: from frames, to verb types (Dixon 1972), to verb classes
based on patterns of diathesis (Levin 1993), and so on.
The multiplicity of language games something that cannot long re-
main hidden from a serious field linguist further complicates the tradi-
tional referential view of lexical meaning. We use words to refer; but also
for many other things. Here is part of Wittgensteins list:
Giving orders, and obeying them Describing the appearance of an object,
or giving its measurements Constructing an object from a description (a
drawing) Reporting an event Speculating about an event Forming and
testing a hypothesis Presenting the results of an experiment in tables and
diagrams Making up a story; and reading it Play-acting Singing
catches Guessing riddles Making a joke; and telling it Solving a prob-
lem in practical arithmetic Translating from one language into another
Asking, thanking, cursing, greeting, praying.
(Wittgenstein 1958: sect. 23)
142 John B. Haviland
Cruse (1986: 270 ff.) reminds us of the differences between what he calls
semantic modes, as in the contrast between the following two utterances.
(6) Semantic modes
I just felt a sudden sharp pain.
If semantics is only about reference and predication, then it will be difficult
to capture the meaning of ouch! semantically, because the word involves
neither reference nor predication. Instead, it will be important to understand
such things as interjections (see Kockelman 2003) in terms of very differ-
ent semiotic modes: indexing speaker stance, interlocutors relationship to
speaker, putative bodily and affective states, expected responses, and so on.
That words like ouch are hard to model in terms of denotata does not re-
lieve us of the lexicographers responsibility of recording them and ex-
plaining how they work a problem which I return to below.
A broader and more appropriate conception of meaning derives from
one of the well-known trichotomies of ways that signs can signify or stand
for other things, due to C. S. Peirce (1932). The three semiotic modes are
based on very different principles, although they generally co-mingle in
most signs, linguistic or otherwise. Peirce pointed out that some signs stand
for other things because of a resemblance between the sign vehicle and the
thing signified thus a photograph of a person can stand for that person
(for example, in a directory or catalogue). The sign bears an iconic re-
semblance to what it signifies, although the nature of the resemblance
can vary tremendously (consider diagrams, drawings, silhouettes, graphs,
for example, or conventionalized but nonetheless onomatopoetic words
whose sounds suggest their meanings: moo or caw or cackle, perhaps).
There can also be an indexical relationship between sign and signified,
such that physical, spatial, or direct causal relationships exist between the
sign vehicle and what it signifies. A footprint, for example, may not re-
semble the person who made it (although it may, of course, resemble his
or her foot), but it stands as an index of the person by virtue of the fact
that it took the persons foot to make the mark (hence, indicating, for ex-
ample, that that person has been at a certain place). In language, ouch!
stands for (indeed, displays) sudden pain precisely because we imagine that
the pain itself somehow (involuntarily?) produces the utterance. In a similar
way, we know what person I or you refers to by observing the contex-
tual relationship between the sign the word and the person uttering it or
Chapter 6 Documenting lexical knowledge 143
to whom it is uttered. Such words, then, rely on an indexical relationship
(in a context) to convey their meanings. Finally, there are signs whose sig-
nificance is essentially unmotivated by either resemblance or context: these
are Peircean symbols which rely on a conventional relationship between
signifier and signified Saussures arbitrariness of the linguistic sign, in
which cat means cat only because that is what a particular linguistic tradi-
tion has legislated.
Figure 1 shows a sign which transparently combines all three Peircean
semiotic modalities: the iconic resemblance between the drawing and a
(stylized) smoking cigarette; the conventional meaning (at least in much of
the Western world) of the shaded circle with the diagonal bar as a prohibi-
tion; and finally, the location of the sign itself, whose physical position
signals indexically exactly where smoking is prohibited.
Figure 1. A semiotically trichotomous sign
An adequate description of the meaning of linguistic elements must capture
all three modes of signification, although the major lexicographic traditions
limit themselves largely to conventional or symbolic meaning, almost ex-
clusively in referential terms.
144 John B. Haviland
3. Metalanguages for meanings and units of lexical knowledge
A second major set of issues for lexical databases is how to represent the
meanings of lexical items, and how to delimit such items in the first place.
Bilingual definitional equivalents are often manifestly inadequate, for the
reasons that have always worried translators: mismatches in grammatical
class, inexactness or lack of equivalence between target and source lan-
guage terms, incompatible ranges of meaning, infinite regress or vicious
circles, and so forth. Much depends on the available metalanguages.
My colleague Matt Pearson, trying to illustrate the interdependence of
different expressive modalities in language, challenges beginning linguistics
students as follows: Can you define spiral without using your hands?
(You might try it yourself before reading on.)
To repeat, everything depends on the available metalanguages. Even a
novice mathematician can respond by giving a formula for a 3-dimensional
graph, i.e., by defining a series of values for the (x,y,z) axes. Here are some
sample formulas.
(7) spiral
(cos(t), sin(t), t) [for a spring-like spiral]
(c tcos(t), c tsin(t), c t) (where c is some constant)
[for a cone-like one]
J ust to see how these formulas work, on the following page are two graphs
of their results, plotted by my statistician colleague Albyn J ones.
The beauty of the mathematical metalanguage involved is its precision,
parsimony, and presumed universality.
The drawback is its potential ar-
cane incomprehensibility.
Moreover, though the formulas may describe
quite precisely a class of geometric forms, and perhaps even would help
define spiral, we might still need recourse to some further (though per-
haps equally general) non-mathematical devices to capture the meaning of
the word in expressions like Prices are spiraling out of control, or We
must control the insane spiral of nuclear proliferation.
One difficulty with presuming a language-independent semantic meta-
language (aside from prejudging the semiosis of words and limiting it to
referential information a worry of the previous section) is that it may do
violence to the conceptual organization of particular languages. Here is the
emic-etic dichotomy of classical anthropological linguistics: do we give
priority to language-specific organization of forms and meanings, or to de-
Chapter 6 Documenting lexical knowledge 145
Figure 2. (cos(6t),sin(6t),t) for t in (0, pi)
Figure 3. (t cos(t),t sin(t),t) for t in the same range
146 John B. Haviland
scriptive categories derived from language-external conceptualizations. An
early and instructive demonstration of the dilemma is Conklins treatment
of Hanunoo pronouns.
(8) Hanunoo pronouns (Conklin 1962)
kuh I 1s
muh you 2s
yah s/he 3s
tah we two 1du
tam we all 1pl INCL
yuh you all 2pl
dah they 3pl
mih we (but not you) 1plEXCL
If we adopt the standard pronominal metalanguage, kuh will be glossed as
first person singular or tam as first person plural inclusive. The meta-
language thus involves a person component (with possible values 1, 2, or
3), a number component (with possible values, for Hanunoo, of singular,
dual, or plural), and an inclusivity component (with possible values inclu-
sive or exclusive, and perhaps an unmarked value) which is defective in
that it can by definition apply only to non-singular first person pronouns.
Using such meaning components it should be possible to distinguish be-
tween 1113 different pronominal forms (three different persons, with three
different numbers, and an inclusive/exclusive distinction on all non-singular
first-person forms). The paradigm has only eight pronouns, however. Worse,
the primitive terms in the descriptive metalanguage (the number and person
categories, plus the terms inclusive and exclusive) themselves total eight,
suggesting that there is little to recommend this particular metalanguage
over just using the raw Hanunoo terms themselves as primitive or un-
definable elements.
Conklin observed that a better analysis is possible, taking as metrics of
evaluation efficiency (so that exactly three binary distinctions should be
able to distinguish eight [=2
] terms), and faithfulness to the native Ha-
nunoo logic. His proposed three binary features are Speaker, Hearer, and
Minimal, giving a table like Table 1, whose aesthetic symmetry inspires
hope that one is discovering rather than imposing the underlying system.
Chapter 6 Documenting lexical knowledge 147
Table 1. Hanunoo pronouns
kuh I 1s + +
muh you 2s + +
yah s/he 3s +
tah we two 1du + + +
tam we all 1pl INCL + +
yuh you all 2pl +
dah they 3pl
mih we (but not you) 1plEXCL +
Another useful descriptive paradigm widely applied to (and in fact driven
by) lexicographic practice is the frame-semantics approach associated
with Charles Fillmore (see, for example, Fillmore and Atkins 1992). Indi-
vidual words, on this view, project wider, structured frames configura-
tions of elements and actions, some of which receive explicit grammatical
realization and some of which remain implicit in the frame. Families of
words then share frames. For example, the Framenet description of the
Commerce-buy frame which might be instantiated by such verbs as
buy, lease, or rent is
These are words describing a basic commercial transaction involving a
buyer and a seller exchanging money and goods, taking the perspective of
the buyer. The words vary individually in the patterns of frame element re-
alization they allow. For example, the typical pattern for the verb BUY:
BUYER buys GOODS from SELLER for MONEY. Abby bought a car
from Robin for $5,000.
Clearly, frames themselves can be interrelated. Compare the description for
the Giving frame, which the Commerce frame above inherits:
A Donor transfers a Theme from a Donor to a Recipient.
This frame in-
cludes only actions that are initiated by the Donor (the one that starts out
owning the Theme). Sentences (even metaphorical ones) must meet the fol-
lowing entailments: the Donor first has possession of the Theme. Following
the transfer the Donor no longer has the Theme and the Recipient does.
148 John B. Haviland
In some ways related as a metasemantic device is the approach, most ex-
plicitly developed in Levin (1993), that uses various syntactic diagnostics
such as patterns of diathesis to partition lexical sets into families or
classes. Testing various diagnostic syntactic behaviors against their occur-
rence with specific verbs partitions the verbs into classes which can, ac-
cording to this logic, be expected to display commonalities of meaning. For
example, Levin proposes the following constructions as relevant tests to
discover semantic classes among transitive verbs.
(9) Diathesis diagnostics
MIDDLE: The bread cuts easily.
CONATIVE: Carla hit at the door.
BODY-PART POSSESSOR ASCENSION: Terry touched Bill on the shoulder.
Applied to specific verbs (each of which may have a variety of hyponyms,
thus forming meaning families), these tests reveal different syntactic classes
corresponding to putative meaning families. The meaning families can, in
turn, be used to group individual lexical items, and the groupings are thus
justified not simply on notional but also on syntactic grounds.
(10) Diathesis diagnostics applied to different verbs (from Levin 1993: 6)
touch hit cut break
MIDDLE No No Yes Yes
4. Systematic extraction of lexical databases
After one has documented the basic structures of a grammar, and collected
an ample corpus of texts, how does one supplement elicited examples and
textually situated tokens of use to achieve a systematic compilation of lexi-
cal knowledge? Interlinear glossing of a large corpus can be used mechani-
cally to generate a structured word list, whose analytical perspicacity is in
direct proportion to the compilers care and consistency in morphological
and semantic tagging during the glossing procedure. Various computational
tools aid lexical extraction from text corpora not only dedicated linguistic
database tools like SILs Shoebox/Toolbox, but also both general and spe-
Chapter 6 Documenting lexical knowledge 149
cialized concordance tools (written, for example, as unix shell scripts, or
with programming languages like PERL or ICON
Other computer techniques can also aid in eliciting lexemes in a lan-
guage, taking advantage of regular phonological patterns. A well-known
example is Terry Kaufmans method for generating an exhaustive list of
potential roots in Mayan languages, based on the observation that the
root canon in Mayan is CVC or some simple variant thereof. Table 2 shows
a short ICON program that begins with all the consonants and vowels
in the
Mayan language Tseltal and produces a complete list of all permutations of
the form CV(:)(j)C. The program produces 8820 potential roots. (The first
of those beginning with b are shown in Table 3.) Each of these can be ex-
haustively (and exhaustingly) tested with native speakers to see which forms
actually produce recognizable lexical items many speakers of Mayan lan-
guages and others with similarly straightforward phonotactics have, over
the years, been subjected to such a mind-numbing task.
Table 2. Tseltal root salad, in the Icon programming language
procedure main()
C := "`bcCjkKlmnpPrstTwxyzZ"
V := "aAeEiIoOuU"
M := "0j"
every (c1 := !C) do {
every (v1 := !V) do {
every (m1 :=!M) do {
every (c2 := !C) do {
root := c1||v1||m1||c2
Table 3. The first possible Tseltal roots beginning with b
ba' bab bach bach' baj bak bak' bal bam ban bap bap' bar bas bat
bat' baw bax bay bats bats' baj bajb bajch bajch' bajj bajk bajk' bajl
bajm bajn bajp bajp' bajr bajs bajt bajt' bajw bajx bajy bajts bajts'
baa baab baach etc.
150 John B. Haviland
Mechanically generated wordlists will inevitably reveal areas requiring
further lexicographic work phrasal lexical units, syntagmatically defined
paradigms, functional vs. lexical elements, or particles, for example
and they ordinarily also expose to view especially elaborated lexical do-
mains worthy of deeper exploration. Such domains may, on the other hand,
emerge not from obvious gaps or hypertrophy in lexical sets revealed in
text collections or elicited wordlists, but in clues from the communicative
practices of a speech community itself: aesthetic judgments about beauti-
ful or eloquent if not ugly or awkward speech, for example,
especially marked and evaluated kinds of talk, or specialized speech genres
or performances, on the one hand; and, on the other, cultural preoccupa-
tions with associated lexical expression: elaborated vocabularies for pro-
fessions, activities, or other kinds of interests, or insistence on getting the
right word or on proper and accurate expression.
Most methods for lexical elicitation are, for better or for worse, exten-
sional and referential that is, they are based on presenting exemplars
of things or situations in the world to native speakers and asking for appro-
priate linguistic expressions which can be used to refer to or to characterize
them. Such a method is perhaps inescapable for first-level lexical documen-
tation, but it leaves largely unanswered difficult questions about the inten-
sions of words: what they actually mean, what meaning distinctions they
encode, what sorts of meaning relationships they enter into with other words
and expressions, rather than simply what states of affairs they can be used
truthfully to refer to. Such elicitation techniques are also often helpless to
capture such non-referential aspects of meaning as politeness registers,
specialized uses and contexts, and the like. Such issues can and perhaps
must be ignored for the first stages of building lexical databases in lan-
guage documentation, but they cannot be ignored forever.
Here is a single example from my own fieldwork on Guugu Yimithirr.
I quickly learned that the everyday Guugu Yimithirr word nambal meant
stone but was also extended to mean money. My primary teacher (and
social father) in the community, who sometimes had occasion to borrow
money from me, often instead used (or whispered) another word to me when
he wanted to refer to money: wambugan. However, wambugan is really a
polite equivalent for the ordinary word nambal in the respectful vocabulary,
obligatory in speech with avoided affines and referred to in the published
literature as Brother-in-law language (Haviland 1979). Its denotative
range is in fact somewhat broader than that of nambal it includes stones
(including specially named grinding stones, quartz, etc., which are not
Chapter 6 Documenting lexical knowledge 151
normally called nambal) AND money. Crucially it is an over-polite word,
no longer used in modern Hopevale with avoided affines nor, indeed, widely
known beyond a few old men, and with them still carrying a euphemistic
tone of respect. Both factors combine to make wambugan a perfect code
word for an embarrassing task like asking ones courtesy son and pupil for
a loan.
Ignoring such difficulties for the moment, let us consider techniques for
supplementing the lexical information haphazardly collected through me-
chanical reversal of text corpora. The trick, obviously, is systematic but
controlled elicitation, by presenting or simulating aspects of external re-
ality so as to stimulate native speakers into using words and expressions to
represent as yet unencountered states of affairs. Somewhat artificially I have
divided sample methods according to what aspects of reality they purport
to simulate: natural facts, socio-cultural institutions, and in the final sec-
tions pragmatic facts of (inter)action, and ideological constructions on lan-
guage and society.
4.1. Nature
The tradition in anthropological linguistics, variously labeled ethnographic
semantics or ethnoscience, purports to display culturally specific knowl-
edge about the natural world by detailing the semantics of lexical domains
related to the corresponding natural phenomena: Hanunoo medicinal plants,
Tseltal categories of firewood, ethnobotany or ethnozoology; parts of houses
or bodies, taxonomies of disease, local technology, and so on. A classic
example of the genre is Berlins (1968) detailed study of Tseltal numeral
classifiers, a detailed compendium of the several hundred classifiers once
obligatory in Tenejapa Tseltal numeral expressions. Numeral classifiers
specify countable units of different kinds of substance, often on the basis of
shape. The notable feature of Berlins study, for our purposes, is his use of
carefully elaborated photographs both as stimuli (i.e. to elicit Tseltal nu-
meral expressions from speakers) and as a vehicle for metasemantic repre-
sentation: the photos accompany and illustrate his verbal characterization of
the Tseltal forms so elicited. (Berlin also used Kaufmans mechanical pro-
cedure to generate potential numeral classifier roots, as described earlier.)
To give an idea of both the semantic specificity of the Tseltal forms and the
nature of the photographic stimuli, here are two sample pictures from
Berlins study. (Note that in Figure 5, illustrating the classifier hiht, the
Chapter 6 Documenting lexical knowledge 153
Following Talmys typological deconstruction of motion verbs (Talmy
1985), and using a variety of elicitation kits involving photographs,
drawings, videos, and cartoons,
field researchers have explored in detail
linguistic systems of spatial adpositions,
directionals, motion verbs and
other auxiliaries, and what have been called spatial frames of reference
(Levinson 2003).
For a slightly different sort of example, just as Tzotzil speakers use a
highly elaborated set of semantically specific positional roots, it is clear in
practice that certain families of verbs grouped by rough notional meaning
categories (Dixon 1991) incorporate distinctions, often unfamiliar to speak-
ers of other languages, that require careful lexicographic delimitation.
Zgusta (1971: 89 ff.) provides a rich discussion of such families of verbs,
what he calls chains of near synonyms, citing as an example multiple
Chinese words for carry. There are many monolexemic Tzotzil transitive
verbs which might most naturally be translated into English as carry, al-
though it is not clear that anything justifies grouping the words together
other than this fact about English translations. Thus, for example,
kuch to carry (a largish burden) on the back, usually with the aid of a
pet to carry or hold in the arms, in front of the body (e.g. a baby)
lik to carry by holding a handle from which the burden dangles (e.g.
a pail)
kach to carry by gripping between two surfaces, normally in the jaws
(e.g. a dog with a bone)
jop to carry cupped in the hands or some other concave surface (e.g.
an apron)
tom to hold or carry in the hand, usually a longish thing gripped in
the hand but extending above or beyond it (e.g. a torch, a rifle)
mich to carry squeezed, usually between the fingers or fist
There is, incidentally, as far as I know no more general Tzotzil carry verb
that could be used to cover all of these cases.
Another such Tzotzil verb family is that of insert (Haviland 1994)
where as in the case of carry verbs the distinguishing criteria involve
the shapes of inserted object and container, the types of contact or contain-
ment involved, the tightness of fit, the orientations of container and inserted
object, etc. Both to elicit and to illustrate such distinctions I have made
154 John B. Haviland
small films of different kind of inserting actions, performed with familiar
objects, which speakers can view and discuss: what is the best way to de-
scribe what they see? are there other ways to describe it? and so on.
It is hard to know in advance what areas of vocabulary will enjoy lexical
hypertrophy in an undocumented language. The advantage of the elicitation
tools developed by the Max Planck Institute for Psycholinguistics (Nij-
megen) and elsewhere is that they can be used to invite speakers to exploit
their full repertoire of expressive resources by describing standardized
stimuli. Childrens cartoons such as the Maus series from German televi-
are both entertaining and useful for investigating domains of motion,
for example. Of course the sense in which speakers of different languages,
with different sorts of cultural backgrounds and life experiences, will see
these stimuli as the same is problematic and, in fact, a central issue to be
investigated in linguistic fieldwork.
4.2. Socio-cultural reality
Of obvious interest for language documentation are lexical domains that
encapsulate central aspects of society. Linguistic anthropology again pro-
vides the classic example: kinship terminologies, once a central part of
comparative ethnography, are for speakers of many endangered languages
an area of intense personal and conceptual concern (see also Chapter 8). In
societies where the central social categories are defined by family relation-
ships, whether genealogically or otherwise construed, the terminology de-
noting such categories is essential to any characterization of social life. The
asymmetry in Tzotzil sibling terminology, for example, seems suggestive
about family relationships. For a male Ego, Tzotzil distinguishes older and
younger brothers (bankil, itzin) from older and younger sisters (vix, ixlel).
For a female Ego, however, the gender distinction is neutralized between
younger brothers and sisters. Thus, a female speaker distinguishes older
brother and older sister (xibnel, vix) and lumps together younger siblings of
both genders (muk). Furthermore, note that the distinction between gender
of Ego is neutralized precisely in the case of the term for older sister, vix
for both men and women speakers (see Figure 6). These asymmetries sug-
gest that the relationship between an older sister and her younger siblings
of either gender is specially marked terminologically and conceptually. A
plausible explanation is the expectation in many Tzotzil speaking commu-
nities that an older sister has special mother-like responsibilities for the care
of her muk or younger siblings, regardless of their gender. This special care
Chapter 6 Documenting lexical knowledge 155
is terminologically matched by a reciprocal terminological projection for
younger siblings that their vix or older sister is a kind of substitute mother.
Figure 6. Tzotzil sibling terms
As the classic debates show, however, kinship algebras and diagrams
conceal a central problem in documenting lexical knowledge, one already
mentioned above: the tension between so-called etic metalanguages and
emic categories. In any given language, one can justifiably question
whether putatively universal descriptive terms for characterizing a particu-
lar kin relationship (in terms, say, of gender, generation, and kin-line, or
with allegedly primitive relational terms like F[ather], M[other], H[usband],
W[ife], or with algebraic symbols like +, , , ) do justice either to the
meaning of a particular natural language term or to a specific relationship
between two individuals. Indeed, in societies which display a clear obses-
sion with kinship and kinship terminologies (for example, in the Australian
Aboriginal communities where I have worked), a central area of dispute
and conceptual wrangling is often exactly how to give the proper lexical
label to a relationship, or how to explain what a particular unambiguously
named relationship entails. My main Guugu Yimithirr teacher, for example,
156 John B. Haviland
would often point out a kinsman walking past and say, You should call
that man X; because his father was your W; but then again, he turned
around and married your Y, so what does that make him? your Z? A ge-
nealogical relationship between two individuals does not uniquely deter-
mine what the relevant kin term might be, since that, in turn, may respond
to considerably more complex factors about what aspects of the relation-
ship are most important. In modern Zinacantn, in some cases a ritual rela-
tionship of compadrazgo or fictive-mutual-parenthood (between the parents
and the godparents of a newly baptized child, for example) may actually
take precedence over an immediate genealogical relationship: brothers may
become compadres and cease to refer to each other with sibling terms.
For purposes of systematic documentation, this domain again illustrates
the tension between a corpus of examples and systematic eliciting. No
single network of actual social/genealogical relationships and the corre-
sponding terminological distinctions can hope to capture the systematicity
of the overall terminological-conceptual complex. At the same time, no
extensional metalanguage (such as the genealogical primitives of kinship
algebra) will be sufficient to guarantee that all socially significant variables
emerge from mechanical elicitation. An adequate lexical database must
combine both kinds of information.
4.3. Pragmatic reality
Methods for enriching a lexical database to include the use of indexical
linguistic units inextricably bound to context are somewhat harder to find
in recent literature. All linguistic behavior is, of course, tied to context and
linked with action, but some of the most intractable lexical items frequently
have inherent links to their indexical surrounds pronouns and other deic-
tics being the most obvious examples, since even their referents (whom
they pick out) must be computed by reference to the contexts of their use.
Studies of such lexical domains suggest that the only practical approach to
the description of such parts of the lexicon is a kind of exhaustive observa-
tional fieldwork. Thus, Hanks (1990) gives detailed analysis of the system
of demonstratives in Yucatec Maya based on extensive fieldwork in which
he recorded, in detail, situated occurrences of spontaneous deictic usage,
inducing from the corpus and from the linguistic forms the theoretical
components of an adequate account of deictic practice.
Another exemplary domain is that of exclamations and interjections.
Kockelmans extended treatment of interjections in Qeqchi (Kockelman
Chapter 6 Documenting lexical knowledge 157
2003) involved a field methodology much like that of Hanks. He systemati-
cally recorded the circumstances when utterances categorized as interjec-
tions occurred in a Qeqchi speaking community in Guatemala. On the
basis of such a corpus, he elaborated a theory of interjections which goes
well beyond the received model of their expressive nature (part of an
ancient tradition in Western linguistic thought, dating back to the Latin
grammarians), to consider the multiple and bi-directional indexical proper-
ties of these expressions: exhibiting emotional and affective stances, explic-
itly inviting reciprocal exhibits from interlocutors, drawing interlocutors
attention to circumstances, requesting actions, and so on. Such studies sug-
gest that there are few shortcuts to an adequate account of what such prag-
matically charged linguistic elements mean, and that extensive ethnographic
fieldwork is thus an essential part of field lexicography.
The same can be said of more prosaic vocabulary, from ordinary body
part terms to specially marked polite and impolite registers, such as joking
and cursing speech. I have already mentioned the residual lexical complexi-
ties produced by changed use of Guugu Yimithirr respectful or brother-in-
law vocabulary, and such complexities are only multiplied when several
more or less well regimented speech registers are in active use in a speech
community. Classic anthropological descriptions of such phenomena attest
to the subtlety and nuance communicated by strategic choice between al-
ternate lexical forms in societies from Aboriginal Australia and Samoa to
Bali (Duranti 1992; Errington 1985; Geertz 1960), or between address
terms and personal pronouns from Europe to J apan (Brown and Gilman
1960). Laughlin (1975) proposes a series of labels to distinguish in Zi-
nacantec Tzotzil such things as ritual speech, joking speech, male and
female speech, baby talk, polite speech, scolding, denunciatory speech,
archaic [words], etc. Whether or not a field lexicographer can give a com-
plete account of such facts for an entire lexical database, it is important to
be aware of the sorts of metalinguistic speech categories that might be rele-
vant in a given speech community.
For self evident reasons, systematic investigation of such genres for
example, tabooed speech may be hard for inexperienced fieldworkers.
Similarly difficult are whole systems of linguistic tropes which sometimes
dominate parts of a languages expressive resources. Again, the only remedy
seems to be wide ranging and systematic ethnographic attention. Here are
two examples from my own fieldwork. As I learned Guugu Yimithirr, I
noticed that many expressions dealing with human propensities and inner
states were transparently metaphors, based on a small set of words which
158 John B. Haviland
seemed simply to name parts of the body. Whether or not, as anthropolo-
gists have sometimes suggested, these expressions represent an implicit
theory of the anatomical distribution of emotions and mental faculties (as
we might argue, for example, with English expressions like hard-headed
or hard-hearted), or instead are simply opaque culturally conventionalized
idioms (as we might argue for green thumb or lily-livered
) it was clear
that Guugu Yimithirr had a semi-productive system for generating diverse
expressions based on body-part tropes. (11) gives an example based on
the Guugu Yimithirr word miil eye. The only way I could document the
system was to keep my ears open (as it were) for relevant expressions in
conversation, and to try systematically to force new combinations of body-
part words with adjectives and verbs, usually yielding only guffaws instead
of new lexemes.
(11) Guugu Yimithirr expressions based on miil eye
miilgu =(lit., eye +EMPHATIC suffix) awake
miil warnggu =(lit., eye sleep) sleepy
miil nhin-gal =(lit., eye sit) watch out, keep an eye out
miil biyal =(lit. eye sinew) staring all the time
miil ngamba =(lit. eye careless) unobservant, shutting ones eyes
to something
miil waarril =(lit., eye fly) feel faint, go crazy, faint, get drunk
miil bagal =(lit., eye poke) deceive, trick, become jealous
miil bathibay =(lit., eye bone) sharp-eyed, always staring
miil biinii =(lit., eye die) go blind
miil gulnggul =(lit., eye heavy) sleepy
miilgu nhin-gal =(lit., eye-EMPHATIC sit) stay awake .
Consider, too, the language of Tzotzil ritual (Gossen 1974, 1985; Haviland
1987, 1996, 2000). In contexts from prayer and song to formal denuncia-
tion, Tzotzil speakers abandon ordinary lexicon and grammar in favor of a
highly structured speech style that involves parallel lines which differ in
only a single word or phrase. These parallel lines are interpreted in terms of
a standard stereoscopic image (Fox 1977) invoked by the paired expres-
sions. Thus, to refer to the body one can use different doublets, depending
on the context. One is highly literal, using pat, xokon back, side as a
metonym for the whole. Another is considerably more opaque, and sug-
gests an image of humility, as in the following extract from a curing prayer,
Chapter 6 Documenting lexical knowledge 159
where the doublet lumal, achelal earth, mud (both in possessed form)
refers to the patients body or self.
(12) From a Zinacantec curing prayer
ja me ta jmala lalumale
I am waiting for your earth.
ta jmala lavachelale
I am waiting for your mud.
A further example is the doublet in Zinacantec ritual speech to refer to
liquor: xi`obil, skexobil, literally cause for fear, cause for shame. Such
expressions share properties with euphemism, always a problematic phe-
nomenon for lexicography that requires careful ethnographic fieldwork.
Systematic elicitation reveals little about the overall system of imagery in
ritual language, although it is an essential part of the languages expressive
power. Laughlins (1975) dictionary of modern Zinacantec Tzotzil anno-
tates and illustrates words that participate in parallel constructions under
the rubric ritual speech. In my own work, I have relied on exhaustive re-
cording and transcription of prayer and other genres that employ parallel-
ism to expand on the list of doublets.
5. Conclusion
When does documentation of the lexicon end? While the lexicon is a re-
pository for the exceptional and the chaotic in language, it is also a site of
considerable regularity and productivity. Nonetheless, field lexicographers
like Laughlin express doubts about how well structured or widely-shared
lexical knowledge is across a speech community, basing his skepticism on
elicitation with both Zinacantec peasants and Washington D.C. university
students. Notoriously difficult even for well-studied languages is distin-
guishing between literal and figurative or tropic uses of words: older
Tzotzil speakers describe airplanes as xulem kok, literally (as we say)
buzzard fire or telephones as chojon takin wire of metal enduring the
giggles of younger speakers (who simply use a Spanish loan instead). Even
more difficult is distinguishing obscure polysemy from simple (but for-
mally unpalatable) homonymy. Laughlins Tzotzil dictionary posits two
homonymous roots, jav(2) a positional root meaning belly (or face) up
160 John B. Haviland
and jav(1), a transitive verb root meaning to chop in half because the two
meanings seem divergent enough to warrant separate entries. However,
Zinacantec folk etymology conjures a succinct image that connects the
senses: when you split, say, a log in two (using a verb based on jav(1)), the
two halves fall belly up (jav(2)). This is thus a case of covert polysemy,
or perhaps of underlying monosemy of a single root with different gram-
matical costumes. Such phenomena may remain intractable throughout a
lexical documentation project.
Similarly, how much ought the lexicographer to include of what might
be labeled erroneous usage malapropisms, puns, or nonce creations?
Zgusta (1971: 5657) distinguishes systemic from occasional uses of
words. An author may use bondage occasionally to mean marriage, with-
out thereby changing the systemic meaning of either term. Zinacantec men,
during several weeks of ribald gossip sessions in 1970, coined what was at
the time a highly creative Tzotzil sexual euphemism using a loan inyeksyon
from Spanish inyeccin, at a time when hypodermic injections were still a
relatively novel foreign introduction. Some of these men still jokingly use
the term almost 40 years later. The word is not in Laughlins Tzotzil dic-
tionary but perhaps it should be.
Finally, questions already mentioned about aims and audience for
whom is a lexical database produced? to what ends will it be put? com-
plicate decisions about what words must be documented and how. The
problems are especially vexed when a lexical database may serve as the
basis for standardization or stabilization, especially in the form of a pub-
lished dictionary.
When people can use a dictionary to look up a word, to
see how it is spelled, and to read a definition, the speech communitys
authority over proper usage is irrevocably altered. How much belongs in
the lexical database of a language documentation project is thus never sim-
ply a matter of completeness or coverage but also involves ideological
decisions that may have far-reaching effects on the future of a language.
Building a lexical database is an expected part of any documentation
project, perhaps the final most demanding analytical task of all. It can be
aided by mechanical techniques applied to textual corpora and by familiarity
with the great lexicographic traditions, which have already grappled with
most of the problems a fieldworker is likely to encounter: lexical units, the
nature of meaning, the vagaries of usage, and, finally, ideologies of lan-
guage and social life. The end product is essential, but producing it relies
on both drudgery and ethnographic inspiration, on systematic elicitation
and serendipitous discovery. One inevitably (re)discovers that enough is
Chapter 6 Documenting lexical knowledge 161
never enough, and that calling a halt by declaring the database closed is
simply an arbitrary rest stop on a very long journey.
This chapter, loosely based on the lecture presented at the DoBeS summer
school in Frankfurt, September 2004, owes a considerable debt to experi-
ences in lexicography shared with my Tzotzil and Guugu Yimithirr teach-
ers, to comments by Nikolaus Himmelmann and J ost Gippert, and to the
hospitality of Elena, Renato, and Lisetta Collavin during its final drafting.
1. Especially with reference to dictionaries for literate European traditions, both
Landau (1984) and Svensn (1993) are useful surveys. See also the multiple
volume handbook edited by Hausmann et al. (19901991).
2. Although languages like Nahuatl enjoy their own centuries old dictionary
traditions (Canger 2002; Amith 2002).
3. A Tzotzil-Spanish version is currently (2005) in press, to be published by the
Centro de Investigaciones y Estudios Superiores en Antropologa Social in
Mexico. As Tzotzil speakers increasingly cross the border into the United
States, the number of Tzotzil-English bilinguals will, of course, only grow.
4. See Haviland (1974). Nick Evans (2002) remarks on misunderstandings of
Aboriginal expressions, even in English, in hearings before the Australian
Land Tribunal shows how such misunderstandings can have serious legal
5. See the notion of rules of use in Silverstein (1976).
6. J ost Gippert reports that Georgian native speakers confirm that mqavs is ap-
plied to anything mobile, such as cars, bicycles, airplanes, or the like.
7. In Berlins works the older spelling Tzeltal is used.
8. The symbol A denotes a hypothetical vowel that alternates between a and o in
derived stems.
9. See and Section 3 below.
10. English is interestingly different in its elaborations, as can be seen by the en-
tries in the Framenet being_attached frame which include: affixed, anchored,
attached, bolted, bound, chained, fastened, fused, glued, handcuffed, lashed,
manacled, moored, nailed, pasted, pinned, plastered, riveted, sewn, shackled,
162 John B. Haviland
stapled, stuck, taped, tethered, tied, welded. In English the central variable
seems to be the kind of material creating the attachment.
11. There are proposals from linguistics itself about a Natural Semantic Metalan-
guage through which definitions of complex notions can be framed in terms
of simpler, allegedly universal (hence natural) semantic primes. See http://, where one can
find a bibliography of the many publications of Anna Wierzbicka.
12. Faced with Pearsons challenge, Reed College senior Chris Haulk promptly
came up with, oh, you mean wrap a string around a cylinder; versus, wrap a
string around a cone (Albyn J ones, personal communication, March 1, 2005)
proving that mathematicians can be lexicographers, too.
13. Note that Donor here is a single entity, defined in Framenet as The person
that begins in possession of the Theme and causes it to be in the possession of
the Recipient.
14. Visit
15. The program symbolizes glottalized or ejective consonants and long vowels as
capital letters, and a 0 is used to signal the absence of medial j.
16. See the descriptions of various stimulus kits developed by the Language and
Cognition Group at the Max Planck Institute for Psycholinguistics at http://
17. See Levinson et al. (2003) for an unashamedly extensional, comparative ap-
18. A short video used to elicit descriptions for Tzotzil inserting actions is avail-
able on the books website.
19. Samples of the sort of cartoon I have found useful for such tasks are available at in streaming video format.
20. The expression is not confined to English; both Italian pollice verde (according
to Elena Collavin) and German grner Daumen (according to Nikolaus Him-
melmann) have exactly the same metaphorical and literal meanings as green
thumb, i.e., someone good at gardening. Similarly, Italian senza fegato with-
out a liver suggests a meaning similar to lily-livered.
21. I ignore basic syntactic issues here: for example, in the expression miil waarril
the word miil eye is the syntactic subject of waarril fly. In miil bagal eye
is syntactic object of bagal poke.
22. In the Tzotzil of nearby Larranzar, the equivalent ritual doublet is at once
humble and literal: achelal, takopal mud, body.
23. See Zgustas discussion of polysemy (1971: esp. 77 ff.); also Evans and Wil-
kins (2000, 2001), Evans (1992).
24. See J ane Hills discussion of the Hopi dictionary project in Chapter 5.
Chapter 7
Prosody in language documentation
Nikolaus P. Himmelmann
Prosodic aspects of a linguistic message such as intonation and lexical ac-
cent are essential elements of its formal make-up. To date, the basics of
analyzing prosodic features have not yet become an integral part of linguis-
tic fieldwork training, and, accordingly, a reasonably detailed and compre-
hensive documentation and description of prosodic features is not yet part
of standard linguistic fieldwork practices. This chapter is specifically con-
cerned with the documentation of prosodic features, i.e. with the question
of what kind of data a language documentation has to contain so that a
thorough analysis of prosodic features is possible. In order to be able to
productively apply the suggestions discussed in this chapter, a basic under-
standing of the core units and procedures of prosodic analysis is necessary.
For a more comprehensive introduction to basic prosodic fieldwork focus-
ing on issues of analysis and description, see Himmelmann and Ladd
Given that a language documentation includes a large corpus of record-
ings of communicative events of different types, it may well be questioned
whether there is any need to pay special attention to prosody when compil-
ing it. Provided that the recordings are of a reasonable quality,
there can be
no doubt that such a corpus can be used for prosodic analyses even when no
particular attention was paid to prosodic features at the time of compiling
the corpus.
However, there are essentially three reasons why some special
attention for prosodic features is necessary when compiling a corpus of
primary data so that it becomes really useful for prosodic purposes:
1. Prosodic phenomena are highly variable and susceptible to contextual
influences. This makes it difficult to recognize basic distinctive patterns.
Prosodic pattern recognition is much facilitated by having the same ut-
terance produced by a number of different speakers (or at least to have
multiple versions of the same utterance). See further Section 2.
164 Nikolaus P. Himmelmann
2. Words produced in isolation are minimal utterances showing both lexi-
cal and utterance-level (post-lexical) features. Hence, the widespread
practice of recording words in isolation when recording a wordlist is of
limited use for prosodic purposes. See further Section 3.
3. Acoustic and auditory data (i.e. recordings of spontaneous and elicited
utterances) do not provide direct evidence with regard to the perception
of native speakers, i.e. what native speakers actually perceive as rele-
vant prosodic contrasts (conversational material may provide indirect
evidence, though; see further below). The most straightforward way to
obtain perception data is to run perception experiments, as further dis-
cussed in Section 5.
Before these points are further elaborated, Section 1 provides a bit more
detail on what exactly the term prosody is intended to refer to here. Further-
more, when discussing points (1) and (2), it will be repeatedly suggested
that elicitation may provide useful materials to complement the data found
in recordings of spontaneous speech. However, eliciting prosodic data is
not an easy task, as discussed in Section 4.
1. Prosodic phenomena
Table 1 lists the major prosodic phenomena according to the different do-
mains in which they are manifest, i.e. the recordable sound wave (acoustic),
the perceptual impression (auditory), and as a component of the language
system (phonological category). The rightmost column lists the most
widely attested functions which may be conveyed by prosodic features (but
of course can also be conveyed by non-prosodic means).
Chapter 7 Prosody in language documentation 165
Table 1. Prosodic phenomena according to domain
Acoustic Auditory
voice quality
(creaky, etc.)
(lexical) accent
levels in pro-
sodic hierarchy
(syllable, foot,
delimiting units
lexical units
speaker attitude
sentence modality
interactional tasks
In discussing prosody, it is important to keep the different domains distinct
and to be aware of the fact that there is no unambiguous mapping relation
between features in different domains. To take just pitch as an example,
regular correspondences exist between changes of fundamental frequency
(F0) observed in the acoustic signal, changes in pitch perceived by the hu-
man ear, and tonal or intonational distinctions. But these correspondences
do not consist of simple and direct mapping relations between the domains.
Thus, there are changes in fundamental frequency which are generally not
perceived as such by the human ear. These are known as microprosodic
perturbations and include phenomena such as the lowering of F0 regularly
induced by voiced consonants.
Furthermore, while it is true that tonal and
intonational categories are primarily marked by changes in pitch, other
auditory parameters such as length, loudness, and voice quality often also
play a role in the marking of these categories.
In the present chapter, the above distinctions and the corresponding ter-
minology will be observed rather strictly. Many of the terms are widely used
in the literature in the sense they are used here, but it may be worth pointing
out that the strict distinction also applies to the terms (lexical) accent and
stress, which are used in many different and often somewhat confusing ways
in the literature. Both terms refer to the phenomenon that a given syllable is
in some sense more prominent than neighboring ones, but lexical accent
here designates this property with reference to the phonological structure of
lexical items (i.e. as a phonological category), while stress refers to an
166 Nikolaus P. Himmelmann
auditory impression (which may or may not have clear acoustic or phono-
logical correlates). In this usage, then, lexical accents can be realized in
different ways, including stress or a fixed change in pitch (so-called melo-
dic or pitch accent as found, for example, in J apanese; cf. Beckman [1986]
and Gussenhoven [2004] for further discussion).
There is no space and need here to discuss in detail all the prosodic phe-
nomena and functions listed in Table 1. The main purpose of this table is to
give an extensional definition of the range of phenomena referred to with
the term prosody in this chapter. A detailed introduction to the phonetics
(both acoustic and auditory) of prosodic features can be found in Laver
(1994: 431546; see also Ladefoged 2003: 75103). The major phono-
logical categories are discussed in Ladd (1996), Cruttenden (1997), Hirst
and di Cristo (1998), Hyman (2001), Yip (2002), Gussenhoven (2004), and
J un (2005), among others. These works also provide useful information
regarding the crosslinguistic variability of prosodic features.
The discussion in this chapter in principle applies to all the prosodic
features listed in Table 1. However, intonation and accent will usually be
mentioned as the main examples and often be singled out for extra com-
ment because these are the two categories that have been most widely ne-
glected in linguistic fieldwork, as opposed to tone, for example, which is a
standard topic in linguistic fieldwork.
2. The need to work with several speakers
Linguistic fieldwork often involves the close cooperation with just one or
possibly two native speakers who are the main contributors or informants
in the sense that a) they provide most of the elicited information on the lan-
guage (texts are often recorded with a broader range of speakers); and b) all
data provided by other speakers is processed and checked with them. This
procedure is based on the fact that with regard to core grammatical features
the information provided by different speakers tends not to differ (or to
differ only minimally). Thus, for example, if one speaker states that the
definite article has to precede the noun and cannot be postponed, this will
in all likelihood be confirmed by all other speakers in the community.
While this set-up works reasonably well for the most basic structural
features of a language, it becomes more and more problematic when more
variable and complex linguistic features are being investigated. The phonet-
ics of prosodic features are highly variable and depend on a complex set of
Chapter 7 Prosody in language documentation 167
factors, including speaker variables and context. There are very few, if any
absolute values. What is high with regard to pitch for one speaker, may be
low for another; what is loud in one context, is just normal in another; and
so on. Furthermore, the perception of prosodic features tends to be heavily
influenced by the investigators own native prosodic system, which further
distorts the data and complicates the analysis. In the early stages of an in-
vestigation of the prosody of a language, it thus tends to be extremely diffi-
cult to recognize a basic pattern in the recorded data. This problem is par-
ticularly pressing in the case of intonation, which for this reason serves as
the major example in this section, but it may also occur with regard to lexi-
cal accent or tone.
The easiest way to solve the pattern-recognition problem is to have sev-
eral speakers do the same thing, i.e. to produce the same utterance in the
same context with the same intention. Figure 1 illustrates the problem and
the suggested solution. It shows fundamental frequency tracings of the
segment (was fr groe) Ohren du hast (what big) ears you have, taken
from the recordings of the folktale Little Red Riding Hood by five German
All speakers produce a rise on the initial accented syllable Oh
and then a continuous fall until the end of the utterance. Note how variable
the initial rise is (shaded area of left-hand column). For speaker J H it is
quite long, starts steep but then becomes flatter, while for speaker NF it is
steep and short. Speaker J Ns rise is very minor indeed and it could be ar-
gued that there is no rise at all in this syllable. Nevertheless, as the five
speakers are doing the same thing, i.e. producing the same utterance in the
same context (of reading the story aloud) with the same intention (of ex-
pressing surprise at the radical changes in the grandmothers appearance), it
is also legitimate to assume that the different rise-falls in F0 seen in these
tracings are in fact realizations of the same category, i.e. the nuclear fall of
Northern Standard German (symbolized with H*+L in ToBI notation). Or,
viewed from the point of view of someone trying to detect a basic pattern,
the fact that one may reasonably assume that the five performances of the
utterances are the same on the level of the language system allows one to
recognize a common pattern, rise on the accented syllable plus continuous
fall until the end of the intonation unit.
168 Nikolaus P. Himmelmann
Ohren Du hast
Original traces Retracings
Figure 1. Multiple performances of the same utterance (from Grabe 1998: 245,
Appendix C)
Doing the same thing here importantly involves three aspects. First, the
utterances have to be segmentally identical (or at least very similar), be-
cause different segments have different microprosodic effects and it is not a
straightforward task to filter out these effects in an attempt to recognize a
basic pattern. Second, the utterances have to convey the same meaning and,
most importantly, they have to be performed with the intention of achieving
the same illocutionary act. As is well known, segmentally identical utter-
ances can be used to ask a question, give a command, make an ironic com-
ment, express surprise, etc. All of these different functions affect the pro-
sodic packaging and hence have to be controlled for when searching for
Chapter 7 Prosody in language documentation 169
prosodically identical utterances. Third, the utterances have to be produced
in identical (or very similar) circumstances, e.g. as casual remarks between
adolescents, in a working environment between people of different status,
With regard to the number of same utterances needed for a detailed
prosodic analysis, there are the following rough guidelines. The absolute
minimum for recognizing a pattern with some degree of reliability is three
instances, because with only two versions of the same utterance it is diffi-
cult, if not impossible, to decide what is distinctive and what coincidental
with regard to those aspects where they diverge. A good start with a de-
tailed analysis can be made with four versions of the same utterance, ideally
two by male speakers and two by female speakers. With eight different
versions, statistical analyses become more viable and useful. With 1012
speakers, the sample size approaches that which is found in much work on
well-documented languages such as English or J apanese.
There is no principled upper limit for the sample size and, depending on
the phenomena being investigated, larger samples may become necessary
which also take into account variables such as age, register, and local dia-
lect. But to repeat, in the typical field setting of a hitherto undocumented
language spoken by a small number of speakers, samples of four to ten ver-
sions of the same utterance will provide a good basis for a detailed prosodic
analysis and will thus greatly improve the databasis for prosodic research.
Note also that, while preferable, it is not absolutely necessary that the
different versions are produced by different speakers. They could also have
been produced by the same speaker(s) on different occasions. Importantly,
more or less immediate repetitions of the same utterance (such as when
asking the speaker to repeat something she just said or to say something
twice) usually do not produce multiple versions of the same utterance in the
intended sense, because repetition usually has some impact on prosody.
It should be obvious that even in a very large corpus of recordings of
more or less spontaneous speech it will be difficult to find a set of four to
ten versions of the same utterance in the intended sense. There may be hun-
dreds or even thousands of utterances one may reasonably safely identify as
polar questions (e.g. Is he coming tonight?). But how many of these will be
segmentally identical or at least very similar? Furthermore, the circum-
stances in which the question is asked may not be really comparable. All of
which makes it difficult to determine those aspects in the prosodic packag-
ing that are related to categorical distinctions. To be sure, in the case of
polar questions, it may be possible to determine these aspects with a rea-
sonable degree of certainty on the basis of a sufficiently large sample from
spontaneous speech. But it is more cumbersome to do this only on the basis
of such a sample and it may become more and more difficult to do it when
investigating more complex issues. In particular, when investigating prob-
lems in the prosodic packaging of information structure (focus, contrast,
deaccenting, etc.), the number of variables to be controlled and accounted
for may become so high that all results remain speculative.
Ideally, then, a comprehensive language documentation should contain
sets of different versions of the same utterance, each set representing a dif-
ferent major function where prosody may be of relevance (i.e. one set for
polar questions, one for all-new utterances, one for polite commands, and
so on). While such sets may happen to occur in a sufficiently large corpus
of spontaneous recordings without paying particular attention to the topic
of prosodic analysis, there are three ways to ensure that they are in fact
represented in the documentation.
First, work with prompting tools such as video clips, a picture story, or
matching games where one speaker instructs another to identify an object
among a set of similar objects or to find a path through an imaginary land-
scape (the so-called map task) will produce similar, if not truly identical
Particularly useful are games where speakers engage in differ-
ent types of speech acts (e.g. asking a question, giving directions, confirm-
ing a suggested solution), provided that the structure of the game forces
speakers to talk about the same world (i.e. to use the same lexical items)
so that the utterances become reasonably similar with regard to their seg-
mental make-up.
The second method to produce relevant data sets is to try direct elicita-
tion by asking speakers to produce utterances or, more precisely, mini-
discourses prepared in advance. The major problem here is how to present
the target utterances in such a way that the prosody is not influenced by the
prompt. We will look at the prompting problem more closely in Section 4.
Here are a few examples of the kind of sentences one may want to try to
elicit with an indication of the prosodic function they target given in square
(1) Has X arrived? No, I havent seen seen him/her/them yet.
[polar question-answer pair]
(2) (In the market:) What are you looking for? (I am looking for)
vegetables. [question word question-answer pair]
(3) Have a seat, please! [polite command]
(4) (Group of people standing at road side, obviously agitated.
What happened?):
A bus turned over!
or: The dog killed a pig! [all-new utterances]
(5) I like the blue shirt, not the red one. [contrastive focus]
(6) Have you ever eaten a black snake? No, I dont eat snakes.
(7) (Surprise:) How big you are already! [speaker attitude]
This list of examples is not complete and should be expanded and adapted
as required by the project setting and make-up. However, since eliciting
such examples will usually not be an easy task and not something which
native speakers will be very eager to do, one should plan to spend consider-
able time on drafting the right set of examples and to test all of them with
one close collaborator before approaching a larger number of speakers for a
One consideration in drafting the examples is segment structure. Exam-
ples should include as few fricatives as possible and in general should
avoid voiceless consonants of all manners of articulation. The ideal exam-
ple in fact consists only of like vowels and nasals, which of course is an
ideal that will hardly ever be attainable when attempting to construct exam-
ples which make sense and are culturally appropriate. Having semantically
and pragmatically well-formed and culturally appropriate utterances will in
general be the more important concern since otherwise the elicitation will
not work at all.
The third way of getting comparable data sets for prosodic analysis is to
make sure that the corpus of recordings contains a sufficient number of
utterances using stylized intonation. A typical example of an everyday use
of stylized intonation is a calling or vocative contour (Ladd 1996: 88,
136 f.). There may be different calling contours, for example, one for calling
someone (Peter!), one for market cries, one used by street-vendors for
advertising fish, and so on. In many languages, listing items (e.g. they had
lots of cows, goats, chicken, and dogs) also involves a special, somewhat
stylized intonation (listing intonation; see also next section). Otherwise,
stylized intonation is a common feature of many forms of ritual speech, in
particular of the so-called chanted speech.
For purposes of prosodic analysis, the main advantage of stylized intonation
contours consists in the fact that by its very nature, intonational contrasts
are more stable and more marked than in non-stylized contours. Conse-
quently, patterns are generally much more easily recognizable. In fact,
while native speakers often do not have very clear intuitions about non-
stylized intonation patterns, they often know about stylized contours and
can readily imitate them.
Obviously, patterns used in stylized intonations differ from those used
in non-stylized ones and similarly, it may be the case that intonation pat-
terns in elicited examples differ quite clearly from those found in spontane-
ous speech (compare the phenomenon of reading intonation found in
many European languages). In this regard, it should be clearly understood
that elicited and stylized data sets have the function of allowing one to get
started on prosodic, and specifically intonational, analysis. They enable the
investigator to get a basic idea of what kind of contrasts are being made in
the language and thus to develop hypotheses that have to be tested with the
spontaneous material. A comprehensive prosodic analysis of course has to
be able to account for the full range of phenomena found in a corpus of
spontaneous recordings.
3. Recording words
It is a widespread practice in linguistics to record lists of elicited words in
order to be able to check ones transcriptions and to document the basic
sound structure of lexical items. The format usually used in such recordings
is first to give the translation equivalent of the word in the contact language
being used (or the number of the word in a word list) which is then fol-
lowed by the word in the documented language, often repeated once or
twice. In this way, words are recorded in isolation, which is often under-
stood to mean in their most basic form, free from any contaminating
contextual influences. This, however, is a misconception, since uttering an
isolated word always constitutes a minimal utterance, which is of particular
import for prosody. Importantly, words in isolation do not only display
whatever lexical prosodic features they might have (lexical tone or accent)
but also features of (usually declarative) utterance prosody. This may ap-
pear to be a rather trivial point, but even in the specialist literature this dis-
tinction has not been made consistently until fairly recently.
As an example, compare Figures 2 and 3. Figure 2 shows the waveform
and F0 tracing for a single Waimaa word, kaluha cloud, recorded in iso-
lation. Figure 3 shows the waveform and F0 tracing for a short Waimaa
utterance, kii baa ini there are people fighting (lit. people hit each other;
an all-new response to a whats-going-on type of question). Note that the
F0 tracing is essentially identical in both figures: it starts out flat at mid-
rises and begins to fall again on the penultimate syllable, and con-
tinues to fall on the last syllable. Hence, the question arises whether the rise
on the penultimate syllable in kaluha is part of the lexical make-up of this
item, reflecting at least in part a regular lexical accent on the penultimate
syllable. Alternatively, this rise-fall on the last two syllables which can be
observed for practically all Waimaa lexical items uttered in isolation is
due exclusively to the fact that uttering a Waimaa word in isolation also
involves the utterance-level features of a standard Waimaa declarative
utterance. (At the time of writing this chapter, I believe that the latter option
is correct, but this needs further research and testing. For current purposes,
it is not relevant which of the two options turns out to be correct. The point
to be clearly understood is that words in isolation always and by necessity
display features of utterance-level intonation.)
Figure 2. Waveform and F0 for Waimaa word in isolation (kaluha cloud)
Figure 3. Waveform and F0 for Waimaa short utterance (kii baa ini there are
people fighting)
In order to be able clearly to separate lexical and post-lexical (utterance-
level) prosodic features, it is now a common practice in research on prosody
(but also in many segmental phonetic studies) not to record words in isola-
tion even when lexical features are the primary concern. Instead, the ideal
is to record the target word(s) in different positions in a carrier phrase, as in
the following English examples:
(8) The target word America in different positions in a carrier phrase
a. America is a word I know. [initial position]
b. I said America once. [phrase-internal position]
c. She said America. [final position]
As seen in these examples, the different position will usually involve dif-
ferent information structural implications, which may, but do not have to,
correlate with post-lexical prosodic distinctions. Furthermore, since the
purpose of these recordings is to compare characteristics of different lexical
items, the carrier phrase usually involves very general items, in particular
verbs such as say, hear, or know (a word) which in principle are
compatible with all lexical items.
Figures 4 and 5 illustrate the effect of carrier phrase position with an-
other example from Waimaa. Here, the target word aboo grandparent, old/
respected person occurs at the end of a carrier phrase (ne ehe aboo she
said aboo) and at the beginning of another one (aboo aku de nau [the
word] aboo I dont know).
Note how the change in position correlates
with a clear change in pitch (rise-fall on boo in final position, late rise on
boo in initial position). But note also what remains constant in both posi-
tions. Most importantly, in both instances boo is roughly twice as long as
the initial syllable a. Consequently, it may be hypothesized that boo con-
tains a long vowel as part of its lexical make-up and that the fact that this
syllable is long in both recordings is not due to an utterance-level effect.
Figure 4. Waimaa carrier phrase with final target word (ne ehe aboo s/he said
Figure 5. Waimaa carrier phrase with initial target word (aboo aku de nau aboo I
dont know)
If working with carrier phrases proves to be too cumbersome or does not
work for some other reason (see next section), one may try to record words
in mini-lists of three to four items, alternating the position of the words
contained in the list, as in (9).
176 Nikolaus P. Himmelmann
(9) Mini-lists with alternating orders
a. America, Africa, Antarctica
b. Africa, Antarctica, America
c. Antarctica, America, Africa
While not as useful as recordings in carrier phrases, such mini-lists often
allow one to make at least a distinction between final and non-final utter-
ance prosody, provided that the speakers actually use listing intonation and
do not simply produce three isolated utterances in rapid sequence. As in the
carrier-phrase example, prosodic features which remain the same across
different positions in the list can be hypothesized to be lexical rather than
4. The prompting problem
Most of the procedures discussed in the preceding two sections involve the
elicitation of prosodic data by asking speakers to produce various kinds of
utterances or mini-discourses. While elicitation quite generally may involve
problems with regard to the naturalness and reliability of the data thus ob-
tained, elicitation of prosodic data is particularly prone to major distortions
since prosodic features are highly susceptible to contextual influences.
Thus, there is little use in presenting the items to be recorded simply by
having native speakers repeat what the researcher or one of her local co-
workers says. In almost all circumstances, this will produce highly distorted
utterances which will largely imitate features of the presented utterance or
display the prosodic characteristics of a repeated utterance.
The most widely-used procedure in prosodic research on languages with
a well-established writing tradition is to have speakers read the target utter-
ances. This procedure, while not directly influencing the prosody by pro-
viding a model for imitation, may encounter a number of other problems.
Most importantly, the reading tasks require that the speakers actually enact
the intended utterance type. Obviously, there is little use in having someone
read a question or a surprised exclamation in a rather flat, non-engaged
monotonous voice. Not all speakers are capable or willing to engage in
such a performance. Successful reading prompts also presuppose that the
speakers are reasonably fluent in reading the language. This will often not
be the case even in those communities where speakers are literate in a
dominant language but not used to seeing their own language written (read-
ing in such circumstances will be slow and in a word-by-word style). An-
other complication may arise from the fact that reading intonation differs
significantly from conversational intonation.
In non-literate societies, written prompts obviously will not work at all.
The main alternative here is to try various kinds of role-playing or the ex-
perimental tasks involving video clips, etc., already mentioned above in
Section 2. Role-plays may work when carefully prepared with a local team
member. They involve speakers pretending to be in a given situation and
reacting with an appropriate short utterance rehearsed in advance. Thus, for
example, one may ask a pair of speakers to pretend meeting one another in
the market, one asking whats happening there? and the other responding
with the target utterance people are fighting. In the best of circumstances,
the speakers engaged in this role-play will actually engage in a short con-
versation, continuing this imagined question-answer pair with a short se-
quence of further utterances. It will often not be possible to make them use
exactly the target utterance prepared in advance, but minor variations in its
segmental make-up will usually not cause major problems for comparabil-
ity. The more realistic the role-playing is, the better the quality of the pro-
sodic data produced in this way will be.
In preparing role-plays and experimental tasks it should be kept in mind
that these will in all likelihood be very strange kinds of communicative
events for native speakers who are not familiar with the basic idea of role-
playing, experiments, or interviews. Thus, one has to be prepared to face
quite a few obstacles when trying to collect data in this way. Continuous
laughing or giggling because of the unusualness and unnaturalness of the
situation is one very common problem. Speakers may also change the
speech act, i.e. rather than responding with a statement (He has gone to the
market) they may produce a command (Go to the market!). Furthermore,
it is not uncommon that speakers who are asked to retell a short action se-
quence in a video clip comment on the kind of dress people are wearing or
the color of the sky visible in the clip instead of engaging with the given
task. Considerable time and ingenuity in developing appropriate prompts
may thus be required in order to make experimental tasks work or to de-
velop useful forms of role-playing in a given community. But this effort
will be well spent because the data generated in this way will be very useful
not only for prosodic analyses, but often also for other types of analysis.
5. Perception experiments
The procedures presented so far in this chapter all focus on production data,
i.e. sets of utterances which can be analyzed acoustically and auditorily.
But production data do not provide any basis for determining which com-
ponents of the complex signal are actually perceived as prosodically dis-
tinctive by native speakers. It is well known from research on European
languages that not all the distinctive information available in the acoustic
data is perceived as such by native speakers. Consequently, there is a need
for data to answer questions such as: Is this clearly observable prominence
(e.g. a change in pitch direction, or increased loudness or duration) actually
perceived by native speakers? Is it perceived at the location where it is ob-
served in the signal? Which of the major phenomena observed for lexical
accents is perceived as distinctive: pitch, duration, length, or vowel quality?
The most straightforward way to answer such questions is to run perception
experiments. In such experiments, the prosodic parameters observed in a
set of utterances are modified and sets of modified utterances (or sets of
modified and unmodified utterances) are then evaluated by native speakers.
For example, loudness on a lexically accented syllable could be reduced
and then it could be tested whether the syllable is still perceived as promi-
nent. Or, the final rise in a question utterance could be reduced or shifted to
an earlier syllable and then tests could be run to determine whether the ut-
terance is still perceived as a question.
It is not an easy task to prepare and run perception experiments of this
type and to date, very few perception experiments have been reported for
languages outside Europe and J apan.
In some ways, the easiest part is the
preparation of stimuli since speech analysis tools such as EMU, PRAAT,
Wave Surfer, or Speech Analyzer allow for a relatively easy and straight-
forward modification of pitch and other prosodic parameters in digitized
utterances. The more difficult part is to find a way of how to run the tests,
especially in societies which have little or no experience with experiments.
That is, perception experiments are also faced with the prompting prob-
lem. Problems here may already arise because speakers may refuse to put
on a headset (which is the best way of ensuring that they can listen care-
fully to the stimuli). But the main challenge consists in defining a manage-
able task. It will usually not be possible to ask directly for the identification
of prosodic properties with questions such as: Where is the major promi-
nence? Is X higher than Y? etc. Instead, what may work are tasks which
involve some kind of comparing and ranking two different items, asking
Chapter 7 Prosody in language documentation 179
questions such as: Which of these two items is more natural/more appropri-
ate/more often heard? Which item would you use when speaking in public?
or the like. Otherwise, general comments on the stimuli (such as: this
sounds rather odd or foreign; thats how people down south speak; etc.)
may also provide important clues, although they will make for a very het-
erogeneous and difficult to quantify dataset.
In this regard, it should be noted that non-experimental, conversational
data may sometimes also provide important clues as to which prosodic fea-
tures are perceived as relevant in a given speech community. A somewhat
trivial, but nonetheless relevant example is the fact that a conversational
corpus allows one to collect a set of examples of utterances which are
treated by the interactants as questions and to compare these to utterances
which are prosodically similar but are not taken up as questions by the lis-
tener. More complex are examples where a misplaced emphasis or wrong
intonation contour produces a misunderstanding, leading to a repair se-
quence. See the contributions in Couper-Kuhlen and Selting (1996) and
Couper-Kuhlen and Ford (2004) for relevant observations and examples.
I would like to thank Bruce Birch and Bob Ladd for extensive discussion of
the issues and ideas presented in this chapter. They, of course, are not re-
sponsible for the use I have made of their input. Thanks also to J an Strunk
for help with plotting the figures.
Very special thanks to Maurcio C.A. Belo, my Waimaa collaborator,
who has patiently suffered through various trial runs of the procedures dis-
cussed in this chapter as well as some other procedures which have been
found not to be productive. Further information and full acknowledgements
for the Waimaa project can be found at
pageDobes1/ SubpagesTeams/SubpageWaimaa/Frameset.htm.
This work was made possible by a research professorship funded by the
Volkswagen foundation and I am most grateful for this very generous sup-
1. Features defining a good recording are listed in Section 2.1 of Chapter 4.
2. Examples of what can be done and what cannot be done in terms of pro-
sodic analysis on the basis of a corpus of recordings alone are Kings (1994)
and Bishops (2002) theses on the intonation of Dyirbal and Bininj Gun-wok,
respectively. Kings thesis is exclusively based on tape recordings of narrative
and procedural texts made by R.M.W. Dixon in the 1960s and 70s. On the ba-
sis of this material, King is able to make a proposal for some key features of
Dyirbal intonation. However, at various points she has to take note of the fact
that the available genres (mostly narrative) severely limit the scope of her
analysis. Furthermore, she notes that much of her analysis remains speculative
as long as it is not possible to test whether the prosodic distinctions she estab-
lishes on the basis of acoustic data alone are actually also perceived as signifi-
cant distinctions by Dyirbal speakers. For perception, see also Section 5 below.
3. Spectral characteristics here refers to those aspects of the formant structure
of speech sounds which reflect prosodic features, e.g. the energy distribution
across the frequency spectrum, which may be an acoustic correlate of stress.
4. Figure 3 below includes a very clear illustration of this effect in that the /b/ of
baa causes a noticeable dip in the F0 contour. Laver (1994: 452456) pro-
vides a fuller discussion of microprosodic perturbations.
5. The tracings are given in two versions, the right-hand column presenting the
original F0 extractions, the left-hand column a somewhat smoothed version of
these. See Grabe (1998: Chapter 2) for further information on the procedures
used in collecting and processing the data. This thesis is available at http://
6. The precise details of the analysis are of no concern here. Note that Grabe
(1998: Chapter 3, Section 2) makes the proposal that the nuclear fall in North-
ern Standard German allows for two major alternative realizations, one with a
clear rise on the accented syllable and one where pitch is more or less level in
the accented syllable (as with J N in Figure 1). The distinction between these
two (phonetic) realizations of the same phonological category is argued to be
7. Further references and links for prompting tools can be found in Chapter 6 and
on the books website.
8. Examples (4)(6) all target distinctions of information structure, a rather com-
plex topic which cannot be adequately dealt with here. See Lambrecht (1994)
and J acobs (2001) for a thorough discussion of some of the basic distinctions
and issues involved, Ladd (1996) for the role prosody may play in marking
information structure, Drubig (2003) for a typological survey, and Dimroth
(2002) for an elicitation task targeting information structure.
9. Bruce (1977) is widely considered the first modern work where the distinction
is fully and consistently applied.
10. As mentioned above, the initial dip in Figure 3 is a microperturbation caused
by the /b/ in baa. The utterance in Figure 2 is by a male speaker, the one in
Figure 3 by a female speaker and therefore overall considerably higher.
Wavefiles containing the utterances of Figures 25 are available at the books
11. The speaker, of course, knows the word aboo, but putting it in initial position
and not using a negation (i.e. using the equivalent of aboo I know as a
prompt) was not felt to be appropriate.
12. Most recent work in this field has been done by researchers associated with the
Phonetics Laboratory, Universiteit Leiden Centre for Linguistics, mostly on
languages of Indonesia, in particular Malay. See Ebing (1997), Od (1997,
2002), van Zanten et al. (2003), and Stoel (2005: 108208) for examples and
references. These works also provide detailed discussion as to how prosodic
experiments can be devised and administered. There is also a fair amount of
work being done on the perception of prosodic differences between Russian
dialects by a group of researchers associated with the Bochum Linguistic Lab
Chapter 8
Ethnography in language documentation
Bruna Franchetto
Ethnographical information is acrucial component of any languagedocu-
mentation. If thewider goal isnot simplytocollect textsandalexical data-
base, but also to present and preserve the cultural heritage of the speech
community, thenethnographical informationmust belinkedtothelinguistic
dataandits annotation and analysis. However, theintegration of linguistic
tronicarchiveisnot aneasytask.
Themain question to beaddressed hereis: What does an ethnographer
look for whensheor heconsultsalanguagedocumentationasconceivedof
in this book? In other words, which kind of information may beirrelevant
for a linguist but highly relevant for an ethnographer? In addressing this
question, I will have little to say about how to annotate ethnographical
information in technical terms. Instead, my main concern will beto make
explicit therequirementsof ademandinguser of alanguagedocumentation,
theethnographer or anthropologist.
There are two main sources on which the discussion in this chapter is
based. Ontheonehand, I haveinterviewedanthropologistsworkinginBra-
zil andtheir responses havebeen condensedinto thekey topics mentioned
inSection2. Ontheother hand, I heavilydrawonthreeyearsof experience
in the project Linguistic, historical and ethnographical documentation of
theUpper Xingu Carib languageor Kuikuro (hereafter referred to as the
Kuikuro Project), which was funded as part of the DoBeS initiative. I
shall usethis experiencein Section3to illustrateoneof thepossibleways
of managingethnographicinformationinalanguagedocumentation(unless
otherwisementioned, all theexamplesandillustrationsprovidedherecome
from this project). So although what follows undoubtedly will provoke
moregeneral questionsandideas, it shouldberememberedthat it reflectsa
specificexperience. Asabackgroundfor thediscussioninSections2and3,
Section 1 provides a few general observations regarding the role of lan-
1. A note on language and ethnography
As BronislawMalinowski (1935) emphasizes, wemust not forget that lan-
guageistheprimarytool usedbyethnographerswhoobtainmuchof thein-
formationcomprisingtheir knowledgeof theother throughthediscourse
of their natives (later called informants and nowadays referred to as
consultants). However, theway this linguistic input is dealt with will of
coursediffer as researchers with different theoretical backgrounds usually
look for different things. For example, theprominencegiven to thenotion
of codes in structuralist theory means that linguistic data are subject to a
particular formof scrutiny to provideevidencefor basic structural patterns
underlyinglanguage, culture, andsociety. For culturalists, ontheother hand,
direct observation, involvement, and interpretativework all formessential
aspectsof theethnographicprocess.
Ethnographersaimto recognizegenresandregistersof speech, describe
thecontextsof speechevents, andidentify apparently significant termsand
expressions. Thelatter maybecomekeynativecategories tobewidelyex-
plored in their analyses and their endeavors to explain cosmologies, social
structures, ritual events, transcriptions andtransformations betweenhuman
and non-human worlds. As aresult of their endeavors, ethnographers pro-
duce another formof discourse in their own language and that shared by
their readersandlisteners thefamousethnographicnarrative allowing
their publictoshareknowledgeabout or producedby theother. Indoingso,
ethnographicdiscoursefacesthedoubletask of introducingitsaudiencesto
a particular universe without losing its comparative horizon, turning the
exotic familiar and the familiar exotic, following Claude Lvi-Strausss
theoretical andmethodological agenda. Indeed, openinguptheparticular to
comparisonisanaimsharedbytheethnographer andthelinguist.
Thisentireprocessinvolvessuccessivephasesof transcriptionandtrans-
lation. Transcription is a painstaking task which should aimto represent
melodic and rhythmic units as closely as possible(seealso Chapter 10 for
further notes on transcribing spoken discourse). In the transcripts that fol-
low(aswell asintheoverall Kuikurodocumentation), I havetriedtoapply
the ideas of ethnographers specialized in verbal art forms, such as Dell
Hymes(1977, 1992), J oel Sherzer (1990) andDennisTedlock (1983).
Translationisafavoritethemeof anthropology ingeneral. Anthropology
teaches us about the possibilities and risks of translation, continually em-
phasizing the importance of translation work and the skill and sensibility
requiredtoachieveagoodtranslation good inthesenseof trustworthy,
as Malinowski put it; good in the sense of competent, something only
made possible by allying linguistic and ethnographic knowledge; good,
too, in the sense of respecting the meanings carried by the source lan-
The work involved in translation is most delicate. Chanted words fre-
quently derivefromspecial registers, thefamous words of theancestors.
Faced with their esoteric quality, many ethnographers have declared this
kindof languageunintelligible. Herelinguists can contributethrough their
capacitytodisclosethemeaningsof phrasesandtermsusedintheseformu-
laic and sui generis languages. Translation work also typically faces diffi-
culties in turning extremely denseand elliptical metaphors into something
at least minimallycomprehensible.
Althoughtranslationisnot approachedhereasaseparatetopic, it under-
lies everything else in this chapter. Translation must beunderstood in the
widest possible sense, ranging fromkinds of transcription and annotation
that allowthebasic characteristics of verbal performances to berecovered,
especially themost elaborateexamples in terms of form, rhythm, register,
vocabulary andmeanings, totranslationproperly speaking, workingfroma
source-language to a target-language. Vast literature exists on translation,
mostly found in the areas of literary criticism and poetic theory, which
wouldbeinterestingto investigatefurther in order to understandtheprob-
lems andreachof translationat all levels. Useful startingpoints for explor-
ingthis literaturearethebooks editedby Swann (1992) andby Rubel and
Rosman(2003), aswell Bringhurst (1999).
2. Some topics an ethnographer is likely to look for
Assuming that the core of a language documentation consists of a collec-
tion of texts (i.e. annotated recordings of speech events) and a lexical
database, therearesomewhat different, but also largely overlapping topics
an anthropologist may look for in each of these components. Before we
takeacloser look at these, it will beuseful tonotethat most ethnographers
have little interest in information on linguistic structures per se. That is,
ethnographers, with a few exceptions, do not read grammars. Linguistic
structureonly becomes interestingwhenit canbelinkeddirectly to culture
and history. Thus, for example, etymologies are one of the favorite lin-
guistic exercises of ethnographers, and it is probably fair to say that, not
infrequently, such etymologies are amateurish at best. Here providing an
indication of morphological boundaries for lexical items, backed-up by a
clear descriptive exposition of the basic morphology, will help to avoid
amateur etymologies. Morecomplex examples of whenlinguistic structure
becomes highly relevant to anthropological concerns will befoundin Sec-
2.1. Consultingalexical database
Likemost other users of alexical database, ethnographers will profit from
theamount of detail providedindefinitionsandthecaregiventotheword-
ingof translations(for further discussionof theissuesraisedinthissection,
seealso Chapter 6). Whenever possible, oneshouldtry to distinguishbasic
andderiveduses whenexplicatingthefull rangeof aterms meanings. For
example, translating Kuikuro oto simply as owner fails to capture the
densewebof its uses: theseareonly attainableby collatingall occurrences
of theexpressionX-oto, owner of X, whichincludefestivals, community
structures, forms of knowledge, objects, kinship, etc. Oto therefore desig-
nates avery particular relationship of control between aperson and acul-
turally relevant object, and thecompleteset of contexts in which it occurs
allows theanthropologist to consider thenatureof this relationship and at-
tainthelevel of abstraction neededto attempt to defineit independently of
any oneof its specific occurrences. Another exampleis tolo, which means
bird, pet, asonggenre, andmy lover. This list of meanings immedi-
ately raises the question of which is the basic sense and how the derived
senses are linked to it? Various kinds of evidence suggest that the basic
meaningisthat of bird, theprototypical pet of theUpper Xingu. Therela-
tionbetweenthepet birdanditsowner isonebetweenafledglingandthe
personwholuredandcaught theyoungbirdinorder tofamiliarize it. As
Fausto (1999; seealso Erikson1987) notes, therelationbetween apet and
its owner is found throughout Amazonia and defines various thematic
domains, suchasshamanism, ritual, warfare, capture, huntingandsoon. In
theUpper Xingu, a lover is thus equivalent to apet bird; theprototypical
tolo songsaremessagesfromalover toher/hisbelovedone.
Thesearchfor ethnographicallyrelevant informationisfacilitatedbythe
definitionof thematic domains inadditiontothesemantic domains usedby
many linguists andlexicographers. Heresemantic domain refers to aset of
Chapter 8 Ethnography in language documentation 187
features whichdefinevery general andinclusivefields of meaning, areof-
ten relevant to grammatical markingandwhich areassociatedwith alarge
number of lexical entries. The categories used may be created by the re-
searcher or formpart of native classifications. Examples include features
Although some semantic domains may contain information useful for
ethnographic purposes, the more narrowly defined thematic domains will
beof greater interest inthisregard. But notethat thedifferencebetweenthe
two types of semantic annotation is gradual at best, and there are some
overlaps, aswill beclear fromthefollowingbrief reviewof major thematic
fields, someof which(e.g. body parts or kinterms) arealso oftenfoundin
KINSHIP terminology isakey areaof ethnographic inquiry. Theseterms,
on theonehand, denotepositions in agenealogical structure, but they are
also inherently relational terms, associatedwith multipledenotata. Thede-
termination of kin relationships is influenced by many variables, such as
genealogical distance or proximity, the calculations made through a third
relation mediating between ego and the individual being addressed or re-
ferredto, as well as contextual andmomentary variables, such as factional
disputes, broken marriages, extraconjugal affairs, and so on. A systematic
analysis of kinship terminology must include a precise indication of the
positions covered by each termin a genealogical structure, using the vo-
cabulary or abbreviations currently usedin anthropology. An extended ex-
ample, representingtheKuikuroconsanguineal kinshipterminology, canbe
In addition to kinship terms proper, related general and specific terms
arealso part of this thematic field. For example: Is thereageneral termfor
kin? Possibly no equivalent of general terms suchas kin is foundinthe
language under study, but we may find collective terms in generation
(egos generation), such as terms for malerelatives of thesamegeneration
(i.e. acover termfor brothers andcousins), andthisis arelevant source
of ethnographical information.
BODY PARTS: Heretheexistenceof alternativeterms for thesamebody
part may provetobeaninterestingsourceof information. InKuikuro, lines
onthepalmsof handsarealsocalledkatuga etoho usedfor (the) mangaba
to come, andtheupper central regionof theforeheadandthe
thighcanbereferredto withkatuga agitoho usedto throw(the) mangaba
(resin ball); both designations refer to an ancient and abandoned ritual
188 Bruna Franchetto
MATERIAL CULTURE or ARTEFACTS: The terminology relating to the
building and structureof thetraditional housemay proverelevant, for ex-
ample, if some of its parts are named after human body parts, as well as
beinguseful fromacomparativeperspective. Herewecanobservethefer-
tility of symbolic analyses of the house in an Amazonian context in the
work of Hugh-J ones (1995) as well as Bourdieus (1970) classic on the
Kabylehouse innorthernAfrica.
verbs denoting actions and events in the agricultural economy, often ex-
tractablefromoriginmythsoncultivatedplants. IntheAmazoniancontext,
thelexicon relating to types of swidden agricultureand phases of cultiva-
tionenablestheextractionof important dataontheorganizationof agricul-
tural work as well as comparative observations. For example, comparison
canthenbemadeof theuseof aswiddenfieldover timeamonganAmazo-
nianforest groupsuchastheParakan(Tupi-Guarani) andanUpper Xingu
groupsuchas theKuikuro, who liveinanareaof transitionbetweenforest
andsavannah. InAmazonia, different patternsof mobility anddistinct con-
ceptionsof alimentarydiets, basedonplantsor animals, areassociatedwith
a greater or lesser diversity of species of cultivated plants as well as a
greater or lesser investment in agricultural production, reflected in distinct
kinds of cultural ethos (Heckenberger 1998; Fausto 2001; Hugh-J ones
1995; Descola1998).
Manymorethematicfieldscouldbenamedhere, but theywill varywith
the specific cultural or geographical area. SHAMANISM is a key thematic
fieldfor many societies, in particular Amerindian people. Ideally, thelexi-
conwouldincludeall theterms designatingsupernatural beings or entities,
explicating themindividually and as a whole, and associating themwith
etiology, the classification and denomination of illnesses, cures, rituals,
masks, andsoon, aswill befurther illustratedinSection3.5below.
2.2. Consultingtexts
In alanguagedocumentation as conceivedof here, texts compriseanno-
tated sessions. Usually, these are audio or video recordings of elicited or
spontaneous performances of verbal genres such as narratives, conversa-
tions, ritual discourses whichhavebeentranscribed, translated, analyzed,
When consulting a corpus of sessions, which are the most relevant
ones fromtheviewpoint of ethnographers, especially thoseaverseto what
Chapter 8 Ethnography in language documentation 189
they call the butterfly collector syndrome? There is no straightforward
and easy answer to such a question, among other things because of the
largevariationthat canbeobservedacrossdifferent cultural andgeographic
areas. But thefollowingsuggestions may provideabasic ideaof therange
of topicsof interest toanethnographer:
STANDARD TOPICS IN ETHNOGRAPHY for which material can probably
be collected in all cultures are body, conception, pregnancy, soul,
ghosts, birth, femaleandmalereclusion, first mensesandmenstruation.
RITUAL WAILING and other verbal-musical genres. It should be noted,
however, that recording events such as ritual wailing, as well as other
songs and shamanic cures, may beprohibited. This applies to theKui-
kuroproject, for example.
GREETINGS as a verbal genre with its specific formulas. This is also
what thenaveconsumer usuallywantstosee/read/learn.
ONOMASTICS, i.e. the system of attributing and transmitting personal
names. This needs to be documented through censuses, village maps,
TOPONYMS: In the best of circumstances, the documentation would
includeamapof theterritory with thetoponyms in theindigenous lan-
guage, where possible translated and analyzed morphologically and
semantically, noting their associations with mythical and historical
MALE and FEMALE SPEECH in sessions which deal with topics affected
by gender distinctions, such as the division of labor, sexual relations,
jealousy, loveaffairs, marriage, menstruation, conceptiontheories, etc.
NATIVE METALINGUISTIC DISCOURSE: What do speakers have to say
about their own language and other languages with which they are in
contact? Seefurther Section3.1.
TURN-TAKING RULES in different kinds of conversation; for example,
those applying to interactions in domestic spaces versus those used in
Data on LANGUAGE ACQUISITION as seen in interactions between chil-
drenof different ages andbetweenchildrenandadults of different ages
possessingdifferent relationstothechild.
ourselves, in theformof narratives and other materials on whitepeo-
ple (or other types of outsiders coming to the community for reasons
suchasresearch, trade, or politics). Withregardtotheinteractionwitha
documentation team, this could include written and spoken materials
190 Bruna Franchetto
that allowan understandingof theprocesses involved in translatingbe-
tween the universe of the foreigners and the indigenous universe. Of
particular interest wouldbethetranslationof foreigntexts, such as leg-
islative documents and health manuals, into the native language, ena-
blingtheanalysisof loanwordsandtheir use, or thecreationof termsto
designatenew objects. Furthermore, adocumentation should include
sessions containingverbal interactionsbetweennativespeakers andfor-
eigners inother languages thantheone(s) beingdocumented, especially
thedominant language(national or regional), reflectingthefull rangeof
This list of topics, though far fromcomprehensive, is already considerable
andattemptingto cover theminfull duringfieldwork will bebothimprac-
tical and unrealistic given the time and resource limits imposed on most
languagedocumentationprojects evenmoresowhendealingwithendan-
gered languages, and aboveall endangered speakers. But an awareness of
these key topics may at least allow the non-anthropological researcher to
identifyandcollect culturallyimportant datawhenever possible.
Apart fromtopicareas, therearesomeother considerationsincompiling
a documentary corpus which may be of equal if not greater importance.
Thus, it is important to beawareof thefact that mythic narratives areeth-
nographic works in themselves. Special attention should begiven to those
narratives that areuseful for comparativepurposes. IntheAmerindianuni-
verse, relevant mythical narrativesincludethoseonthegenesisor originof
theworld and thedifferent classes of beings, theorigin of sex/gender dis-
tinctions, the origin of death (or short life), the origin of white people,
andtheoriginof language(theabsenceof thelatter typeof mythshouldbe
notedandcommentedupon, as well as notingwherecues to anativephi-
losophy of language canbefound). Comparativeobservationsondifferent
styles arealso important, suchas, for example, thedifferences betweenthe
short anddensenarratives of theParakanandthelong, rhetorical andfor-
mal narratives, filledwithrepetitions, of theKuikuro.
Another point to be noted is that the documentation of rituals is prob-
lematic since the more performative the event to be captured, the less
adequate purely linguistic data becomes; on the other hand, the less per-
formativetheritual, themorerelevant linguisticdataproves. Inthisregard,
an observation concerning video documentation will be in order. Today,
video is widely usedas ameans of capturingelements which whether or
not analyzed by the documenting team can provide important data for
Chapter 8 Ethnography in language documentation 191
other researchers. However, one should not overestimate the power of
video. Vision, likeany other formof perception, isalwayspartial, andsim-
plefilminglacksabasic element of goodethnography: participant observa-
tionover anextendedperiod, guidedby specific trainingthat enablesperti-
nent questions to be formulated at any given instance. In addition, visual
documentation to datevery often involves amateurish products of dubious
Insummary, ethnographers asother researchers areinterestedinsets
of information that enable theformulation of questions and hypotheses as
well asthecorroborationof thelatter. Productivequestions, however, cannot
be simply derived fromdocumentary material without the prior definition
of issues based on actual ethnographic field research, or without compara-
tive aims and objects. Nothing substitutes for ethnographic field research.
For this reason, the ideal scenario is to work in an interdisciplinary team.
Although theKuikuro Project is nowadays multidisciplinary, this structure
evolved over time. At the beginning of the project, we had already been
workingfor anumber of yearsinclosecooperationwithanethno-archaeol-
ogist conductingresearchintheUpper Xinguregionand, morespecifically,
the Kuikuro territory. An anthropologist formally joined the teamonly in
2002. Although this has complexified and slowed down the documentary
work well beyond initial forecasts, the experiencehas been and continues
to be extremely productive and positive. Collected and recorded data can
nowbemorecomprehensively contextualized, deepeningtheknowledgeof
thelanguageandtherichnessof its constructions andmeanings. Reflecting
on therelationship between languageand culturehas becomeamuch less
trivial operation. And, last but not least, the involvement of the Kuikuro
themselvesinthedocumentationprocess thefact that theyaretodaymuch
moresubjects-actors-guides than consultants-objects is duein part to the
interest generatedbythegood questionsposedbyagoodethnographer.
3. Exploring a language documentation from an ethnographic point of
Inthissection, I will discussafewexamplesfromtheKuikurodocumenta-
tioninorder toexemplifywaysinwhichanethnographer maylook through
documents(i.e. sessions) inalanguagedocumentation, therebyalsomaking
it clear which kinds of resources will be of particular use in this regard.
Repeatedly throughout this section we will have occasion to note that in
192 Bruna Franchetto
digitally storeddocumentations, theperhapsmajor resourceisanetwork of
links between sessions and other resources included in thedocumentation,
i.e. to makefull useof thehypertext possibilities inherent inthedigital de-
sign. An intelligent network of links between narratives, lexica, images,
andanalytic studieswill helptheuser tonavigatethroughaculturestwist-
ingnetworksof meanings.
Asastartingpoint, I shall look at languageidentity, tryingtounderstand
what theKuikuro mean when they say that theword tisakis our (exclu-
sive) words/language may beused as asynonymfor tisghtu our (ex-
clusive) way of being or, as they would say today, our culture.
naturally leads into a discussion of different speaking styles, and we will
take a closer look at one of the more formalized or ritualized speaking
styles, the chiefs speech, in Section 3.2. Ritualized speech forms often
aboundwithreferences to thepast inanumber of ways, oneof whichwill
beexemplifiedinabit moredetail inSection3.3. Another major character-
istic of ritual speech found in many communities throughout the world is
parallelismin its linguistic and rhetorical structure. Section 3.4 provides a
very brief example. Section 3.5 concludes this exploration by drawing to-
gether the different facets mentioned along the way in exemplifying one
way of resolving typical translation problems in onekey thematic field of
Amerindianethnography, i.e. shamanism.
The current section heavily draws on two resources which the reader
shouldhaveat handduringtheexplorationinorder to beableto makefull
useof thediscussion. On theonehand, Appendix 2 provides an overview
of thestructureof theKuikurodocumentationat thetimeof writing. Onthe
other hand, the website of this book provides access to the primary data
discussedinthissectionintheformof audioandvideoclips.
3.1. Languageandidentity
Based on my experience with the Upper Xingu people in Brazil, I shall
highlight hereoneessential point: Languageis adiacritical marker of indi-
vidual andcollectivesocial andpolitical identity(Franchetto2001).
The Upper Xingu is one of the few multilingual systems without a
commonlinguafrancastill inexistenceintheSouthAmericantropical low-
lands. Thesesystemsseemtohavebeenmorenumerousandcomplexinthe
past that is, until the disastrous effects of European conquest took their
toll (until the18thcentury).
Chapter 8 Ethnography in language documentation 193
TheUpper Xinguishometogroupsspeakinggeneticallydistinct languages,
sharingthesamebasiccultural traitsandinteractingwithinadensenetwork
of ritual, trade, andmatrimonial exchanges (seeMap1). Thecareful obser-
vanceof these linguistic differences is a crucial factor in maintaining and
reproducing theglobal system. It is unsurprising, therefore, that theUpper
Xingu peoples possess a rich set of metalinguistic notions. In fact, they
enjoyspeakingabout themusicof languages, just asthey liketocompare
different languages andput alot of effort into thework of translation. Not
by chance, dictionaries (vocabularies) attract particular attention. They
claimdictionaries rather than grammars are the best way to learn a new
Map 1. Local groupsandvillagesintheUpper Xinguregion
TheKuikuro, speakers of alanguagefromtheCaribfamily, contrast them-
selves with those peoples speaking languages fromthe Arawak family as
thosewhospeak inthethroat versusthosewhospeak withthetipof the
teeth. This is an accuratedescription of thearticulatory characteristics of
the languages under comparison: the preponderance of dorsal and uvular
194 Bruna Franchetto
articulations intheupper XinguCariblanguages andthepreponderanceof
dental andpre-palatal articulationsinArawak languages.
But such socially functional linguistic differences also involve thedis-
tinction between dialectal variants of the same language. Kuikuro is one
variant of theUpper Xingu Carib language; theother variant is spoken by
theNahukw, Kalapalo, andMatipu. Thefactors differentiatingthesevari-
ants are not so much lexical and morphological elements, but primarily
prosodicstructuresor distinct rhythmicpatterns.
Kuikuro builds moraic troquees fromright to left. The main accent of
thewordisgenerallyonthepenultimatesyllable, but it shiftstotheultimate
syllableof thewordwhenit isrelatedasanargument toahead. Thisallows
us to identify prosodically phrasal constituents suchas theverbwithits in-
ternal argument, thepostposition with its complement, or, moregenerally,
therelation between aheadandits dependent. High pitch, vowel and con-
sonant lengthening, and intensity are the parameters that characterize the
main accent. All convergeonthesamesyllable. Consequently, speakers of
theKuikurolanguagesaytheir languageisspokenstraight, direct, inaline.
The Kalapalo dissociate tonal pitch from intensity (loudness). Tonal
pitch generally occurs on theantepenultimatesyllableand intensity on the
penultimate. In addition, the languages phonology, unlike Kuikuro, does
not read syntactic constituents; rather the simple word (and not the
phrase) is the domain of accent. Speakers of these variants are therefore
saidtospeak injumps, waves, andcurves. Inkeepingwiththeir culture, the
metaphor preferred by Upper Xingu peoples is musical in kind: tisakis
angunda our (exclusive) wordsaredancing.
Toget abetter ideaof thisbasicdifferencebetweenthetwodialect vari-
ants, let uslook at andlistentotwowomenintwovideosegments(asmen-
tioned above, theseareavailableat thebooks website). In video segment
A: KUIKURO [HONEY], the woman speaks Kuikuro. The segment displays
her first utteranceswhendescribingapracticelost andforgottenby younger
people: theritual of harvestinganddistributingnativehoney. Payingatten-
tion to themelody of thespeech, you will discover that it results fromthe
Video segment B: KALAPALO [TUKUTI] displays aspeaker of theKala-
palo variant where melody and rhythmclearly differ fromthe preceding
segment. Segment B is fromthebeginningof adescriptionof thepower of
tukuti kueg, theHyper HummingBird, whoseimageor representationthe
speaker is holding in her hand. This is thesupernatural being who caused
her a severe illness, and her cure meant that her husband became an
Chapter 8 Ethnography in language documentation 195
owner of theHugag ritual. Wewill returntothissegment further below
What I wouldliketoemphasizehereisthat thelinguistsparticular skill
liesindocumentinganddescribingthevariantsof alanguage inour sam-
plecase, distinctionsinrhythmwhichrepresent complementary oppositions
at thesocio-political level. Here, metrical phonology combines withnative
metalinguisticconcepts, providinguswithdatacrucial totheunderstanding
of asocial andcultural system.
In a language documentation, such information should be presented in
the metadata through which a session is accessed. Obviously, users of a
documentation will profit even moreif this aspect is also dealt with in an
analytically elaboratedform, i.e. intheformof phonological andcompara-
tivestudies, whichshouldbecross-linkedtotherelevant sessions.
3.2. Waysof speaking(genres)
Intheprevious sectionwesawtheimportanceof musicality inunderstand-
ingthesociolinguistics, andmoregenerally thesociopolitics, of theUpper
Xingu people. Musicality interconnects three domains: (i) the study of
rhythmin phonology; (ii) thespeakers metalinguistic awareness andcate-
gories; and (iii) the identification of speech genres (speeches) since these
areinpart identifiablebydifferencesinrhythmandmelody, asweshall see
IntheUpper Xingu, theidentifiableverbal speechgenresaredistributed
in a continuumwhose poles are formed at one end by everyday speech,
dominatedbythemetricpatternproper tothedialect variant (asexemplified
above), and at the other end by songs where the rhythmof the everyday
language is subject to, or transfigured by, another metric pattern, another
beat, another rhythm. Along this continuumbetween prosaic speech and
songwefindgenreswhereapatterned, fixedrhythmtransfigurestheprosaic
musicality into another style, a psalm-like succession of monotonal lines.
This is thecasewith thechantedspeech characterizinganet itaginhu, the
chiefs speech, a verbal performance which marks the apex of the large
inter-tribal rituals intheUpper Xingu. Here, local identities andtheglobal
societyarecelebratedsimultaneously, givingcenter groundtothehistoryof
thebirth of thegroups along with their chiefs. Theserituals areacelebra-
tion of local and regional history through the memory of the great chiefs
andtheir descendents(Franchetto1993, 2000).
196 Bruna Franchetto
Imaginethat wearewatchingandlisteningto asmall segment of theanet
itaginhu performed in the 2002 dry season in the run up to the egits
(kwaryp) ritual celebrated in the Yawalapiti (Arawak) village. Video seg-
ment C: CHIEFSSPEECH showschief TahukulawelcomingthreeYawalapiti
messengerswhohavejust arrivedinthevillagetoinvitetheKuikurotothe
festival. At thebeginningof theclip, thechief is still insidehis housewith
his brother, waitingfor themoment to comeout into themiddleof thevil-
lageandwalk to thefront of themens house, wherethethreemessengers
tobeofficiallygreetedarewaiting, sittinginthesun.
The chanted speech style anet itaginhu is made up of a sequence of
discourses. The chief summons other Kuikuro chiefs
in order for one or
more of themto accept the task of leading the Kuikuro to the Yawalapiti
village. Finally, Tahukula crouches down with the chief (or chiefs) who
acceptedbeingtheleader of theKuikurofor thisevent infront of thegrave
of thedead chiefs situated in front of themens house, and in front of the
messengers, thereby confirming acceptance of the invitation. Afterwards,
chief and messengers reciteanother part of thechiefs speech in unison.
Everythingissaid byanet itaginhu.
Thetranscriptionandtranslationof asmall sectionof thechiefsspeech
isreproducedbelow. Theanet itaginhu comprisesasequenceof altogether
six main speeches. Thereis thespeech to celebratethearrival of themes-
sengers fromanother village, the speech to make the messengers sit on
thestools placedin front of themens house locatedinthemiddleof the
village, andso forth. Eachspeechmarks aparticular phaseof theritual for
welcoming those who come fromoutside. The sixth speech is the apical
discourse, theoneinwhichthegreat chiefs of thepast, thefounders of the
Kuikuro inthis case, paradeinsequence, eachonethecentral personaof a
unit of thespeech, herecalledablock, whichis composedof various lines
or verses. Theparallelisticstructurecharacterizesboththeversesmakingup
ablock and therelationship between various blocks. In the apical speech,
theeffect produced is likewandering through aportrait gallery filled with
the great ancestors, whose sequence consubstantiates the existence of the
current chiefsandtheKuikuroasawhole. Thefollowingistheblock relat-
ingtoAmatuag, oneof thefounder chiefs:
(1) Transcript of videosegment C: CHIEFSSPEECH
etsuhehetseli etsuhehetsegakengingoku
(ancient words) messenger
its a mistake for you to come here, messenger
Chapter 8 Ethnography in language documentation 197
ahtha kukuge thigmbkila ngingoku
aht-ha kukuge t- hi -g -mbkila ngingoku
NEG-AFF our/people RFL-grand/son-REL-PASTNEG messenger
our people have no more descendents, messenger
there are no more descendents of Amatuag, messenger
angolo atai hle wke
true when ADV distant/past
and that, by contrast, was the time of the true ones (chiefs)
ngele higmbg kaenga atsakuhotag ngingoku
ngele hi -g -mbg kae-nga atsaku-ho-tag ngingoku
AN grand/son-REL-PAST LOC-ALL run -HYP-CONT messenger
you should run towards the descendents of this one (Amatuag), mes-
isagingo geleha atsakugake ngingoku
is-agingo gele-ha atsaku-gake ngingoku
3-same yet-AFF run -IMP messenger
just like him still, run, messenger!
nago imala geleha atsakugake ngingoku
nago ima-l -a gele-ha atsaku-gake ngingoku
AN trail-REL-like yet -AFF run -IMP messenger
as if it were along their path, run, messenger!
aneto imala geleha atsakugake ngingoku
aneto ima-l -a gele-ha atsaku-gake ngingoku
chiefs trail-REL-like yet -AFF run -IMP messenger
as if it were still along the path of the chiefs, run, messenger!
We are dealing here not with words (aki) but with speech or talk
(itaginhu). Despite the appearance of being a monological genre, for the
Kuikuro thechiefs speech in its highly ritualizedperformanceis acon-
versation. Thespeech is conceived of as an interaction or adialogue or
more than this: a conversation with a polyphony of voices. The chiefs
squatted in front of the messengers very often perform the formulas of
anet itaginhu simultaneously; on other occasions, messengers and host
chiefsspeak at thesametime, eachgroupingintheir ownlanguage.
198 Bruna Franchetto
Thereisnospaceheretoexploretheethnographicsignificanceof theanet
itaginhu in detail. It will sufficeto notethat it contains acondensed set of
meanings, values, and attitudes which help illuminate politics, chiefdom,
andsocial morphology. Thechief linksthepast tothepresent, representing
and maintaining the unity of his local group in relation to other groups,
thereby allowinghis ownto openupto others. Althoughbeingachief is a
conditiontransmittedby bloodlinesof inheritance, this conditionhas tobe
continually constructed by the full exerciseof chiefdom, by certain quali-
tiesand, last but noleast, by knowingtheanet itaginhu speechesandhow
It canalso beobservedthat, apart fromalmost untranslatableterms and
densemetaphors typical for this genre, thesegment contains as in anet
itaginhu as a whole the self-derogatory posture typical for interactions
between affines and indicative of hierarchical relations characterized by a
specificverbal andbehavioral etiquette.
Notealsotheoccurrenceof theparticlewke, whichendsthelinedivid-
ingthis unit of discourseinto two parts. Theother lines endwiththeword
ngingoku messenger a term from the special lexicon of the chiefs
speech. Wke meanspast, thetruthvalueandauthorityof aspeechabove
all suspicion, as ametaphor of acollectivity andthechiefdom. Morepre-
cisely, wke is a marker of an epistemic modality, as further discussed in
3.3. History
Theprevious section examined someof thegeneral information necessary
tobegintounderstandthechiefsspeech, whichshouldbeincorporatedin
themetadatadirectly linked to thesession containing thefull recording of
theevent. Inamorecompleteannotation, onewouldhavetoinvestigatethe
social and political meaning of the chiefs speech within the intertribal
system, itsfunctionwithinaspecific ritual, thespeechgenrewhichcharac-
terizesit, thestatusandrolesof actorsandtheir interaction. Additionally, it
will beuseful to includeacareful network of links to other sessions of dif-
ferent genres, the lexicon and non-linguistic components such as images,
iconography, genealogies, studies, and so on in order to allow for a full
exploration of its functions and meanings. For example, in thecaseof the
chiefs speech, links withsessions containinghistorical andpersonal nar-
ratives, providingaccesstocollectiveandindividual memories, arecrucial,
Chapter 8 Ethnography in language documentation 199
The historical oral tradition contains narratives where an elder called
Hopestells how his grandfather swapped his namewith theGerman eth-
nographer Karl vondenSteinen, calledKalusi bytheKuikuro, at theendof
the19thcentury (Steinen1940, 1942). Thesenarratives takeus back to the
time when the Kuikuro group was formed and the differentiation of the
dialectical variantsbegan. It wasthetimeof thefounder chief Amatuag, a
persona from the chiefs speech mentioned in the segment transcribed
above. Thefollowing is atranscription and translation of thebeginning of
this story, from the session Kalusi (see audio segment KALUSI). This
segment also shows howlinguistic and cultural commentary can belinked
directlytothelineit ismost relevant to, withfurther linksincludedinsome
of thesecommentaries.
(2) Transcript of audiosegment Kalusi
\trs ising wke ingila Intag Intag il ande Intag
\te A long time ago he came from Intag, Intag is
over there,
\ nt l Observe the second position particle wke, which
means distant past combined with the epistemic
value of true statement from collective memory
and the authority of someone (the speaker) who
received the story through the line of his parents
and grand-parents; his grand-father was the one who
saw Kalusi/Steinen (faithfully reported first hand
\ nt c Intag name of an old Nahukw village. Kalusi
came through the Nahukw villages situated along
the Curisevo river until the beginning of the 20th
century. The names of the villages mentioned in
this session correspond to those villages existing
at the time of Steinen. SEE MAP STEINEN.
\trs Kuhikugu imnhige Lahatua imnhige
\te in the direction of Kuhikugu, in the direction of
\ nt c Kuhikugu the first Kuikuro village founded after
a number of families departed from the Oti villages
complex, thought to be inhabited by the Uagiht
Kuikuro village already settled in the twentieth
200 Bruna Franchetto
century and inhabited up to the 1950s. SEE MAPS
\trs isithg Kalusi etshgha
\te it was him who arrived, it was Kalusi who arrived
\ nt c Kalusi is now introduced as the main protagonist of
the story. The name Kalusi derives from Karl, in
Portuguese Carlos, adapted to the phonological
structure of the upper Xingu Carib languages, where
there are no consonantal clusters, and to their
syllabic structure (CV).
\trs Maginatu hekeha ingithg
\te Maginatu brought him
\trs Maginatu akatsange ingitinhi wke Kuhikugunaha
\te a long time ago Maginatu brought him to Kuhikugu,
to Kuhikugu
\trs Maginatu Tugumai ekisei Maginatui
\te Maginatu, he was a Trumai, Maginatu was
\ nt c In the Hopese version of the encounter, von den
Steinen was brought to the village of Kuhikugu by a
Trumai Indian called Maginatu.
Theparticlewke, anepistemicmodalitymarker wealready encounteredin
the chiefs speech, is present here in the first and penultimate lines (and
underlined). Thedescriptionof so-calledepistemic modalitiesisathemeof
great interest to ethnographers. Epistemic modality markers convey infor-
mationabout therelationshipbetweenthespeaker andhisor her statements
andtheinterlocutors. Theyincludeevidentials, hear-sayparticles, andother
modalizers of a statements truth value. An extensive literature exists on
this topic.
Hence, it is important to comment inthesessionannotations on
thepresenceandmeaningsof theseelementsinthelanguage.
Many of theseparticles canbefoundinHopessnarrative, centeredon
thefigureof Karl vondenSteinen. Themost interestingonesarethosethat
indicate the speakers attitude in relation to the contents of his recollec-
tions, thus indexingasub-genreof narratives which wecandefineas his-
torical. These marks distinguish historical narratives fromthe narratives
wecall mythic, whichtell of theoriginsof cultural goodsandarelocatedat
thebeginningof time, whenhumansor non-humans, or quasi-humans, lived
and communicated with each other. The epistemic markers help us in the
Chapter 8 Ethnography in language documentation 201
workof distinguishingnarrativeregisterssuchasthehistorical andmythical,
andkindsof memory important topicsintodaysethnological debates.
Along with deictics and particles carrying an aspectual value, these
small wordsarecalledtisakis enkgutoho bytheKuikuro, abeautiful meta-
phor meaningroughly madefor our words to beachsafely. They arepre-
dicativeanchors, actualizingthestatement andclosingitslivingmeaning.
Links between components, sub-components and their contents help to
deepentheethnographicinformation. Continuingwithour example, histori-
cal narratives can beconnected to historical and archaeological studies. In
the Kuikuro case, one of the results of the research work undertaken by
Michael Heckenberger, theethno-archaeologist collaboratingwiththeKui-
kuroproject, isthereconstructionof thepre-historicvillages, i.e. thosepre-
cedingthefirst historical recordwrittenby Karl vondenSteinen(Hecken-
berger 2005). Wewerethusabletoreconstruct thevillageof Kuhikugu, the
first settlement built by theKuikuro group, still in existenceat thetimeof
von den Steinen, when the ethnographer met the grandfather of the elder
Hopes(seeMap2). Thepre-historical villagesweremuchlarger andmore
complex than contemporary ones, linked in a more impressive way than
today toanetwork of primary andsatellitevillages, connectedby large50-
metrewidepathways. Aboveall, eacharchaeological siteisassociatedwith
historical andmythic narratives that allowageo-historical mapto beinter-
connectedwith acosmological map. Consequently, nativeoral history, rit-
ual performances of verbal art forms, thehistory written by outsiders and
archaeological research are combined to delineate a history in which in-
3.4. Parallelism
Likeanyoral performance, thenarrativeabout Karl vondenSteinentoldby
Hopescontains many repetitions. Thehigh incidenceof repetitions espe-
cially appliesto aculturebasedonprimary orality. Rather thanbeingmere
repetitions, these are usually parallelistic constructions.
This parallelism
(lexical and grammatical in kind) is a defining characteristic of verbal art
genres, although, as I mentioned, it can already be found in sketchy and
elementary formin prosaic and informal discourse. We can see the inter-
weavingbetweenparallelismandversificationinthechiefsspeech, at the
extremeendof chantedspeech.
202 Bruna Franchetto
Map 2. Map of prehistoric site of Kuhikugu showing locations of the Kuikuro
villageoccupiedover thepast 150years(denotedby closedcircles). Black
dotsrepresent collectionunits. (FromHeckenberger 1998: 638)
In traditional narrative, the ability to construct micro and macro parallel-
isms defines the skill of a recognized akinh oto owner/master of narra-
tive. Theresources providedby thegrammar aretheobject of aconscious
manipulation deployed to produce beautiful speech (att itaginhu). In
Kuikuro, for example, theplay of alternations between transitivity and in-
transitivity (or causativity andanti-causativity) is marvelously exploitedby
experienced narrators (Franchetto 2003). Let us examinejust oneexample
here, takenfromthe(historical) narrativeontheoriginof theKuikuropeo-
Chapter 8 Ethnography in language documentation 203
(3) Segment fromSessionKukopogipg: IV, d, 142143
[tsiu] otohinhakeng leha
[tsiu] ot- ohinhake -ng leha
[id] 3/DETR- maniocswidden/cut-PNCT CMPL
tutuhi itu ohinhakeng leha iheke [tsiu]
tu- tuhi itu ohinhake -ng leha i- heke [tsiu]
RFL-manioc manioc swidden/cut-PNCT CMPL 3-ERG [id]
[tsiu] he cleared the manioc swidden place
he cleared the place for his own manioc swidden field [tsiu]
In this example, the scene of the chief clearing the first swidden field for
maniocinthenewvillageisseenthroughtheconcomitant andcomplemen-
tary perspectives of an intransitiveandatransitiveaction(comparelines 1
Exploring theKuikuro metalanguageagain, wediscover that thereis a
termdesignatingsynonymyand, obviously, theparallelisticrelationshipbe-
tweenexpressionssuchastheoneinour example. Thisisthetermotohongo
which means the same other or the other same, a termused in many
other (non-linguistic) domains as well, such as the differentiation of spe-
cies, kinshiprelations(siblings) andlocal groups.
As we emphasized above, distinct semantic and thematic domains can
beinterrelated in this instance, on thebasis of formal traits pertaining to
different genresof verbal art. Documentationof thelatter is infact particu-
larly relevant in contemporary ethnology as part of an ongoing discussion
about ethnopoeticsandtheproblemsof translation.
3.5. Thematicfieldsanduntranslatableterms
Ethnographers can search the Kuikuro database for key words linked to
texts, lexical entries and other components relating to thematic fields,
whichwerebriefly discussedinSection2.2above. As well as enablingthe
understanding of aspecific culture, thetopics coded in thematic fields are
especially important for comparison. Theethnography thus produced con-
tributestoanthropological theory, essentiallyacomparativescience.
One of the key themes in the Kuikuro documentation is shamanism.
This topic connects cosmology, rituals, social morphology, conceptions of
204 Bruna Franchetto
sickness, death of the body, and incorporeal principles (different kinds of
soul, shadow, breath, theinvisiblearrowsof thewitch, etc.), curingconcep-
tionsandpractices, politics, andprestige. Theshamanhasbeendefinedasa
translator, mediating human and non-human worlds, a master of transfor-
mations(see, for example, CarneirodaCunha1999, whichprovidesfurther
references). Herewebriefly illustratehow this very complex topic can be
approachedvialinksbetweenrelevant sectionsof thedocumentation.
In a session already mentioned above, Tapualu, a Kalapalo woman,
holds in her hand the representation of the powerful and feared tukuti
kueg, theHyper Being(spirit-animal) of theHummingBird. Sheexplains
who or what tukuti kueg is. She is showing the cause of her illness: the
Hyper Beingisassociatedwiththepequi tree(Caryocar brasiliense) andthe
pequi origin myth. Tukuti kueg encountered her whileshewas collecting
pequi fruits andtheBeingstruck her, causingher to feel terriblepains. Re-
turninghome, shespent weeks in her hammock in delirium, dreaming and
shouting. Inarelatedsession, Samuag, Tapualus husband, recounts what
happened, recollecting the myth and offering explanations (seevideo seg-
ment D: ITSEKE-TUKUTI1). Then, theshamansrushtodiagnosethecauseof
theillness andcurethevictim. Inanother session, oneof theKuikuro sha-
manstalksabout thewomansillnessandtheprocessof diagnosisandcure
(video segment E: ITSEKE-TUKUTI2). Tukuti kueg was tamed through
thehugag ritual which thewomans husbandthen owned for anumber
of years(videosegment F: HUGAG).
All rituals or better, all ritual complexes connect worlds, but theyare
alsothecoremotorsof productivecycles, thecirculationof goods, thesys-
temof exchanges, andthemaintenanceof thelocal supra-domesticunit, the
village. Ritualsengender social roles, actualizerelationshipsof kinshipand
alliance, and confer prestige. A ritual is a festival, dance, and song; it is
beauty, it restoreswell being, it isjoyandhealth. Ritual istransformation.
Every sessionlinkedto thekey wordshamanism will obviously allow
certainlexical entries tobebuilt upmorecarefully, suchas kueg (roughly
translated here as hyper), an operator categorizing every supernatural
entity, or itseke. Every kueg being is itseke (translatable in a highly
equivocal fashionasspirit). Thesearetermswhosemeaningcannot really
be grasped without referring to the entire cosmological and shamanistic
complex. How to attribute glosses, translations, and definitions to these
almost untranslatable terms? The shortcut translation or glossing of
corecultural categoriesisat onceanunavoidabletask andafrustratingone
(Franchetto 2002). Our attempt intheKuikuro lexiconis far fromsatisfac-
Chapter 8 Ethnography in language documentation 205
tory, even though we have strived to include native definitions wherever
possible. Consider theentryfor itseke:
\ l x i t seke
\ ent yp r oot
\ l c i t seke
\ ph [itsEkE]
\ ps N
\ ge hyper - bei ng
\ xkk tinegetinhha ugei itsekeinha
\ t e I amaf r ai d of t he hyper - bei ngs
\ xkk itseke ingilha kupehe kukapngu igakaho
\ t e we see t he hyper - bei ngs bef or e we di e
\ xkk kagamuke kagineng itseke heke
\ t e t he hyper - bei ng f r i ght ened t he chi l d
\ def kk i t seke eki sei kukengeni , kugehngha eki sei , i nhal ha
i ngi l i ; i t seke kuki l ha ngi ko heke kukengeni heke;
i t seke eki sei kukot ombani kukgnuhat a.
\ def e I t seke i s t hat whi ch eat s us, i t i s not a per son, i t
cannot be seen; we say t hat i t seke i s somet hi ng whi ch
eat s us; i t seke i s t hat whi ch hur t s ( ot omba- ) us wi t h
i nvi si bl e ar r ows when we ar e si ck. I t seke i s a super -
nat ur al bei ng, a spi r i t , a ' beast ; ' i t dwel l s i n t he
f or est s, r i ver s and l akes; i t causes i l l ness and
deat h; onl y shamans and t he si ck can see t hem.
\ cf kueg, ot omba
To give an appropriate meaning explication for words such as itseke or
akunga souls isobviouslyaverydemandingtask. But for thesewordsit is
at least possibleand useful to assumethat speakers shareasingleconcept
which can be approached by combining different metalinguistic explica-
tions with alargenumber of textual occurrences. However, thereareother
cultural categories extremely salient and apparently empty whereeven
this assumptiondoesnot holdandthus any single, unifyingglossor defini-
tionis misleadingonavery basic level. This is thecasewiththenotionof
kugihe, which in first approximation we may gloss with witchcraft (sub-
stance). Thistermliesat thecenter of beliefsconcerningcausality, illness,
death, curing, and individual capacities. Peoplecannot say what kugihe is,
but they can talk about the effect kugihe has and the social relations that
surroundkugihe. Itsexact meaningseemstoremainineffabletospeakers.
It would thus be a mistake to think that all categories are represented
with adefinition andthat definitions aresharedwithin thespeech commu-
nity; this is the case with many non-observational categories. As Boyer
206 Bruna Franchetto
(1990: 37) says: A vocabulary of anatural languageisnot auniformland-
scape. Not everythingcomprisesasignifier withitsconceptual counterpart
and terms such as kugihe arenot common shared categories. These terms
shouldbeespecially markedwhenoccurringintexts or inthelexical data-
base(seealso Chapter 6). If thelexicon forms afunctional part of thetext
interlinearization, as in Shoebox, the use of an oversimplified and strictly
speaking wrong gloss is unavoidable. Description, native definitions,
commentsaswell as links defininganetwork of explanatory, narrativeand
performativepiecescan, albeit partially, makeupfor theethnographicpov-
ertyof our documentationtools.
4. Conclusion
Thepurposeof thischapter wasnot toanswer thequestionof howtoanno-
tate ethnographical information in a language documentation in technical
terms. This wouldbeanimpossibletask, not only for practical reasons but
alsobecauseof theever shiftingandevolvingresearchinterestsinthefield
of anthropology. Instead, I haveattempted to givean ideaof what an eth-
nographer might look for in alanguagedocumentation and how sheor he
wouldmakeuseof it. I suggestedthat, whererelevant andnecessary, meta-
dataattached to sessions could providemoredetailed and sensitiveethno-
graphical information, i.e. contain a kind of compacted, theme-specific
ethnography. Obviously, theinclusion of an ethnographic sketch in a lan-
guage documentation will also be of major assistance in accessing the
documentationfromananthropological point of view. Whileawell worked-
out sketchandevensession-specificmini ethnographiesmaywell bebeyond
theexpertiseof researchers who lack atrainingin anthropology, asystem-
atic collectionof amateur observations will still beof someuse, inparticu-
lar if it includes pointers to possibly relevant sessions as well as a frank
assessment of thequality of thetranslationof, andcommentary on, mytho-
logical andother ethnographicallyrelevant material.
But evenwheretheexpertisefor includingafull-fledgedethnography is
available, I think that thedigital format providesfor perhapsanevenbetter
way to deal with the complex data needed for anthropological research.
This involves designing digital architectures with multipleand multidirec-
tional links between different sessions and qualitatively different kinds of
information such as lexica, analytical papers, photos, and so on. We can
thus design paths which intelligent users can follow in order to construct
Chapter 8 Ethnography in language documentation 207
their own possible ethnographies or their own possible narratives on the
ways of beingandthinking of thepeoplewhoselanguage, words, andtalk
arecrystallized inthedocumentation.
I thank Eduardo B. Viveiros deCastro andCarlos Fausto for their sugges-
1. Mangaba isthefruit of aplant (Hancornia speciosa) typically foundinsavan-
nahregions. Theresinextractedfromit went into makingasmall ball usedin
anintra- andinter-tribal ritual gameintheUpper Xingu.
2. Inthischapter, all thewordsandutterancesintheUpper XinguCariblanguage
(inbothvariants, KuikuroandKalapalo) aretranscribedinanorthography de-
velopedasaresult of literacyprograms. Thecommunitiesdecidedonanortho-
graphy whichisnot strictly phonemic inthat it representssomesub-phonemic
unitsaswell. Theconventionsfor correspondencesbetweenphonemes/phones
and graphemes which do not have their IPA values are the following: /i/
<>, uvular tap <g>, /N/ <ng>, // <nh>, /<ts>, /
3. Thereis amultitudeof chiefly roles inanUpper Xinguvillage: theowner of
thevillage, theowner of theplaza, theowner of thehouse, theowner of
the main trail, the owner of the trail to the water. Each one is considered
anet (chief) by inheritance and the label of his status defines some kind of
dominanceor control, not just symbolic, of oneof theelementsof thevillages
social and ritual spaces. Thus, the owner of the middle is the person who
controlsthecenter of thevillage, amalepublicandritual spacepar excellence;
theowner of themainpath controlsthearrival anddepartureof themessen-
gers who cometo inviteothers to theinter-tribal festivals that takeplaceperi-
odically in the Upper Xingu villages; the owner of the house represents a
domestic group, normally an active male adult with children and sometimes
grandchildren. Owner or master isaroughtranslationof thetermoto, whose
4. Theabbreviationsfor interlinear glossesarethefollowing: ADV adversative;
AFF affirmative; ALL allative; AN anaphoric; CONT continuative(as-
pect); HYP hypothetic (mood); IMP imperative(mood); LOC locative;
PASTNEG negativepast; REL relational; RFL reflexive.
208 Bruna Franchetto
5. Thisisadirect extract fromour Shoebox filewhereeachlineisprecededby a
code: \trs orthographictranscription; \te Englishtranslation; \ntl linguistic
notes; \ntc cultural notes.
6. Mapnot includedhere.
7. Mapsnot includedhere.
8. Compare, for example, Chafe and Nichols 1986; Basso 1987, 1988, 1995;
Silverstein1993. SeealsoFranchetto2005.
9. On parallelismsee, among other authors: J akobson 1960, 1966, 1968, 1973;
Lord 1985; Zumthor 1983; Tedlock 1983; Fox 1998; Finnegan 1992; Hymes
1992; Sherzer 1990; Urban1991; Monod-Becquelin1987.
10. Theabbreviationsfor interlinear glossesarethefollowing: CMPL completive
(aspectual particle); DETR detransitivizer; ERG ergative; id ideophone;
PNCT punctual (aspect); RFL reflexive.
11. The example is taken directly fromthe Kuikuro lexical database in Shoebox
where the following line codes are used: \lx lexeme (main entry); \entyp
entry type; \lc citation form; \ph phonetic transcription; \ps part-of-
speech; \ge Englishgloss; \xkk exampleinKuikuro; \te Englishtransla-
tion of the example; \defkk original definition in Kuikuro; \defe English
translationof theKuikurodefinition; \cf cross-references.
Chapter 8 Ethnography in language documentation 209
Appendix 1: Kuikuro terms for consanguineal basic kin types
Thetablesbelow, extractedfromtheethnographical component of theKui-
kurodocumentation, showthemultiplicity of denotataof eachterm(Tables
Table 1. Kuikuroconsanguineal kinterms(maleego)
Term Denotata English gloss
ngaupg FF, MF grandfather
ngits MM, FM grandmother
u F, FB, FFB father
ama, ata, isi M, MZ mother
ijogu MB maternal uncle
etsi, ipg FZ paternal aunt
hisug B, FBS, MZS brother
hinhano eB, FBeS, MZeS older brother
his yB, FByS, MZyS younger brother
ingdzu Z, FBD, MZD sister
h MBCh, FZCh cousin
mugu S, BS son
indis D, BD daughter
hatu ZS nephew
hati ZD niece
hig SS, SD, DS, DD grandson/-daughter
* Thetablesmakeuseof thecommonly usedabbreviationsfor kinrelations: F =
father, FF =fathers father, M =mother, Z =sister, B =brother, S =son, D =
daughter, Ch=child, e=elder, y=younger, etc.
210 Bruna Franchetto
Table 2. Kuikuroconsanguineal kinterms(femaleego)
Term Denotata English gloss
ngaupg FF, MF grandfather
ngits MM, FM grandmother
u F, FB, FFB father
ama, ata, isi M, MZ, MMZ mother
sogu MB maternal uncle
etsi, ipg FZ paternal aunt
hisug B, FBS, MZS brother
has eZ, FBeD, MZeD older sister
ikene yZ, FByD, MZyD younger sister
his B, FBS, MZS brother
h MBCh, FZCh cousin
mukugu S, ZS son
indis D, ZD daughter
hatu BS nephew
hati BD niece
hig SS, SD, DS, DD grandson/-daughter
Thelexical entryof akinshipterminthelexical databaseshouldminimally
be associated with the specification of the denotata and the sex of the
speaker, asinthefollowingexamples:
\lx u \glefather \den F, FB, FFB
\lx ingdzu \glesister \den Z, FBD, MZD
<m.s. (=manspeaking)>
\lx has \gleolder sister \den eZ, FBeD, MZeD
<w.s. (=womanspeaking)>
Chapter 8 Ethnography in language documentation 211
Appendix 2
Chapter 9
Linguistic annotation
Eva Schultze-Berndt
This chapter is concerned with linguistic annotation of a documented com-
municative event, i.e. the annotation of its linguistic aspects which is, at
the same time, the type of annotation that is likely to be produced by, and
to be of interest to, linguists. Following Bird and Liberman (2001), the term
annotation will be used here as a cover term for all types of information
(including transcriptions) that can be related to the recording of a commu-
nicative event, or that may represent aspects of a communicative event for
which no recording exists. Apart from linguistic annotation, there is also a
type of annotation relating to the cultural norms and practices of the speech
community that form the background of a given communicative event. This
type of annotation is discussed in Chapter 8. It goes without saying that
especially in the area of semantics and translation (see Section 3), linguistic
and ethnographic commentary overlap.
Linguistic annotation can also be distinguished from metadata or
header information comprising information about the language used,
the time and place of the recordings, the participants including the docu-
menter, access rights, and so on. Metadata are further discussed in Chapters
1, 4 and 13, and will not be considered any further here.
1. Basic assumptions
Let us first consider the significance of linguistic annotation for the enter-
prise of language documentation, in the sense in which this expression is
being used throughout this volume. It should be obvious that what is docu-
mented is not a language but a selection of communicative events, where
the communicating parties consider themselves as sharing a code or lan-
For simplicitys sake and without an implication of homogeneity,
people sharing a language will henceforth be referred to as a speech com-
214 Eva Schultze-Berndt
munity. The main motives for selecting certain communicative events for
documentation include:
a) their accessibility to the documenter(s) which is of course the condi-
tion for documentation,
b) their representativeness of communicative events conducted in the
speech community i.e. communicative events that are likely to take
place even in the absence of any person documenting it, referred to as
observed communicative events by Himmelmann (1998),
c) their representativeness of the structural possibilities of the language in
question this is the reason for including what Himmelmann (1998)
terms staged communicative events and elicited utterances, elicited
precisely for the purpose of elucidating some aspect of the structure of
the language.
It is self-evident that the task of documenting a communicative event does
not stop at simply recording it (by producing, e.g., an audio- or video-
recording). Especially in the case of languages only spoken by a small
group of people, such a recording would not be interpretable by the major-
ity of people with a potential interest in the language e.g. linguists, an-
thropologists, historians, or the general public. In the case of endangered
languages, the recording would possibly not even be interpretable to the
descendants of the speakers themselves. Therefore, a recording has to be
accompanied by further information, in a format that is accessible to a
wider, possibly non-specialist, audience.
For simplicitys sake, I have assumed above, and will assume in most of
what follows, that the communicative event in question was spoken rather
than written, that it has been captured in audio or video format, and that the
annotation can indeed be related to segments of that recording. The seg-
mentation of the recording session into smaller units such as turns, sen-
tences, clauses, or intonation units as minimal units for the purposes of
annotation, is presupposed here. Segmentation, by no means a trivial issue,
is discussed in detail in the Chapter 10. It is recommended practice that the
basis for segmentation is made explicit in the documentation and that, in
the written transcription, intermediate units (often referred to as intonation
units) are each represented on a separate line.
It is important to remember that even an audio- or video-recording is
just a representation of the original communicative event albeit an iconic
(or analog) representation that preserves a great deal, but by no means all,
Chapter 9 Linguistic annotation 215
aspects of the original communicative situation (cf. Duranti 1997: 114;
Lehmann 2004b: 182, 205). Even a video recording preserves only auditory
and visual information (restricted by the camera angle), but not, for exam-
ple, the smell or temperature to which the original participants were sub-
jected. Nevertheless, within the context of a language documentation, such
audio- or video-recordings can be regarded as the primary data which form
the basis for further annotation. Representations, e.g., of a pitch contour or
amplitude, as produced by acoustic analysis, can be considered as further,
derived iconic representations these are not part of linguistic annotation
proper since they can be derived at any time if the original recording is pre-
served, and will not be discussed any further in this chapter.
This chapter deals with three main levels of linguistic annotation. The first
level, discussed under the heading of transcription (Section 2), comprises
various types of symbolic representations of the formal or significans side
of the linguistic expressions used in a communicative event (cf. Lehmann
2004: 205206). The second level, termed translation here (Section 3),
comprises any type of annotation that attempts to capture, in terms of one
or more metalanguages, the significatum side (i.e. the meaning and func-
tion) of the communicative event. The third level, dealt with in Section 4
(grammatical annotation), comprises all annotation related to structural
aspects of complex signs. In two further sections, I consider two further
types of annotation that can in principle relate to any of the three levels
mentioned above. The first of these can be termed the level of meta-com-
mentaries, i.e. commentaries on aspects of the annotation, for example, on
its reliability (Section 5). The second type is cross-referencing (Section 6),
i.e. the linking of representations of different communicative events.
It is by no means a trivial task to derive annotations comprising the dif-
ferent types of representations just mentioned from the raw data as has
already been pointed out, for the case of transcriptions at least, by Ochs
(1979) in her seminal article on Transcription as Theory. On the one
hand, the representations reduce the information present in the recording,
e.g. in the case of a written representation of a speech event. On the other
hand, representations also enrich it, in that they incorporate an analysis of
different aspects of the code underlying the communicative event, e.g. a
phonological analysis (in the case of a phonological transcription), a se-
mantic analysis however preliminary in the case of glossing and transla-
tion, and a grammatical analysis in the case of grammatical annotation. The
interaction of annotation as one aspect of documentation and linguistic
216 Eva Schultze-Berndt
description and analysis will therefore be a constant theme throughout this
chapter (see also Chapters 1 and 12).
For each of the main types of annotation and their subtypes, I will pro-
vide some evaluation of their potential usefulness (for different users) in
language documentation, as well as pointing out existing and possibly
competing conventions. Illustrations will come partly from my own anno-
tated corpora of J aminjung and Ngaliwurru, two closely related varieties
belonging to the Northern Australian Mirndi family, one of the Non-Pama-
Nyungan language groups. I will however refrain from recommending a
single annotation format, since the aims and means of each documentation
project will be different. Generally, in line with the scope of this volume,
only those annotation formats are considered here which appear to be suit-
able in the actual context of the documentation of a lesser-used language.
The issues arising in this context are clearly different from annotation is-
sues for corpora of major languages, intended e.g. for research on speech
recognition, speech synthesis, or discourse analysis, such as those distrib-
uted by the Linguistic Data Consortium (LDC). Only spoken language is
considered here; issues of the transcription of sign languages are beyond
the scope of this chapter and beyond my expertise, although much of what
is said below about translation and grammatical annotation and other types
of commentary will be equally applicable to the documentation of signed
In language documentation, each project will have to find a balance
between completeness of annotation on the one hand, and on the other
hand, the time and effort involved in producing annotations, which is easily
underestimated. The estimates for the time needed for the full annotation of
one minute of recording vary between 1 hour and 150 hours. The differ-
ences in these estimates are essentially due to the level of detail and scrutiny
that is applied to the annotation. The estimates at the higher end typically
come from phoneticians who have in mind a very detailed, segment-by-
segment annotation which requires listening to the recording over and over
again. It will be useful to keep these figures in mind when deciding on the
basics of the annotation scheme to be used within a documentation project.
Taking only the level of transcription as an example, there is not much use
in providing a large amount of very sloppy and superficial transcripts where
lots of segments are missing or wrongly transcribed and which, even with
the recording at hand, are difficult to interpret. On the other hand, the more
features one includes in a transcript, the longer the transcription process
takes, the more mistakes can be made and the less of the recorded materials
Chapter 9 Linguistic annotation 217
gets transcribed. Practicability will thus be a recurring theme in the discus-
sion of different types of annotation below. It is assumed that certain types
of annotation (e.g. a detailed prosodic annotation or grammatical annotation)
will usually only be undertaken by someone with a certain analytical goal
in mind. For this reason, the recommendations given here differ from any
recommendations made with maximal explicitness and consistency in mind,
such as those outlined by Lieb and Drude (2000).
Readability and relevance for potential users should also be considered in
deciding on an annotation format, since from the perspective of a user, too,
an annotation burdened with too much detail even if there are technical
solutions for displaying only selected aspects of the annotation can be
cumbersome rather than helpful (an impressive demonstration of the effect
of increasing detail in a transcription on readability is provided by Duranti
[1997: 122161]).
Another important point to remember is that language documentation
is an inherently ongoing process and that annotations may be produced or
corrected multiple times by one or multiple authors (Holton 2003: 6; cf.
also Edwards 2001: 322). It is thus quite possible that, for example, an an-
notation consisting of a transcription in a practical orthography and a trans-
lation will be supplemented, many years later, with a prosodic annotation by
a research project on prosody, and with grammatical annotation by some-
one working on a reference grammar of the language.
A few more notes on some basic assumptions are in order here. First, I
assume that the linguistic annotation will be produced in machine-readable
format or is at least convertible into machine-readable format. As a conse-
quence, only symbolic types of annotation are considered, ruling out, for
example, iconic representations of fundamental frequency (pitch contours;
see Section 2.4 on prosodic annotation).
A further assumption made here is that the annotation itself will be in
multi-tier or interlinear format (Edwards 2001: 327). This means that anno-
tation of different types is displayed in different fields or tiers (e.g.
phonetic transcription, orthographic transcription, interlinear gloss, transla-
tion) which represent different aspects of the same section of speech. The
tiers themselves obviously have to be labelled according to the type of an-
notation they represent. An illustration of the use of labelled tiers is pro-
vided in example (1). The conventions employed here and in subsequent
examples follow the format employed by Shoebox/Toolbox, one of the
most widely used databases for linguistic analysis. The type of annotation
is indicated by a label consisting of a backslash and a few letters chosen for
218 Eva Schultze-Berndt
their mnemonic value, separated by a space from the actual annotation. For
example, the label \ref stands for the reference ID which serves to uniquely
identify each line of the transcript, in this example, incorporating informa-
tion about the year of the recording, the tape number, the section of the
tape, and the line of the transcript.
The label \sp precedes the initials of the
speaker (see further Section 2.6), the label \orth stands for an orthographic
transcription, and the label \ft marks the free translation (all labels used in
this chapter are included in the list of abbreviations at the end of this chap-
(1) Illustration of multi-tiered annotation (J aminjung example
\ref 1999_A03_01.034
\sp IP
\orth malarabiya dibard ganunyngungam, bangawu
\ft the frog now is jumping away from the two, look
Again for simplicitys sake I will further assume, throughout most of the
chapter, that all types of annotation in a multi-tiered format are aligned with
(i.e. refer to) the same segment of audio- or video-data (and, in fact, may be
linked to that segment via time codes; cf. also Edwards 2001: 328). Cases
of non-alignment will be discussed in the sections dealing with overlap
(Section 2.6) and with the alignment of free translation and contextual
commentary (Sections 3.2 and 3.3). In addition, I will argue that annotation
may also take the form of cross-reference between data sets (Section 6).
Annotation proper has to be distinguished from markup (cf. Edwards
2001: 322), the standardized representation of the structure and format of a
text for the purpose of exchange of digitally encoded text. The current stan-
dard for markup is XML. In this chapter, I am mainly concerned with the
content and structure of the annotation and not with aspects of markup or
technical implementation. Aspects of technical implementation other than
markup include:
the linking of corresponding elements on different tiers, e.g., via indices
or time codes (cf. also Edwards 2001: 328);
flexibility in displaying the full annotation or hiding parts of the annota-
tion irrelevant to the task or output at hand;
conversion into other formats, including printable output;
the use of characters (the current standard being Unicode compatible
Chapter 9 Linguistic annotation 219
There exists an ever-growing body of literature on those technical aspects
(cf., e.g., Bird and Liberman 2001; Bow, Hughes, and Bird 2003; see also
Chapters 4, 13, and 14 for discussion and references). For an overview of
software employed in various projects as well as current encoding standards,
see also Edwards (2001: 337338, 342343). Suggestions for annotations
including prosodic and paralinguistic annotations in an XML-compatible
format have been developed by the Text Encoding Initiative (TEI). For the
latest version of the TEI recommendations, see TEI Consortium (2005)
(especially Ch. 10, Transcriptions of Speech).
2. Transcription
The label transcription is used here to refer to any symbolic representation
of the significans side of documented speech events. As has already been
indicated above, no transcript can be regarded as a direct, unbiased repre-
sentation of a communicative event it is by necessity filtered and influ-
enced by the annotators decisions, usually according to his or her theo-
retical goals and definitions (Ochs 1979: 44; Edwards 2001: 321).
Types of transcription that will be considered below are orthographic,
phonemic, and phonetic transcriptions of segmental information, and
transcription of prosody and of paralinguistic and non-linguistic
phenomena. To these is added a section of a more general nature dealing
with the representation of multi-speaker and multilingual discourse.
The issue of transcription does not arise for genuine cases of written
communicative events that may be included in the documentation, such as
newspaper articles, letters, or graffiti in the documented language. Written
communicative events usually employ an orthographic representation
(which may or may not be standardized; in the latter case, a rendition in
standardized orthography could be added to the documentation). In terms of
annotation other than transcription, written communicative events can be
treated just like spoken communicative events.
In the process of language documentation, it will quite often happen that
a spoken communicative event is not recorded, but written down at the time
of speaking or immediately afterwards, e.g. overheard utterances or elicita-
tions that were not deemed interesting enough to be recorded. The tran-
scription of unrecorded utterances will usually be in the same format that is
chosen for the transcription of recorded utterances e.g. a phonetic tran-
scription in the early stages of the documentation process, or an ortho-
220 Eva Schultze-Berndt
graphic or phonemic transcription, possibly in addition to the phonetic one,
or even a rudimentary rendition of the most salient prosodic features of the
utterance (see Section 2.4). However, except in exceptional cases of anno-
tators with very good phonetic memory, transcriptions of unrecorded utter-
ances will include less information than transcriptions of recorded utter-
ances, and have to be considered less trustworthy.
In the case that a recording is present, it is recommended that the tran-
scription whether on the orthographic, phonemic, or phonetic level
should represent as faithfully as possible what is being said. This includes
so-called filled pauses, false starts and self-repair (see (18) for an example),
and repetitions. It is also recommended that a stretch of speech which is not
transcribed because it is not intelligible to the transcriber is marked in the
transcript a common convention is to use the letter x for each unintelli-
gible syllable. If a publication of some of the data is intended, speakers
understandably often prefer an edited version which does not include
such features, but which comes closer to a version in written rather than
oral style (see, e.g., Mosel 2004b). If at all possible (i.e. acceptable to the
speech community), such an edited version should not replace the original
transcript, but either be added in the form of a further transcription tier or
(especially in the case of heavy editing) be treated as a separate communi-
cative event, linked to the original by cross-referencing (see Section 6).
Likewise, independent transcriptions by native speakers, especially those
with little training in linguistic conventions, could be treated as primary
data and linked to a standardized version of the transcript. For further
discussion of these points, see also Chapter 10.
2.1. Orthographic transcription
If an orthography for the language under investigation is already estab-
lished and accepted by the speech community, it is virtually an obligation
for a documentary linguist to provide an orthographic transcription as part
of the annotation, since this greatly adds to the accessibility of the docu-
mentation for the members of the speech community themselves. This is
why orthographic transcription is discussed first in this section.
In the case where no established orthography exists, or where an existing
orthography is not acceptable to the current speech community for one rea-
son or another, the documenter(s) will often be involved in devising a new
orthography. The principles, decisions, and potential problems involved in
this process are discussed in Chapter 11.
222 Eva Schultze-Berndt
documenters have in effect devised a preliminary orthography that may
well constitute the basis for the development of an accepted orthography
later on. In actual practice, this will often be adapted from orthographies
used by other speech communities in the region, or by linguists in descrip-
tions of neighboring languages.
2.2. Phonemic transcription
A phonemic transcription is one that represents only the distinctive sounds
and possibly tones of a language, i.e. those that potentially make a differ-
ence in the meaning of a word or morpheme. Therefore, the use of a pho-
nemic transcription presupposes at least a preliminary phonological analysis
of the language (and the phonemic transcription may have to be revised
repeatedly in line with revisions of the phonological analysis). Procedures
for working out the distinctive sound features of a language (for example,
by establishing minimal pairs) are stated in all good introductory textbooks
on phonology and will not be repeated here (for an example of the distinc-
tion between the phonetic and the phonemic level, see (4) in Section 2.3).
The symbols used in a phonemic transcription are often based on one of the
conventions for phonetic transcription discussed in Section 2.3.
A phonemic transcription (just like an orthographic transcription),
moreover, includes word boundaries (indicated by spaces). In a strictly pho-
nemic transcription, thesewould indeed haveto bephonological words rather
than grammatical words. In principle, the recognition of phonological words
presupposes a phonotactic and (partial) prosodic analysis. Although word
boundaries are not easily recognized in connected speech, in the actual
practice of linguistic fieldwork the integrity of a lexical word is fairly easily
established in most cases words are those units that can be uttered and
often also translated in isolation by native speakers. The analysis and hence
representation of clitics and function words can create notorious problems,
though (see Chapter 10).
A phonemic or orthographic representation should also be used in creat-
ing a lexical entry for each morpheme in the lexical database. In this way,
the process of morpheme-by-morpheme gloss can be automatized (see fur-
ther Section 4.1).
Chapter 9 Linguistic annotation 223
2.3. Phonetic transcription
We now turn to the question of whether to include a phonetic transcription
in the annotations used in language documentation. A phonetic transcription
is one that attempts to represent the articulatory characteristics of perceived
segments as well as possibly some suprasegmental characteristics on the
lexical level such as word stress and tone (for other suprasegmental charac-
teristics, see Section 2.4), without embodying a decision as to which of
these characteristics are distinctive (as in a phonemic transcription).
The most widely employed standard for segmental and tonal phonetic
transcription is the IPA alphabet (devised by the International Phonetic
Association), which is based on the Roman alphabet but includes many
special symbols. Americanists have been using a somewhat different pho-
netic alphabet involving diacritics such as those employed in standard ortho-
graphies of several European languages. A good overview of the phonetic
symbols used in both traditions is provided by Pullum and Ladusaw (1996).
Although with the advent of Unicode, the use of special phonetic fonts
has become less of a problem for the exchange and archiving of data, the
process of phonetic transcription using a standard keyboard may still prove
cumbersome. The SAMPA system (Speech Assessment Methods Phonetic
Alphabet) has been devised to overcome this problem, as it relies solely on
characters available on a standard keyboard, e.g. by making use of capital
letters and digits. For an overview of this system, see Wells et al. (1992),
Wells (1997), and the online description provided by Wells (2004). As an
example, consider the following phonetic transcription of two German
words in both the IPA and SAMPA systems, following Gibbon (1995) and
Wells (2004).
(3) Phonetic transcription using IPA and SAMPA symbols (German)
\phonet_ipa pykt.l
\phonet_sampa pYNkt.lIC
\orth pnktlich
\ft punctual
\phonet_ipa .n
\phonet_sampa S2:.n@
\orth schne
\ft beautiful (F)
224 Eva Schultze-Berndt
Some training in the basics of phonetics and phonetic transcription can be
considered essential for anybody undertaking a language documentation.
Since a phonetic transcription can be undertaken without prior phonological
analysis, it is often the type of transcription used in the initial stages of lin-
guistic fieldwork. However, as all but the most phonetically gifted field-
workers will probably confirm, these initial transcriptions are likely to be
unreliable and should not be included in the annotations, or used as the sole
basis for a phonemic or orthographic transcription, without subsequent
Once a phonological analysis has been undertaken, it is strictly speaking
not necessary to include a phonetic transcription if the original recording is
provided together with the annotation. However there are several good rea-
sons for providing a phonetic transcription for at least part of the corpus.
Depending on the place of a phonetic transcription in the language docu-
mentation project at hand, the phonetic transcription employed will be
broad or narrow. These terms really describe a continuum with a pho-
nemic transcription at the broad end, and a phonetic transcription includ-
ing as much detail as possible on the narrow end. A fairly broad phonetic
transcription of at least part of the text corpus can be used to provide in-
formation on allophones, i.e. the realization of phonemes in different pho-
nological environments. The distribution of voiced and voiceless stops in
J aminjung may serve as a simple example. Voicing is not distinctive in
J aminjung, since voiceless and voiced stops are in complementary distribu-
tion: As in many other Australian languages, stops are always voiceless
word-finally, but always voiced word-initially and medially. In the phone-
mic transcription illustrated in (4), only the symbols for voiced stops are
employed. In an allophonic phonetic transcription, the last /g/ would have
to be rendered by the symbol for the voiceless velar stop, [k]. Similarly, the
second /u/ in the phonemic transcription is replaced, in the broad phonetic
transcription, with the symbol for the centralized allophone which occurs in
unstressed syllables, [].
(4) Illustration of phonemic and broad phonetic transcription
(J aminjung example)
\phonem gugug
\phonet 'gugk
\ft in the water (water-LOCATIVE)
Chapter 9 Linguistic annotation 225
Allophonic realizations like those illustrated above should be described in a
grammatical sketch accompanying the language documentation (see Chapter
12). An allophonic transcription is therefore not absolutely necessary, al-
though it can provide users of the documentation (provided they have the
appropriate training) with a quick illustration of the basic allophonic prin-
For some language documenters, a phonetic analysis of the language
will be one of their research goals. In this case, a narrow phonetic transcrip-
tion of some parts of the textual corpus will prove crucial, but will have to
be supplemented with carefully elicited materials for instrumental analysis
of articulatory and acoustic characteristics of the speech sounds. Maddieson
(2001) and Ladefoged (2003) are good introductory texts to phonetic analy-
sis in fieldwork conditions. An overview of the sound systems likely to be
encountered in the worlds languages and their phonetic characteristics is
provided by Ladefoged and Maddieson (1996).
Another possible use for a phonetic transcription tier is a faithful rendi-
tion of variation in pronunciation which may turn out to have relevance for
the description of sociolects or dialects, allegro forms (fast speech forms),
or forms otherwise noteworthy or deviating in pronunciation from forms
used in careful speech. For example, in the common German allegro form
given in (5), the nasals are assimilated in their place of articulation to the
following and previous consonants, respectively, and the reduced vowel of
the last syllable is replaced by a syllabic nasal.
(5) Differentiating allegro forms and standard forms in phonetic and pho-
nemic tier (German)
\phonet agebm`
\phonem angebn
\ft indicate(INF) (or: boast(INF))
There is a good reason for both representing the actual pronunciation, e.g.
in the phonetic tier, and the standard or careful speech form in the ortho-
graphic or phonemic tier as in (5), as the latter greatly facilitates searches
for this word form. If a phonetic representation is used, it makes sense to be
consistent in the level of detail (i.e. consistently use a narrower or broader
transcription, cf. Rischel [1987: 6265]) and to indicate this in the general
explication of the transcription conventions.
226 Eva Schultze-Berndt
2.4. Prosodic annotation
By prosodic transcription we mean the representation of non-lexical su-
prasegmental characteristics of the speech signal (as opposed to lexical
characteristics such as word stress and lexical tone). Suprasegmental in-
formation that might be represented in a transcription includes the follow-
ing characteristics (following Llisteri 1996):
pitch movements, pitch direction or pitch contour, both local and global,
some of them indicating prosodic boundaries;
accent at phrase level;
lengthening (beyond lengthening that is distinctive on a segmental level);
pauses and pause length.
Whereas an orthographic or phonemic transcription is essential for any lan-
guage documentation, and there are good reasons to include a (broad) seg-
mental phonetic transcription with at least a part of the annotation, the rele-
vance of a prosodic transcription seems less obvious. To be sure, prosodic
information is often crucial for the analysis of the phrase structure as well
as the information structure of spoken language (as there is no punctuation
in spoken language!). However, prosodic transcription is very time-con-
suming and it is more difficult (and much less common) to undertake a
prosodic analysis of a language than to arrive at a segmental phonemic
analysis, and hence to produce a prosodic transcription capturing only the
distinctive aspects. Moreover, there is no standard transcription system for
prosody even on the phonetic level comparable to the IPA system for
segmental phonetic transcription.
It is therefore to be expected that people
involved in annotation and documentation will only add a prosodic tran-
scription if one of their goals is a prosodic analysis.
Many of the transcription systems for prosody that have been developed
in modern linguistics are not compatible with the demands of machine-
readable annotation. A few of those that are compatible will be introduced
very briefly below. One important issue that one has to address in the case
of prosodic transcription is whether this will be superimposed on a segmen-
tal transcription in one of the formats described above (e.g. the orthographic
or the phonetic transcription), or whether suprasegmental characteristics
will be annotated in a tier that does not include information on segmental
characteristics. The latter option facilitates searches for prosodic patterns,
but necessitates some link between the units on the segmental and the su-
prasegmental tier (e.g. via time codes).
Chapter 9 Linguistic annotation 227
The various prosodic transcription conventions employed in linguistic dis-
course analysis are all examples of prosodic annotation superimposed on
the segmental usually orthographic annotation. The annotation systems
described in DuBois et al. (1993), Ochs, Schegloff, and Thompson (1996),
Selting et al. (1998), and Couper-Kuhlen (2001), and also those employed
in the CHAT conventions (see McWhinney 1991 and the CHAT website)
and the TEI conventions, all belong to this type. Many of them share fea-
tures such as:
the use of capital letters or diacritics for accented syllables;
the use of punctuation marks for boundary intonation, e.g. period (.) for
falling intonation and question mark (?) for rising intonation;
the use of arrows for salient changes of pitch.
An advantage of the discourse analysis formats is that they have been de-
veloped for an annotation on the phonetic level which can be undertaken
prior to decisions regarding the prosodic analysis. Moreover, just as with
segmental phonetic annotation, the transcription can be more or less de-
tailed (i.e. broader or narrower). An example of a fairly broad prosodic
transcription in this tradition is provided in (6). Phrasal accent is represented
by capitalizing the accented syllable; the semicolon indicates non-final
boundary intonation (slightly falling or level), the slash and backslash, rising
and falling boundary intonation, respectively, and the equals sign, interlac-
ing of intonation units without a pause. This type of prosodic annotation
indicating only phrasal accent and boundary intonation is relatively easy
to produce and can be very helpful for an assessment of the syntactic struc-
ture of the units in question. Pause measurements are also provided in this
example, but since these are very time-consuming, this practice is not nec-
essarily recommended for a general-purpose annotation.
(6) Prosodic transcription in the discourse-analytic tradition
(German example)
\pros wir ALbern im KORB; (0.8)
\pros NEKken uns; (4.1)
\pros SCHERzen / (=)
\pros dass wir uns hinAUSschmeissen ; (=)
\pros gegenseitig \
\ft we laugh around in the basket, tease each other, joking that
we will throw each other out
\cc account of a balloon ride
228 Eva Schultze-Berndt
Another transcription system that is explicitly designed with crosslinguistic
applicability in mind (hence a system on the phonetic level) is INTSINT
(INternational Transcription System for INTonation; see, e.g., Hirst and Di
Cristo 1998; Hirst, Di Cristo, and Espesser 2000). In this system, absolute
pitch with respect to the frequency range of the speaker can be indicated, as
well as relative pitch at a turning point in the intonation contour and itera-
tive relative pitch (upstep and downstep); the symbols used are either capi-
tal letters or different arrow symbols (Hirst and di Cristo 1998: 15). How-
ever, neither word level stress nor phrasal accent nor lengthening are
explicitly marked. The advantage of this system is that the prosodic contour
can be transcribed on a separate tier from the segmental transcript.
A system of prosodic annotation which is popular in prosodic research
is called ToBI (Tones and Breaks Index), following from the work of Pier-
rehumbert (1980) and subsequent revisions (see, e.g., Silvermann et al.
1992). This system relies on the decomposition of prosodic contours in
tones of two pitch levels, high (H) and low (L), which can be linked to
stressed syllables and intonational phrase boundaries. The main problem
with this system from the point of view of language documentation is
that it presupposes a phonological analysis of the prosodic system in ques-
tion. Prosodic annotation in ToBI style can therefore only be undertaken by
annotators who are seriously concerned with the prosody of the language in
2.5. Transcription of paralinguistic and non-linguistic aspects of the inter-
In Sections 2.1 to 2.4, we have been concerned exclusively with the tran-
scription of spoken language in the narrow sense, i.e. the linguistic compo-
nent of speech events. As anybody with any experience with the transcrip-
tion of natural (rather than read) speech knows, speech events have other
features which are usually not captured by writing systems (even modified
ones such as the IPA notation).
Following the classic paper by Trager (1958), non-linguistic aspects of
speech events can be divided into paralanguage on the one hand, compris-
ing voice quality and vocal events such as coughing, whistling, laughing, or
the so-called filled pauses, and non-vocal or kinesic events on the other
hand. Non-vocal events, in turn, can be divided into speech-accompanying
gestures and any other events that may occur during or in conjunction with
Chapter 9 Linguistic annotation 229
a speech event, such as the slamming of a door which may or may not
have a communicative impact. Shifts or changes in vocal quality (e.g.
whispering or shouting) or speech tempo are referred to as paralinguistic
features since they cannot be separated from the linguistic features of the
communicative event.
For some time, linguists involved in discourse analysis (including con-
versation analysis) have been aware of the importance of paralinguistic and
non-linguistic aspects of communicative events, and have, accordingly,
developed conventions for transcribing these. J ust as for the transcription of
prosody, many earlier systems are not compatible with the demands of digi-
tal processing (cf., e.g., Ehlich and Rehbein 1979; Halwachs 1994). Cur-
rently emerging standards tend to be based on transcription conventions
where the transcription of paralinguistic and non-linguistic features is super-
imposed on a segmental transcription. Some examples of relatively recent,
and fairly similar, suggestions resulting from this tradition can be found in
Selting et al. (1998) and in the Appendix of Ochs, Schegloff, and Thompson
(1996: 461465), as well as in the conventions employed by CHAT and
those recommended by the TEI (TEI Consortium 2005: esp. Ch. 10.1).
For the purposes of most language documentation projects, it will prove
too time-consuming to produce a detailed transcription of non-linguistic
and paralinguistic aspects of all documented speech events. However, some
of these aspects can be transcribed relatively easily and can greatly facili-
tate the understanding of the interaction. These include hesitations and
filled pauses (e.g. uhm), laughter (which can be represented by L), and sig-
nificant changes of vocal quality, such as whispering. Non-linguistic events
can often be considered part of the contextual information and may be de-
scribed in the tier devoted to the contextual commentary (see Section 3.3).
While many paralinguistic and non-linguistic vocal events can be tran-
scribed relatively easily, the transcription of gesture although often a very
important part of the interaction is difficult and extremely time-consum-
ing, and no standard transcription conventions exist. Obviously, the possi-
bility of annotating gesture also depends on the availability of video record-
ings. For the purposes of a language documentation project not specifically
devoted to the annotation of gesture, it is nevertheless recommended that
gestures (mainly pointing gestures) accompanying deictic expressions are
annotated and treated as contextual information (e.g. speaker points to the
top of the tree); these can be noted during the event by an observant field-
worker even in the absence of a video recording.
230 Eva Schultze-Berndt
2.6. Transcription of multi-speaker and multilingual discourse
So far, the examples of annotation given in this chapter were of a mono-
logical nature, i.e. they involved only one speaker. Naturally occurring
communicative events, however, are rarely monologues, but rather involve
at least two participants. It is fairly obvious that any annotation has to indi-
cate changes of speaker (also termed turns). In transcripts of interactions
in discourse analysis frameworks, each turn starts on a new line and begins
with a representation of the speaker, e.g. by capital letters or initials, as
illustrated in (7).
(7) Representation of multi-speaker discourse in the discourse-analytic
tradition (DuBois et al. 1993: 49)
A: now that we have the [side door] fixed,
B: [Thats kind of]
A: he could.
B: Yeah,
C: Yeah.
As also seen in this example, it is common to indicate overlapping speech,
which frequently occurs in multi-speaker discourse, by enclosing overlap-
ping segments in angled brackets and arranging them in parallel with each
other. This works reasonably well in print, but is not easily transferred into
a machine-readable format. Furthermore, consistent marking of overlap can
be a very time consuming and difficult affair (see DuBois et al. 1993: 50
52, for examples and discussion). For the purposes of providing a base
transcript in a language documentation, one may well leave this task to a
later user who is actually interested in analyzing the structure of conversa-
tional exchanges.
In a multi-tiered annotation format, speaker information will appear in a
separate tier rather than being included with the transcript, as illustrated in
version (a) of example (8), and in (9) below. Alternatively, different labels
can be employed for transcript tiers of different speakers as shown in ver-
sion (b) of example (8); this is the solution implemented in the CHAT and
ELAN annotation conventions.
Chapter 9 Linguistic annotation 231
(8) Representation of multi-speaker discourse in a multi-tiered format
(adapted from (7))
a. \sp A
\orth now that we have the [side door] fixed,
\sp B
\orth [Thats kind of]
b. \orth_A now that we have the [side door] fixed,
\orth_B [Thats kind of]
Presenting the utterances of different speakers on consecutive lines is the
most widely used, but not the only option of representing multiparty dis-
course. Alternatively, one could also arrange the utterances of different
speakers in different parallel columns (see Ochs 1979 for an example and
discussion), or present them like different voices in a musical score, i.e. in
blocks of parallel lines running across the full width of the page (see Ehlich
1993 for exemplification). The latter option is actually the one implemented
in time-linking software such as ELAN which provides the possibility to
link a segment of a transcript to the corresponding segment in the original
recording. In ELAN, participants are distinguished by different labels not
only for the transcript tiers, but also for all other annotation tiers that are
aligned with the transcript tier. The advantage of this type of notation is
that overlaps are easier to represent. The disadvantage is that in multi-party
interactions, the transcript becomes rather difficult to read.
The interaction with a researcher who is not a member of the speech
community can be treated as a special type of multi-speaker discourse. This
implies that the researchers part of the interaction also be documented (cf.
Samarin 1966: 125), even if this is done in a more cursory fashion. Docu-
menting the researchers questions and comments may help to uncover
misunderstandings and mistakes in the translation later on.
An even more complicated annotation format is needed in the case
which is the rule rather than the exception in the case of speakers of endan-
gered languages that speech events tend to be multilingual rather than
monolingual. Reserving one tier in a multi-tiered annotation format for the
language name will be sufficient if there is no code-switching within units.
In the latter case, however, some indication in the transcript itself is re-
quired (leaving aside the notorious problem of deciding between code-
switching and borrowing in this case). In example (9), the dominant lan-
guage (or matrix language) for each intonation unit is indicated in a sepa-
232 Eva Schultze-Berndt
rate tier; the languages involved are the Australian languages Ngarinyman
the dominant language for speaker ER, J aminjung the dominant language
for speaker DB, and Kriol, an English-lexified creole which is the lingua
franca of the area and often features in utterance-internal code-switching. In
this example, Kriol insertions, being the unmarked case, are indicated by
angular brackets without any further marking (as in lines (9b) and (9c)),
whereas insertions in another language, as in line (9d), are marked by addi-
tional characters (here Ng for Ngarinyman).
(9) Example of a multi-speaker and multilingual discourse
a. \sp ER
\lg Ngarinyman
\mo yanarnin=barnalu gani::ny,
\it come:PST=1PL.EXCL ??
\ft we came here
\cc account of work on cattle station when speakers were young
b. \sp ER
\lg Ngarinyman
\mo <wilbarra>-yawung, mangarri-yawung \
\it wheelbarrow-PROPR
\ft with a wheelbarrow, with food
c. \sp DB
\lg J aminjung
\mo <wilbarra> ya gan-anthama!
\it wheelbarrow ?? 3SG.A:3SG.P-bring.IMPF
\ft she used to bring a wheelbarrow
d. \sp ER
\lg Kriol
\mo ya, gatta wilbarra wi bin pushim, <Ng mangarri>,
\it yes with wheelbarrow we AUX.PST push:TR
\ft yes, we pushed food with a wheelbarrow
3. Translation
A free translation of the transcribed speech events into a widely accessible
language is essential in the documentation of a less widely known language.
This is one feature that distinguishes language documentation as envisaged
Chapter 9 Linguistic annotation 233
in this volume from the compilation of corpora for widely spoken lan-
guages such as English or J apanese, for which often no translation is made
The first problem to be addressed in this context is the choice of the
language(s) to be translated into (Section 3.1). Different styles of translation
are discussed in Section 3.2, while in Section 3.3 it is argued that informa-
tion on the non-linguistic context of the utterance should not be incorpo-
rated into the translation, but provided in a separate tier as contextual com-
mentary. Morpheme-by-morpheme glosses (interlinear glosses), while
obviously involving the process of translation, also involve morphological
analysis and are intimately linked to other types of grammatical annotation;
they are therefore treated together with these, in Section 4.1.
3.1. Metalanguage(s) used in glossing and translating
One major decision to be made in the process of translation in language
documentation is the choice of the metalanguage(s) (or target languages)
for the translation, keeping in mind the aim of making the documentation
accessible to a varied group of users. Possibilities for the choice of a target
language include the following:
The second/dominant language(s) for speakers of the documented lan-
guage typically, but not always, also a regional lingua franca or an of-
ficial state language;
A language of official status in the country where the language docu-
mentation is undertaken, which could be one of the national language(s)
or the language primarily used in education e.g. Hindi in large parts of
India, Indonesian in Sulawesi, Turkish in Turkey, and often a colonial
language such as English in Nigeria or Spanish in Guatemala;
A standard language in case of the documentation of nonstandard varie-
ties or dialects of a language for which a written standard exists;
The native or dominant language of the person undertaking the transla-
tion e.g. Spanish in the case of a Mexican researcher with Spanish as
the first language;
The language of academic affiliation of the person undertaking the
translation e.g. French if the person in question undertakes language
documentation as part of obtaining a degree at a French university;
An academic lingua franca or world language.
234 Eva Schultze-Berndt
It is of course quite possible to combine translations into more than one
language although the cost in terms of the additional time involved in
annotation is immediately obvious. Criteria for deciding between the dif-
ferent possibilities include, obviously, the abilities of the person undertaking
the translation and/or the possibility of employing additional translators. A
further essential criterion is the accessibility to members of the speech
community and, importantly, their descendants who may not speak the
documented language anymore. The most sensible (though somewhat
ironic) choice in this case is a translation into the language that is most
likely to be the target of language shift, generally the dominant regional
language or an official language of the country in question. Often the insti-
tution funding the research will have requirements for the language of
translation. If the funding comes from a regional institution, this is likely to
be an official language of the country where the documentation is under-
taken; for academic institutions outside this country, it is more likely to be
the language of education used in that institution. Today it seems to be as-
sumed by most academic advocators of language documentation that Eng-
lish should be at least one of the metalanguages employed not only for the
translation, but also the other descriptive components of a language docu-
mentation, with the aim of making the documentation accessible to the in-
ternational academic community.
3.2. Free translation
Translation is a skill (many will say, an art) which, if undertaken to profes-
sional standard, usually requires a lot of training, and is fraught with meth-
odological problems. It seems highly unrealistic to burden documenters or
annotators with the expectation that they ought to provide translations that
meet the standards of professional literary or scientific translation. This is
all the more so as the translation is often undertaken by someone who is not
a member of the speech community and, moreover, is only just beginning
to learn the language to be documented and to understand its structure as
well as its cultural background. In addition, often a documenting linguist
will translate into a language which is not his or her native language (e.g.
English, Spanish, or Indonesian). Therefore, all users and potential users of
language documentations should be discouraged in the strongest possible
terms from using the free translations which are provided as part of the
annotation as more than a clue to the meaning and analysis of the docu-
mented utterances.
Chapter 9 Linguistic annotation 235
Apart from the choice of language, one choice to be made in translating is
the choice between a free translation and a more literal translation al-
though the boundaries are gradual and nothing much hinges on a consistent
decision in this respect. A literal translation remains closer to the source
language and is therefore more helpful in the understanding of the structure
of the language, and less likely to be misleading. A free translation is idio-
matic in the target language and therefore more readable especially for
people fluent in this language. It may also be richer in that it incorporates
the pragmatic effect of the original utterance, and in this respect, the trans-
lator has of course to be careful in order not to give a misleading impres-
sion of a pragmatic effect.
Of course it is possible to provide both a free and a literal translation,
either in different labelled tiers or by adding, for example, the literal trans-
lation in brackets to the free translation. The first possibility is illustrated in
(10); the free translation is labelled \ft and the literal translation \lit. This
example illustrates the difficulty of translating the complex predicate con-
sisting of the non-verbal element dibard jump and the inflecting verb
-(ng)unga leave in J aminjung. Note also that if an interlinear (morpheme-
by-morpheme) translation is provided (see further Section 4.1), as in the
line labelled \it in the following example, this in itself already provides an
extremely literal kind of translation.
(10) Interlinear, free, and literal translation (J aminjung example)
\orth malarabiya dibard ganunyngungam, bangawu
\mo malara=biya dibard ganuny-ngunga-m, ba-ngawu
\it frog=SEQ jump 3SG.A:3DU.P-leave-PRS IMP.SG-see
\ft the frog now is jumping away from the two, look!
\lit the frog now is jump-leaving the two, look!
If a free rather than a literal translation is chosen, a common practice is to
provide a translation for larger units of segmentation such as paragraphs, as
illustrated in (11), instead of translating each intonation unit. This, how-
ever, is only recommended if an interlinear translation is also provided,
since otherwise it becomes too difficult to relate the translation to the tran-
236 Eva Schultze-Berndt
(11) Free translation relating to more than one intonation unit (J aminjung
a. \orth a: ya:, ngiyinthuni barrajjung ngayiny
\mo a: ya:, ngiyinthu-ni barrajjung ngayiny
\it INTERJ INTERJ DEM-LOC further animal
b. \orth ganunyma jarndang
\mo ganuny-ma jarndang
\it 3SG.A:3DU.P-hit.PST go.down.completely?
c. \orth gugubina
\mo gugu-bina
\it water-ALL
d. \orth wiribmijjung
\mo wirib-mij-jung
\ft ah yeah, this animal then pushed the two all the way down
into the water, (the boy) together with the dog.
Example (11) above again from a J aminjung retelling of the Frog Story
also illustrates two further issues in translating. The first is that a free trans-
lation, especially when the translation is that of a whole paragraph, tends to
assume the stylistic features of written as opposed to spoken language. This
is not a major issue if the translator is aware of it and if the translation is
regarded as an aid for the interpretation of the original utterance by later
users, not as a faithful rendition of the original. In special cases however,
e.g. when translating ritual speech events or verbal art, the translator may
well strive to represent aspects of the original discourse structure (for dis-
cussion of this issue, see e.g. Sammons and Sherzer 2000).
The second issue is that of adding information not present in the original,
illustrated by the addition of the noun phrase the boy in brackets in the free
translation of (11), the omission of which would result in an ungrammatical
sentence in English. In J aminjung, on the other hand, the information about
the referent is only indicated by the second person dual object prefix in line
(11b) and the comitative case in line (11d), together with the preceding
context. It is recommended that additional information of this kind is
marked by brackets or some other means, since this greatly helps later users
of the documentation to assess immediately where the translation deviates
from the original.
Chapter 9 Linguistic annotation 237
In addition to providing a more idiomatic as well as a more literal transla-
tion where appropriate, I have found it good practice to include the literal,
rather than edited, version of any translation into a contact language pro-
vided by native speakers. (Alternatively, this can be done by cross-
referencing see Section 6 if such translations are documented as com-
municative events in their own right.) In example (12) below, the transla-
tion into Kriol, labelled \ot, provides a much closer rendition of the J amin-
jung utterance than the free English translation because it is basically a
calque of the former: first, the causal interrogative expression nganthan-
nyunga what-ORIG is translated literally as what from (the Origin
case, apart from acquiring a causal function, also functions as a marker of
origin, as in the man from Bulla). Second, the lexeme mangarra is trans-
lated as taka (<Engl. tucker); both J aminjung mangarra and Kriol taka are
generic terms used for any edible plant or food made from this plant. Thus,
an original translation can often provide important cues to the structure of
the original utterance.
(12) Original translation by a native speaker (J aminjung example with
Kriol translation)
\mo nganthan-nyunga nganth-unga-m mangarra?
\it what-ORIG 2SG.A:3SG.P-leave-PRS
\ft why are you leaving your food (rather than eating it up)?
\ot wat from yu livim taka
3.3. Contextual commentary
During the process of translation for the purpose of annotating recorded
speech events, the annotator should remember to add contextual informa-
tion where it is crucial for an interpretation of the utterance by anybody
who was not present during the original speech event. Relevant information
of this kind may pertain to the entity, event, or stimulus referred to by the
speaker, to the addressee and the intended pragmatic effect of the utterance,
or to an action of the speaker or other participants accompanying the
speech event. This information may overlap with, complement or partly
replace a transcription of non-linguistic aspects of the interaction (see Sec-
tion 2.5) and also overlap with ethnographic commentary, discussed in
Chapter 8. Contextual information can consist of a prose description of the
238 Eva Schultze-Berndt
context, but also of links to photographs of some aspect of the speech situa-
tion (e.g. an artefact under discussion), or of stimuli used in elicitation.
Providing contextual information is particularly important when an ut-
terance is not embedded in a longer text which would aid its interpretation.
In example (13), again from J aminjung, the tier labeled \cc provides the
contextual information without which the utterance even with the transla-
tion could hardly be interpreted. In the case of transcribing an unre-
corded, overheard utterance such as (13), it is important to immediately
note as much detail as possible about the circumstances of the communica-
tive event, since there is no recording to assist in the recovery of such in-
(13) Contextual information about the event referred to
(J aminjung example)
\mo juwurlab ga-rna-ya ngayin
\it swell.up 3SG-burn-PRS meat
\ft the meat is swelling up because of the heat
\cc tinned meat on the fire rising out of the can
Rather than in an additional tier, contextual information could be included
with the free translation (see Section 3.2), e.g. (in the case of example (13))
the (tinned) meat is swelling up (i.e. rising out of the can) because of the
heat (on the fire). While this saves space, it makes the translation less
readable and obscures its relationship to the original utterance. It is there-
fore recommended to provide contextual information in a separate tier.
As in the case of the free translation, a contextual commentary will often
relate to more than one line in the transcript (i.e. to more than one intona-
tion unit). This can be represented in a straightforward manner if each tier
is linked to a segment of a recording via time-codes; another method is to
explicitly link a contextual commentary with the reference numbers (see
example (1)) of several units.
4. Grammatical annotation
4.1. Interlinear glossing
It has become standard practice in the linguistic literature to provide data
from languages other than the most widely known languages in a three-
tiered format: a (phonemic or orthographic) representation is combined
Chapter 9 Linguistic annotation 239
with morpheme-by-morpheme glosses, commonly referred to as interlinear
glosses, and a free translation. In an annotated corpus, it is also recom-
mended practice to include interlinear glosses for all or at least part of the
transcriptions. Done manually, interlinear glossing is very time-consuming;
if, however, the text database is linked to a dictionary database listing indi-
vidual morphemes, glossing can be done largely automatically by diction-
ary lookup, as implemented by the CLAN and Shoebox/Toolbox software.
Interlinear glossing involves the addition of two additional tiers. The first
is derived from the phonemic or orthographic transcription tier, but with the
addition of morpheme and clitic breaks which are standardly indicated by a
hyphen and an equals sign, respectively; the second tier contains the mor-
pheme-by-morpheme glosses. Some of the conventions employed in inter-
linear glossing (at least among linguists) are illustrated in the tiers labelled
\mo and \it in (14), repeated from (10) above.
(14) Illustration of interlinear glossing (J aminjung example)
\orth malarabiya dibard ganunyngungam, bangawu
\mo malara=biya dibard ganuny-ngunga-m, ba-ngawu
\it frog=SEQ jump 3SG.A:3DU.P-leave-PRS IMP.SG-see
\ft the frog now is jumping away leaving the two, look!
The most important conventions include:
The use of corresponding boundary symbols (space, hyphen, equals
sign) in both the morpheme break tier and the gloss tier;
The use of lower case for glosses of lexical morphemes and of upper
case (or rather, small capitals) for glosses of grammatical morphemes;
The use of dots to separate the grammatical components of portmanteau
morphemes in fusional languages (e.g. IMP.SG as the glossing for the
single prefix ba- above), and of colons to separate glosses where a seg-
mentation in the morpheme tier is possible in principle but not applied
because of convenience, or because of unclarities in the exact position
of the morpheme boundary (e.g. 3SG.A:3DU.P in example (14) here the
prefix ganuny- could be further segmented as gan-uny-, but since the
boundary is not always clear with other transitive pronominal prefixes, I
have chosen to generally gloss them in the format illustrated here);
The consistent use of a single gloss as translation equivalent of any
given morpheme, even though this may not be the closest translation
equivalent in the free translation (for example, the verb -ngawu- is
glossed as see throughout my annotated corpus even though the closest
240 Eva Schultze-Berndt
translation equivalent in examples such as (14) is look). This not only
avoids arbitrary decisions regarding the polysemy of a given morpheme,
but also greatly facilitates automatic searches.
For a more detailed discussion of these conventions, the reader is referred
to the pinoneering paper on interlinear glossing by Lehmann (1983), the
revised version (Lehmann 2005) as well as the versions published in Knig
et al. (1994) and Bickel, Comrie, and Haspelmath (2004). These recommen-
dations also include abbreviations for common grammatical morphemes.
While the adherence to such standards facilitates the use of a documenta-
tion for linguists, it is more important that an explanation of all abbrevia-
tions used in the glossing is included with the documentation. Ideally, also,
the function of all grammatical morphemes will be discussed in the sketch
grammar accompanying the documentation (see Chapter 12).
Interlinear glossing presupposes that a morphological analysis and some
degree of semantic analysis of the language has already been undertaken,
since the indication of morpheme breaks involves a decision on what the
smallest meaning-bearing units are, and the glosses provided for the gram-
matical and lexical morphemes even if they are considered preliminary
involve some degree of grammatical and lexical semantic analysis, respec-
tively. The principles of morphological segmentation are outlined in all
textbooks on morphology, see e.g. Matthews (1991) or Haspelmath (2002),
and will not be repeated here, with one exception: A problem frequently
arises in the morphological segmentation of languages where morpheme
boundaries tend to be blurred by morphophonemic processes. Apart from
the use of colons as illustrated above, it is possible and often practiced in
these cases to include the underlying forms of the morphemes in question
in the morpheme tier, and use these as the basis for glossing, as illustrated
in (15).
(15) The representation of underlying forms in the morphological tier
(Tagalog example, Nikolaus Himmelmann, p.c.)
\orth mamulot nung manga bunga
\mo maN-pulot non=ng mang bunga
\it AV-pick_up DIST.GEN=LK PL flower
\ft (their means of living was) to pick fruit,
One disadvantage of interlinear glossing as recommended by linguists is that
it is often difficult to read for non-linguists. In some cases, the annotator
Chapter 9 Linguistic annotation 241
may therefore opt for not glossing some grammatical morphemes, or for
using the closest translation equivalent in the metalanguage for any gram-
matical morpheme where this is possible (for example me instead of
1SG.ACC, for instead of BEN (benefactive) or now instead of SEQ
as in example (14)). If employed consistently, glosses of this kind may still
be converted by global change into standard linguistic interlinear glossing.
A more radical departure from the principles of interlinear glossing is the
glossing of whole word forms instead of morphemes, bordering onto a very
literal translation (see Section 3.2). If glosses of this kind are expected to be
of value to some potential users of the documentation, it is probably best to
add them as a separate tier. This is illustrated in (14):
(16) Illustration of non-linguistic interlinear glossing
(Kwakw'ala example, from Boas 1911b: 554)
\orth l:'lai G:xdn
\mo la:-'la-i G:xdn
\it2 then_it_is_said Gixden
\orth dx'
ut':lis la:q.
\mo dx
-u:t'a-gi-i:s la:-q
\it jump-out_of_enclosed_space-MOTION-beach go-3.OBJ
\it2 jumped_out_of_woods_on_beach to_them
\ft Then Gixden jumped out of the woods.
4.2. Grammatical tagging
The grammatical information provided by interlinear glossing is obviously
limited: it does not show grammatical analysis of constituency or depend-
ency for structures beyond word level. While the coding of this kind of
information is often an important feature of published corpora of widely
spoken languages, in the practice of language documentation it is only
rarely attempted, first because of its time-consuming nature, second, be-
cause a grammatical analysis will only be developing in the course of the
annotation. Some possibilities of adding grammatical information to the
annotation are nevertheless mentioned here and in the following section on
grammatical notes (Section 4.3). Any grammatical regularities that can be
observed early on in the documentation process, such as (for many lan-
242 Eva Schultze-Berndt
guages) word order, should be included in the grammatical sketch (see
Chapter 13).
The type of grammatical information that is most often being provided
in corpora of less widely spoken languages is that on the part-of-speech
membership of individual morphemes or of word forms (as illustrated in
the tiers labelled \ps_mo and \ps_w, respectively, in (17)). This is often
referred to as morphosyntactic tagging in the corpus linguistics literature.
(17) Part-of-speech tagging on morpheme and word level
(J aminjung example)
\mo thanthu=biya wajgany wirib-ni..
\it that=SEQ sugarbag dog-ERG
\ps_mo dem=clitic n n-case
\ps_w dem=clitic n n
\ft that honey, the dog
\mo mu-mirrang gani-ngayi-m=ngarndi
\it FS- look.up 3SG.A:3SG.P-see-PRS=FOC
\ps_mo pv bpron-vtr-tense=clitic
\ps_w pv vinfl_tr
\ft is looking up at it
The tier indicating the part of speech category can be used to search for
patterns of distribution and can therefore assist in grammatical analysis.
Technically speaking, at least part of speech assignment on a morpheme-
by-morpheme-basis can easily be done automatically in conjunction with
automatic interlinear glossing, e.g. by the Shoebox/Toolbox software. It
should always be borne in mind, however, that the assignment of parts of
speech to lexical items in a language which has not been well described is
by no means a trivial task and should not rely on semantic criteria (see, e.g.,
Schachter 1985; Sasse 1993; Broschart 1997; and references there). Unless
the language under consideration has straightforward criteria for word-class
assignment (usually morphological criteria, e.g. clearly different inflec-
tional paradigms for the major parts of speech such as nouns and verbs), it
is perhaps advisable not to add part-of-speech tagging until at a later stage
in the documentation process.
A next possible step in grammatical annotation is the coding of con-
stituency in the form of reduced tree diagrams (e.g. by bracketing). An il-
lustration is provided in (18), where NP stands for noun phrase and CP for
Chapter 9 Linguistic annotation 243
complex predicate, consisting of a preverb and an inflecting verb (there is
no evidence for a verb phrase level including a noun phrase in J aminjung).
(18) Grammatical tagging of constituency (J aminjung example)
\mo thanthu=biya wajgany wirib-ni..
\it that=SEQ sugarbag dog-ERG
\gr [dem=clitic n]NP [n]NP
\ft that honey, the dog
\mo mu-mirrang gani-ngayi-m=ngarndi
\it FS- look.up 3SG.A:3SG.P-see-PRS=FOC
\gr [pv vinfl_tr]CP
\ft is looking up at it
As pointed out above, this type of grammatical annotation presupposes a
good understanding of the grammar, as well as the adherence to a particular
model of constituency. Note also that it is rather difficult to change follow-
ing a change in either the grammatical analysis or the model adopted, un-
like a change in grammatical glosses or part of speech tags which can be
done in a (semi-)automatic fashion. It is therefore not necessarily recom-
mended for the purpose of language documentation and certainly should
not be undertaken in the early stages of a documentation project.
4.3. Grammatical notes
While a consistent annotation of grammatical structure will prove impracti-
cal for many if not most documentation projects, the annotator may well
wish to highlight particularly good or relevant (or indeed, problematic)
examples of certain constructions by adding keywords or even a more full-
fledged commentary on the structure in question (see also Section 3.2 of
Chapter 12). If keywords are used, it is advisable to apply these consis-
tently (i.e. to employ a controlled vocabulary) in order to facilitate later
searches; ideally, the items in the list will also be commented on in the
sketch grammar, or at least in a glossary accompanying the documentation.
Grammatical notes of this nature greatly aid the production of a sketch
grammar and/or a comprehensive reference grammar, either by the original
annotators or by later users of the documentation.
In the examples below, the tier labelled \grn contains grammatical notes
of the nature discussed above. In (19), the description case marking: abla-
244 Eva Schultze-Berndt
tive agent is intended to alert the user to the (rare) phenomenon of agent
marking with the ablative (rather than ergative) case. If the user had to rely
on a search for Ablative (e.g. by looking for the gloss ABL), one would
have to go through at least 95% of examples where the ablative has its
more common function of indicating a spatial source.
(19) Use of grammatical descriptors (J aminjung example)
\mo mugmug-ngunyi ngayirr gan-arra-m
\it owl-ABL peep 3SG.A:3SG.P-put-PRS
\ft the owl is looking down at him
\grn case marking: ablative agent
The use of a grammatical notes tier can be extended to semantic notes,
i.e. highlighting examples that are of particular relevance for the semantic
description and lexicographic treatment of a given lexical item (see also
Chapter 6). In some cases, like that illustrated in (20), this may border on
ethnographic commentary, as discussed in Chapter 8.
(20) Use of grammatical descriptors for semantic description (Waimaa
example, Waimaa DoBeS team)
\mo tou hile thunu la udo-wai gai/
\it PTL again bake at rain maybe
\ft (let me know) when you again make a sacrifice for (calling)
\grn /thunu/ bake is also widely used for ceremonies and festivi-
ties of all kinds, including making a sacrifice or having a
party. Malay bakar bake is used in the same way in local
A tier dedicated to grammatical notes can further be used to document
grammaticality judgments elicited by means of variations of the utterance
in question, e.g. when the fieldworker deliberately changes the word order,
case inflection, or other aspects of an attested utterance in order to ascertain
whether this will or will not be accepted by native speakers. For example,
in (21) the descriptor tier indicates that I have inquired about the possibility
of using the verb -inama do with foot in the context of closing a car door
with ones foot (described using a different verb in the attested example)
but that this was not accepted by the speaker whose initials are given in
Chapter 9 Linguistic annotation 245
(21) Use of grammatical descriptors for grammaticality judgments (J amin-
jung example)
\mo jubard gan-arra-m wirlga-ni
\it shut 3SG.A:3SG.P-put-PRS foot-ERG/INSTR
\ft she shuts it with her foot
\cc car door
\grn verb: * -inama do with foot (J M)
5. Metacommentaries (notes and questions)
In the actual practice of annotating a recorded speech event, the annotator
will often wish to add notes or metacommentaries on some aspects of the
annotation. Often, these will appear in the form of questions e.g. when a
certain lexeme is expected on the basis of the translation and the context
but can only be imperfectly recognized in the acoustic signal, or when the
annotator is unsure of the contextual relevance of the utterance. Such ques-
tions may or may not be resolved in later stages of the annotation process.
Their inclusion in the annotation greatly helps the annotator(s) to system-
atically check for open questions at a later stage. If the problems cannot be
solved, the existence of a note to this effect also helps later users of the
documentation to interpret the annotation. In the most systematic annota-
tion format imaginable, one would probably employ a separate metacom-
mentary tier accompanying every single annotation tier. In actual practice
though, a single tier for such metacommentaries will be sufficient and more
practicable, since the target of the commentary is usually clear. In example
(22), both the note in the metacommentary tier (labelled \qu) and the ques-
tion marks in the interlinear gloss (\it) and translation (\ft) tiers point to an
uncertainty in the transcription of the verb the expected imperfective form
of the verb would be ganngarnanyi, but the transcribed form is gannginyi.
(22) Use of metacommentary tier (J aminjung example)
\mo thanthiya=biya gan-nginyi=yirrag
\it DEM=SEQ 3SG.A-1.P-give?:IMPF?=1PL.EXCL.OBL
\ft that one she gave to me (?)
\qu ganngarnanyi??
Example (23) illustrates the use of a metacommentary tier for noting meta-
linguistic commentaries of speakers on an utterance. Strictly speaking, this
246 Eva Schultze-Berndt
would not be necessary if the whole discussion had been recorded and tran-
scribed (see also Section 6).
(23) Metalinguistic information as metacommentary (J aminjung example)
\mo ning nga-jga-ny nganju
\it 1SG-go-PST tendon
\ft I tore my tendon
\qu some dispute as to whether ning or bag break was correct;
MW said ning spontaneously but eventually agreed to bag
Notes may also include any commentary on an aspect of the recording that
is not systematically incorporated into the annotation for example, when
prosodic information (see Section 2.4) is not generally transcribed but the
annotator wishes to indicate that a particular word was spoken with extra
high pitch.
6. Cross-referencing
One further type of annotation that can greatly enhance the value of a
documentation is the use of cross-referencing. Cross-referencing can be
employed to indicate the relationship between an original utterance and a
metalinguistic comment related to this utterance, as may arise when a re-
cording is played back to native speakers for clarification. This is illustrated
in (24); the utterance in (24b) is the paraphrase given by a different speaker
during playback of the recording of utterance (24a). Cross-referencing is
achieved here by including the unique reference number of each utterance
in the tier labeled \cf of the corresponding utterance (see also the extensive
illustration of cross-referencing in Chapter 9).
(24) Cross-referencing to the paraphrase of an utterance
(J aminjung examples)
a. \ref 99_v01_06_756
\sp VP
\mo burnduma-ny=biya jirrama maja=yirram=in=ung
\it 2DU:come-PST=SEQ two thus=two=ERG=CLITIC
\ft the two (crocodiles) came now, both of them like that
\cf 99_FN_433
