Judging Interpretations: Thomas A. Schwandt
Judging Interpretations: Thomas A. Schwandt
Judging Interpretations: Thomas A. Schwandt
Judging Interpretations
Thomas A. Schwandt
NEW DIRECTIONS FOR EVALUATION, no. 114, Summer 2007 © Wiley Periodicals, Inc.
Published online in Wiley InterScience (www.interscience.wiley.com) • DOI: 10.1002/ev.223 11
12 ENDURING ISSUES IN EVALUATION
and the like that he or she has learned as ways of living and grasping the
world (as expressed by Joseph Rouse, 1987). Third, a consequence of these
two assertions is the notion that if interpretations are always made in a con-
text or background of shared (social) beliefs and practices, it follows that
interpretations are, in an important sense, infused with political and ethical
implications related to matters of power and authority. In other words,
interpretation is not simply an individual cognitive act but a social and
political practice. Clearly, these central principles of a philosophy of inter-
pretivism stand in sharp contrast to what is, more or less, a standard
epistemological account of establishing the objectivity and truthfulness of
claims that we make about the world. On that account, a claim is consid-
ered objective and true to the extent that it is free of any biasing influence
of context or background beliefs and accurately mirrors the way the world
really is.
It is against this backdrop (and fairly fearlessly entering into this
complicated epistemological matter) that Egon Guba and Yvonna Lincoln
offered their thinking on the question of appropriate criteria for judging
evaluations as interpretations. In the chapter that appears on the following
pages (Guba and Lincoln, 1986) and in the two very influential books that
serve as bookends to it (Guba and Lincoln, 1985, 1989), they built an
argument for the way those committed to the interpretive practice of
evaluation could profitably address the difficult problem of demonstrating
the credibility of their interpretations. To their credit, as is apparent in the
following chapters as well as in the aforementioned books, they did not offer
their way of thinking as the last word but rather as an invitation to further
debate and consideration. For those of us who have, in the past twenty
years, subsequently wrestled with the problem of the nature and justifica-
tion of interpretations, their work has remained a touchstone for both
disagreement on the part of some scholars and elaboration and extension
on the part of others in many fields of study. They should be happy that the
invitation they issued has been accepted.
What they describe in the chapter are two approaches to thinking about
the problem of justifying interpretations. One way they characterize as that
of employing trustworthiness criteria, and they describe these criteria as
analogs to “scientific” understandings of conventional notions of internal
validity (credibility), external validity (transferability), reliability (depend-
ability), and objectivity (neutrality). The second way, they argue, is funda-
mentally different, and more aligned with assumptions about interpretations
as socially constructed undertakings with significant implications for the
ways in which we inevitability use those interpretations to continue to go
on with one another (as Wittgenstein might have said)—that is, in making
sense of or understanding one another and subsequently acting with
confidence on those understandings. Thus, they offered a new (and some-
times difficult) language of authenticity criteria—fairness, ontological
authenticity, educative authenticity, and catalytic authenticity.
NEW DIRECTIONS FOR EVALUATION • DOI: 10.1002/ev
JUDGING INTERPRETATIONS 13
anthropology, and so on) are an extension of the ways we support the truth-
fulness, honesty, correctness, and actionability of our interpretations in
everyday life. To successfully defend our interpretations we appeal to crite-
ria of both trustworthiness and authenticity. Guba and Lincoln name these
ways in shorthand expressions befitting our ways of thinking of social
scientific practices like evaluation; yet, be not misled, more importantly they
have invited us to think more carefully about what judging the credibility
of interpretations actually entails in both our everyday lives and our
professional lives as interpreters of human actions.
References
Bohman, J. F., Hiley, D. R., and Shusterman, R. “Introduction: The Interpretive Turn.”
In D. R. Hiley, J. F. Bohman, and R. Shusterman (eds.), The Interpretive Turn: Philosophy,
Science, Culture (pp. 1–16). Ithaca, N.Y.: Cornell University Press, 1991.
Guba, E. G., and Lincoln, Y. S. Naturalistic Inquiry. Thousand Oaks, Calif.: Sage, 1985.
Guba, E. G., and Lincoln, Y. S. “But Is It Rigorous? Trustworthiness and Authenticity in
Naturalistic Evaluation.” In D. Williams (ed.), Naturalistic Evaluation. New Directions
for Evaluation, no. 30. San Francisco: Jossey-Bass, 1986.
Guba, E. G., and Lincoln, Y. S. Fourth Generation Evaluation. Thousand Oaks, Calif.:
Sage, 1989.
Rouse, J. Knowledge and Power. Ithaca, N.Y.: Cornell University Press, 1987.
Until very recently, program evaluation has been conducted almost exclusively
under the assumptions of the conventional, scientific inquiry paradigm using
(ideally) experimentally based methodologies and methods. Under such
assumptions, a central concern for evaluation, which has been considered a
variant of research and therefore subject to the same rules, has been how to
maintain maximum rigor while departing from laboratory control to work in
the “real” world.
The real-world conditions of social action programs have led to increas-
ing relaxation of the rules of rigor, even to the extent of devising studies looser
than quasi-experiments. Threats to rigor thus abound in sections explaining
how, when, and under what conditions the evaluation was conducted so
that the extent of departure from desired levels of rigor might be judged.
Maintaining true experimental or even quasi-experimental designs, meeting
the requirements of internal and external validity, devising valid and reliable
instrumentation, probabilistically and representatively selecting subjects and
assigning them randomly to treatments, and other requirements of sound
procedure have often been impossible to meet in the world of schools
and social action. Design problems aside, the ethics of treatment given and
treatment withheld poses formidable problems in a litigious society (Lincoln
and Guba, 1985b).
We are indebted to Judy Meloy, graduate student at Indiana University, who scoured the
literature for references to fairness and who developed a working paper on which many
of our ideas depend.
Guba and Lincoln, in press). Only a brief reminder about the axioms that
undergird naturalistic and responsive evaluations is given here.
The axiom concerned with the nature of reality asserts that there is no
single reality on which inquiry may converge, but rather there are multiple
realities that are socially constructed, and that, when known more fully, tend
to produce diverging inquiry. These multiple and constructed realities
cannot be studied in pieces (as variables, for example), but only holistically,
since the pieces are interrelated in such a way as to influence all other
pieces. Moreover, the pieces are themselves sharply influenced by the nature
of the immediate context.
The axiom concerned with the nature of “truth” statements demands
that inquirers abandon the assumption that enduring, context-free truth
statements—generalizations—can and should be sought. Rather, it asserts
that all human behavior is time- and context-bound; this boundedness
suggests that inquiry is incapable of producing nomothetic knowledge but
instead only idiographic “working hypotheses” that relate to a given and
specific context. Applications may be possible in other contexts, but they
require a detailed comparison of the receiving contexts with the “thick
description” it is the naturalistic inquirer’s obligation to provide for the
sending context.
The axiom concerned with the explanation of action asserts, contrary
to the conventional assumption of causality, that action is explainable only
in terms of multiple interacting factors, events, and processes that give shape
to it and are part of it. The best an inquirer can do, naturalists assert, is to
establish plausible inferences about the patterns and webs of such shaping
in any given evaluation. Naturalists utilize the field study in part because it
is the only way in which phenomena can be studied holistically and in situ in
those natural contexts that shape them and are shaped by them.
The axiom concerned with the nature of the inquirer-respondent
relationship rejects the notion that an inquirer can maintain an objective
distance from the phenomena (including human behavior) being studied, sug-
gesting instead that the relationship is one of mutual and simultaneous influ-
ence. The interactive nature of the relationship is prized, since it is only because
of this feature that inquirers and respondents may fruitfully learn together. The
relationship between researcher and respondent, when properly established, is
one of respectful negotiation, joint control, and reciprocal learning.
The axiom concerned with the role of values in inquiry asserts that far
from being value-free, inquiry is value-bound in a number of ways. These
include the values of the inquirer (especially evident in evaluation, for exam-
ple, in the description and judgment of the merit or worth of an evaluand),
the choice of inquiry paradigm (whether conventional or naturalistic, for
example), the choice of a substantive theory to guide an inquiry (for exam-
ple, different kinds of data will be collected and different interpretations
made in an evaluation of a new reading series, depending on whether the eval-
uator follows a skills or a psycholinguistic reading theory), and contextual
NEW DIRECTIONS FOR EVALUATION • DOI: 10.1002/ev
18 ENDURING ISSUES IN EVALUATION
values (the values inhering in the context, and which, in evaluation, make a
remarkable difference in how evaluation findings may be accepted and used).
In addition, each of these four value sources will interact with all the others
to produce value resonance or dissonance. To give one example, it would be
equally absurd to evaluate a skills-oriented reading series naturalistically as
it would to evaluate a psycholinguistic series conventionally because of the
essential mismatch in assumptions underlying the reading theories and
the inquiry paradigms.
It is at once clear, as Morgan (1983) has convincingly shown, that the
criteria for judging an inquiry themselves stem from the underlying para-
digm. Criteria developed from conventional axioms and rationally quite
appropriate to conventional studies may be quite inappropriate and even
irrelevant to naturalistic studies (and vice versa). When the naturalistic
axioms just outlined were proposed, there followed a demand for develop-
ing rigorous criteria uniquely suited to the naturalistic approach. Two
approaches for dealing with these issues have been followed.
Parallel Criteria of Trustworthiness. The first response (Guba, 1981;
Lincoln and Guba, 1985a) was to devise criteria that parallel those of the
conventional paradigm: internal validity, external validity, reliability, and
objectivity. Given a dearth of knowledge about how to apply rigor in the
naturalistic paradigm, using the conventional criteria as analogs or
metaphoric counterparts was a possible and useful place to begin. Further-
more, developing such criteria built on the two-hundred-year experience of
positivist social science.
These criteria are intended to respond to four basic questions (roughly,
those concerned with truth value, applicability, consistency, and neutrality),
and they can also be answered within naturalism’s bounds, albeit in different
terms. Thus, we have suggested credibility as an analog to internal validity,
transferability as an analog to external validity, dependability as an analog to
reliability, and confirmability as an analog to objectivity. We shall refer
to these criteria as criteria of trustworthiness (itself a parallel to the term rigor).
Techniques appropriate either to increase the probability that these
criteria can be met or to actually test the extent to which they have been
met have been reasonably well explicated, most recently in Lincoln and
Guba (1985a). They include:
For credibility:
For transferability:
actual belief system that undergirds a position on any given issue is not
always an easy task, but exploration of values when clear conflict is evident
should be part of the data-gathering and data-analysis processes (especially
during, for instance, the content analysis of individual interviews).
The second step in achieving the fairness criterion is the negotiation of
recommendations and subsequent action, carried out with stakeholding
groups or their representatives at the conclusion of the data-gathering,
analysis, and interpretation stages of evaluation effort. These three stages
are in any event simultaneous and interactive within the naturalistic para-
digm. Negotiation has as its basis constant collaboration in the evaluative
effort by all stakeholders; this involvement is continuous, fully informed
(in the consensual sense), and operates between true peers. The agenda for
this negotiation (the logical and inescapable conclusion of a true collabo-
rative evaluation process), having been determined and bounded by all
stakeholding groups, must be deliberated and resolved according to rules of
fairness. Among the rules that can be specified, the following seem to be the
absolute minimum.
that are more ontologically authentic. It is also essential that they come to
appreciate (apprehend, discern, understand)—not necessarily like or agree
with—the constructions that are made by others and to understand how those
constructions are rooted in the different value systems of those others. In this
process, it is not inconceivable that accommodations, whether political, strate-
gic, value-based, or even just pragmatic, can be forged. But whether or not that
happens is not at issue here; what the criterion of educative validity implies
is increased understanding of (including possibly a sharing, or sympathy
with) the whats and whys of various expressed constructions. Each stakeholder
in the situation should have the opportunity to become educated about others
of different persuasions (values and constructions), and hence to appreciate
how different opinions, judgments, and actions are evoked. And among those
stakeholders will be the evaluator, not only in the sense that he or she will
emerge with “findings,” recommendations, and an agenda for negotiation that
are professionally interesting and fair but also that he or she will develop a
more sophisticated and complex construction (an emic-etic blending) of both
personal and professional (disciplinary-substantive) kinds.
How one knows whether or not educative authenticity has been reached
by stakeholders is unclear. Indeed, in large-scale, multisite evaluations, it may
not be possible for all—or even for more than a few—stakeholders to achieve
more sophisticated constructions. But the techniques for ensuring that stake-
holders do so even in small-scale evaluations are as yet undeveloped. At a
minimum, however, the evaluator’s responsibility ought to extend to ensur-
ing that those persons who have been identified during the course of
the evaluation as gatekeepers to various constituencies and stakeholding
audiences ought to have the opportunity to be “educated” in the variety of
perspectives and value systems that exist in a given context.
By virtue of the gatekeeping roles that they already occupy, gatekeep-
ers have influence and access to members of stakeholding audiences.
As such, they can act to increase the sophistication of their respective
constituencies. The evaluator ought at least to make certain that those from
whom he or she originally sought entrance are offered the chance to
enhance their own understandings of the groups they represent. Various
avenues for reporting (slide shows, filmstrips, oral narratives, and the like)
should be explored for their profitability in increasing the consciousness
of stakeholders, but at a minimum the stakeholders’ representatives and
gatekeepers should be involved in the educative process.
Catalytic Authentication. Reaching new constructions, achieving under-
standings that are enriching, and achieving fairness are still not enough.
Inquiry, and evaluations in particular, must also facilitate and stimulate
action. This form of authentication is sometimes known as feedback-action
validity. It is a criterion that might be applied to conventional inquiries and
evaluations as well; although if it were virtually all positivist social action,
inquiries and evaluations would fail on it. The call for getting “theory into
action”; the preoccupation in recent decades with “dissemination” at the
NEW DIRECTIONS FOR EVALUATION • DOI: 10.1002/ev
24 ENDURING ISSUES IN EVALUATION
Summary
All five of these authenticity criteria clearly require more detailed explica-
tion. Strategies or techniques for meeting and ensuring them largely remain
to be devised. Nevertheless, they represent an attempt to meet a number of
criticisms and problems associated with evaluation in general and natural-
istic evaluation in particular. First, they address issues that have pervaded
evaluation for two decades. As attempts to meet these enduring problems,
they appear to be as useful as anything that has heretofore been suggested
(in any formal or public sense).
Second, they are responsive to the demand that naturalistic inquiry
or evaluation not rely simply on parallel technical criteria for ensuring
reliability. While the set of additional authenticity criteria might not be the
complete set, it does represent what might grow from naturalistic inquiry
were one to ignore (or pretend not to know about) criteria based on the
conventional paradigm. In that sense, authenticity criteria are part of an
inductive, grounded, and creative process that springs from immersion with
naturalistic ontology, epistemology, and methodology (and the concomitant
attempts to put those axioms and procedures into practice).
NEW DIRECTIONS FOR EVALUATION • DOI: 10.1002/ev
TRUSTWORTHINESS AND AUTHENTICITY IN NATURALISTIC EVALUATION 25
Third, and finally, the criteria are suggestive of the ways in which new
criteria might be developed; that is, they are addressed largely to ethical and
ideological problems, problems that increasingly concern those involved in
social action and in the schooling process. In that sense, they are confluent
with an increasing awareness of the ideology-boundedness of public life and
the enculturation processes that serve to empower some social groups
and classes and to impoverish others. Thus, while at first appearing to be
radical, they are nevertheless becoming mainstream. An invitation to join
the fray is most cheerfully extended to all comers.
References
Guba, E. G. “Criteria for Assessing the Trustworthiness of Naturalistic Inquiries.”
Educational Communication and Technology Journal, 1981, 29, 75–91.
Guba, E. G., and Lincoln, Y. S. “Do Inquiry Paradigms Imply Inquiry Methodologies?” In
D. L. Fetterman (Ed.), The Silent Scientific Revolution. Beverly Hills, Calif.: Sage, in press.
Guba, E. G., and Lincoln, Y. S. Effective Evaluation: Improving the Usefulness of Evaluation
Results Through Responsive and Naturalistic Approaches. San Francisco: Jossey-Bass, 1981.
Guba, E. G., and Lincoln, Y. S. “The Countenances of Fourth Generation Evaluation:
Description, Judgment, and Negotiation.” Paper presented at Evaluation Network
annual meeting, Toronto, Canada, 1985.
House, E. R. “Justice in Evaluation.” In G. V. Glass (Ed.), Evaluation Studies Review
Annual, no. 1. Beverly Hills, Calif.: Sage, 1976.
Lehne, R. The Quest for Justice: The Politics of School Finance Reform. New York:
Longman, 1978.
Lincoln, Y. S., and Guba, E. G. Naturalistic Inquiry. Beverly Hills, Calif.: Sage, 1985a.
Lincoln, Y. S., and Guba, E. G. “Ethics and Naturalistic Inquiry.” Unpublished manu-
script, University of Kansas, 1985b.
Morgan, G. Beyond Method: Strategies for Social Research. Beverly Hills, Calif.: Sage, 1983.
Strike, K. Educational Policy and the Just Society. Champaign: University of Illinois
Press, 1982.