Artigo Kintsch
Artigo Kintsch
Artigo Kintsch
English texts were constructed from propositional bases. One set of 16-
word sentences was obtained from semantic bases containing from 4 to 9
proposition;. For another set of sentences and paragraphs, number of words
and number of propositions covaried. Subjects read the texts at their own
rate and recalled them immediately. For the 16-word sentences, subjects
needed 1.5 set additional reading time to process each proposition. For
longer texts, this value increased. In another experimental condition reading
time was controlled by the experimenter. The analysis of both the text
and the recall protocols in terms of number of propositions lent support to
the notion that propositions are a basic unit of memory for text. However,
evidence was also obtained that while the total number of propositions upon
which a text was based proved to be an effective psychological variable, all
propositions were not equally difficult to remember: superordinate proposi-
tions were recalled better than propositions which were stucturally sub-
ordinate.
‘This research was supported by a Grant from the National Institute of Mental
Health, MH-15872. The research findings reported in this paper were presented at the
Meeting of the Psychonomic Society, St. Louis, 1972. Reprint requests should be sent
to: Dr. Walter Kintsch, Department of Psychology, University of Colorado, Boulder,
CoIorado 80302.
257
Copyright @ 1973 by Academic Press, Inc.
All rights of reproduction in any form reserved.
258 KINTSCH AND KEENAN
variables of interest for the present study. Other subjects, however, were
given only a limited reading time. Their recall was compared with that
of unrestricted subjects, and a processing model that captures some of
the most salient features of both sets of data was developed.
METHOD
Subjects. In the free reading time condition, 29 undergraduates from
the University of Colorado served as subjects. They were fulfilling part
of a course requirement. For the restricted reading time condition, 44
students were each paid $2.00 for their participation.
Material. Two sets of materials were constructed for this experiment.
Set A consisted of 10 sentences which were 16-17 words long counting
punctuation, 1616 otherwise. The sentences were not related to each
other, most of them dealing with topics from classical history. This choice
of topics was made in an attempt to hold the relative familarity of the
text at a minimum, while avoiding problems of vocabulary concomitant
with equally unfamiliar but more technical material.
Although word length in these sentences was fairly strictly controlled,
the number of propositions upon which each sentence was based varied
between four and nine. Two sample sentences, together with the proposi-
tions from which they were constructed and the hierarchical relationships
which exist among these propositions, are shown in Table 1. The analysis
into propositions was made according to the theory described in Kintsch
TABLE 1
VIII. Cleopat m’s dowvllfnll lily in her foolish t rilst in the fickle polil icnl
figures of the l:c)mxn world.
1 (BIXAIJSK, N, 8)
2 (FELL l)O\I’K, CLIWPATl:A) = a 2
;: (TIWST, CLWPATRA, FI(;ITl:l’S) = jj /*
4 (FOOLIHII, Tl:UST) 1 - :: -3 4
5 (FICRLB, FI(;UlWS)
\
(i (POLITICAL, FIGUI:l~:S) 5 ---f fi
7 WART OF, FIGUIXS, ROl:I,l)) \
s (RO~IAN, \V~;ol:T,l)) 7 -* s
260 KINTSCH AND KEENAN
RESULTS
Sentence Set A. The subjects’ recall protocols for the 10 sentences in
Set A were scored for propositional recall. For each protocol it was
determined which of the propositions were recalled. Paraphrases of the
original wording were accepted as correct, as long as the propositional
meaning was accurately expressed. If a subject made an error in a super-
ordinate proposition which then reappeared in a subordinate proposition,
which was otherwise correct, the subordinate proposition was accepted
as correctly recalled, while the superordinate proposition itself was
scored as incorrect. For example, suppose a subject recalled Sentence I
as Romulus took t& Saline cities by force. Proposition 1 was scored as
incorrect because of the substitution of cities for women, but Proposition
4 was scored as correct. Although the subject had made an error in the
262 KINTSCH AND KEENAN
5/ 123456789
FIG. 1. Mean reading time for the sentences of Set A as a function of the number
of propositions in the base structure of the sentences as presented, and as a function
of the number of propositions recalled by the subjects, together with least square
lines. The standard error for the points shown ranges from .75 to 2.10 set, since
points are based on differing numbers of observations, and averages 1.15 sec.
PROPOSITIONAL BASE AND READING 263
FIG. 2. Mean number of propositions presented and recalled for the sentences of
Set A. The predictions for the free reading data are shown by the broken line (for
explanation see text). The standard error for free reading is .22 propositions and
.17 propositions for the 5 set reading time.
264 IUNTSCH AND KEENAN
limited reading time. The free reading data are partly implied by the
results already discussed: Equations (1) and (2) can be combined to
obtain a relationship between the number of propositions presented and
recalled. This relationship is shown as the broken line in Fig. 2. Obvi-
ously, it describes the actual data quite well, although its slope (.64)
somewhat underestimates the least squares value (.69). Number of
propositions presented and mean number recalled correlate r = -91 in the
data shown. This high value is not quite matched by the data from the
restricted reading condition where an r = .74 was obtained. When read-
ing time was limited recall was not as good as when reading time was
free, and this difference was greatest for the most difficult sentences, that
is, those based upon a large number of propositions. However, the same
kind of relationship between propositions presented and propositions
recalled appears to hold for both conditions.
When reading was self-paced, the subjects recalled 86% of the proposi-
tions correctly. Given the design of the present experiment, one cannot,
of course, tell the cause of errors in this experiment: it may be a failure
of processing, or forgetting, or a combination of both. Therefore, only
correct responses will be discussed here. The percentage of propositions
recalled of those actually presented was independent of either the total
number of propositions in each sentence, the total number of terms
appearing in the propositions for each sentence, or the reading time for
that sentence. However, which propositions were recalled was by no
means random. The hierarchical relationships among the propositions in
each sentence were a powerful determinant of recall. There are two
obvious ways to quantify these hierachies. One is to consider the rank
of each proposition in a sentence, with the most superordinate proposi-
tion assigned rank 0, the immediately subordinate propositions rank 1,
etc. Thus, in Sentence VIII of Table 1, Proposition 1 would have
rank 0; Propositions 2 and 3 would have rank 1; Propositions 4, 5, and
7 would have rank 2; and Propositions 6 and 8 would be assigned rank 3.
The likelihood of an error was computed for all propositions as a function
of rank thus defined and is shown in Fig. 3. Of course, propositions of
high rank must necessarily come from sentences with many propositions.
In order to avoid possible selection effects, the propositions of each
sentence were divided into two classes-low and high rank. When rank
was Vincentized in this way, the decrease in the likelihood of a correct
recall was still correlated with rank: recall was 86% for the superordinate
propositions and 74% for the subordinate propositions,
A second way of quantifying propositional hierarchies is to count the
number of descendants for each proposition. For example, in Sentence
VIII, Proposition 1 has 7 descendants, Proposition 3 has 5 descendants,
PROPOSITIONAL BASE AND READING 265
>-
k
? 100.
m go-
2 .80.
0 70.
lx
a 60.
RANK
FIG. 4. Mean reading times for the 20 sentences of Set B as a function of the num-
ber of propositions recalled. The data for each sentence are shown as Vincent curves.
processed. Note that it is not necessary that all of the a set for general
analysis precede the processing of separate propositions.
Now let us assume that the processing times for each proposition are
exponentially distributed with rate l/b = X. This assumption is chosen
merely because it is the simplest one known for workable reaction time
models. The total processing times for Sentence i would then be com-
posed of a constant plus a gamma distribution with parameters x and Pi.
However, since the subjects may, and do, stop reading at any time, the
observable reading time distributions are truncated in unknown ways.
This problem could be avoided by considering only the reading times
for perfectly recalled sentences, but in the present data, the frequency
of such events is much too small to permit an analysis of reading time
distributions.
A more practical approach may be taken by noting that if a subject
reads a sentence based upon P propositions for time (t - a) with rate X,
the likelihood that he will have processed z propositions during this
time is a Poisson variable with parameter h( t - a). But by Eq. (3) and
the definition of X,
so that
Pr(x = P;P) =
c
2=P
CPPZ
p*
X!
e--r(5--c)((5 - C)T)’
Pri(z:T(S - c)) = ifz <Pi (5a)
z.I
co
p(5--c)((5 - c)T)z
Pr& = P,;7(5 - c)) = ___ @b)
c z. I
z=pi
0 1 234567 8 9
NUMBER OF PROPOSITIONS
FIG. 7. Predicted (lines) and observed (circles and crosses) recall probabilities
for sentence Set A, 5 set reading time. The partial recall data are averaged over all
sentences. The complete recall data are based upon either one or two sentences and
have been smoothed by taking running triplet means.
272 XINTSCH AND KEENAN
Note, however, that the prediction (and observation) that the likelihood
of complete recall decreases as the number of underlying propositions
increases in a sentence agrees loosely with results reported by Perfetti
( 1969). Perfetti found that sentences with a high lexical density, a
statistic correlated with though not identical with number of proposi-
tions, were recalled more poorly than sentences with low lexical density.
There are many other procedural and scoring differences between Per-
fetti’s study and the present experiment, but the consistency of the results
should not be overlooked.
From ~(5 - c) = 4.2 one can obtain somewhat speculative estimates
of c and 7, if one is willing to assume that the relationship between slope
and intercept which was observed in Fig. 1 holds true even if reading
time is restricted. Thus, if the intercept is about 4 times the slope, T be-
comes 1.64, which compares with x = 66 for the free-reading data. In
other words, while the subjects took about 1.5 set per proposition if
reading time was unlimited, only .61 set per proposition was needed
when each sentence was exposed for 5 sec.
Clearly, no information processing model will be satisfactory unless
differences between propositions are taken into account. However, this
limitation, which is inherent not only in the model but in the whole
design of the present study, should not detract from the positive accom-
plishments: number of propositions, although by no means the whole
story, has proven to be a useful independent variable for the analysis of
both reading time and recall data. One of the next steps will be to
explicitly account for at least some differences among propositions. In-
deed, although this experiment was not designed for this purpose, some
of the subsidiary results discussed above have already provided interest-
ing clues about some characteristics of propositions that are probably
important. It has been noted that, in general, superordinate propositions
were recalled better than subordinate propositions. One way to in-
corporate this finding into a model would be to describe a noticing or
processing order for propositions which depends upon their hierarchical
structure. However, rather than elaborate the model at present, it
may be a better research strategy to design experimental studies to
explore these problems further. There are, of course, other aspects of
propositions not related to their hierarchical structure that were found
to be important determinants in recall. Propositions appearing in the
surface structure as modifiers were very poorly recalled, while proposi-
tions which contained a proper name were extremely well-recalled. In
general, it must be remembered that although number of propositions
had a large effect upon reading times, it could account for only 21%
of their variance. That leaves quite a bit of room for other factors,
PROPOSITIONAL BASE AND READING 273