Keywords: The author has thought about working memory, not always by that name, since 1969 and has conducted research
Working memory development on its infant and child development since the same year that the seminal work of Baddeley and Hitch (1974) was
Empiricism published. The present article assesses how the field of working memory development has been influenced since
those years by major theoretical perspectives: empiricism (along with behaviorism), nativism (along with
modularity), cognitivism (along with constructivism), and dynamic systems theory. The field has not fully dis
Neo-Piagetian theory cussed the point that these theoretical perspectives have helped to shape different kinds of proposed working
Memory decay memory systems, which in turn have deeply influenced what is researched and how it is researched. Here I
Working memory capacity discuss that mapping of theoretical viewpoints onto assumptions about working memory and trace the influence
of this mapping on the field of working memory development. I illustrate where these influences have led in my
own developmental research program over the years.
What is now termed working memory has interested me since I read system for short-term retention during the processing of the information
sundry things: a technical summary of research on dreams in 1969, just stored (after Baddeley & Hitch, 1974).
after high school; Hebb (1949) and a little of William James’ work, in There have been debates about whether the stark distinction be
college; and other cognitive and developmental research. By working tween empiricism and nativism is defunct (Spencer et al., 2009) or
memory, I mean the small amount of information held in mind and used inevitably important for research (Spelke & Kinzler, 2009). For the area
in cognitive tasks. Definitions of working memory vary; nine of them in of working memory development, it seems useful to consider how such
the research literature were noted by Cowan (2017a). As in some of basic influences have shaped research and how they can be integrated
those definitions, I will sometimes refer to short-term retention to mean into an adequate modern view, along with cognitivist, constructivist,
the same as working memory. Here I share observations of what may and dynamic systems perspectives that heavily rely on both kinds of
have shaped research on working memory development and its role in influences.
cognition for the past half century: in particular, three overarching Although most of the research on working memory has used abstract
views: empiricism, related to behaviorism; nativism, related to modularity; stimuli to reduce the effects of knowledge and highlight basic processes,
and cognitivism, related to constructionism. I contrast the three views and the field of working memory development can hardly be considered to
related concepts to discern their distinctive features (cf.Gibson, 1969; be removed from the rest of cognitive development. There are strong
Gibson and Pick, 2000) and implications for research, and also discuss correlations between working memory capacity and cognitive abilities
the role of dynamic systems theory. in children, including those with and those without learning challenges
The term working memory can be seen as a product of the emerging (e.g., Cowan et al., 2017; Geary, Hoard, Byrd Craven, & DeSoto, 2004;
cognitive revolution in the 1950s, starting with the use of this term to Gray et al., 2017; Slattery, Ryan, Fortune, & McAvinue, 2021; for re
refer to temporarily accessible sets of critical information needed to views see Cowan, 2010, 2014). We need to know more to understand
allow computers to solve geometric proofs (Newell & Simon, 1956). The why these correlations exist. There have also been some developmental
term was soon afterward used in a similar manner for the information studies in which memory for everyday objects was used (e.g., Forsberg,
needed for human planning and problem-solving (Miller, Galanter, & Guitard, Adams, Pattanakul and Cowan, 2021b) and studies of the
Pribram, 1960) before it was widely used to imply a multicomponent development of memory for spoken sentences (e.g., Gilchrist, Cowan, &
Naveh-Benjamin, 2009). These show age trends that are generally Table 1
comparable with the studies using more abstract materials, though Tenets of the theoretical views under consideration and implications for working
knowledge clearly also contributes to performance. Several studies have memory development.
examined working memory for instructions given in schools, high Theoretical View Main Tenet of the Potential Implications for
lighting the importance of working memory in predicting children’s Theoretical View Working Memory
scholastic success (Holmes et al., 2014; Jaroslawska, Gathercole, Logie, Development
& Holmes, 2016). In children, the ability to remember what needs to be Scientific Observing behavior is the It is convenient to start with
done at the correct time, or prospective memory, also depends on Empiricism sole basis of psychological the notion that behavior
science. depends largely on the
working memory availability (Cheie, MacLeod, Miclea, & Visu-Petra,
history of stimulation, not on
2017). At least in the adult literature, as well, arrays of information internal neural differences
can be compressed and simplified, taking up less space, to the extent that between individuals.
individuals can use knowledge and formulate rules to find patterns in Behaviorism The measured relations General principles of
the stimuli (e.g., Brady & Alvarez, 2015; Chekaf, Cowan, & Mathy, between stimuli and learning and memory can
responses are scientific; explain remembering in the
2016; Hollingworth, 2004; Jiang, Olson, & Chun, 2000) and more mental concepts are not short term. Developmental
developmental work of that sort is needed to understand the practical discussed. improvement of the ability
limits of working memory in everyday activities. In sum, in many ways to remember recent events
improving our theoretical understanding of working memory develop could stem from the learning
history of the child.
ment has considerable importance for practical issues of education and
Nativism The key factor determining Mechanisms of working
child neuropsychology. behavior is the genetic memory are likely to arise
Below, I describe how multiple theoretical views can influence factor. from innate structures in the
research in one area, working memory development, and then document brain, with a built-in
the main influences that appear to have have occurred, related to each schedule of maturation.
Modularity Individuals are endowed The development of working
view. My own research program is used to illustrate the evolution of the
with separate mechanisms memory is likely to take
field over the years, and owes the most to the cognitivist view. that have evolved to carry place according to a
out different mental relatively immutable
1. An overview: multiple influences on research on working processes. developmental schedule.
There would be interesting
memory development
questions about the
variability of maturation of
There are several reasons why these theoretical stances to be these modules between
examined in detail can enlighten our understanding of working memory individuals.
development. Throughout the long history of working memory research, Cognitivism Behavior and other sources The most helpful
of evidence can be understanding of working
there have been strong influences of different schools of thought on combined to yield memory and its development
understanding memory over the short term, which we now call working inferences about internal should come from a
memory. These influences include the idea of the empiricists that there mental processes. description in terms of
is no special working memory structure, only general rules of memory; internal mental processes
like activation, storage,
the idea of nativists that there are automatically-operating, inherited
attention, mnemonic
neural structures or modules that produce a simple taxonomy of verbal rehearsal strategies, and the
versus non-verbal stimuli; and the idea of cognitivists that working use of self-knowledge to
memory integrates some temporary information with long-term adjust strategies.
knowledge and does so strategically, in ways that depend on what the Constructivism (of Ideas in the mind are not As a child matures, the
the cognitivist solely external to the development of ideas
participant can manage using attention. One can find myriad examples variety) person and then absorbed; improves not only because of
of adult and developmental working memory studies over the years that they are largely constructed increasing knowledge, but
appear to be influenced by these theoretical starting points. Under by the person as thought also because an improved
standing more about the relevant intellectual traditions should make us develops. working memory capacity
allows more complex ideas
better able to interpret the findings and point to a good future path of
to be grasped and created
research on working memory development. from its component parts.
Table 1 summarizes the tenets and potential implications of seven Dynamic Systems Neural principles can be Working memory can
views under consideration, to be explained in turn later. Fig. 1 is a and Neuro- modeled to show how develop because increasing
snapshot illustrating what issues may be at stake, considering three constructivism mental processes can knowledge allows neural
emerge from neural processing of the input to
major views for the sake of simplicity. From each theoretical view shown activity, often in a self- retain more separate items;
in the fig. (left column), there is an expected set of mechanisms of most organizing manner. improvements in neural
interest in describing how working memory operates (second column). processing efficiency also
For each view, there is also a kind of evidence that can be seen as contribute.
somewhat problematic (third column). Each view leads to a different
conception of how working memory is likely to improve during child
great deal between theorists, according to their interests and beliefs
hood (fourth column) and it is the developmental data regarding these
about the mind. The research questions that are thought most worth
expectations (fifth column) that are highlighted here. The point of this
pursuing also differ according to beliefs. Suppose researchers are pre
exercise is not to pit the theories against one another. Rather, the
sented with the finding that, as children progress through the ages 4-15
perceived urgency of certain experimental manipulations has been
years, the length of a series of items that they can repeat, or memory
influenced by the overarching theories, and it is useful to see this to
span, increases steadily (which is true, e.g., Gathercole, Pickering,
understand where the field of working memory has gone over the years.
Ambridge, & Wearing, 2004). An empiricist may believe that the most
The influences go back to before the term working memory came into
important changes during this age range will be the learning history, and
general use after the seminal article of Baddeley and Hitch (1974).
they will want to know the learning history for the items. The emphasis
No one denies that people and most animals can experience an event
on learning, often to the exclusion of the internal structure and function
and remember various aspects of that event a few seconds later. How
of the brain, is not logically required by the empiricist view but, when
ever, the way in which this memory performance is interpreted differs a
View Working Memory Difficult Evidence Development Some Key Developmental Results
Fig. 1. A schematic illustration of the issues at stake and some key developmental evidence. The first column shows three theoretical views; the second column,
implications of each view for the structure of working memory; the third column, evidence that presents a difficulty for each type of view; the fourth column,
implications of each view for the nature of developmental change; and the fifth column, key evidence about that type of developmental change.
one focuses on behavior, it would be convenient if there were simple and verbally rehearsing the items. These strategies open the conversa
laws relating stimuli to responses, so the effort has been in searching for tion to possibilities of the improved use of attention and executive
those simple laws. There may be not many individuals that adhere function with age to achieve the most working memory storage and to
strictly to the empiricist view, or any one view, but one can point to an use it most effectively for other tasks.
effort to find out the role of learning history in terms of the familiarity of A researcher of dynamic systems theory (Table 1) might, however,
items used to in working memory tests (e.g., Melby-Lervåg & Hulme, check whether there are simpler means to use knowledge together with
2010). neural processes with which results can be emergent without having to
A nativist, on the other hand, would focus on whether the basic propose higher-level cognitive processes. There can be an interplay of
working memory mechanisms in the brain are inherited, and may have cognitive behavioral and neural modeling results, creating a tension in
questions stemming from that proposition. Working memory should be which increasingly complex behaviors are documented and in which the
operational at birth (e.g., Reznick, Morrow, Goldman, & Snyder, 2004), dynamic systems approach can be used to see if the proposed cognitive
but how does its improvement with age correspond to biological growth constructions can be pared back while still predicting the observed be
of the brain (e.g., Thomason et al., 2009)? Are there specific modules haviors. For a summary of one such research program, see Spencer
inherited to carry out different kinds of working memory tasks? Do in (2020).
dividuals differ in the quality of what they inherit (e.g., Friedman et al.,
2008), and does that difference explain working memory differences 2. Theoretical views and implications for working memory
between children at any particular maturational level? These types of developments
considerations illustrate that the field of working memory should not be
viewed as isolated from more general theoretical initiatives, in this case The views summarized in Table 1 are mostly grouped into pairs of
nativism. related views: empiricism and behaviorism, nativism and modularity,
A cognitivist believes that one can talk about working memory in and cognitivism and constructivism. The last view, dynamic systems
terms of the inferred internal components, a conception of working theory, also is related to neuroconstructivism (Westermann, Thomas, &
memory consistent with Baddeley and Hitch (1974). The cognitivist Karmiloff-Smith, 2011) that is not based on abstract symbols like other
perspective can be compared to the more functionalist use of the term types of constructivism. The majority of the author’s work that will be
working memory (focusing on what the outcome is, in a somewhat more discussed comes from a cognitivist and constructivist viewpoint. How
behaviorist manner) by Miller et al. (1960); their use of the term only ever, the other views provide an important context for the progression of
referred more vaguely to whatever processes hold or retrieve the in ideas over the years. This is not to say that the influence of these views
formation necessary for the task of planning behavior, without in on the field was always intentional, but the influence seems clear.
ferences about what those mechanisms are. To understand those internal
components, the cognitivist emphasis is on what processes change with
2.1. Working memory development and empiricism: in search of general
age, based on a combination of both learning and biological maturation,
learning principles
and to try to disentangle those processes (Cowan, 2016).
One belief, especially with the constructivist wing of cognitivism, is
Empiricism is the philosophical notion that our sensory experience is
that the processes taking place within the individual, both obligatory
the basis of all knowledge (OxfordLanguages, 2021a). Empiricism as a
and voluntary, are important in forming and retaining ideas. Whereas it
scientific philosophy in psychology refers to the special importance of
may be nativist to propose that a working-memory component of the
observed behavioral evidence. Proponents of behaviorism accepted
brain simply increases its holding capacity with age in childhood, it is
these ideas and took them to mean that we should study the relation
more cognitivist and constructivist to consider that, with age, there are
between stimuli and responses, looking for orderly relationships in
increasing strategies such as grouping items together, chunking them to
behavior without trying to divine the nature of processes hidden in the
form larger units, mentally refreshing representations with attention,
mind and brain. Under the influence of this type of theoretical view,
considerable research was carried out involving remembering infor humans (Brown, 1958; Peterson & Peterson, 1959) and in animals
mation recently presented (e.g., in the field of verbal learning), as well as (Mishkin & Delacour, 1975; Olton, Collison, & Werz, 1977). The idea
developmental norms such as those embedded within tests of intelli was that information from stimuli does not inevitably remain accessible
gence. This research strives to determine the most general principles until there is interference with it. Instead, there is said to be a process in
that can account for behavior. Thus, behaviorism led to a verbal learning which temporarily-available representations of stimuli, or short-term
theory, which was based on the hope that one set of laws can explain memory traces, undergo a decay process. After a certain amount of
learning regardless of the species or the time between encoding and decay, the information may still be in a long-term memory store but is no
retrieval. This theory is at odds with the notion that working memory longer readily accessible without a retrieval process that could fail. This
depends on some specialized mechanisms, as opposed to one general set distinction between short- and long-term memory was also proposed by
of learning mechanisms. Broadbent (1958) in a seminal book published the same year that he
Even though behaviorism is essentially a methodology, it has carried became Director of the Applied Psychology Unit in Cambridge, UK.
with it an implicit theory of the mind. For example, regarding various Behaviorism was still in fashion, and that shows in the vocabulary
psychological research endeavors, Watson (1913, p. 170) stated that Broadbent used, but the diagram of an information flow that included
“There is no reason why appeal should ever be made to consciousness… separate short- and long-term stores that appeared in a footnote in his
Or why introspective data should ever be sought during the experi book show him actually as an early cognitivist.
mentation, or published in the results. In experimental pedagogy espe Proponents of the verbal learning tradition fought back against the
cially one can see the desirability of keeping all of the results on a purely notion of so much structure in the mind. As a key example, consider the
objective plane. If this is done, work there on the human being will be method of Peterson and Peterson (1959) and a response by Keppel and
comparable directly with the work upon animals.” The hope was that Underwood (1962). On every trial of the Peterson and Peterson pro
one could find a set of laws of behavior that would be generally appli cedure, a trigram of letters, such as CHJ, was to be remembered while
cable across species. One can see that implicit theory as being appealing the participant counted backward from a given number by threes until
if learning is empiricist, based on simple associations built from sensa receiving a signal to stop counting backward and recall the trigram. It
tions to eventually yield more complex behaviors. If the learning was assumed in the behaviorist approach that letters and numbers are
mechanism is supposed to be one in which sensory neurons become different enough not to interfere with each other much. Yet, there was a
associated with effector neurons to produce behavior then, with that severe loss of information about the trigram over 18 s of counting
assumption, the resolution to study just the laws relating stimuli to re backward. However, Keppel and Underwood (1962) argued against the
sponses would reasonably follow. need for a distinction between short- and long-term memory faculties in
this work by Peterson and Peterson, on the basis of an alternative
2.1.1. Influence of empiricism and behaviorism in adult research on interpretation of what was going on. Rather than the trigram being lost
working memory: against decay and capacity limits due to short-term memory decay, they proposed that the information
The notion that one set of principles for all learning is opposed to the suffered more proactive interference from the trigrams in previous trials
idea that there are certain qualities of working memory that set it apart as the retention interval filled with counting backward increased. They
from long-term memory. In working memory, the information could be found that, in the first several trials, before there was much proactive
lost because information decays rapidly over time, or because there is a interference, there was little forgetting even if the retention interval was
mechanism with a limited capacity. In the empiricist initiatives within long. Only in subsequent trials did the loss of memory as a function of
verbal learning theory, there were efforts to show that decay and ca the retention interval emerge.
pacity limits are not needed at all. Baddeley and Scott (1971) countered the verbal learning type of
explanation by running many participants for only one trial each, The issue of decay. An outgrowth of empiricism and behav varying the retention interval across participants. They found a loss of
iorism in adult research was the field of verbal learning, summarized, for information across at least a few seconds, briefer than what Peterson and
example, by Kausler (1974) and Postman (1975) and, in child devel Peterson (1959) observed, when the list length was longer to avoid
opment, by Keppel (1964) and Goulet (1968). The issue is that when ceiling effects.
rapid loss of information over time is observed, it could be the result of Even if one accepts the Keppel and Underwood (1962) result, there
retroactive interference from subsequent stimuli or thoughts, not rapid are two interpretations possible, essentially empiricist versus nativist
decay. In order to prevent rehearsal, researchers often use distracting interpretations. The Keppel and Underwood (empiricist) interpretation
tasks during the retention interval after a set of stimuli and before recall; is that the longer counting-backward intervals allowed time for proac
but the distracting stimuli might not only tie up attention, but also tive interference from previous trials to take place in a unitary memory.
replace the items to be recalled in currently accessible memory – an In an alternative account, though, there are separate stores for short- and
example of retroactive interference. long-term retention, both used at the same time but with proactive
Many rules of learning in animals were found to apply to verbal interference being a more important factor in the longer store (cf.
learning in humans. A fundamental rule of special importance here is Cowan, Johnson, & Saults, 2005). This account combines the empiricist-
that, if one learns an association A-B and later learns an association A-C, inspired mechanism of proactive interference, along with the nativist-
the second learning episode will encounter interference from the first, inspired mechanism of a separate short-term store. When the
termed proactive interference, that is, interference of the earlier stimuli counting-backward interval is short, then short-term storage (working
on memory for the similar, later stimuli. Moreover, the interference is memory) can be used. When the interval is long, short-term storage has
time-dependent, such that A-C is at first predominant but, as time decayed and only long-term memorization can be used. In that situation,
elapses, A-B re-emerges and competes with A-C (the definition of pro proactive interference on later trials will tend to be strong as Keppel and
active interference). As proactive interference emerges, retroactive Underwood thought.
interference caused by the A-B association on the retrieval of A-C sub The idea from verbal learning that the passage of time allowed
sides. That is, if one receives A-B and then A-C, one will at first proactive interference was later formalized in the notion of temporal
remember A-C best but, after a delay, there will be a recovery of A-B at distinctiveness (e.g., Bjork & Whitten, 1974; Crowder, 1976; Glenberg &
the expense of A-C. Thus, contrary to decay, some learning was seen to Swanson, 1986). The idea is that items are represented in memory in a
re-emerge over time (McGeoch, 1932). temporally-organized stream, metaphorically like a row of telephone Decay across seconds?. A challenge to the verbal learning poles (e.g., with poles clustered into groups of three to represent the
tradition came from research on the rapid decay of information, both in items in a trigram in Peterson & Peterson, 1959). As the retention in
terval elapses, it becomes more difficult to tell which item occurred in
N. Cowan Cognition 224 (2022) 105075
which location, like walking away from the series of poles so that they The issue of capacity limits. Working memory, by definition, can
begin to merge perceptually. That merging would be the basis of pro hold a limited amount. The limit logically can occur because the rep
active interference. Cowan, Saults, and Nugent (1997) tried to assess resentations of items in working memory decay or because the number
temporal distinctiveness in a two-tone comparison task by manipulating items that can be represented in working memory at once is limited, a
not only the time between the tones to be compared, but also the times capacity limit. A capacity or a decay mechanism must be found if one
between trials that affect distinctiveness. With the ratio of times held wishes to claim that there is such a thing as a working memory separate
constant, there was still forgetting over 12 s as a function of the time in its process from long-term memory, i.e., to disprove the learning
between tones to be compared. Ricker, Spiegel, and Cowan (2014) did theory view. Miller (1956) famously suggested a capacity limit, specif
something similar for memory of arrays of unfamiliar characters or ically that working memory is limited to about seven chunks, or inte
English letters and found very little effect of distinctiveness, with a much grated items. A chunk has strong associations between elements within
larger effect of loss over 6 s from decay. Subsequent works suggests that it and weaker associations with items outside of it. For example, a
decay occurs primarily when items are presented with insufficient time known acronym like USA can be a chunk, as can a known word (as
for thorough working memory consolidation, which depends on the opposed to its phonemes).
materials (Ricker & Cowan, 2014; Ricker & Hardman, 2017; Ricker, In a seminal chapter initiating the field of working memory and a
Nieuwenstein, Bayliss, & Barrouillet, 2018) and not much when related article, Baddeley and Hitch (1974) and Baddeley, Thomson, and
encoding is good (e.g., Oberauer & Lewandowsky, 2008). Buchanan (1975) changed the emphasis away from capacity limits to Decay across milliseconds?. The issue of decay can be concentrate on decay limits. Baddeley et al. showed that people could
examined on at least two different time scales. Cowan (1984, 1988) retain about as many verbal items in a list as they could recite in about 2
reviewed evidence suggesting that in every modality (but validated s, and the notion was that covert rehearsal allows each speech element
especially for vision and hearing), there are two phases of sensory to be renewed before it decays in about 2 s, like a juggler repeatedly
storage: a vivid mental afterimage lasting only a fraction of a second, as throwing balls into the air. I, however, thought that both mechanisms
perceptual processing of the stimulus is ongoing, and a longer, more were viable (Cowan, 1988).
processed memory that lasts on the order of seconds. A good example of
both types of stores appears in the study by Phillips (1974). On each The issue of knowledge. There are aspects of working memory
trial, an irregular checkerboard-like pattern was presented, varying in that still suggest it can be reasonably interpreted without a dual process
the number of cells (complexity). It was to be compared to a second or dual storage mechanism. Craik and Lockhart (1972) found that when
pattern, after a variable interval, that differed from the first in having stimuli are processed more deeply, they are remembered longer; that is
one more or one fewer black square. If the interval was very short (~100 the levels-of-processing argument. For example, a printed word’s font is
ms) and there was no change in the location of the patterns and no considered most shallow, the phonological representation is deeper, and
masking pattern, performance was excellent regardless of the the semantic representation is still deeper. (For a recent summary see
complexity. At longer inter-pattern time intervals, performance dropped Craik, 2020.) An argument can be made that the phenomena we call
as a function of time to an extent that depended on the pattern short-term memory involve series of stimuli that do not include good
complexity and persisted despite a change in location or the presence of sets of cues for retrieval, which (although Craik does not claim this) may
a mask. It appears that the short-lived representation is more literal, and be compatible with an empiricist, learning-theory-based, unified-mem
the longer representation is more abstract and immune to sensory ory approach.
interference. With a different procedure, Kallman and Massaro (1979) There is no question that knowledge effects are important. For
demonstrated two phases of auditory sensory memory in a tone- example, series of nonsense syllables are more difficult to retain than
comparison task. A masking sound following the first tone could inter series of known words (e.g., Saint-Aubin & Poirier, 2000). The question
fere with both phases of sensory memory, whereas a masking sound is just whether the apparent duration and capacity of storage can be
following the second tone could only interfere with the first phase of predicted from knowledge available, and that theory has not been
sensory memory of that tone, allowing an intact longer sensory store of worked out. Evidence against a unified-memory approach comes from
the first tone to facilitate the comparison process after the second tone neuropsychological syndromes in which an individual has damage to the
was perceived. Unlike a unitary memory view, sensory memory seems hippocampus, resulting in loss of long-term learning without a notice
tiered. able change in short-term memory or interactions. The best-known case Summary of decay issues. From an empiricist / behaviorist of this is H.M., whose hippocampi were removed to treat epilepsy.
base, the verbal learning outlook survived in the adult research for many Recently, MacKay (2019) suggested that there were important defi
years (e.g., McGeoch, 1932; Melton, 1963, both formerly at the Uni cits in H.M.’s sentence comprehension that were not due to memory for
versity of Missouri) and is still ongoing, with proponents who do not the sentence, which remained in front of him during the test. If pro
believe that there are separate short- and long-term storage processes (e. cessing is impaired, then the absence of new long-term learning by H.M.
g., Crowder, 1982; Nairne, 2002). In situations in which stimuli can be could be attributed to him not forming deep representations. In this
encoded well, there is evidence against decay from studies in which the light, the support for separate short- and long-term memory storage
duration of the recall period is manipulated (e.g., Oberauer & Lew from neuropsychology may be uncertain, even after all these years.
andowsky, 2008). When encoding time or capability is quite limited,
though, as in the case of simultaneous arrays or non-categorized tones to 2.1.2. Influence of empiricism and behaviorism in developmental research
be compared, there appears to be loss over time that requires something on working memory
more than a single process for all of learning and memory (e.g., Ricker
et al., 2018). This finding does not exclude the reality of principles put Development and the issue of decay. Decay has been espoused in
forward by verbal learning researchers, such as proactive and retroac some leading theories of working memory throughout the years (e.g.,
tive interference, but it suggests that we need further divisions of the Baddeley, 1986; Barrouillet, Portrat, & Camos, 2011). Accordingly, I
memory system to understand all the evidence. There must be some had hoped to examine whether short-term memory decay can be shown
thing in the mind and brain other than a single, all-encompassing neural to change in its rate during development, as a potential basis of cognitive
network for memory that operates according to just one set of rules for maturation that would affect thinking. There have been studies sug
memory at any time scale, as the unitary memory theorists from an gesting that what changes with development is the rate of carrying out
empiricist approach seem to believe (e.g., Chater & Brown, 1999; activities that could counteract decay: covert verbal rehearsal of words
Nairne, 2002). (Hulme & Tordoff, 1989; Kail & Park, 1994) or attention-based
refreshment of information (Camos & Barrouillet, 2018). These theories it points to some age differences other than simple learning.
relied on the existence of decay but typically assumed that the decay rate Longer periods of time beyond minutes. There has been
itself stayed constant throughout development. some beautiful work on the temporal qualities of memory in infants Decay across seconds. Cowan, Nugent, Elliott, and Saults using behaviorist principles and procedures, such as using reminders
(2000) examined working memory decay by presenting spoken digit and demonstrating developmental increases in the stability of memory
lists to children in Grade 2 (7-9 years), Grade 5 (10-12 years), and col over time (Rovee-Collier & Cuevas, 2008). Perhaps because this work
lege students, in a condition in which participants could not use atten was carried out from a tradition quite different from other researchers of
tion during the presentation, only afterward during retrieval. Spoken working memory, and more like the animal research from a learning
lists of the participant’s span length were presented at irregular intervals theory tradition, there has not been very much integration of this body
while they engaged in a silent game in which pictures with rhyming of research with the working memory literature. Developmental re
names were to be selected. The game was occasionally interrupted 1, 5, searchers seem to have emphasized long-term learning principles or
or 10 s after the most recent spoken list, the task then being to recall the short-term memory principles, but rarely the integration of the two
list. Overall, there was no age difference in the rates of forgetting of list (although see recent work on dynamic systems theory, e.g., Spencer,
items across 10 s, with an impressive closeness in the results across ages. 2020).
There was, however, a pronounced, selective deficit of the youngest
group compared to the older children and adults in the performance on Development and the issue of capacity. Aside from the possibility
the most recently presented word. This finding suggested to Cowan et al. of decay, capacity limits are the other way that there could be a separate
that the uninterrupted auditory memory of the last item was more working memory system. Cowan, Nugent, Elliott, Ponomarev, and
persistent in the older children. An alternative possibility is that atten Saults (1999) showed that capacity for spoken lists of digits increased
tion is directed to a fading auditory memory of the last item more effi during the elementary school years, using the aforementioned, silent
ciently in older participants. Against this interpretation, though, another rhyming task to prevent the use of mnemonic strategies during presen
method showed that the memory for the precise pitch of a single tone tation of lists. Each participant heard lists of digits intermittently, with
was lost more quickly in younger children (Keller & Cowan, 1994). list lengths adjusted to equal the participant’s attended span or to be 1,
Regardless, a difference across age groups of this sort would not be ex 2, or 3 items shorter than that. The silent rhyming game was interrupted
pected according to a simple learning theory in the empiricist tradition. intermittently, at which point the participant was to recall the most Decay across milliseconds. James (1890, p. 462) wrote recent list, which had ended 1 s previously. The number of items recalled
that newborns experience a “blooming, buzzing confusion” but that from unattended lists in these circumstances was approximately con
view was later criticized by infant researchers who documented the stant across list lengths but differed by age, at about 3.5 items in college
capabilities of infants. Nevertheless, there is evidence suggesting that students, about 3 items in fourth-grade children, and about 2.5 items in
there is some truth to the notion of confusion. If each sensation has a first-grade children. The constancy across list lengths was taken to
brief afterimage, then it is possible for sensations to blur together if the indicate that, with mnemonic strategies removed, only a core working
afterimage of one sensation persists while a second sensation is occur memory capacity was indexed. Similar capacity estimates later emerged
ring. Something like sensations running together has been found in from studies of visual array memory development (e.g., Cowan et al.,
vision in the form of flicker fusion (e.g., Regal, 1981) and in audition in 2005; Cowan, Fristoe, Elliott, Brunner, & Saults, 2006; Riggs, McTag
the form of perceptual fusion between brief noises (Plomp, 1964). The gart, Simpson, & Freeman, 2006; Riggs, Simpson, & Potts, 2011). These
duration of an auditory afterimage can be assessed by masking a brief seemingly small differences are proportionally large (e.g., from 2.5 to
sound with a subsequent sound, which can result in the inability to 3.5 is a 40% increase), with major implications for cognitive processing.
identify the first sound because the second one interrupts the extraction Capacity versus filtering. It remained possible that chil
of information from the afterimage of the first. In adults, this backward dren may develop a better ability to carry out mnemonic processing that
recognition masking reaches an asymptotic level when the time between creates these apparent capacity limits. One type of mnemonic processing
the onset of the brief target to be identified and the brief mask reaches is attention to the most relevant items, excluding irrelevant information.
about 250 ms (Massaro, 1975). Given converging information from Cowan, Morey, AuBuchon, Zwilling, and Gilchrist (2010) addressed this
other procedures (Efron, 1970a, 1970b, 1970c; Plomp, 1964), Cowan possibility with arrays of colored objects of two different shapes to be
(1984) assumed that the end of masking is related to the point at which remembered for subsequent probe item recognition mixed together on
there has been too much decay of the afterimage for image extraction to each trial. In some trial blocks, there was a much higher frequency of
continue. testing one shape compared to the other (e.g., the colored circles tested
Cowan, Suomi, and Morse (1982) tested the duration of backward on 80% of the trials, with the colored triangles tested on the other 20%).
recognition masking in 8- to 9-week-old infants using a modified non- When the array size was small (with only 2 items of each shape), par
nutritive sucking procedure. Infants heard a steady stream of masked ticipants in all age groups filled working memory primarily with the
vowel sounds coming from one tape-recorded channel but, when they more-often-tested shape, according to the recognition data. However,
sucked on the pacifier, the soundtrack switched to a second recorded the youngest, 7-year-old children remembered far fewer colors of either
channel with a change in the masked vowel. Sucking allowed infants shape. Cowan et al. (2011) replicated this finding with a much slower,
access to a change in the sound (except in a no-change control condi one-at-a-time presentation of the objects, so the age difference was not
tion). This access elicited an increased rate of sucking on the pacifier an encoding speed limitation. Younger children’s working memory was
when the onset-to-onset interval between the target and mask was 400 not more cluttered with less-relevant items, except when memory was
ms, but much less so when the interval was only 250. These results overloaded (Cowan et al., 2010) by presenting 3 items of each shape.
suggest that infants may have acoustic persistence for longer than Capacity versus speed as the causal factor. Children’s speed
adults, which may be useful for the extraction of information at a slower of mnemonic processing increases with age (Camos & Barrouillet, 2011;
pace. It also might result in the experience of a smearing of sounds in a Cowan et al., 1998; Gaillard, Barrouillet, Jarrold, & Camos, 2011;
stream such as speech, compared to what adults experience, i.e., Hulme & Tordoff, 1989; Kail, Lervåg, & Hulme, 2016). Mnemonic
something of a blooming, buzzing confusion after all. There is sup processing speeds of some sorts theoretically could govern the increase
porting evidence from another infant procedure in audition (Morron in capacity across ages if faster rehearsal or attention-based refreshing
giello & Trehub, 1987; Trehub, Schneider, & Henderson, 1995) and, in allows more items to avoid decay. Possibly, increasing processing speed
vision, developing temporal acuity of flicker fusion (Regal, 1981). These as a basis of development could fit a unitary memory theory like the
findings seem in keeping with the importance of maturation of physi verbal learning theories, by applying to all processes equally. However,
ology and against the notion of all organisms being roughly comparable;
it is also possible that speeds are the result, rather than the cause, of Working memory capacity could be examined by the number of sen
capacity differences across age groups. If more items can be retained in tences that were at least partly recalled. Knowledge could be examined
the focus of attention at once, for example, then it may be possible to by the degree of completion of sentences that were at least partly
refresh more items in parallel, increasing the observed speed of recalled. Across age groups, the rate of completion of the at-least-partly-
refreshment (Lemaire, Pageot, Plancher, & Portrat, 2018). Cowan et al. recalled, short sentences was about 80%, so knowledge was equated by
(2006) showed that second-grade children could be instructed to speed the simplicity of the sentences; yet children in second grade recalled
up their repetition of lists to equal the speed adults naturally used; yet, fewer sentences than older participants, indicating a lower capacity.
working memory did not increase at all as a result. This finding suggests
that a basic capacity difference may underlie working memory devel Empiricism and working memory development: A summary. In
opment, not a speed difference. sum, I have looked for developmental effects that are either consistent or
inconsistent with an empiricist view and find that the view has merit but Development and the issue of knowledge. The empiricist is not in itself complete. Age differences in working memory capacity
approach might suggest that developmental changes in working mem and sensory decay cannot be totally explained by increasing knowledge
ory could be attributed to learning. As children learn, they know more or learning principles. Thus, the complete model of development ap
items that are used in working memory tasks and can form better pat pears to require learning principles supplemented by a working memory
terns from them, resulting in fewer, larger chunks and more extensive system with additional limits. In favor of that description, as I will
inter-item associations. But can this learning account for all of the explain further under the constructivist view, the embedded processes
development of working memory? Cowan, Ricker, Clark, Hinrichs, and model of Cowan (1988, 1995, 2019) considers working memory to
Glass (2015) addressed this issue by presenting arrays of English letters comprise activated long-term memory elements, including rapid new
or unfamiliar characters to elementary school children at various ages, learning, and a capacity-limited focus of attention embedded within the
and to college students (method, Fig. 2; results, Fig. 3A). A few first- activated portion of long-term memory.
grade children did not know their letters well enough to be successful
at even the smallest array set sizes. With those few excluded, English 2.2. Working memory development and nativism: in favor of specialized
letters were remembered much better than unfamiliar characters at all modules
ages. Using separate cross-age-standardized scores for the two types of
materials, however, a steep developmental growth curve looked almost Nativism differs dramatically from empiricism in emphasizing the
identical for letters and unfamiliar characters, indicating that the importance of the genes in creating the neural organization that pro
developmental increase in knowledge alone cannot explain working duces responses. Nativism (with modularity) is consistent with the idea
memory development in this situation. that more specialized structures exist, and with the idea that funda
In a convergent, very different method, Gilchrist et al. (2009) mental properties of cognition are inborn, albeit with the possibility of
examined verbatim recall of lists of unrelated, simple spoken sentences. change with maturation. Although there is relatively little work directly
A. Interim Task:
None (Cowan et al.,
2015, with leer or
character smuli; other
studies as control)
B. Interim Task:
A second memory set;
instructed to aend to
Memory Set Set 1, 2, or both
(Cowan et al., 2018)
Probe Item
(array or list; colored
(to be judged same as
shapes, leers, un-
one memory set item
familiar characters,
C. Interim Task: or different from all)
tones, spoken digits)
Item on le or right;
speeded key press,
same v. opposite side
(Cowan et al., 2021)
D. Interim Task:
Esmate how many
memory set items you
know (Forsberg et al.,
Fig. 2. A schematic diagram of the method in several studies indicating processing capabilities that develop in childhood. Studies differed in the types of materials
presented in the set to be remembered (memory set) and the nature of the interim task (A-D) presented in the retention interval between the memory set and the
probe item. The probe item was to be recognized as same as one memory item or different from all. In Cowan et al. (2018), the memory items were arrays of colored
squares, lists of spoken digits, or lists of tones; in the later studies, arrays of colored squares.
1.0 4.0
Component k Value
0.5 3.0 Shared
Unfamiliar Component
0.0 2.0 Component
Letters Visual
-0.5 1.0
-1.0 0.0
-1.5 -1.0
k Value, Set Size 4
No Interim
2.0 Same-Side
Key Press
1.0 Side Key
Fig. 3. Selected results of four studies corresponding to the different types of interim task (A-D) in Fig. 2. A, replotted from Cowan et al., (2015); B, replotted from
Cowan et al. (2018, Experiment 2); C, replotted from Cowan et al. (2021, Array Set Size 4); data posted with the article through the Open Science Framework link
were used to regroup children to create age groups comparable to the other studies. D, replotted from Forsberg, Blume and Cowan (2021a, Experiment 2, Array Set
Size 6) using the “clean data” file posted at the Open Science Framework site linked to the article. First grade corresponds to children 6-8 years old, with increasing
age 1 grade per year. Error bars are +1 standard error of the mean.
linking genes to the neural understanding of working memory and its differences in intelligence and the extent to which they are heritable,
development (though there is some, e.g., Friedman et al., 2008), there and certainly have included working memory among the abilities that
are many studies focusing on what appear to be special constructions of are candidates for largely inherited cognitive ability (e.g., Deary, 2012;
the brain, or modularity, for the purpose of carrying out working Friedman et al., 2008).
memory processes. There have been investigations from this viewpoint
focusing on possible development changes in these modules. Implications for working memory. In the psychology of memory,
the idea of nativism would imply that there are modules specific to
2.2.1. Influence of nativism and modularity in adult memory research memory that are inherited and that determine the quality of thinking in
This approach is characterized in the second row of Fig. 1. Online an organism. The question here would be, what kinds of modules, and
OxfordLanguages (2021b) defines nativism in psychology as “the theory how many, and with what developmental course? The nativists would
that concepts, mental capacities, and mental structures are innate rather be quite comfortable with the notion of separate short- and long-term
than acquired by learning.” The idea is usually associated with Kant’s memory modules but these might not appear to be enough. There are
philosophy and, in psychology, Fodor’s view of modularity (Wikipedia, neurological dissociations in which an individual with a brain lesion is
2021). Nativists are among those who have looked for individual found to have good verbal working memory but very poor visual
working memory, or vice versa (for reviews see Hanley & Young, 2019; searching in the container, apparently for the not-yet-retrieved, original
Shallice & Papagno, 2019), and these helped support the notion that toys. However, when three toys were hidden and then secretly
visual and verbal working memory are separate modules (e.g., Baddeley, exchanged for new toys, children typically stopped searching after
1986). To this day, there are scientific arguments about whether the extracting the three new toys. The explanation was that they knew there
neurological support for working memory modules supports the exis were three toys but no longer remembered the other features of the
tence of separate visual and verbal modules, or whether the data can be original toys, and so they had no motivation to keep on searching. That
explained by mnemonic processes without separate modules for is, there was supposedly a tradeoff between memory for details of the
different types of storage (Buchsbaum & D’Esposito, 2019; Cowan, original toys, up to two toys, and loss of the memory for detail after a
2019; Logie, 2019; Majerus, 2019; Morey, 2018, 2019; Morey, Rhodes, third original toy was encoded into memory. Together with the litera
& Cowan, 2019, 2020). ture, one possible implication is that the developmental change from
All theories acknowledge that the individual inherits some neural infancy to adulthood might be not in the number of items in working
apparatus that used to carry out the kinds of tasks that we term working memory, but rather in the featural detail possible within each of about
memory tasks. However, there is a difference between the kind of theory three representations in working memory. A change in featural detail
in which specialized modules are inherited (e.g., Logie, 2016) versus a sounds much less like a nativist concept than does a change in capacity
theory in which the kinds of processes that are inherited are per se.
general-purpose, problem-solving mechanisms that can be adapted Based on these infant results and adult findings of Cowan, Blume,
across different types of tasks and materials (e.g., Cowan, 2019; Kane and Saults (2013), Cowan (2017b) suggested that in subsequent child
et al., 2004). It is the former type of theory that can be more closely development, too, there may be a tradeoff between the number of items
identified with nativism and modularity, whereas the latter type of in working memory and the specificity of the feature representation of
theory is more cognitivist and must go on to explain more about stra each item. Perhaps feature specification of items continues to increase
tegies used to adapt to each kind of task. during childhood. Given the potentially large role of learning in feature
specification, the developmental story no longer sounds as uniquely
2.2.2. Influence of nativism and modularity in developmental research on nativist as it might if a working memory module (or several) simply
working memory grew in capacity with age.
From the broad field of cognitive science, the spirit of nativism was
brought to developmental psychology by Chomsky (1965) with the 2.3. Working memory development, cognitivism, and constructivism:
claim that language is learned so fast, resulting in such a complex innate readiness for the assimilation and construction of knowledge
structure, that it can only be the result of an innate language acquisition
device. This kind of sentiment was suitable to infant researchers, who At a certain point in the history of science, investigators realized that
found that if a sensitive measure like eye movements is used, there is they had enough information to make inferences about the chemical and
evidence of infant capabilities far beyond what psychologists had sup atomic makeup of objects (such as, for example, the sun) even though
posed, such as the maintenance of an object in working memory when it the inferred units were not directly observable. Similarly, in psychology,
has been hidden (object permanence), as early as 5 months of age in the mid-1950s there was a realization that one could make inferences
(Baillargeon, Spelke, & Wasserman, 1985), much earlier than Piaget and about the organization or makeup of the mind using a variety of tech
colleagues had suggested based on manual responses. niques, ranging from the behavioral study of immediate memory (Miller,
As Cowan (2016) noted, there have been many infant studies of 1956) and attention (Broadbent, 1958) to the analysis of grammar
working memory and other skills, suggesting that even working memory (Chomsky, 1957) and the use of computers to simulate human thinking
capacity gets a very early start in life. At first glance, it appears that (Newell & Simon, 1956). That was the beginning of the cognitive rev
infants have a capacity of about 3 items (e.g., Ross-Sheehy, Oakes, & olution that helped establish a cognitivist view in the field. In this view,
Luck, 2003; Zosh & Feigenson, 2012, 2015). This estimate seems similar the internal manipulation of symbols is said to describe mental func
to what is found in adults with procedures in which mnemonic strategies tioning (Teaching and Learning Resources, 2021).
cannot be used (e.g., Cowan, 2001; Luck & Vogel, 1997; Zhang & Luck, Beyond cognitivism, Bruner, Goodnow, and Austin (1956) presented
2008). If there were no developmental increase in basic working research suggesting that ideas and mental categories do not simply come
memory capacity, and it were about 3 items at all ages, then certainly it from the stimuli; they depend on participants’ decisions as to what cues
would have to be considered a fundamental mental capacity that is and factors to consider or ignore in forming their conclusions. This ap
innate rather than influenced by learning. This apparent equivalence, pears to be one beginning of the concept of constructivism, the idea that
however, is paradoxical because studies that examine working memory concepts are constructed by the individual. Thus, this view as important
using the same procedure across age groups in childhood do find sub even in adult research, though constructivism occurs over time and
stantial changes with age (e.g., with change detect and related proced therefore is well-suited to developmental study. Researchers in the field
ures, Cowan, Elliott, et al., 2005; Simmering, 2016; with a variety of of education (Jonassen, 1991, for example) distinguished between an
span tasks, Gathercole et al., 2004). Something about the infant pro objectivist and a constructionist view. In objectivism, individual
cedures allows infants to demonstrate a seemingly higher capacity than learners process symbols and discern reality, whereas in constructivism,
young children. individuals build symbols and determine their own reality. According to
To help resolve this paradox, a study by Zosh and Feigenson (2012) this analysis, cognitivism has an objectivist component and a construc
with 18-month-olds may be relevant. The study suggests that there may tivist component. Most likely, the reality falls in between the extremes.
still be a monotonic developmental change in working memory. In The cognitivist viewpoint was not devised with development
particular, it may be a change not in the number of working memory particularly in mind and its influence on developmental research has
slots but in the completeness of the feature representation of objects in waxed and waned in the field of developmental psychology (Spencer,
working memory. Zosh and Feigenson asked whether the infants’ Perone, & Buss, 2011; Thelen & Bates, 2003). Spencer et al. (p. 261)
memory of how many items were in a container included memory of all stated, “we conducted a survey of the fourth through sixth editions of the
the features of those items. Toys were hidden and, in one condition, the Handbook of Child Psychology: Theoretical Models of Human Devel
toys were secretly swapped out for others without the infants knowing opment. These editions span more than 20 years in developmental
this. Infants were then allowed to search for the toys, extracting them psychology (from 1983 to 2006). Although this book is just one indi
from the container. What was measured was the duration with which cation of how the field is changing, our survey revealed that four
infants kept searching in the container. When there were one or two theoretical viewpoints have disappeared from the Handbook over time:
toys, infants were not satisfied after extracting them and often kept on nativism, cognitive and information processing, symbolic approaches,
N. Cowan Cognition 224 (2022) 105075
and Piaget’s theory. Of course, scholars still actively pursue all of these inherited; what is presumably inherited includes mechanisms that allow
perspectives.” Given that the field may have shifted away from these different processing strategies to be used in a manner up to the discre
perspectives, it may be especially helpful to understand how they shaped tion of the individual.
some of the past and ongoing research. This is very true in my own
program of work because I have conducted both adult and develop Baddeley and Hitch (1974) conception of working memory. The
mental research on working memory heavily over the years, which has general message of the seminal Baddeley and Hitch (1974) work is that
allowed a great deal of cross-fertilization between the two lines of something was amiss in the elegant modal model. It seemed that mul
research. In general, in fact, many investigators of working memory tiple storage and processing mechanisms were needed, and the term
development have had close ties to the adult research from a cognitivist working memory was applied to the multicomponent system. Materials
perspective (e.g., the work of Bayliss, Jarrold, Gunn, & Baddeley, 2003; that were very different, such as digits to be remembered and sentences
Camos & Barrouillet, 2018; Gathercole et al., 2004; Hitch, Towse, & to be comprehended, still interfered with each other, but not as much as
Hutton, 2001; and Towse, Hitch, Horton, & Harvey, 2010), and all of if they had totally displaced each other in primary or short-term mem
these investigators have also published steadily in the adult literature. ory. In free recall of 16-word lists, the advantage of recall for the most
The finding that developmental handbooks do not much focus on these recent several items was unaffected by a concurrent digit load, which
areas, along with my own experiences, suggest to me that this faction of was odd if those last several items in the word list were held in a short-
the developmental field may have turned toward the working memory term store needed for digit storage. Concurrently speaking to prevent
theorists more than it has interacted with other developmentalists, verbal rehearsal had a large effect on recall of printed words, but little if
increasing the need for the present review and reconciliation of any effect on recall of spoken words. The results suggested that what was
approaches. needed was a theory in which there are separate dedicated modules for
The relevance of cognitivism and constructivism to working memory different purposes, as in a nativist view. These developed to be the
research differ. Cognitivism is related to inferences about mental ac phonological and visuo-spatial stores of Baddeley’s (1986) model.
tivities, and opens up consideration of new storage and processing However, it was also suggested that general attention was needed to
mechanisms that were not discussed in empiricism because they are not store and process complex items. The processing component was the
directly observable. Constructivism is relevant in two ways. First, central executive in Baddeley’s model, whereas the central storage
although a nativist view also includes the possibility of internal struc component of attention was excluded by Baddeley (1986). A general
tures, constructivism increases the amount of attention paid to the storage was restored to the model as the episodic buffer (Baddeley,
possibility that storage is flexible and regulated strategically by the in 2000, 2001). Attention was a central part of Cowan’s (1988, 1995)
dividual, as opposed to an inflexible and rote storage mechanism. (Note alternative conception that included storage in the focus of attention.
that a separate type is neuroconstructivism, to be considered later with Although the early work by Baddeley was concentrated on under
dynamic systems theory.) Second, if individuals do construct their own standing what might be considered the more nativist, modular storage
concepts, they presumably do it by holding symbols in working memory buffers, much of his more recent work has focused largely on the more
to allow these symbols to be recombined in that cauldron. Cognitivism cognitivist concept, the attentional aspect of immediate memory, more
and constructivism should add to what is contributed by empiricists and in keeping with the emphasis of Cowan and colleagues (for a summary,
nativists. To understand behavior, their previous insights must be see Baddeley, Hitch, & Allen, 2021).
combined and must explain how knowledge and strategies are
constructed. Embedded-processes model. Cowan’s (1988, 1995, 2019) model
of information processing includes the activated portion of long-term
2.3.1. Influence of cognitivism and constructivism in adult research on memory embedded within the memory system and, embedded within
working memory activated memory, the focus of attention. It also includes a perceptual
process giving rise to a brief, initial sensory storage process that deposits Early cognitive revolution. The work of Miller et al. (1960), information into the memory system, whereas a longer kind of sensory
Plans and the structure of behavior, involved an early mention of the term storage is considered to be one kind of activated memory (along with
working memory. It did not get into internal mechanisms of the mind semantic activation). There is a central executive that helps to control
but can be considered a bridge between empiricism and constructivism. the focus of attention, much like its function in previous models.
The notion was that people must hold on to information temporarily and The embedded processes model did not arise specifically in response
use it to plan activities to meet a hierarchy of goals ranging from the to the Baddeley and Hitch (1974) or Baddeley (1986) models per se, but
immediate, such as popping a piece of toast into the toaster, to long- rather to a number of puzzling aspects of information processing stem
term, such as planning to purchase a home or furthering a career. ming from the modal model. Unattended aspects of stimulation were
A slew of models in the 1960s were attempts to use what had been supposedly dropped from further processing after sensory memory; yet,
learned about behavior and computers together to formulate models of somehow, when these filtered-out channels of information changed in
human information processing. These models had some similarities, as their physical properties, as when there is a sudden flicker in the lighting
pointed out for example by Murdock (1967), who described a “modal or a sudden change in the background noise that was being ignored,
model” of commonalities between the models. It included a sensory people noticed these changes (Broadbent, 1958; Cherry, 1953). More
store susceptible to decay, an attention process that can select among over, it appeared that the unattended information somehow made
sensory items, a more limited-capacity primary memory susceptible to contact with information in long-term memory. For example, people
displacement, a rehearsal process, and a secondary memory susceptible sometimes noticed when their name was presented in an unattended
to interference at the time of retrieval. Work serving as an important channel (Moray, 1959). Treisman (1960) explained this by proposing
precursor to this model is Sperling (1960), and the best-known model of that attention did not totally filter out unselected information; that in
this cohort described by Murdock is that of Atkinson and Shiffrin (1968), formation was merely attenuated, so it could still reach the long-term
with an earlier version of their work available to Murdock as a 1965 memory system and changes could, presumably, still be noticed. It
technical report. The term working memory was not used but the sen seemed to Cowan (1988), however, that a more detailed concept of what
sory and short-term stores are its relevant precursors. It is the promi might occur could be constructed, based on the notion from Sokolov
nence of the control processes, represented in these early models by (1963) that an intelligent organism builds up a neural model of the
attention and rehearsal, that might most strongly indicate that we are environment and then notices changes to that environment. This could
not just talking about some autonomous, mechanistic stores that are be represented by doing away with the sequential arrangement of stores
N. Cowan Cognition 224 (2022) 105075
and allowing incoming stimuli, once perceived, to make direct contact perception (for a discussion see Farah, Wilson, Drain, & Tanaka, 1998).
with long-term memory, activating the relevant features to the extent If working memory looked superior for faces compared to houses, it
that it could be processed (Cowan, 1988, 1995, 2019). Moreover, could be because of the contribution of the face perception module, but
attention could be represented as focused on a limited amount of newly- that would not be a demonstration of modularity in working memory
perceived changes, task-relevant information from the environment, per se, just modularity in perception used by working memory.
and recently-encountered or thought-of information. In each case, the Where do capacity limits come from according to an embedded
information in the focus of attention would result in the best-analyzed, processes approach? It is a complex problem but here is an example that
most coherent and meaningful view of the world. It could include some, may help. To hold a set of separate items successfully, one must (1)
but not all, of the many features of long-term memory currently acti retain the items, (2) keep them from being confused with one another,
vated by the environmental input and the recent work of the focus of and (3) know which object goes where (that is, remember their serial,
attention in eking out the meaning of stimuli and thoughts. Within this temporal, and spatial positions). For Requirements 2 and 3, similarity
theoretical framework, working memory would be represented by the between items can be harmful. Several studies of object array item
activated long-term memory information (a concept related to the recognition fit a model in which working memory is limited to about
activated cell assemblies proposed by Hebb, 1949) serving as a collec three objects, but with limited numbers of features (location, shape,
tion of currently highly-accessible thoughts, and the focus of attention (a color, texture, size, and so on) that can be retained for each object
concept related to the primary memory of James, 1890) as the central (Cowan et al., 2013; Hardman & Cowan, 2015; Oberauer & Eichen
core of this working memory system. berger, 2013).
Like Baddeley’s (1986) model, the model of Cowan (1988, 1995, Brain imaging evidence supports this description of capacity limits.
1999) was influenced by brain research. However, these models were The focus of attention in working memory especially relates to one area,
influenced by different aspects of brain research. Baddeley had experi the intraparietal sulcus (Cowan, Li, Moffitt et al., 2011), which may not
ence with neurology patients who had diminished verbal immediate represent the information retained in working memory but rather the
memory along with preserved visual immediate memory or vice versa. number of items being retained (Majerus et al., 2016), perhaps serving
Instead, I was interested in consciousness and focused on individuals as indices pointing to other brain areas to coordinate their input. There
with parietal and frontal damage, important for the functioning of a is functional connectivity between the intraparietal sulcus and the
focus of attention and for executive operations on the material being posterior cortical regions that clearly do represent the information
stored. In Baddeley’s model, the reason why verbal and visual sets of currently being retained (Li, Christ, & Cowan, 2014). The intraparietal
information do not interfere with each other very much is that they are sulcus is more active not simply as a function of the number of items in
handled by different dedicated stores. Cowan (1988) reasoned, howev working memory, but as a function of the similarity between those
er, that an adequate taxonomy would also have to consider other kinds items; Gossaries et al. (2018) found that it was more active when
of information: acoustic (duration, pitch, etc.), orthographic, visual retrieving one direction of movement from a memory set that also
(color, shape, spatial location, etc.), and semantic features, as well as included two other directions, as compared to a memory set that
information from other senses. Some features crossed modalities (e.g., included just the one direction along with two colors.
duration and location). It seemed likely that a given stimulus would give In sum, I do not subscribe to visual and phonological working
rise to feature activation that is not limited to one type, and the taxon memory stores per se like Baddeley (1986) and Baddeley et al. (2021),
omy of stores seemed possibly too inflexible and complex to propose in a given the need for many additional feature distinctions such as loudness
modular sense. Rather than doing so, Cowan (1988) simply proposed and pitch dimensions in an acoustic store, shape and color dimensions in
that stimuli having similar processed features would interfere with one a visual store, direction or location separate from the modality
another within activated long-term memory. Thus, the model seems conveying it, other senses, semantics, and so on. Baddeley (2000, 2001)
more heavily cognitivist and constructionist than the Baddeley model, dealt with these cases by appealing to a new storage unit, the episodic
which appears to have a larger nativist influence. buffer, but this wide net cast by the new buffer does not seem able to
Even in the embedded-processes model, though, there does appear to handle them all in a principled manner. A more microscopic or network-
be some nativist influence. For example, the use of features probably like level of analysis may be needed, following on the activated long-
depends on the inherent suitability of the modality to the task so that, for term memory approach.
example, phonological features may be well-suited for retaining a Cowan (1988, 1995, 1999, updated view described 2019) suggested
sequence, more so than unrelated meanings (Saint-Aubin & Poirier, a model of processing that was meant to coordinate results suggesting
2000). As another example, the visual modality is better-tuned to fine decay and those suggesting capacity limits. In this approach, working
spatial information, whereas the auditory modality is better-tuned to memory consists of embedded processes. A portion of long-term memory
fine temporal information (e.g., Penney, 1989). These inherited aspects is temporarily activated, with activation limited by decay as well as
of cognition are acknowledged in the embedded processes model. Innate interference from subsequent input. It includes features of recently seen
aspects of perception could cause separations by modality and code that or thought-of items, though the features are not well-integrated. Rapid,
apply to working memory. For example, acoustic advantages in serial new learning takes place via the focus of attention, and this new learning
order retention of speech (Penney, 1989) are partly retained as phono can produce activated long-term memory elements used even in the
logical advantages even for written items (Conrad, 1964). Auditory and current trial of a task (a point noted by Cowan, 1988 but amplified by
phonological materials may be especially suited to sequential presen Cowan, 2019). Embedded within activated memory, there is a focus of
tation, which may allow them to be learned and retained in working attention that can maintain only at most several chunks at once; unlike
memory with relatively little investment of attention, compared to vi activated memory, it is capacity-limited. Working memory performance
sual materials, which require more attention to be retained in working is based on information in the focus of attention and information that
memory (Gray et al., 2017; Li & Cowan, 2021; Morey, Morey, van der can be rapidly retrieved into the focus because it is in still activated long-
Reijden, & Holweg, 2013; Penney, 1989; Vergauwe, Barrouillet, & term memory. Recent research suggests that information can be off-
Camos, 2010). loaded through rapid memorization to free the focus of attention for
As a different kind of example of the possibility of an innate module other work (e.g., Cowan, Li, Glass, & Saults, 2018; Rhodes & Cowan,
contributing to perception rather than working memory per se, suppose 2018).
there is a specialized face perception module. It would contribute in Miller (1956) left some doubt about whether capacity limits are
formation to working memory that is privileged by any innate face fundamental, suggesting that the limits observed in different procedures
category information. In contrast, a category like house perception are the result of coincidence, and using the magical number seven plus
could not be innate and would presumably be less privileged in or minus two as a rhetorical device to allow him to discuss several
N. Cowan Cognition 224 (2022) 105075
apparently unrelated phenomena that he was describing in the same talk basically from the unfolding of plans built into the individual. Piaget
(Cowan, 2015; Miller, 1989). Cowan (2001) helped to renew the early in his career did intelligence testing under Alfred Binet, for the
emphasis on capacity by reviewing many phenomena indicating that purpose of placing children correctly in the French schools (Biography
there is a core working memory capacity in adults of 3 to 5 items, often Newsletter, 2021; Whitman, 1980). Part of the method of intelligence
augmented by mnemonic strategies including grouping, chunking, and testing involved finding somewhat arbitrary questions about knowledge
rehearsal (see also, Broadbent, 1975; Luck & Vogel, 1997). Some mod of daily life and scoring children’s ability to answer them. However,
ern theories have renewed the emphasis on decay, to explain why there Piaget found the testing method too rigid and asked children to explain
are effects of the amount of free time on the ability to refresh item their answers. He noticed that children’s wrong answers were not
representations with attention (e.g., Barrouillet et al., 2011; Camos & random, but involved some age-related misconceptions, such as
Barrouillet, 2018), though decay’s existence in these specific situations egocentrism, anthropomorphization, and magical use of associations.
is contested (Oberauer & Lewandowsky, 2008). Their wrong answers made sense based on what they knew and the level
In sum, an explosion of adult research has been taking place for some of complexity at which they could fig. things out. He developed the idea
time and is ongoing, using behavioral and neuroscientific methods to that there are stages of knowledge, which he also traced back to infancy
understand the structure and mechanisms of working memory. As yet, with observations of his own baby, formulating the idea that children
little theoretical agreement has been reached, although the models are develop in stages of cognitive development including sensorimotor (~0-
becoming more similar in some ways to accommodate new evidence. 2 years), preoperational (~2-7 years), concrete operational (~7-12
Much is still unknown regarding when decay occurs, how interference years), and formal operational (>~12 years) stages, with a biological
operates, and what limits working memory capacity. foundation but very important environmental input, and the increasing
ability to make use and make sense of that input as processing matures;
2.3.2. Influence of cognitivism and constructivism in research on working that is, to construct schemes and ideas to understand the world and act
memory development upon it. The developmental stages of thought are said to be limited by
Logically, the broadest response to both the empiricist and nativist certain key concepts that cannot be grasped without sufficient matura
positions is that both are important, an intermediate position. Even the tion, such as the concept that an object remains when it is not in sight
staunchest empiricists acknowledge that an inherited nervous system is (object permanence), the concept of the constancy of mass (e.g.,
needed for behavior, and even the staunchest nativists realize that changing the shape of a ball of clay does not alter the amount of clay),
environmental input to the individual is needed for normal cognition to and the concept that certain operations are reversible (e.g., the clay can
develop. Further, although it is logically possible for environmental and be returned to its original shape).
inherited properties to contribute independently to behavior, they often Great examples of constructivist thinking can be observed in the
interact. Statistically, that means that the amount and direction of in online video, Piaget on Piaget, 2021; see also Piattelli-Palmarini, 1980).
fluence of the environment for a particular behavioral scale depends on For example, a young child trying to draw a triangle in plain sight draws
the genetic makeup. A key example is that some children seem geneti a roundish shape and then adds lines where the corners should be; the
cally sensitive, such that the exact environment makes a big difference in drawing is taken to indicate that the child constructs a closed shape as a
their psychological outcomes, whereas other children seem genetically first pass and then tries to add the corners in a second pass. In child
robust, such that they are much less sensitive to the same details of the language, an example is a young child’s use of a self-invented form that
environment. The situation has been likened to the distinction between follows a general rule, like the past tense verb goed, even though the
what it takes for orchids versus dandelions to thrive (for a review see child used the correct exceptional form, went, at an earlier develop
Belsky, Bakermans-Kranenburg, & van Ijzendoorn, 2007). mental stage (an overregularization, e.g., Marcus et al., 1992). A nativist
The manners of relevance of cognitivism versus constructivism to system like that suggested by Chomsky (1965) has inborn rules of syntax
working memory development differ. The cognitivist influence suggests among which the final rules are to be selected on the basis of language
that limits on the capacity of working memory increase with age, at each input. In a cognitivist and constructivist system, in contrast, the indi
age limiting the types of concepts that can be understood. The vidual hears language and uses general problem-solving abilities to fig.
constructivist influence, though, suggests that working memory is a out what the rules are. The latter can explain why young children make
cauldron for putting together concepts and symbols to construct new word combinations that do not fit the adult grammar (e.g., more up)
representations, and that a working memory limit prevents some con given that young children’s conclusions regarding the rules of the sys
cepts from being created pending further development. For example, a tem may not be sophisticated or informed enough to be completely
tiger is essentially a big, striped cat and those three features have to be adult-like (e.g., Braine, 1976; Braine, Brooks, Cowan, Samuels, & Tamis-
associated in working memory for the concept to be adequately grasped LeMonda, 1993). So far, this constructivist interpretation did not heavily
as distinct from other categories. Forget to include the large size, and include information processing components in a modern cognitivist way
one can confuse a tiger with a striped house cat; forget the stripes, and that explicitly referred to working memory, but that changed with the
one can confuse the tiger with a lion; or forget the feline category, and neo-Piagetians.
can confuse the tiger with a zebra. The child must compare input to
decide the limits and extensions of each category. Thus, working Neo-Piagetian approach. The cognitivist, constructivist view is
memory constraints could contribute to young children’s under- and also interactionist, in that it fully considers both biology and knowledge.
overgeneralizations of concepts (e.g., calling a horse a dog). Once A particular kind of knowledge cannot be used well until the mental
certain concepts are understood, as well, it is possible for the child to schemes are in place to absorb that knowledge (in an assimilation pro
imagine (construct) imaginary new combinations, such as a dog with cess). Moreover, if the child is in the necessary state of readiness, which
stripes. Cognitivist and constructivist approaches can be seen in Pia depends largely upon maturation, then new information can trigger an
getian theory, neo-Piagetian theories, and in the present author’s long- advancement to the next stage (in an accommodation process). Given all
standing program of research that will be reviewed. this, one can slightly, but not greatly, hasten the advancement from
stage to stage by enriching the stimulus environment. Piagetian theory. Piaget’s (1952, 1977) theory is a basis of The relevance to working memory of both cognitivism and
constructivism. Children’s concepts are said to be self-constructed. His constructivism can be seen in the work of neo-Piagetians. Their work
view was influenced by his early exposure to the endeavor of intelli tries to account for cognitive performance in each stage by considering
gence testing, when there was a strong emphasis on innate individual the basic, largely innate and maturing information processing abilities of
differences (e.g., Deary, 2012; Galton, 1865). Development would come the individual. Pascual-Leone and colleagues (Pascual-Leone, 1970;
N. Cowan Cognition 224 (2022) 105075
Pascual-Leone & Johnson, 2021; Pascual-Leone & Smith, 1969) adopted least the apparent capacity growth. These processes include the forma
this view, suggesting that a developmental increase in mental energy tion and use of knowledge, the use of metacognitive ability, the control
results in more items in working memory, which allows more complex of attention in dual-task situations, and the off-loading of information to
cognitive concepts or schemes to be thought and executed. Although preserve attentional capability. These will be described in turn and are
mental energy is a fluid construct, at certain points it has grown large highlighted in Fig. 1 (bottom).
enough to allow an additional scheme to be held in mind, resulting in a Formation and use of knowledge. Above, I summarized
stage advance. Case, Kurland, and Goldberg (1982) suggested that what research showing that, with knowledge equated among age groups,
accounts for the increased working memory ability is something called developmental differences in working memory still could be observed
operational efficiency, which can be measured by word identification (Cowan et al., 2015; Gilchrist et al., 2009). The cognitivist and
speeds; these speeds and working memory were linearly related across constructivist view includes knowledge, but the use of it depends on the
age groups. Moreover, making the stimuli nonwords for adults reduced processes of assimilation and accommodation, which depend in turn on
identification speeds and memory spans for these materials, so they the child’s level of cognitive maturation. Adult findings of the use of
approximate what is found with more meaningful stimuli in younger patterns and rules to simplify or compress stimulus sets, increasing the
children. That is, processing efficiency is said to determine working apparent capacity (e.g., Brady & Alvarez, 2015; Chekaf et al., 2016) fit
memory capacity, which in turn governs the complexity of concepts that into the cognitivist and constructivist views under the assumption that
an individual can understand or construct. In another neo-Piagetian individuals will differ in what patterns and rules they can find, and the
approach, Halford and colleagues (Andrews, Halford, Bunch, Bowden, assumption that the ability to find patterns relies on a type of executive
& Jones, 2003; Halford, Cowan, & Andrews, 2007; Halford, Wilson, & function that generally improves with age (Cowan et al., 2018).
Phillips, 1998) proposed that the key factor that differentiates one Demetriou et al. (2014) proposed a relevant theory in which children
cognitive level from another is the number of items that can be inter- 2 years and older go through developmental cycles. In the first part of
associated in working memory. For example, learning sums can each cycle, children learn to form representations of things in the
require the association of as few as three items (e.g., 2+3=5), whereas environment. In a later phase of the cycle, they integrate the new rep
learning ratios requires four (e.g., 2/3 = 4/6). resentations into a wider system of knowledge. Cowan (2021a, 2021b)
In the neo-Piagetian view, the levels of complexity of thought follow was able to test predictions of this cognitivist theory in a large-scale
Piaget’s prescription only under a circumscribed set of processing de sample of data on children 2-7 years old, who were tested to create
mands, which determine the load on working memory. If the demands norms for the Wechsler Preschool and Primary Scales of Intelligence
are reduced, such as by using eye movements as a measure of surprised (WPPSI-IV). This test included two new visual working memory mea
reactions instead of intentional, manual choices, one can see the rudi sures, one of which involved recognition of pictures of common objects
ments of object permanence by 5 months (Baillargeon et al., 1985). and would be helped by better representations. In the age range of 2;6 to
Piaget seemingly thought that a stage was reached when the correct 3;11 years, the results suggest that better representations did help: the
concept was expressed independent of the processing demands, but proportion of working memory variance shared with knowledge-based
there may be no such point according to a neo-Piagetian viewpoint. tests (Information, Receptive Vocabulary, and Picture Naming) was
Illustrating the enormous role of processing demands, once as a R2=.16 to .22. In the age range of 4;0 to 6;0 years old, however, better
teaching assistant in graduate school I slightly complexified a test of representations did not play a major role: the proportions of variance
concrete operations and found that college students then failed it. Spe dropped to the range of R2=.08 to .11. This shift was specific to tasks
cifically, a well-known Piagetian finding in preoperational children is that depended on the use of knowledge. For tests that emphasized ma
that they will state that a ball of clay rolled out into a sausage shape has nipulations rather than knowledge-based representations, there was no
more clay. I asked students to imagine that there were two identical balls comparable drop across the age groups in shared variance between
of clay heavier than water, that one of them was rolled out into a sausage working memory and the other tests. The pattern is in accord with
shape, and that the two pieces of clay were then totally immersed in Demetriou et al. and a neo-Piagetian, stage-like approach involving the
identical beakers of water. Many students judged that the level of water combination of working memory and knowledge.
would rise higher in the beaker containing the sausage-shaped clay, a Use of metacognitive ability. A key aspect of working
preoperational answer for a problem with just one extra processing step memory performance according to the cognitive view is the use of
(judging clay volume indirectly, by the rise in water volume). The problem-solving to fig. out how to make the most of working memory
misconceptions in this task seem working-memory-related and may retention abilities of the brain. Performance depends on deliberate
resemble what happens in more abstract, real-world problems (for strategies of the individual, which in turn depend on the individual’s
related published evidence see Çalýk, Ayas, & Ebenezer, 2005). In an understanding of his or her cognition, or metacognitive knowledge.
example provided by Goldinger, Kleider, Azuma, and Beike (2003), Forsberg et al. (2021a) examined the metacognitive ability of elemen
participants were to judge the amount of fiscal liability of a baseball tary school children and adults in a working memory task (method
stadium after injury of a spectator, given extenuating factors that were Fig. 2, Interim Task D; results, Fig. 3D). On every trial, an array of
not logically relevant to the decision (Goldinger et al., 2003). Irrelevant colored spots was followed by a retention interval of several seconds and
factors influenced the liability decisions when working memory was then a probe color to be judged present in the array or absent from it. On
overloaded. some trials, during the retention interval, the participant was queried
regarding how many items they thought they retained in mind. Whereas Cognitivist advances in my own recent research. I have already actual performance levels increased throughout development to a level
described how my earlier work showed age differences in decay and of about three items in adults, children from the early elementary school
capacity. For some years, though, I was unable to get further to deter years incorrectly indicated that they thought they held an average of
mine what processes may help to account for the apparent structural about three items. This pattern (along with the infant evidence discussed
changes. They could be simple maturational differences consistent with previously) suggests that perhaps, in some sense, working memory ca
a nativist view of the change in working memory. According to a cog pacity remains fixed at about three items at all developmental stages. If
nitivist, and constructionist, view of working memory development, so, however, there must be a developmental improvement in the rep
however, what changes with development need not be the structures resentation of the task-relevant features of the objects (Cowan, 2017b).
that retain and process information. Instead, the mnemonic strategies The younger children of Forsberg et al. may have retained three items in
that are used to make the most use of the structures may change. Recent some sense, but perhaps some of the representations of these items did
work leans more toward mnemonic processing differences underlying at not actually contain the color features needed to respond appropriately
to the probe. The idea of an object file (Kahneman, Treisman, & Gibbs,
N. Cowan Cognition 224 (2022) 105075
1992) could be relevant because metacognition in children may detect This is essentially the same question of how much working memory
object files that are not complete and may even be empty, perhaps adheres to the empiricist agenda, with experience altering processing
except for a location feature. This is an interesting topic for continuing abilities, or the nativist agenda, with experience having little or no
research. effect. Control of attention in dual-task situations. The world is a The consensus from this research is that it is certainly possible to
lively place and working memory must navigate it. Encapsulating these improve performance on a task but that, at least in typically developing
aspects of the world, Cowan et al. (2021) reported on an experiment in children, that training will not much change performance across task
which a visual memory load (an array of colors to be remembered for a components more than ordinary daily life would do. Any one kind of
later probe item recognition test) was followed by a retention interval training does not display much in the way of “far transfer” to other task
that sometimes included a speeded task, pressing a button on the same skills that were not directly trained. It may still be possible that some
side as a visual or acoustic signal or, in a more difficult situation, children with learning deficits do not get optimal training in daily life or
pressing a button on the opposite side (method, Fig. 2, Interim Task C; educational settings and might benefit from working memory training.
Results, Fig. 3C). The pattern of results across development during the There also may be changes in brain physiology from training, which
elementary school years, and with college students, was clear. Younger would suggest that the brain is not an indication of wholly genetic traits;
children responded as if they often stopped trying to maintain the we already know this from earlier studies of environmental enrichment
memory load when they had to carry out the speeded task. They showed in animals (e.g., Kempermann, Kuhn, & Gage, 1997).
a profound effect of the speeded task on the memory task compared to a I think that the finding of little or no far transfer of working memory
memory-task-alone condition, but little or no effect of the memory task training effects has had a dampening effect on the field of working
on the speeded task. In contrast, older children and adults did try to memory, as far as the public is concerned. People have thought, if
maintain the memory load and therefore demonstrated a mild impair working memory cannot usefully be trained, then why study it? Of
ment of performance on both tasks, compared to when only one task was course, one reason to study it is to assess what educational materials are
required. The age differences are consistent with the notion of younger best suited to children of a certain age or ability level (Cowan, 2014).
children operating with a reactive stance, as if they invest effort in each Also, training of working memory even without far transfer can be
task only as it arrives, as compared to older participants operating with a important for meaningful tasks that involve working memory, such as
proactive stance, as if they invest effort not only on the current speeded mathematics. My educated guess at present is that working memory
task, but also on the probe recognition phase of the trial that will follow. should be trained by making sure to include problems in a topic area
There is other, converging evidence for a developmental shift from such as math or science with sufficiently challenging (but not overly
reactive to proactive stances in the childhood development of working difficult) working memory demands resembling the demands encoun
memory (Chevalier, Martis, Curran, & Munakata, 2015; Morey, Mareva, tered in important activities.
Lelonkiewicz, & Chevalier, 2018). According to the cognitivist and constructivist views, the most Off-loading of information to free up attention. Sometimes, important environmental input would be expected to be input making
it is impossible to share attention well across two tasks but, according to the individual think, reflect, or adapt better to the environment, and it
a hypothesis of Cowan (2019) and Rhodes and Cowan (2018), another may well be that working memory training is too repetitive and sterile to
possibility is to off-load some information out of the focus of attention to stimulate much intellectual growth. What might work better is training
be saved in activated long-term memory (with new learning about as that does more to help guide participants in the effective construction of
sociations between items and their serial or spatial positions). Relevant new mnemonic strategies, and to help increase the metamemory needed
to this process, Cowan et al. (2018) examined the development of the to know where improvement is most possible, so as to develop strategies
ability to combine visual information from spatial arrays and acoustic usable given the child’s current abilities (cf. Demetriou & Spanoudis,
information from tone or digit series, with the two sets of stimuli pre 2018).
sented in succession (method, Fig. 2, Interim Task B; results, Fig. 3B). On
some trials, the task was to retain both sets in working memory whereas, Summary and prognosis for the cognitivist and constructivist in
on other trials, only one set was to be retained. The reduction in per fluences on working memory development. It appears that the structures of
formance resulting from the requirement of retaining both sets was working memory do not show tremendous, obvious specialization for
considered a central, attention-intensive part of capacity, whereas the different domains (e.g., see common functions across various tasks
portions of visual and acoustic memory that were not affected by the administered to children 4 to 15 years by Gathercole et al., 2004). They
instructions were considered peripheral, feature-specific aspects of ca do not markedly change with development, at least during the
pacity based on activation of long-term memory elements. The capacity elementary school years and possibly earlier. Thus, for example, infants
of feature-specific portions increased markedly with age from the early and adults both show signs of a core working memory faculty that can
elementary school years through adulthood, whereas the central, retain only about 3 items, despite developmental change in performance
attention-intensive portion did not increase. The implication is that, as using adult procedures (e.g., Cowan, Elliott, et al., 2005; Simmering,
participants mature, they do not hold more information in the focus of 2016). What seems to improve most with development is the active
attention at once. Instead, they somehow learn to hold more information construction of mnemonic strategies and knowledge to enrich working
in a manner that does not stress the focus of attention. Cowan et al. memory encoding and to handle and process information better (e.g.,
proposed that the more mature participants do this by memorizing see Fig. 3), changes that appear to underlie the increased ability to un
patterns that could be retained for the duration of the trial without very derstand and construct concepts (Halford et al., 2007). Several sugges
much use of attention. More research on the mechanism is warranted. tions for understanding the results and carrying out future research from
a cognitivist standpoint will be suggested below. The cognitivist and constructivist approaches and working memory Working memory and attention. If there is a single bio
training. One difference between my research program and many others logical development related to the various advancements of working
is that, unlike many other laboratories, I have not addressed the question memory, along with increasing knowledge, it is the way in which
of whether aspects of working memory can be trained. This has been attention can be used to initiate and carry out mnemonic strategies.
done by others to examine whether it is possible to improve indices of Many have pointed out the close relationship between attention and
brain power, intelligence, aptitude, or scholastic performance (for re working memory in children (e.g., Alloway, Gathercole, Kirkwood, &
views see Diamond & Lee, 2011; Holmes & Gathercole, 2014; Melby- Elliott, 2009; Bertrand & Camos, 2015; Karatekin, Marcus, & Couperus,
Lervåg & Hulme, 2013; Morrison & Chein, 2011; Sala & Gobet, 2017). 2007; Magimairaj & Montgomery, 2013; Rogers, Hwang, Toplak, Weiss,
N. Cowan Cognition 224 (2022) 105075
& Tannock, 2011; Siegel & Ryan, 1989), even though attention does not range of ages.
account for everything that working memory scores do (for example, The specific uses of working memory in other cognitive
with a separate role of working memory in reading achievement shown tasks. More research is needed to understand just what working memory
by Slattery et al., 2021). What was unclear previously was the degree to is used for. Here is one example in which the findings about the use of
which the role of attention in working memory was one of storage or working memory were clear, but counter to our expectations. Adams
mnemonic processing. Several of the recent studies reviewed here sug and Cowan (2021) went beyond the usual correlational approach,
gest that a major aspect of development is the use of attention for examining 4- and 5-year-old children’s delayed imitation of passive-
mnemonic processing rather than storage. It is an improved use of ex sentence descriptions of pictures (“speaking in a new way,” they were
ecutive control akin to what is found in other procedures examining the told), sometimes while under a memory load. The results were coun
development of attention (e.g., for reviews see Ristic & Enns, 2015; terintuitive, indicating that children ventured more passive sentences
Shore, Burack, Miller, Joseph, & Enns, 2006). In contrast, I have not seen under load than not under load. In the process, they occasionally made
strong evidence of developmental change in the more automatic aspects some silly mistakes, such as saying The girl was watered by the flower. The
of working memory storage or in the number of items that can fit in the results make sense with the notion that working memory was used to
focus of attention at once, which is a revision of a hypothesis that I used check the semantic accuracy of the repetition, with children often
previously to explain why it is attention-intensive aspects of working abandoning the assigned task of speaking in the new way for the sake of
memory are the ones that change most with development (Cowan, semantic accuracy. In contrast, under a working memory load, this
Elliott, et al., 2005). Thus, the central storage component of working reformulation of the idea to be sure of its semantic accuracy was not
memory exists but did not change magnitude with age, whereas the feasible, so what predominated was simply mimicking the presented
ability to use attention for better encoding and memorization of the sentence as well as possible, despite errors.
materials apparently did (Cowan et al., 2018), as Fig. 3b shows; and the Summary. In sum, there are various ways in which chil
ability to coordinate storage with a separate processing task showed dren’s working memory improves with age, largely having to do with
children taking a more proactive stance with development, indicating management of attention. The improvement of working memory trig
improved ability to share attention between two tasks (Cowan et al., gers improved learning from the environment and the ability to handle
2021), as Fig. 3c shows. more complex ideas. The findings underlying these statements provide
Forsberg et al. (2021b) recently examined the role of attention in invitations for further research, with many new avenues now open.
working memory and long-term learning of pictures of common objects.
The ratio between working memory capacity for objects in an array and 2.4. Working memory development and dynamic systems theory
the number of them later recognizable in a long-term memory test
depended on whether the items were presented in sub- or super-capacity Work with dynamic systems theory has explained working memory
sets. Nevertheless, the ratios were very similar across the elementary development in terms in which basic changes in neural functioning, with
school years and adulthood; younger children remembered fewer items environmental input, lead to the developmental growth of capacity (e.g.,
in both tests but the working-memory to long-term-memory ratio did not Perone, Simmering, & Buss, 2021; Simmering, 2016; Spencer, 2020;
change. As suggested by Cowan (2014), it appears that working memory Thelen & Smith, 1994). The approach has relied heavily on computa
underpins learning directly, so that the amount learned depended on tional models to show how very basic neural processes combined with
maturation because working memory depended on maturation. How knowledge can create developmental changes, for example in working
ever, the degree of learning depended on attention in that sub-capacity memory capacity. The strength of the approach is to show neural in
sets were better encoded into working memory and better learned in teractions through computations with emergent consequences for the
each age group. child’s processing capabilities that might not be obvious from verbal Working memory and neo-Piagetian theory. The opportu arguments alone.
nity to apply a neo-Piagetian approach to an understanding of the role of In principle, the dynamic systems approach seems friendly to cog
working memory in cognition is under-utilized. For example, theory-of- nitivism as it may be providing a different level of analysis to understand
mind studies (e.g., Andrews et al., 2003) could be combined with the interactions between heredity and environment more fully. It can
measures of working memory for the underlying premises to determine serve as a check on behavioral investigations. In a related approach,
the role of memory failure in performance. We now have a good idea Westermann et al. (2011) describe the new field of neuroconstructivism,
that what changes with development is largely the ability to manage saying (p. 723) that “Neuroconstructivism builds on the Piagetian view
attention more efficiently, for example by applying knowledge and that development constitutes a progressive elaboration in the
noticing patterns that can be used to help ease the load (Rhodes & complexity of mental representations via experience—dependent pro
Cowan, 2018). This kind of attention-based theory is consistent with cesses. However, neuroconstructivism is also informed by recent the
Case et al. (1982) and work from a neo-Piagetian perspective (e.g., ories of functional brain development, under the view that the character
Pascual-Leone & Johnson, 2021). It remains to be determined just what of cognition will be shaped by the physical system that implements it.”
biological maturation may occur to allow this improved management of The basic difficulty in developmental study is that age is not an
attention. Clearly, the frontal lobes help in managing attention and are experimental manipulation; many processes improve with age at once
some of the latest portions of the brain to mature (Casey, Giedd, & and it is difficult to disentangle them, as I have been trying to do. Is
Thomas, 2000). metamemory is important in the efficient use of working memory? Do Understanding u-shaped developmental changes. There are older participants remember more through their increased ability to
various cases in which u-shaped developmental grown is observed (e.g., encode patterns in the stimuli? I can investigate whether those processes
Gershkoff-Stowe & Thelen, 2004), such as the apparent indication of occur, but the dynamic systems approach can provide input into
more working memory capacity in infancy than in the early elementary whether there is an elegant way to conceive of the connection between
school years (Cowan, 2016) or the change from the correct word form them and working memory capacity development, or whether instead
like went to an invented word form like goed. Fundamentally, though, they seem unessential correlates of that capacity.
the theoretical expectation is that increases in knowledge and ability are As Witherington and Margett (2011) discussed, there is a wide range
monotonic across infant and child development, not u-shaped. To of variability between possible dynamic systems approach, with some of
demonstrate monotonic development despite u-shaped behavioral them more consistent with a view that the main task is to absorb
change, working memory and the child’s conceptual level could be structure from the environment and other dynamic systems approaches
indexed through physiological measures (e.g., alpha suppression mea more consistent with a view that the main task is to understand inherited
sures of attention to items: Wang, Megla, & Woodman, 2021) in a wide maturational changes, or to coordinate both of these things.
N. Cowan Cognition 224 (2022) 105075
3. Concluding remarks: a long perspective Baillargeon, R., Spelke, E., & Wasserman, S. (1985). Object permanence in five-month-
old infants. Cognition, 20, 191–208.
Barrouillet, P., Portrat, S., & Camos, V. (2011). On the law relating processing to storage
A lot has happened since the field of working memory development in working memory. Psychological Review, 118, 175–192.
began. Nevertheless, there are many gems of the past that have not been Bayliss, D. M., Jarrold, C., Gunn, D. M., & Baddeley, A. D. (2003). The complexities of
fully integrated into the current-day outlook. Bolton (1892) reported on complex span: Explaining individual differences in working memory in children and
adults. Journal of Experimental Psychology: General, 132, 71–92.
written digit spans of about 1,500 children between 8 and 15 years old, Belsky, J., Bakermans-Kranenburg, M. J., & van Ijzendoorn, M. H. (2007). For better and
but without a clear theoretical rationale. Following that study of norms for worse: Dif-ferential susceptibility to environmental influences. Current Directions
long ago, working memory research has been influenced by big theories. in Psychological Science, 16, 300–304.
Bertrand, R., & Camos, V. (2015). The role of attention in preschoolers’ working
The important thing for us now is to be aware of the major influences memory. Cognitive Development, 33, 14–27.
and to use a disconfirming stance to allow room for the other influences. Biography Newsletter. (December, 2021). Downloaded from the worldwide web on 23.
If one is examining working memory or its effects in infancy, one must https://www.biography.com/scientist/jean-piaget.
Bjork, R. A., & Whitten, W. B. (1974). Recency-sensitive retrieval processes in long-term
allow for early learning along with neurobiology. If one is assessing what free recall. Cognitive Psychology, 6, 173–189.
children know, or their tendency to follow directions, one must consider Bolton, T. L. (1892). The growth of memory in school children. American Journal of
that failure can come from working memory or attention limitations in Psychology, 4, 362–380.
Brady, T. F., & Alvarez, G. A. (2015). No evidence for a fixed object limit in working
replying or trying to carry out directions (Holmes et al., 2014; Jar memory: Spatial ensemble representations inflate estimates of working memory
oslawska et al., 2016; Ristic & Enns, 2015; Shore et al., 2006). When we capacity for complex objects. Journal of Experimental Psychology: Learning, Memory,
evaluate unexpectedly early skills in infancy, we need to ask whether and Cognition, 41, 921–929.
Braine, M. D. (1976). Children’s first word combinations. Monographs of the Society for
neo-Piagetian ideas can explain the progression in infancy, with theories
Research in Child Development, 41, 104. https://doi.org/10.2307/1165959
that preserve some aspects of Piaget’s theory while revising the timeline Braine, M. D. S., Brooks, P. J., Cowan, N., Samuels, M. C., & Tamis-LeMonda, C. (1993).
and task-specific processing demands. We must ask children what they The development of categories at the semantics/syntax interface. Cognitive
think they know to assess what strategies they will judge to be helpful. Development, 8, 465–494.
Broadbent, D. E. (1958). Perception and communication. New York: Pergamon Press.
We should use cognitive level and processing limits mainly to tune the Broadbent, D. E. (1975). The magic number seven after fifteen years. In A. Kennedy, &
learning material to the children (Cowan, 2014), not to try to push the A. Wilkes (Eds.), Studies in long-term memory (pp. 3–18). Oxford, England: John Wiley
children to where they are not. If there is nothing as practical as a good & Sons.
Brown, J. (1958). Some tests of the decay theory of immediate memory. Quarterly Journal
theory, as is often said, still perhaps there is nothing as useful as several of Experimental Psychology, 10, 12–21.
contrasting theories, which have pushed research on working memory Bruner, J. S., Goodnow, J. J., & Austin, G. A. (1956). A study of thinking. John Wiley and
development in different directions for the past half-century. Sons.
Buchsbaum, B. R., & D’Esposito, M. (2019). A sensorimotor view of verbal working
memory. Cortex, 112, 134–148.
Credit author statement Çalýk, M., Ayas, A., & Ebenezer, J. V. (2005). A review of solution chemistry studies:
Insights into students’ conceptions. Journal of Science Education and Technology, 14,
This manuscript is sole-authored and the author did all of the work. Camos, V., & Barrouillet, P. (2011). Developmental change in working memory
strategies: From passive maintenance to active refreshing. Developmental Psychology,
47, 898–904. https://doi.org/10.1037/a0023193
Open science practices
Camos, V., & Barrouillet, P. (2018). Working memory in development. Routledge.
Case, R., Kurland, D. M., & Goldberg, J. (1982). Operational efficiency and the growth of
The previously unposted data from Cowan et al. (2015) and means short-term memory span. Journal of Experimental Child Psychology, 33, 386–404.
for all results replotted in Fig. 3 are posted at Cowan (2021), https://osf. Casey, B. J., Giedd, J. N., & Thomas, K. M. (2000). Structural and functional brain
development and its relation to cognitive development. Biological Psychology, 54
io/xg6j3/. Original data for the other studies regraphed in Fig. 3 are (1–3), 241–257.
posted with those studies. Chater, N., & Brown, G. D. A. (1999). Scale-invariance as a unifying psychological
principle. Cognition, 69, B17–B24. https://doi.org/10.1016/S0010-0277(98)00066-
Acknowledgements Cheie, L., MacLeod, C., Miclea, M., & Visu-Petra, L. (2017). When children forget to
remember: Effects of reduced working memory availability on prospective memory
performance. Memory & Cognition, 45, 651–663. https://doi.org/10.3758/s13421-
This research was supported by NICHD grant R01-HD021338.
