2 Manuscript
2 Manuscript
2 Manuscript
Analytic Review
* Corresponding author at: CRP-CPO, EA 7273, Université de Picardie Jules Verne, Amiens,
- Supplementary Figure S1. PM test batteries: funnel plot of standard error showing
study dispersion
1
Abstract
abilities, to describe their content and to quantitatively summarize the effects of various
June 2019 to identify the existing prospective memory measures. The identified PM
measures were classified according to the type of assessment: test batteries, single-trial
patients with various diseases and controls depending on the type of assessment.
Results: In total, 16 measures were identified. Most measures evaluated both event- and
regarding validity and reliability and provided parallel versions. To a slightly lesser
language, cutoff scores for diagnostic purposes, qualitative scoring, parallel version and
external aids during the test. Compared to healthy controls, patients had significantly
analyses indicated consistent PM impairments for patients relative to controls for three
test batteries. PM complaints did not differ between patients and controls and the scores
Conclusions: This work contributes in inventorying the existing PM measures both for
research and clinical purposes. We suggest some future directions based on these
2
Introduction
Everyone forms intentions that are not executed immediately (e.g., taking medication)
but are instead scheduled for another moment or context (e.g., at 8 pm or during diner). The
term prospective memory (PM), or realization of delayed intentions (Ellis, 1996), is used to
define memory for activities to be performed in the future (Einstein & McDaniel, 1990). It is
commonly distinguished from retrospective memory, which refers rather to the ability to
remember past information (e.g., remembering the activities we did during the last holidays).
component to remember that something has to be done (intent) and a retrospective component
to remember what and when has to be done (content) (Einstein & McDaniel, 1990, 1996).
Another distinction has been made according to the nature of the cue that triggers the retrieval
the intention execution is auto-initiated by the person in a specific temporal frame (e.g.,
when an external cue occurs (e.g., taking medication during diner). These two theoretical
the two PM components (e.g., Hainselin et al., 2011; Umeda, Nagumo, & Kato, 2006), as well
as between time- and event-based tasks (e.g., Yang, Zhong, Qiu, Cheng, & Wang, 2015).
According to McDaniel and Einstein (2007), a typical PM task requires to respect several
features: 1) the action must not be fulfilled immediately, there must be a delay between the
encoding and the retrieval phases; 2) it should be embedded into another task (named
“ongoing task”) in which the PM cue represents a part of the situation; 3) the time period
during which the action can be performed must be established; 4) the time required to perform
the PM task must be established (e.g., deadline for taking medication) and 5) the to-be-
3
performed action must be formulated consciously without remaining constantly in mind
problems (Crovitz & Daniel, 1984; Terry, 1988), such that PM has a significant impact on
Mathias & Mansfield, 2005; Zogg, Woods, Sauceda, Wiebe, & Simoni, 2012). Several meta-
analyses have reported a PM impairment in normal aging, especially after 70+ (Henry,
MacLeod, Phillips, & Crawford, 2004; Ihle, Hering, Mahy, Bisiacchi, & Kliegel, 2013), and
in a wide range of clinical groups, including neurodegenerative (Ramanan & Kumar, 2013;
van den Berg, Kant, & Postma, 2012), neurodevelopmental disorders (Landsiedel, Williams,
& Abbot-Smith, 2017), neurological injuries (Wong Gonzalez, 2015) and psychiatric
syndromes (Wang et al., 2009; Zhou et al., 2017). The PM impairments in these clinical
groups reflects a multiprocess model which argues that successful performance not only
associative memory system (McDaniel & Einstein, 2000; Moscovitch, 1994), but also a
strategic monitoring system (Smith, 2003; Smith & Bayen, 2004). This latter system operates
for complex PM tasks, requiring attention allocation and executive control for relevant
environmental cues to activate delayed intentions. This strategic monitoring system has been
lateral Broadman area 10, Broadman area 40, insula and anterior cingulate (for a review, see
McDaniel, Umanath, Einstein, & Waldum, 2015), often impaired in normal aging, patients
with mild cognitive impairment, autism spectrum disorders, traumatic brain injury or
memory, would provide great benefit for patients for which functional outcomes and daily
issues are not identified by traditional memory tasks used by clinicians, and for detecting
4
individuals who are at risk of developing dementia (Rabin et al., 2014; Troyer & Murphy,
2007).
practice. In their 747 neuropsychologists survey, Rabin, Barr and Burton (2005) showed that
the Wechsler Adult Intelligence Scale-Revised (Wechsler, 1987) and the Wechsler Memory
Scale-Third Edition (Wechsler, 1997), two batteries without PM subtests, were ranked in 1st
position, endorsed by 70.80% of the respondents. The single test with partial PM assessment,
the Rivermead Behavioural Memory Test (Wilson, Cockburn, & Baddeley, 1985), was ranked
The 10-year follow-up study, Rabin, Paolillo and Barr (2016) reported the same top 5
answers, and not a single PM assessment was mentioned as the respondents’ daily clinical
practice. However, at the same time, many papers developed several measures to assess PM
questionnaires and experimental procedures). Rabin et al. (2016) also reported ongoing
challenges encountered by neuropsychologists, these included the lack of: adequate normative
data, ecological validity, reliability, diagnostic accuracy, parallel version, translation into
the current review, we show that PM measures do not cover all of these factors, which in turn
practice, as it also requires trained personnel. Anecdotally, the first normative ecological PM
Foley, Shiel, Watson, Hawkins, & Groot, 2005), was published and available for clinicians in
2005.
The goal of the current paper was (1) to identify through a systematic review the
available measures to assess PM abilities by describing their content and (2) to use a meta-
5
analytical approach to quantitatively summarize the effects of various diseases on PM
Methods
the current study. To objectively assemble and screen the literature in search of PM
assessment tools, we selected empirical studies that met the criteria according to the Preferred
Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA; Gates &
Eligibility Criteria
Criteria for inclusion were: (1) peer-reviewed journal articles, paper presented in
scientific conferences or dissertations (2) published in English language and (3) studies that
have assessed PM abilities. Studies were excluded if they were (1) primarily focused on
another area than PM (2) PM studies with training/rehabilitation purposes (3) focused on non-
human populations (3) single case studies and (5) review articles, systematic reviews and
meta-analyses. All studies that met the inclusion and exclusion criteria were included in the
review. To maximize the identification of existing measures to assess PM, the literature
searches were not limited by the age and neurological status of individuals.
Information Sources
A systematic search of published studies was conducted by the first author using
PubMed, PsycArticles and PsycInfo databases. No search was conducted after June 12, 2019.
6
Literature Search
The initial search was conducted on PubMed, PsycArticles and PsycInfo databases
and included the following terms in abstract, titles or keywords: “prospective memory”,
assessment tools psychometric properties, we added appropriate search filters query proposed
by Terwee, Jansma, Riphagen, and De Vet (2009). Therefore, we added the most sensitive
(i.e., “valid*” and “reliab*” with percentages of 39.70% and 37.90%) and specific (i.e.,
of 100,00%, 42.30% and 35.70%, respectively) terms in abstract, titles or keywords to the
final search query. For all articles found, titles, abstracts and keywords were screened for
eligibility and the Abstrackr machine learning tool (Wallace, Small, Brodley, Lau, &
Data Items
Data collected from each reviewed study contained the names of the tests used,
country of publication (ISO 3166) with language, age range with mean and standard
deviation, education range with mean and standard deviation, samples size with the
dependent variable (i.e., the proportion of PM cues correctly responded or the proportion of
PM complaints for questionnaires) for patients and healthy controls, total duration of the test
with the length of the retention interval, number of PM item, and study reference. We
developed an algorithm for each study to clarify whether the identified PM measures meet the
Rabin et al.' study (2016): language translation, cross-cultural adaptation, validity assessed
7
outcome measure, qualitative scoring. This initial algorithm has also been extended to three
other key variables of interest specific to the field of PM, namely event-based tasks, time-
based tasks and the use of external aids during the test. Each study was assigned one or a
combination of these 12 key variables depending on whether it has endeavored these criteria.
The identified PM measures were assigned to four distinct types of assessment: test
between objective and subjective measures, as well as the number of PM items included in the
test (e.g., Kinsella, Pike, Cavuoto, & Lee, 2018). The number of studies included in the
current systematic review and meta-analysis was calculated, as well as the number of PM
measures identified for each category, mean age and education (in years) of samples. An
excel sheet was created for each assessment type to facilitate the identification of studies, as
well as to reduce the risk of counting duplicates. Our 3 stages analysis included 1) identifying
key variables that met the criterion for each study, 2) aggregating each key variable
occurrence meeting the criterion for each of the identified PM measures, and 3) calculating
the percentages of criteria met for each key variable according to the number of measures
Meta-Analytic Approach
used the random effects model for all analyses in order to provide a more realistic approach
when combining data from various methodology and sample characteristics compared to the
fixed effects model (Borenstein, Hedges, Higgins, & Rothstein, 2010; Cheung &
Vijayakumar, 2016).
8
Given the small sample sizes of some of the included studies, we calculated the
recommended by the Cochrane Collaboration (Higgins & Green, 2011). Effect sizes were
considered as small, medium and large when g ≥ .20, .50 and .80 respectively (cf. Cohen,
1988).
Six studies reported data on multiple, but distinct PM measures. The main feature of
these studies is that the same participant provided data on at least two different PM tasks,
including both objective (i.e., test batteries or experimental procedures) and subjective (i.e.,
cannot treat the different outcomes as though they were independent as this would lead to
misleading estimate the variance for the overall effect (cf. Senn, 2009). In an effort to
improve the reliability of our analyses, and given that the administration of objective and
subjective measures is a traditional approach used by the authors to assess PM, Hedges’s gs of
individual studies were pooled to a mean effect size according to the type of assessment (i.e.,
test batteries, single-trial measures, questionnaires and experimental measures). Pooled effect
sizes in the negative direction indicated that PM performance was lower for the patients
The homogeneity of the effect sizes between the samples was measured using the Q
statistic. A significant Q index indicates that the variance of effect sizes in the population is
greater than expected as compared to the sampling error. We also calculated the I2 statistic,
which refers to the percentage of variation across studies that is due to heterogeneity rather
than chance (Higgins & Thompson, 2002; Higgins, Thompson, Deeks, & Altman, 2003). A I2
increase in heterogeneity. The heterogeneity was assumed to be low, moderate and high when
I2 value was 25%, 50% and 75% respectively (Higgins et al., 2003). In the case where
9
heterogeneity estimates indicated a substantial difference between individual studies, we
conducted planned subgroup analyses for all measures included in the assigned assessment
category to further examine the source of the heterogeneity. Subsequent analyses were
conducted only for measures that were used in at least two different studies.
more studies showing statistically significant results than studies with non-significant results.
This causes a Type I publication bias error and results in a spurious effect of the parameter
asymmetrical funnel shape, with a Egger test p < .05. To overcome this bias, the influence of
unpublished studies should be taken into consideration. Therefore, we used funnel plots and
Egger’s tests to examine whether asymmetry due to publication bias was present in the study
and we also applied Rosenthal's (1979) fail-safe N formula to estimate the number of
Results
The initial literature search of the 3 databases generated a total of 326 references (63
in PubMed, 4 in PsycArticles and 259 in PsycInfo) and 15 additional studies were identified
in the reference lists of these articles and other studies known to the first author through
previous readings were also considered for inclusion. From the 341 references, 50 duplicate
records were excluded, 105 were excluded based on titles and abstracts and consequently, 186
full texts of articles were retained. After having removed duplicates, reviewed the entire full
content of articles and applied the exclusion criteria, the number of studies that met the
inclusion criteria was 52 and 23 for the literature review and the meta-analysis respectively.
10
The literature review identified a total of 16 PM measures including:
• Five test batteries (see Table 1 for an overview of criteria met and
section): the Rivermead Behavioural Memory Test (Wilson et al., 1985) the
1996), the Cambridge Test of Prospective Memory (Wilson et al., 2005), the
Memory for Intention Screening Test (Raskin, 2004), the Royal Prince Alfred
• Three single-trial procedures (see Table 2 for an overview of criteria met and
section): the envelope task (Huppert, Johnson, & Nickson, 2000), the prompt
card task (Delprado et al., 2012) and the telephone test (Hsu, Huang, Tu, &
Hua, 2014).
• Four experimental procedures (see Table 4 for an overview of criteria met and
11
Charbonneau, & Giguère, 2011), the Virtual Week and the Actual Week
Test Batteries of PM
Rivermead Behavioural Memory Test (RBMT). The RBMT (Wilson et al., 1985,
event-based tasks (e.g., remembering to ask the experimenter for the next appointment time
when an alarm sounds) among the 11 sub-tests making up the battery. The original RBMT has
been translated into fourteen languages (Wilson, 2009). The test was subsequently
standardized for older adults (Cockburn & Smith, 1989) and adapted for both adolescents
(Wilson, Forester, Bryant, & Cockburn, 1990) and young children as the Rivermead
Behavioural Memory Test for Children (RBMT-C, commercially available; Wilson, Ivani-
chalian, Besag, & Bryant, 1993). The RBMT-3 is the last commercially published version of
the test and provides a general memory index based that follows the basic principles of
standardized IQ score but does not provide a standardized PM score (Wilson et al., 2008).
The validity of the RBMT has been assessed on the basis of therapists’ observations of
80 brain-damaged patients (35 hours of observation per patient; range 16-55 hours) suffering
from everyday memory failures (Wilson, Cockburn, Baddeley, & Hiorns, 1989). Wilson
(1991) showed that the standardized profile scores obtained at the RBMT were good
predictors of functional independence (e.g., having a paid job) for patients who experienced
severe head injury, although other authors (Mathias & Mansfield, 2005; Mills et al., 1997)
could not replicate this result. The limited number of items (3 PM items only), the lack of
12
TBPM tasks and long-term naturalistic task and a ceiling effect (Mathias & Mansfield, 2005)
reduce the validity of this measure. Wilson herself (2009) argues that the RBMT “is not
sufficient on its own. It can highlight some of the areas that one might want to tackle in a
treatment program but it does not specify with sufficient precision the nature and extent of the
everyday problems in such a way that we can set appropriate goals” (p. 46). Indeed, one study
reported that even older adults without cognitive impairment and functional difficulties failed
PM tasks of the RBMT, especially for the Appointment and Belonging sub-tests, but not for
Cambridge Behavioral Prospective Memory Test (CBPMT) and the Cambridge Test of
Prospective Memory (CAMPROMPT). The CBPMT was initially in a study of a patient with
severe amnesia (Kime et al., 1996) and adapted in an extended 40-minutes version, including
4 time-based and 4 event-based PM tasks, for people with brain injury and controls to
specifically assess the construct of PM. Despite the lack of validation or normative data, the
damaged patients and healthy controls (Groot, Wilson, Evans, & Watson, 2002). The CBPMT
is the first assessment to allow the participants to take notes to help them in remembering PM
tasks. Interestingly, note takers performed better than non-note takers, regardless of brain
Wilson et al. (2005) improved the scoring of the CBPMT and created the
provides normative data based on age and IQ. The test includes six (3 time- and event-based
tasks) tasks and requires 25 to 30 minutes for completion. Furthermore, participants are
13
The initial validation and normative data of the CAMPROMPT were collected on 72
patients (mainly traumatic brain-injured patients and patients with degenerative neurological
conditions) and 212 healthy controls, ranging from 16 to 92 years old. Wilson et al. (2005)
found a moderate correlation of .38 between the total profile score of the Rivermead
Behavioural Memory Test (RBMT) and both the CAMPROMPT total score and the event-
based PM score sub-scale (r = .47 for each), but not between the total profile score of the
RBMT and the event-based PM score sub-scale. Because of the lack of time-cues PM task in
the RBMT, and the wide range of cognitive abilities it encompasses, it might not be still
distinguish control participants from smokers (Heffernan, O’Neill, & Moss, 2010a), amnestic
mild cognitive impairment (Delprado et al. 2012) and young binge-drinkers (Heffernan &
O’Neill, 2012). Patients with spina bifida meningomyelocele had also poorer performances
than controls (Dennis, Nelson, Jewell, & Fletcher, 2010). The authors also noted that patients
took fewer notes than controls (50.00% vs 82.35%) which was inconsistent with the
The Memory for Intentions Screening Test (MIST). The MIST (commercially published,
Raskin, 2004) provides comprehensive scoring system for omissions (e.g., loss of content or
time) and commission errors (e.g., task substitutions). These variables have proved to be
relevant in clinical research (see Woods, Twamley, Dawson, Narvaez, & Jeste, 2007 for
patients with schizophrenic disorders; but also with HIV-infected individuals Carey et al.,
2006; Woods, Iudicello, et al., 2008a). The MIST includes a total of 8 PM tasks (4 time- and
event-based tasks), with two parallel versions, norms on 736 participants from 18 to 94 and
education percentiles. The MIST includes a more ecological (optional) task where participants
have to leave a phone message to the clinician 24-hour after the testing.
14
Woods et al. (2008b) later published the psychometric characteristics of the MIST
collected on 67 healthy adults, ranging from 19 to 74 years old, but no clinical group was
enrolled. The correlation analyses showed an acceptable split-half reliability (.70; Spearman-
Brown coefficient) and an excellent inter-rater reliability (.99). However, the poor internal
consistency for the eight PM tasks (Cronbach’s α: .48) might be due to the particularly high
level of education of the participants and the restricted range of scores observed in this
sample. The authors also showed that the call-back PM task was not linked to any other MIST
measures and demographic characteristics. Indeed, unlike the other MIST items, the
participants could use strategies such as taking notes, but did not receive specific advice.
However, the authors neither recorded nor published data concerning the number and the type
of strategies that may have been used, limiting the conclusions that could support the
psychometric properties of this long-term PM task. Carey et al. (2006) showed deficits in
time- and event-based tasks, as well as more failure on the 24-hour delay PM task and
Characteristic (ROC) analysis highlighted a high discriminative power for the MIST
(acceptable sensitivity and specificity with a coefficient of the area under the curve of .83) in
specificity (.74) coefficients. The MIST demonstrated a good ecological validity via
significant relationships with the Instrumental Activities of Daily Living (IADL; Lawton &
Although the MIST and the CAMPROMPT integrate in their design some useful and
samples, their long administration time (30-40 minutes on average) is a technical limitation
15
Royal Prince Alfred Prospective Memory Test (RPA-ProMem). The design of the RPA-
since it takes less than 15 minutes for administration and does not include classical distractor
tasks (i.e., “filler” tasks such as puzzles or questionnaires), thereby reducing additional
cognitive demand for patients. The RPA-ProMem also includes a PM task to be carried
outside the laboratory with a longer period (1 week after testing) compared to the 24-hour
delay ecological PM task of the Memory for Intentions Screening Test (Raskin, 2004).
The validation of the test was conducted by Radford et al. (2011) with 20 patients
presenting various brain disorders (ranging from 18 to 63 years old) and 20 healthy control
enough to identify patients’ PM deficits compared to healthy controls. Radford et al. (2011)
did not show any correlation between the RPA-ProMem and the Memory for Intentions
Screening Test (MIST; Raskin, 2004) in the control group. According to the authors, this
could be partly due to the fact that MIST does not allow participants to use external aids in
contrast to the RPA-ProMem. Concluding on these elements is thus difficult given that
Radford et al. (2011) did not pay attention to the possible use of other external aids not part of
the RPA-ProMem. Rabin et al. (2014) showed that those with amnestic mild cognitive
impairment had worst performance than controls on the RPA-ProMem for time- and event-
based PM tasks, as well as for both short- and long-term delays. Patients with subjective
cognitive decline also scored lower than controls on long-term and naturalistic subtask. The
authors also reported a strong inter-rater reliability (coefficient of intraclass correlation of .97)
and a good alternate form of reliability (Rho = .71). Notably, the RPA-ProMem total scores
Prospective Memory (CAPM; see below) and the Instrumental Activities of Daily Living
(IADL; Lawton & Brody, 1969), which demonstrates the ecological validity of the tool.
16
Beyond the interest of questionnaires in assessing functional difficulties in everyday life, the
RPA-ProMem may be interesting when these measures cannot be obtained from informants.
Single-Trial Procedures
The envelope Task (Huppert et al., 2000), the prompt card task (Delprado et al., 2012)
and the Telephone Test (Hsu, Huang, Tu, & Hua, 2014) have been developed to assess PM
Envelope Task. The envelope task is a single-trial event-based PM task used by Huppert et al.
lack of formal standardization, the envelope task was administered to 11,956 individuals aged
65 years and above. The clinician tells the participant that he/she will have to write a given
name and address (“John Brown, 42 West Street, Bedford”) on an envelope when it is shown,
to add their own initials, seal it and return it back to the clinician. The test allows to assess
both the prospective and retrospective components of PM. To assess the PM prospective and
retrospective components, the participant has to remember to do something after receiving the
envelope within about 5 to 10 seconds. The clinician gives a prompt if the participant did not
do so within the proper time or performed only one action (i.e., just seal the envelope or just
write initials on the back). Responses are coded as follows: 2 (correct action without prompt),
1 (correct action with prompt) and 0 (the participant did not remember the action, even when
he/she was prompted). The clinician also scored 2, 1 or 0 point for correct action following a
prompt in order to assess the retrospective memory component. These instructions are
followed by a 10-minute interval in which the participant has to perform a set of cognitive
17
tasks. Huppert and colleagues (2000) reported that decrement of PM performance was
linearly and strongly correlated with age. They also showed that 54% of individuals aged 65
successfully performed the task without prompt, compared to 19% for the elderly over 90 and
8% for individuals with probable dementia (n = 388). The envelope task is also sensitive to
spot patients with amnestic mild cognitive impairment (Lee et al., 2016). The authors
administered a single-item subjective rating scale for which the participants were asked to
assess the effectiveness of their memory on a daily basis compared to individuals of the same
age. Compared to controls, their results showed that patients performed poorly the envelope
task compared to controls. Their results also showed that the envelope task achieved a better
level of discrimination compared to the subjective memory rating (area under the curve
coefficient of .83 and .76, respectively). The specificity of the envelope task in detecting a
group difference was good (91.9%), although its specificity was low (64.3%).
Prompt Card Task. This single-trial event-based prompt card task starts with writing details
about the next appointment on a card that the participant is supposed to give to the clinician at
the end of the session. The diagnostic value of the prompt card task seems interesting, and
showed poorer PM performance for patients with amnestic mild cognitive impairment
patients compared to healthy participants (Delprado et al., 2012). A ROC analysis was
Kaplan, & Ober, 2000) and the three measures of PM to determine their diagnostic value. For
the case of PM, the CVLT-II has shown to bear the highest discriminative power in
identifying patients from healthy participants with a coefficient of the area under the curve of
.93; followed by the envelope task (.85), the prompt card task (.77) and the Cambridge Test of
Prospective Memory (.76 for both time- and event-based sub-tests). It was quite predictable
that the CVLT-II has been found to be the best measure to distinguish patients from healthy
18
participants because a similar retrospective memory screening measure was used prior to the
investigation to diagnose patients with amnestic mild cognitive impairment. In conclusion, the
envelope task seems to be a decent PM measure to identify patients with mild cognitive
impairment, and the best tool when it is compared with the prompt card task the Cambridge
Telephone Test. The telephone test is the only single-trial procedure that allows to assess
time-based PM (Hsu et al., 2014). Participants are requested to remind the clinician to make a
phone call to the counter 5 minutes after the instruction. Like the envelope task, the telephone
test allows to measure both prospective and retrospective PM components. A prompt is given
to participants if no action is triggered within the 60 seconds following the 5-minutes delay.
For the prospective component, 2 points are given when the participant reminds the
experimenter that something needs to be done within the 60 seconds following the 5-minutes
delay, 1 point if the reminder is given after this delay and 0 point is scored if the participant
does not perform the expected action. For the retrospective component, 2 points are given if
the content of the action is correctly recalled, 1 point if the participant does not remember the
content of the action but remembers that something needs to be done with the telephone or the
counter. Combining the telephone test and the envelope task scores, the authors showed
poorer performance for patients with dementia compared to healthy controls and negative
correlations between informant rating of both prospective and retrospective sub-scales of the
Prospective and Retrospective Memory Questionnaire (see below) and combined PM scores
(r = -.57; -.58). In another study with patients with subjective cognitive decline, Hsu, Huang,
Tu, and Hua (2015) found poorer performance on the telephone test for compared to healthy
19
*** insert Table 2 about here ***
Questionnaires
Prospective Memory Questionnaire (PMQ). The PMQ (Hannon et al., 1990) was the first self-
report PM measure available for assessing PM failures as well as the frequency use of
memory aids.
The initial validation study (Hannon et al., 1990) included 361 individuals (291
PM (e.g., item 1 “I missed appointments I had scheduled”) rated on a 9-point Likert scale.
The PMQ assesses several dimensions of PM with 5 sub-scales: Long-Term Episodic, Short-
Term Habitual, Internally Cued Scale and Techniques to Remember. The latest published
version of the PMQ includes 52 items (Hannon, Adams, Harrington, Fries-Dias, & Gipson,
1995). The authors have confirmed this initial factor analysis with another factor analysis
using varimax rotation in healthy younger adults and older adults, but also with brain-injured
patients. The internal consistency coefficient of the PMQ was high (.92) and ranged from .78
Brain-injured patients and age-matched healthy older adults performed poorer than
younger adults on three PM measures including short- and long-term ecological tasks with an
alpha coefficient of .76. Moreover, groups differed only on one dimension of the PMQ,
namely the Short-Term Habitual sub-scale. Hannon et al. (1995) also reported negative
relationships between scores on the short-term tasks and total scores obtained on the PMQ (r
= -.17), but also for the 3 sub-scales of the PMQ, namely Long-Term Episodic, Short-Term
Habitual and Internally Cued sub-scales (r = -.19; -.25; -.22). The PMQ has a good test-retest
20
reliability with a coefficient of .88 for the PMQ among 72 participants of the sample 10 to 14
Heffernan, O’Neill and Moss (2013) used the PMQ with a video-based procedure to
assess PM (Prospective Remembering Video Procedure; PRVP, see below) and showed no
difference between smokers and controls on the questionnaire, despite poorer performances
on the PRVP were reported for smoker individuals, suggesting a lack of self-awareness of
such PM deficits.
Prospective and Retrospective Memory Questionnaire (PRMQ). The PRMQ (Smith, Del Sala,
Logie, & Maylor, 2000) is one of the most widely used questionnaire designed to provide a
self- and informant rating of memory complaints for both prospective and retrospective
failures (8 items for each) in everyday life context. To our knowledge, the PRMQ has been
translated into 5 languages (Gondo et al., 2010; Hsu & Hua, 2011; Piauilino et al., 2010;
Rönnlund, Mäntylä, & Nilsson, 2008; Wong Gonzalez, 2015). Each item of the questionnaire
can be categorized along three dimensions: (1) assessing retrospective episodic memory by
(2) self- or external cues (i.e., time- and event-based tasks) and (3) requiring long- or short-
term delay. For example, the item 1 (“Do you decide to do something in a few minutes’ time
and then forget to do it?”) is defined as measuring prospective, short-term and self-cued
memory, while the item 2 (“Do you fail to recognise a place you have visited before?”) is
The validation and standardization of the PRMQ included 551 healthy individuals
aged between 17 to 94 years (Crawford, Smith, Maylor, Della Sala, & Logie, 2003). The
latent structure of the tool was studied using confirmatory factor analysis. The model was
composed of a tripartite structure including a general memory factor (all items included) plus
two orthogonal factors specific to prospective and retrospective memory with acceptable
21
Cronbach’s alpha coefficients of .89, .84 and .80, respectively. However, the confirmatory
factor analysis suggests that the classical distinction between self- and environmental cues
does not explain the pattern of covariance among items. While this factorial structure was
confirmed by 3 studies (Hsu & Hua, 2011; Piauilino et al., 2010; Rönnlund et al., 2008), this
was not the case for the Spanish version of the PRMQ (González-Ramírez & Mendoza-
González, 2011) compared with the original study (Smith et al., 2000).
PM failures were rated as more frequent than retrospective failures for both
Alzheimer’s disease and healthy older adults groups (Smith et al., 2000). Moreover, PM
failures of Alzheimer’s disease patients were rated as more frustrating for informants than
spouse, self- and informant ratings did not differ, suggesting a relative coherence of these
the PRMQ for both smokers individuals (Heffernan et al., 2010a), young binge-drinkers
(Heffernan, Clark, Bartholomew, Ling, & Stephens, 2010b; Heffernan & O’Neill, 2012)
compared to healthy controls individuals; while their performance on the Cambridge Test of
al. (2015) reported a similar pattern of result by showing that self-reported PM failures on the
PRMQ did not differ across patients with MCI, patients with dementia and controls. The
authors highlighted that informants tended to rate higher PM failures for patients with
dementia than those presenting MCI and control participants. Moreover, the reports of
patients presenting MCI and healthy controls were not linked to informant reports. These
results suggest that informant reports represent a more valid diagnostic indicator, notably for
individuals with dementia, but not for patient with a lesser degree of impairment. Other
studies led to similar conclusion for patients with amnestic mild cognitive impairment (Lee et
22
Comprehensive Assessment of Prospective Memory (CAPM). This questionnaire is
specifically devoted to brain-injured individuals (Roche, Fleming, & Shum, 2002). The
items), degree of concern (Section B, same 39 items) and reasons for each PM failure
(Section C, 15 items). What distinguishes Section A of the CAPM from other questionnaires
is the nature of its two subscales, which refer to the type of daily living activity that is
remembered. The principal components analysis conducted by Waugh (1999) indicated that
the Section A of the CAPM was defined by two components: (1) common memory failures
referring to Instrumental Activities of Daily Living (IADL; Item 1 “Forgetting to buy an item
at the grocery store”) and (2) uncommon failures referring to basic activities of daily living
The initial validation study using the CAPM was conducted among 525 healthy
consistency of these two sub-scales showed acceptable alpha coefficients of .92 and .79,
respectively. In addition, the CAPM proved to be sensitive enough to discriminate age groups.
Fleming et al. (2009) have shown that CAPM self-reports scores were not correlated with
neither the Cambridge Test of Prospective Memory (CAMPROMPT; Wilson et al., 2005) nor
the Memory for Intentions Screening Test (MIST; Raskin, 2004). However, the informants’
reports on the IADL sub-scale and total scores of the CAPM were negatively correlated to the
CAMPROMPT and the MIST. This result highlights the usefulness of the Section A of the
The reliability and normative data of the CAPM on 95 healthy individuals with an age
range of 15 to 60 years showed more failures for younger adults (15–30 years) than the
healthy older adults (31–60 years) (Chau, Lee, Fleming, Roche, & Shum, 2007). This result is
23
while deficits were observed in laboratory-based PM tasks (Rendell & Thomson, 1999). The
older age group being relatively young compared to other studies (e.g., Waugh, 1999) might
explain this result. Both internal consistency and test-rest reliability coefficients of the CAPM
and controls for the Section A of the CAPM, ratings from informants showed that brain-
injured patients had more frequent PM failures compared to controls (i.e., patients tended to
underestimate the frequency of PM failures compared to informants) (Roche et al., 2002). The
authors suggested that impaired self-awareness could be a factor affecting the accuracy of
Assessment of Prospective Memory (CAPM; Waugh, 1999), the BAPM includes both IADL
and BADL sub-scales (8 items for each) into a 16 items short form test (Man et al., 2011).
The authors assessed the validity of the BAPM from 3 samples. The first sample was a group
527 healthy participants included Waugh' study (1999), while second and the third samples
were 95 healthy participants and 45 brain-injured patients who participated in Fleming et al.'
study (2009). The authors also reported acceptable internal consistency and test-retest
reliability for both IADL and BADL sub-scales for all samples with coefficient ranging
between .66 and .98. Like for the CAPM, the correlations between self-reports on the BAPM
and the CAMPROMPT were not significant, suggesting a poor concurrent validity of the
BAPM. Results also showed that BAPM scores correlated with Sydney Psychosocial
Reintegration Scale (Tate, Hodgkinson, Veerabangsa, & Maggiotto, 1999), indicating a good
24
*** insert Table 3 about here ***
Experimental Procedures
which participants watch a 12-minutes video recorded at a shopping precinct and have to
recall future intentions (e.g., remembering to buy a soccer ball) in response to event-based
PM cues appearing during the movie (Titov & Knight, 2001). Each item of the PRVP assess
Their results supported the inter-item reliability (Cronbach’s alpha of .79 for the first
list and .67 for the second list), as well as the alternate form of reliability (.65). The authors
also found that familiarity, assessed with a 10-point Likert scale, enhanced recall and that pre-
exposure to a video of unfamiliar stimuli could attenuate this effect. Moreover, evidence for
the concurrent validity of the PRVP was found by showing relationship between participants’
total scores and their performance on comparable PM tasks performed in natural settings
(coefficient of .71). The PRVP was also sensitive enough to distinguish healthy control
participants from young binge-drinkers (Heffernan et al., 2010b) and smokers (Heffernan et
al., 2013).
Test Écologique de Mémoire Prospective (TEMP). Inspired by the PRVP (Titov & Knight,
2001), the TEMP (Potvin et al., 2011) is a 20-minute movie that displays several areas (i.e.,
commercial, residential and industrial) of a city. It includes 15 tasks (10 event-based and 5
time-based tasks) simulating real activities of daily living (e.g., reserving train tickets). The
TEMP provides two versions for test-retest (no significant differences between the two
versions) and assess both PM components (prospective and retrospective), the 3 main phases
25
(encoding, storage and retrieval) and both time- and event-based aspects of PM. The test-
retest reliability of the TEMP was found to be high with a coefficient of .93.
controls for the encoding phase and when retrieving intentions at the right context (i.e.,
prospective component), especially for time-based tasks (Potvin et al., 2011). Correlational
analyses indicated that retrospective memory measures were linked to both prospective and
correlated with attentional processes and executive functions. Moreover, the authors found a
correlation of -.51 between the TEMP total scores and the informant’s reports on the CAPM.
However, there was no significant correlation between TEMP total scores and participant’
results on the CAPM (r = .06). Finally, the significant correlation between TEMP scores and
those obtained on the envelope task (r =.47) provides good evidence of convergent validity of
Virtual Week. The Virtual Week (Rendell & Craik, 2000, Experiment 1) is a computerized
PM task which simulates daily life activities on a virtual board game. As participants move
around the board, they make decisions about daily activities and are asked to perform lifelike
activities as PM tasks. The full version of the Virtual Week board game provides PM
assessment on 1-week simulation (from Monday to Sunday) and takes approximately one
hour to be completed. For each virtual day, participants perform 10 tasks, including 4 regular
activities (e.g., remembering to take asthma medication at breakfast and dinner), 4 irregular
activities (e.g., remembering to return a book to the library at 4 pm.) and 2 regular time-check
placed on the screen). Half of the regular and irregular activities are time- and event-based
PM tasks. Overall, regular tasks performances were better than both irregular tasks and time-
26
check tasks. The young participants (M = 21.30; age range = 19–24) performed better than
young-old participants (M = 67.83; age range = 61–73) for the time-check and irregular tasks,
and better than old-old participants (M = 78.84; age range = 75–84) for regular, irregular and
time-check tasks. To date, the Virtual Week has been translated and adapted in 2 different
languages (Italian version: Mioni, Stablum, Biernacki, & Rendell, 2015; Polish version:
The Virtual Week has also proved to be valid and consistently sensitive to impairment
in various clinical groups including substance abuse (Leitz, Morgan, Bisby, Rendell, &
Curran, 2009), schizophrenics (Henry, Rendell, Kliegel, & Altgassen, 2007), Parkinson’s
disease (Foster, Rose, McDaniel, & Rendell, 2013), mild cognitive impairment and dementia
(Thompson et al., 2015), multiple sclerosis (Rendell, Jensen, & Henry, 2007) and brain
The reliability of the computerized version of the Virtual Week, the most common
version, showed an acceptable Spearman-Brown split-half reliability for both young (.64) and
older adults (.93) (Rose, Rendell, McDaniel, Aberle, & Kliegel, 2010), as well as in various
clinical groups including schizophrenia (Henry et al., 2007), multiple sclerosis (Rendell et al.,
2012), Parkinson’s disease (Foster et al., 2013) and traumatic brain injury (Mioni, Rendell,
Henry, Cantagallo, & Stablum, 2013) versus controls. Overall, split-half reliability
coefficients ranged from .74 to .89 for these studies. Furthermore, the authors reported poorer
PM performances on the Virtual Week for older adults compared to their younger
counterparts and for individuals from clinical groups compared to healthy controls for all the
previously mentioned studies. Test-retest reliability of the Virtual Week was also examined
among healthy participants (Mioni, Rendell, Stablum, Gamberini, & Bisiacchi, 2014). In
experiment 1, when using the same version A, the older adults showed lower performance
compared to their younger counterparts and a high test-retest reliability coefficient was found
27
for older adults (r = .80), while the young adults had moderate test-retest coefficient (r = .61).
In the second experiment, the authors created a parallel version (version B) in which they
varied the content of the PM actions. The study only included an older adult sample assigned
to one of the two experimental conditions (version A and B at retest or vice versa) and
showed no differences in performance between the two versions with a moderate test-retest
Actual Week. The Actual Week is a Virtual Week adaptation in naturalistic settings (Rendell
& Craik, 2000, Experiment 2). Targets are the same for all participants but different from VW
because they are adapted to the situations encountered by participants in everyday life.
Participants are also requested to return one daily sheet per day to the experimenter without
checking any sheet after completion. Participants are also asked to record via a micro-recorder
full or partial fulfilment of each task. Older adults outperformed the young adults,
congruently with the well-known and intriguing pattern of age-related (i.e. age-PM-paradox).
Recently, Au, Vandermorris, Rendell, Craik and Troyer (2017) adapted the Actual
Week (Rendell & Craik, 2000) to assess healthy older adults (age range: 50–90 years) PM
performance in naturalistic settings. The time-check tasks were removed because participants
reported during the pre-test phase that these tasks were too difficult to recall and did not
participants had to remember to really perform each day an irregular call-back task in which
they were requested to send the experimenter a voicemail message. Participants were
encouraged to use all the techniques commonly used to remember everyday tasks and to
report the time of completion of each task in the appropriate daily log sheet. Their results
partly replicated those of their previous study (Rendell & Craik, 2000, Experiment 2), in that
event-based tasks were better recalled than time-based tasks. However, performance on
28
irregular tasks was better than on regular tasks. According to the authors, this result can be
explained by the procedural differences mentioned above, in particular by the novelty effect
Au et al.’s results (2017) also provided evidence for the reliability of this adaptation of
the Actual Week by showing a high internal consistency for both time completion of tasks (α
= .93) and accuracy (α = .95). The test-retest reliability coefficient of the Actual Week
showed that performance was stable over the time (r = .76). First day performance was
correlated to the remaining days with correlation coefficients ranging from .73 to 83,
suggesting that the administration of a single day was sufficient to ensure the reliability of the
convergent (i.e., memory of strategy use and verbal episodic memory with coefficients
ranging from .27 to .46) and not for divergent validity (i.e., the health-promoting lifestyle
behaviors and both positive and negative emotions experienced in the last week with
coefficients ranging from .01 to .18). They also observed that 82% of assigned voicemail
messages were totally in concordance with the self-reported time completion of tasks and
actual completion of the call-back task, which support the ecological validity of the Actual
Week.
based tasks while 62.50% (N = 10) assessed time-based tasks. Results also indicated that
87.50% (N = 14) of the identified PM measures were linked to functional outcomes, 68.75%
(N = 11) showed empirical evidences regarding the validity, 75.00% (N = 12) for reliability.
29
To a slightly lesser extent, 43.75% (N = 7) provided parallel versions and 31.25% (N = 5)
were normed, translated, and showed evidence for diagnostic value. Finally, 18.75% of the
measures (N = 3) allowed the use of external helps while only 12.50% (N = 2) were adapted
to a respective culture and provided qualitative scoring system. Tables 5–8 present the
percentages of PM measures that tested the criteria according to the type of assessment.
Meta-Analyses Results
Data from 3,136 different (nonoverlapping) participants (1,194 patients and 1,942
controls) were analyzed across the 22 studies included to compute a summary weighted mean
effect (see Table 9). Averaged mean age and education were 53.06 years (range = 18.64–
80.78) and 13.21 years (range = 11.62–14.49) respectively. Participants characteristics were
very different and included outpatient psychiatric or substance use treatment, patients with
subjective cognitive decline, patients diagnosed with mild cognitive impairment, Alzheimer’s
30
*** insert Table 9 about here ***
controls, patients had significant impairment in PM summary scores, with a mean effect
size ranging from -2.93 to -0.35 across studies, SMD = -1.49; SE = 0.24; 95% CI [-1.96,
-1.01], p < .001. However, as was evident in the forest plot, there was a large significant
heterogeneity between studies, Q(10) = 126.35, p < .001; I2 = 92.09%, 95% CI [87.82,
94.86], suggesting the need for a more in-depth analysis of study subgroups.
We created a funnel plot and used the Egger’s regression intercept in order to identify
the possible presence of asymmetry due to publication bias. Visual inspection of the figure
(see Supplementary Figure S1) shows asymmetry and the Egger’s regression intercept
suggests there is a publication bias (p = .04). Rosenberg’s fail-safe N suggests that 1,465
additional studies with null results would be required to yield a non-significant effect of PM
single-trial procedures showed that patients had lower PM summary scores compared to
controls, with a mean effect size of and ranging from -2.33 to -1.21 across studies, SMD
= -2.16; SE = 0.52; 95% CI [-3.18, -1.14], p < .001. Once again, there was a very large
heterogeneity between studies, Q(2) = 31.24, p < .001; I2 = 93.60%, 95% CI [84.67,
Visual inspection of the funnel plot revealed potential asymmetry (see Supplementary
Figure S2) but the Egger’s regression intercept was not statistically significant (p = .33),
suggesting the absence of a publication bias. Rosenberg’s fail-safe N suggests that 268
31
additional studies with null results would be required to yield a non-significant overall effect
administering questionnaires showed that there were no significant differences for self-
reported PM failures between patients and controls, with a mean effect size of and ranging
from -2.26 to 3.68 across studies, SMD = 0.18; SE = 0.45; 95% CI [-0.72, 1.06], p = .70.
Heterogeneity estimates were statistically significant and the effect was very large, Q(11) =
591,62, p < .001; I2 = 98.14%, 95% CI [97.58, 98.57], also indicating the need for planned
subgroup analysis.
Visual inspection of the funnel plot revealed potential asymmetry (see Supplementary
Figure S3) but the Egger’s regression intercept was not statistically significant (p = .38),
additional studies with null results would be required to yield a non-significant overall effect
scores than controls, with a mean effect size of and ranging from -1.44 to -0.41 across
studies, SMD = -0.79; SE = 0.18; 95% CI [-1.14, -0.44], p < .001. The distribution of
scores was homogeneous across the individual studies, with moderate nonsignificant
variation, Q(4) = 8.48, p = .08; I2 = 52.84%, 95% CI [0.00, 82.65],suggesting that this
A symmetrical funnel plot was observed (see Supplementary Figure S4) and the
Egger’s regression intercept showed that there was no publication bias (p = .74).
32
Rosenberg’s fail-safe N suggests that 71 additional studies with null results would be
Planned Subgroup Analyses. The data on the prompt card task used in Delprado et al.' study
(2012) were excluded from the analyses because it was the only one we found that met the
inclusion criteria in the current review. Therefore, planned subgroup analyses were only
performed for the envelope task in this category of measurement. In accordance with
Richardson, Garner, and Donegan (2019), the results of the planned subgroup analyses were
considered statistically significant when the p-value was less than 0.1.
controls when the Rivermead Behavioural memory Test (RBMT), the Cambridge Test of
Prospective Memory (CAMPROMPT), the Memory for Intentions Screening Test (MIST),
the Royal Prince Alfred Prospective Memory (RPA-ProMem) test and the envelope task were
administered. Although the distribution of scores was homogeneous for the RBMT, the
Questionnaire with small and moderate nonsignificant variations, this was not the case for the
CAMPROMPT and the envelope task, as well as for the Prospective Memory Questionnaire
and the Prospective and Retrospective Memory Questionnaire, for which the heterogeneity
33
Discussion
This paper is the first attempt to review systematically the literature regarding the
existing measure to assess PM and quantitatively summarize the effects of various diseases
according to the type of assessment. Fifty-two studies were included to examine the
characteristics of the identified PM measures and 22 studies were retained to summarize the
effect of diseases on PM. Among the 16 identified measures, we found 5 psychological tests
Cambridge Test of Prospective Memory, Memory for Intentions Screening Test, Royal Prince
Alfred Prospective Memory Test), 3 single-trial procedures (envelope task, prompt card task
Prospective, Virtual Week, Actual Week). These results now reduce the ambiguity regarding
the existing measures devoted to PM assessment in the literature. The findings of the current
study also showed that the use of specific measures may be of interest to identify PM
opportunities and research gaps in this arena and made recommendations to integrate the
2016). We showed that more than 50% of the identified PM measures were associated with
34
validity and reliability and measured both the event- and time-based PM tasks. However, it
appears that some of the challenges encountered by psychologists in their clinical practice
have received relatively little attention by memory researchers. This includes the lack of
normative data, test translations/adaptations, available cutoff scores for diagnostic purposes,
qualitative scoring, parallel versions for test-retest and specific instructions for use of external
aids during the test. These results suggest that, like classical neuropsychological assessment
instruments, PM measures suffer from these pitfalls, which may ultimately limit their utility
in clinical settings (Rabin et al., 2005, 2016). Together, these findings encourage researchers
to respond to these challenges to extend the clinical utility of the PM measures. We believe
that such an endeavor will answer the frequent assessment referral questions raised by
The preliminary results from the meta-analyses indicated a trend toward lower PM
performance for clinical groups compared to non-clinical groups when test batteries, single-
were heterogenous across studies, except for experimental procedures for which the effect
sizes were moderate and homogeneous. Results also showed null effect of group for the
Subsequent planned subgroup analyses indicated consistent differences for three test batteries
(Rivermead Behavioural memory Test, Memory for Intentions Screening Test, Royal Prince
Alfred Prospective Memory Test) with high effect sizes. This suggests that like experimental
procedures, the use of the Rivermead Behavioural memory Test (RBMT; Wilson et al., 1985),
the Memory for Intentions Screening Test (MIST; Raskin, 2004) and the Royal Prince Alfred
Prospective Memory Test (RPA-ProMem; Radford et al., 2011) are relevant to identify
variations in PM performance in many clinical groups, especially for patients with subjective
cognitive decline, mild cognitive impairment, brain damage and schizophrenia. However, it
35
was not possible to estimate a group effect of test batteries administration on the Cambridge
Test of Prospective Memory (CAMPROMPT; Wilson et al., 2005) measure of PM due to the
self-reported PM failures between groups. Effect sizes remained heterogeneous for studies
using the Prospective Memory Questionnaire (Hannon et al., 1990), and the Prospective and
Retrospective Memory Questionnaire (Smith et al., 2000), except for the Comprehensive
Assessment of Prospective Memory Questionnaire (Waugh, 1999). (It should be noted that we
excluded the Brief Assessment of Prospective Memory Questionnaire from the analyses due
to insufficient data for estimating effect sizes.) Compared to objective measures of PM, this
result corroborates previous empirical findings revealing that PM questionnaires were not
able to differentiate healthy participants from clinical groups, whereas these clinical groups
showed poorer PM performance on PM objective measures. These results are congruent with
Uttl and Kibreab's meta-analysis (2011) highlighting a lack of validity for PM self-report
measures. Taken together, PM questionnaires might not be used for diagnosis but have an
assessment in most memory centers is estimated between 90 and 120 minutes. The specific
PM measures like the CAMPROMPT and the MIST take about 30 to 40 minutes to be
administered. Therefore, the use of such tests may be complicated. This is partly due to the
fact that these have their own set of distractor tasks. In such a situation, the use of a more
flexible measure like the RPA-ProMem, which takes only 15 minutes to be administered,
could be an alternative solution to overcome this limitation and simplify its administration in
36
day-to-day clinical practice. However, further studies are needed to establish normative data
before expanding the use of RPA-ProMem in clinical practice. Although the literature review
showed that the use of single-trial procedures is interesting in assessing patients’ ability to
remember to carry out intended actions in a shorter period of time, other studies with various
clinical groups are required to ascertain their contribution to diagnosis, especially for patients
who are at risk of developing dementia. Indeed, it appears that some traditional retrospective
memory measures appear to be more effective in identifying patients with mild cognitive
impairment PM tests batteries like the CAMPROMPT or even single-trial procedure like the
envelope test (Delprado et al., 2012). This seems to corroborate the results of a previous study
that indicated that the reduced reliability of PM tasks was associated with a small number of
trials, which remains an issue for most of the measures identified in this study (see Kelemen,
Weinberg, Alford, Mulvey, & Kaeochinda, 2006). In this context, the use of experimental
procedures such as the Virtual Week may be relevant because it makes it possible to propose
impairment in a wide range of clinical groups. Face to these exciting lines of work, future
research should also focus on the development of revised, shortened versions of PM to extend
their clinical applicability in different languages and cultures. Such projects are often
organized at a cross-country level and should agree with standard international guidelines for
test development such as those provided by the International Test Commission (2010).
The current study has several limitations including a relatively small number of
studies that have been selected for inclusion in the meta-analytical review (n = 22), limited
range of test administered and variability in study populations with a lack of potential relevant
demographic data in some studies. Indeed, the studies included a wide range of populations
including normal aging, Alzheimer’s disease, mild cognitive impairment, multiple sclerosis,
brain injury, substance abuse, schizophrenia, HIV and spina bifida. Moreover, the features of
37
the identified PM were relatively different from each other (e.g., number of items, retention
intervals, administration time), even for the same type of assessment so it is difficult to
suggest the use of a unique measure of PM that can be an appropriate candidate for all
Strengths of the review included the use of PRISMA guidelines to identify the existing PM
measures, which involved the establishment of criteria for inclusion and exclusion of studies,
This review of the available PM measures should provide a useful and valuable
information to guide therapists who work with patients with various neuropathologies towards
the choice of the appropriate PM assessment, taking all due account of their clinical
requirement. Our work should also guide future research to ultimately extend the clinical
Funding
This work was supported by the Regional Council of Hauts-de-France and the European
Reginal Development Fund (FEDER) [REG16026]. The funders of the study had no role in
study design, data collection, data analysis, data interpretation, or writing of the report. The
corresponding author had full access to all the data in the study and had final responsibility
for the decision to submit for publication. GB drafted the protocol with MH, VQ and YG, and
GB assessed the eligibility of the studies for inclusion and extracted data. All authors
contributed to the interpretation of the findings. GB drafted the manuscript, to which all
authors contributed.
Conflict of interests
38
Acknowledgments
The authors thank Barbara Wilson and Peter Watson (MRC Cognition and Brain Sciences
Unit, University of Cambridge, United Kingdom) for their valuable and useful information
about the Rivermead Behavior Memory Test. The authors would also like to thank Béatrice
Université de Caen Normandie, France) for her precious comments on a previous version of
this manuscript.
References
Au, A., Vandermorris, S., Rendell, P. G., Craik, F. I. M., & Troyer, A. K. (2018).
Psychometric properties of the Actual Week test: a naturalistic prospective memory task.
https://doi.org/10.1080/13854046.2017.1360946
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2010). A basic
Carey, C. L., Paul Woods, S., Rippeth, J. D., Heaton, R. K., Grant, I., & the HIV
https://doi.org/10.1080/13803390590949494
Chau, L. T., Lee, J. B., Fleming, J., Roche, N., & Shum, D. (2007). Reliability and normative
39
https://doi.org/10.1080/09602010600923926
Cockburn, J., & Smith, P. T. (1989). The Rivermead Behavioural Memory Test supplement
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. NY: Routledge
Academic.
Crawford, J., Smith, G., Maylor, E., Della Sala, S., & Logie, R. (2003). The Prospective and
https://doi.org/10.1080/09658210244000027
Crovitz, H. F., & Daniel, W. F. (1984). Measurements of everyday memory: Toward the
https://doi.org/10.3758/BF03333861
Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (2000). CVLT-II: California verbal
Delprado, J., Kinsella, G., Ong, B., Pike, K., Ames, D., Storey, E., … Rand, E. (2012).
https://doi.org/10.1017/S135561771100172X
Dennis, M., Nelson, R., Jewell, D., & Fletcher, J. M. (2010). Prospective memory in adults
https://doi.org/10.1007/s00381-010-1140-z
40
Einstein, G. O., & McDaniel, M. A. (1990). Normal aging and prospective memory. Journal
https://doi.org/10.1037/0278-7393.16.4.717
Einstein, & M. A. McDaniel (Eds.), Prospective memory: Theory and applications (pp.
(Eds.), Prospective Memory: Theory and Applications (pp. 1–22). Mahwah, NJ, US:
Fleming, J., Kennedy, S., Fisher, R., Gill, H., Gullo, M., & Shum, D. (2009). Validity of the
Comprehensive Assessment of Prospective Memory (CAPM) for Use With Adults With
https://doi.org/10.1375/brim.10.1.34
Foster, E. R., Rose, N. S., McDaniel, M. A., & Rendell, P. G. (2013). Prospective memory in
Parkinson disease during a virtual week: Effects of both prospective and retrospective
Gondo, Y., Renge, N., Ishioka, Y., Kurokawa, I., Ueno, D., & Rendell, P. (2010). Reliability
young and old people: A Japanese study. Japanese Psychological Research, 52(3), 175–
41
185. https://doi.org/10.1111/j.1468-5884.2010.00433.x
Groot, Y. C., Wilson, B. A., Evans, J., & Watson, P. (2002). Prospective memory functioning
in people with and without brain injury. Journal of the International Neuropsychological
Hainselin, M., Quinette, P., Desgranges, B., Martinaud, O., Hannequin, D., De La Sayette, V.,
… Eustache, F. (2011). Can we remember future actions yet forget the last two minutes?
4149. https://doi.org/10.1162/jocn_a_00076
Hannon, R., Adams, P., Harrington, S., Fries-Dias, C., & Gipson, M. T. (1995). Effects of
brain injury and age on prospective memory self-rating and performance. Rehabilitation
Hannon, R., Gipson, M. T., Rebmann, M., Keneipp, J., Sattler, J., Lonero, P., … Bolter, J. F.
Heffernan, T., Clark, R., Bartholomew, J., Ling, J., & Stephens, S. (2010b). Does binge
drinking in teenagers affect their everyday prospective memory? Drug and Alcohol
Heffernan, T. M., O’Neill, T. S., & Moss, M. (2013). Smoking-related prospective memory
42
https://doi.org/http://dx.doi.org.ez.statsbiblioteket.dk:2048/10.1017/ipm.2012.4
Heffernan, Thomas, & O’Neill, T. (2012). Time based prospective memory deficits associated
with binge drinking: evidence from the Cambridge Prospective Memory Test
https://doi.org/10.1016/j.drugalcdep.2011.11.014
Heffernan, Thomas, O’Neill, T., & Moss, M. (2010a). Smoking and everyday prospective
Henry, J. D., MacLeod, M. S., Phillips, L. H., & Crawford, J. R. (2004). A Meta-Analytic
Review of Prospective Memory and Aging. Psychology and Aging, 19(1), 27–39.
https://doi.org/10.1037/0882-7974.19.1.27
Henry, J. D., Rendell, P. G., Kliegel, M., & Altgassen, M. (2007). Prospective memory in
179–185. https://doi.org/10.1016/j.schres.2007.06.003
Higgins, J., & Green, S. (2011). Cochrane Handbook for Systematic Reviews of Interventions
560. https://doi.org/10.1136/bmj.327.7414.557
Hsu, Y.-H., & Hua, M.-S. (2011). Taiwan Version of the Prospective and Retrospective
43
Neuropsychology, 26(3), 240–249. https://doi.org/10.1093/arclin/acr012
Hsu, Yen-Hsuan, Huang, C.-F., Tu, M.-C., & Hua, M.-S. (2014). The Clinical Utility of
https://doi.org/10.1371/journal.pone.0112210
Hsu, Yen-Hsuan, Huang, C.-F., Tu, M.-C., & Hua, M.-S. (2015). Prospective Memory in
https://doi.org/10.1097/WAD.0000000000000060
Huppert, F. A., Johnson, T., & Nickson, J. (2000). High prevalence of prospective memory
Ihle, A., Hering, A., Mahy, C. E. V, Bisiacchi, P. S., & Kliegel, M. (2013). Adult age
a meta-analysis on the role of task order specificity. Psychology and Aging, 28(3), 714–
720. https://doi.org/10.1037/a0033653
International Test Commission. (2010). Guidelines for translating and adapting tests.
Kelemen, W. L., Weinberg, W. B., Alford, H. S., Mulvey, E. K., & Kaeochinda, K. F. (2006).
Kime, S. K., Lamb, D. G., & Wilson, B. A. (1996). Use of a comprehensive programme of
external cueing to enhance procedural memory in a patient with dense amnesia. Brain
44
Injury, 10(1), 17–26. https://doi.org/10.1080/026990596124683
Kinsella, G. J., Pike, K. E., Cavuoto, M. G., & Lee, S. D. (2018). Mild cognitive impairment
https://doi.org/10.1080/13854046.2018.1468926
Landsiedel, J., Williams, D. M., & Abbot-Smith, K. (2017). A Meta-Analysis and Critical
Lawton, M. P., & Brody, E. M. (1969). Assessment of Older People: Self-Maintaining and
https://doi.org/10.1093/geront/9.3_Part_1.179
Lee, S., Ong, B., Pike, K. E., Mullaly, E., Rand, E., Storey, E., … Kinsella, G. J. (2016). The
149. https://doi.org/10.1080/13854046.2015.1135983
Leitz, J. R., Morgan, C. J. A., Bisby, J. A., Rendell, P. G., & Curran, H. V. (2009). Global
Man, D. W. K., Fleming, J., Hohaus, L., & Shum, D. (2011). Development of the Brief
Assessment of Prospective Memory (BAPM) for use with traumatic brain injury
https://doi.org/10.1080/09602011.2011.627270
Martin, M., Kliegel, M., & McDaniel, M. A. (2003). The involvement of executive functions
45
in prospective memory performance of adults. International Journal of Psychology,
Mathias, J. L., & Mansfield, K. M. (2005). Prospective and declarative memory problems
following moderate and severe traumatic brain injury. Brain Injury, 19(4), 271–282.
McDaniel, M. A., & Einstein, G. O. (2000). Strategic and automatic processes in prospective
144. https://doi.org/10.1002/acp.775
https://doi.org/http://dx.doi.org/10.4135/9781452225913
McDaniel, M. A., Umanath, S., Einstein, G. O., & Waldum, E. R. (2015). Dual pathways to
https://doi.org/10.3389/fnhum.2015.00392
Mills, V., Kixmiller, J. S., Gillespie, A., Allard, J., Flynn, E., Bowman, A., & Brawn, C. M.
(1997). The correspondence between the Rivermead Behavioral Memory Test and
Mioni, G, Rendell, P. G., Henry, J. . D., Cantagallo, A., & Stablum, F. (2013). An
using Virtual Week. Journal of Clinical and Experimental Neuropsychology, 35(6), 617–
630. https://doi.org/10.1080/13803395.2013.804036
Mioni, Giovanna, Rendell, P. G., Stablum, F., Gamberini, L., & Bisiacchi, P. S. (2014). Test–
46
https://doi.org/10.1080/09602011.2014.941295
Mioni, Giovanna, Stablum, F., Biernacki, K., & Rendell, P. G. (2015). Virtual Week:
1–21. https://doi.org/10.1080/09602011.2015.1103758
Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gøtzsche, P. C., Ioannidis, J. P. A., …
Moher, D. (2009). The PRISMA statement for reporting systematic reviews and meta-
analyses of studies that evaluate health care interventions: explanation and elaboration.
https://doi.org/10.1016/j.jclinepi.2009.06.006
Process Model and Comparisons with Other Models. In S. Daniel & E. Tulving (Eds.),
Technology.
Niedźwieńska, A., Rendell, P. G., Barzykowski, K., & Leszczyńska, A. (2016). Virtual Week:
https://doi.org/10.1016/j.erap.2016.02.003
Piauilino, D. C., Bueno, O. F. A., Tufik, S., Bittencourt, L. R., Santos-Silva, R., Hachul, H.,
https://doi.org/10.1080/09658211003742672
Potvin, M.-J., Rouleau, I., Audy, J., Charbonneau, S., & Giguère, J.-F. (2011). Ecological
prospective memory assessment in patients with traumatic brain injury. Brain Injury,
47
25(2), 192–205. https://doi.org/10.3109/02699052.2010.541896
Rabin, L. A., Barr, W. B., & Burton, L. A. (2005). Assessment practices of clinical
neuropsychologists in the United States and Canada: A survey of INS, NAN, and APA
https://doi.org/10.1016/j.acn.2004.02.005
Rabin, L. A., Chi, S. Y., Wang, C., Fogel, J., Kann, S. J., & Aronov, A. (2014). Prospective
memory on a novel clinical task in older adults with mild cognitive impairment and
https://doi.org/10.1080/09602011.2014.915855
Rabin, L. A., Paolillo, E., & Barr, W. B. (2016). Stability in Test-Usage Practices of Clinical
Neuropsychologists in the United States and Canada over a 10-Year Period: A Follow-
206–230. https://doi.org/10.1093/arclin/acw007
Radford, K. A., Lah, S., Say, M. J., & Miller, L. A. (2011). Validation of a new measure of
prospective memory: The Royal Prince Alfred Prospective Memory Test. The Clinical
Ramanan, S., & Kumar, D. (2013). Prospective Memory in Parkinson’s Disease: A Meta-
https://doi.org/10.1017/S1355617713001045
Raskin, S. A. (2004). Memory for intentions screening test [abstract]. Journal of the
Rendell, P. G., & Craik, F. I. M. (2000). Virtual Week and Actual Week : Age-related
https://doi.org/10.1002/acp.770
48
Rendell, P. G., Henry, J. D., Phillips, L. H., de la Piedad Garcia, X., Booth, P., Phillips, P., &
https://doi.org/10.1080/13803395.2012.670388
Rendell, P. G., Jensen, F., & Henry, J. D. (2007). Prospective memory in multiple sclerosis.
https://doi.org/10.1017/S1355617707070579
Rendell, P. G., & Thomson, D. M. (1999). Aging and Prospective Memory: Differences
https://doi.org/10.1093/geronb/54B.4.P256
Richardson, M., Garner, P., & Donegan, S. (2019). Interpretation of subgroup analyses in
systematic reviews: A tutorial. Clinical Epidemiology and Global Health, 7(2), 192–198.
https://doi.org/10.1016/j.cegh.2018.05.005
memory failure in adults with traumatic brain injury. Brain Injury, 16(11), 931–945.
https://doi.org/10.1080/02699050210138581
Rönnlund, M., Mäntylä, T., & Nilsson, L. G. (2008). The Prospective and Retrospective
memory ratings, and Swedish norms: Scandinavian Journal of Psychology, 49(1), 11–18.
https://doi.org/10.1111/j.1467-9450.2007.00600.x
Rose, N. S., Rendell, P. G., McDaniel, M. A., Aberle, I., & Kliegel, M. (2010). Age and
49
working memory, vigilance, task regularity, and cue focality. Psychology and Aging,
Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological
Senn, S. J. (2009). Overstating the evidence – double counting in meta-analysis and related
https://doi.org/10.1186/1471-2288-9-10
Smith, G., Del Sala, S., Logie, R. H., & Maylor, E. A. (2000). Prospective and retrospective
memory in normal ageing and dementia: A questionnaire study. Memory, 8(5), 311–321.
https://doi.org/10.1080/09658210050117735
https://doi.org/10.1037/0278-7393.29.3.347
Tate, R., Hodgkinson, A., Veerabangsa, A., & Maggiotto, S. (1999). Measuring Psychosocial
Recovery after Traumatic Brain Injury: Psychometric Properties of a New Scale. Journal
199912000-00003
Terry, W. S. (1988). Everyday Forgetting - Data from a Diary Study. Psychological Reports,
50
62(1), 299–303. https://doi.org/10.2466/pr0.1988.62.1.299
Terwee, C. B., Jansma, E. P., Riphagen, I. I., & De Vet, H. C. W. (2009). Development of a
https://doi.org/10.1007/s11136-009-9528-5
Thompson, C. L., Henry, J. D., Rendell, P. G., Withall, A., & Brodaty, H. (2015). How Valid
Are Subjective Ratings of Prospective Memory in Mild Cognitive Impairment and Early
Titov, N., & Knight, R. G. (2001). A video-based procedure for the assessment of prospective
0720(200101/02)15:1<61::AID-ACP689>3.0.CO;2-Y
Troyer, A. K., & Murphy, K. J. (2007). Memory for intentions in amnestic mild cognitive
https://doi.org/10.1017/S1355617707070452
Umeda, S., Nagumo, Y., & Kato, M. (2006). Dissociative Contributions of Medial Temporal
Uttl, B., & Kibreab, M. (2011). Self-report measures of prospective memory are reliable but
Van Den Berg, E., Kant, N., & Postma, A. (2012). Remember to Buy Milk on the Way
51
and Dementia. Journal of the International Neuropsychological Society, 18(04), 706–
716. https://doi.org/10.1017/S1355617712000331
Wallace, B. C., Small, K., Brodley, C. E., Lau, J., & Trikalinos, T. A. (2012). Deploying an
of the 2nd ACM SIGHIT symposium on International health informatics - IHI ’12 (p.
https://doi.org/10.1145/2110363.2110464
Walter, S. D., & Yao, X. (2007). Effect sizes can be calculated for studies reporting ranges for
852. https://doi.org/10.1016/j.jclinepi.2006.11.003
Wang, Y., Cui, J., Chan, R. C. K., Deng, Y., Shi, H., Hong, X., … Shum, D. (2009). Meta-
Waugh, N. (1999). Self-report of the young, middle-aged, young-old and old-old individuals
Wechsler, D. (1987). Wechsler Memory Scale-Revised manual. San Antonio, TX: The
Psychological Corporation.
https://doi.org/http://dx.doi.org/10.1080/09602019108401386
Wilson, B. A. (2009). Memory rehabilitation: Integrating theory and practice. New York, NY,
52
US: Guilford Press.
Wilson, B. A., Cockburn, J., & Baddeley, A. D. (1985). The Rivermead Behavioural Memory
Wilson, B. A., Cockburn, J., Baddeley, A., & Hiorns, R. (1989). The development and
validation of a test battery for detecting and monitoring everyday memory problems.
https://doi.org/10.1080/01688638908400940
Wilson, B. A., Emslie, H., Foley, J., Shiel, A., Watson, P., Hawkins, K., & Groot, Y. C.
Harcourt Assessment.
Wilson, B. A., Forester, S., Bryant, T., & Cockburn, J. (1990). Performance of 11–14 year
olds on the Rivermead Behavioural Memory Test. Clinical Psychology Forum, 30, 8–10.
Wilson, B. A., Greenfield, E., Clare, L., Baddeley, A. D., Cockburn, J., Watson, P., …
Wilson, B. A., Ivani-chalian, R., Besag, F. M. C., & Bryant, T. (1993). Adapting the
rivermead behavioural memory test for use with children aged 5 to 10 years. Journal of
https://doi.org/10.1080/01688639308402572
Wong Gonzalez, D. (2015). Prospective Memory Following Traumatic Brain Injury : A Meta-
Woods, S. P., Iudicello, J. E., Moran, L. M., Carey, C. L., Dawson, M. S., & Grant, I.
53
https://doi.org/10.1037/0894-4105.22.1.110
Woods, S. P., Moran, L. M., Dawson, M. S., Carey, C. L., Grant, I., & HIV Neurobehavioral
https://doi.org/10.1080/13854040701595999.Psychometric
Woods, S. P., Twamley, E. W., Dawson, M. S., Narvaez, J. M., & Jeste, D. V. (2007).
Deficits in cue detection and intention retrieval underlie prospective memory impairment
https://doi.org/10.1016/j.schres.2006.11.005
Yang, J., Zhong, F., Qiu, J., Cheng, H., & Wang, K. (2015). Dissociation of event-based
prospective memory and time-based prospective memory in patients with prostate cancer
Zhou, F.-C., Wang, Y.-Y., Zheng, W., Zhang, Q., Ungvari, G. S., Ng, C. H., … Zaza, S.
Zogg, J. B., Woods, S. P., Sauceda, J. A., Wiebe, J. S., & Simoni, J. M. (2012). The role of
9341-9
54
Tables
Table 1. Overview of the criteria met for each of the identified PM batteries and their main
characteristics
N of N of
criteria PM
EB TB LF Retention Duration
Measures T CA V R N DV EA PV QS met/N of items
PM PM O interval (mn)
criteria
(max. 12)
Rivermead
Behavioural ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 8 3 20 mn 30
memory Test
Cambridge
Behavioural 3, 15 and 20
✔ ✔ ✔ 3 8 40
Prospective mn
Memory Test
Cambridge
Test of 7, 13 ,20 mn
✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 10 6 25—30
Prospective and 24 hours
Memory
Memory for
Intentions 2, 15 mn and
✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 9 8 30—40
Screening 24 hours
Test
15 mn, total
duration of
Royal Prince the session,
Alfred when arrived
✔ ✔ ✔ ✔ ✔ ✔ 6 4 15
Prospective at home and
Memory Test 1 week after
the end of the
session
55
Table 2. Overview of the criteria met for each of the identified single-trial procedures and
their main characteristics
N of
criteria
EBP TBP Retention
Measures T CA V R N DV EA PV QS LFO met/N of
M M interval
criteria
(max. 12)
Envelope task ✔ ✔ ✔ 3 10 mn
total duration
Prompt card task ✔ ✔ 2
of the session
Telephone test ✔ ✔ 2 5 mn
56
Table 3. Overview of the criteria met for PM questionnaires and their main characteristics
N of
criteria
EBP TBP N of PM Duration
Measures T CA V R N DV QS LFO met/N of
M M items (mn)
criteria
(max. 10)
Prospective
Memory ✔ ✔ ✔ ✔ ✔ 5 52 15-17
Questionnaire
Prospective and
Retrospective
✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 9 8 3-5
Memory
Questionnaire
Comprehensive
Assessment of
✔ ✔ ✔ ✔ ✔ 5 39 13-15
Prospective
Memory
Brief
Assessment of
✔ ✔ ✔ 3 16 5-7
Prospective
Memory
57
Table 4. Overview of the criteria met for each of the identified PM experimental procedures
and their main characteristics
N of
criteria N of
EB TB LF Duration
Measures T CA V R N DV EA PV QS met/N of PM
PM PM O (mn)
criteria item
(max. 12)
Prospective
Remembering 18 and
✔ ✔ ✔ ✔ ✔ 5 12
Video 21
Procedure
Test
Écologique de
✔ ✔ ✔ ✔ ✔ ✔ 6 15 20
Mémoire
Prospective
10 60 (full
Virtual Week ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ 8
/daya version)a
5 and 7-
10
Actual Week ✔ ✔ ✔ ✔ ✔ 5 day
/daya
version
a
The number of days and items may differ between studies.
58
Table 5. Percentages of criteria met for PM test batteries
Criteria met (n) Percentages of criteria met Related tests
T (2) 40.00 RBMT, CAMPROMPT
–
CA (0) 0.00
59
Table 6. Percentages of criteria met for single-trial PM procedures
Criteria met (n) Percentages of criteria met Related tests
–
T (0) 0.00
–
CA (0) 0.00
–
V (0) 0.00
–
R (0) 0.00
–
N (0) 0.00
60
Table 7. Percentages of criteria met for self-reported PM questionnaires
Criteria met (n) Percentages of criteria met Related tests
T (2) 50.00 PRMQ, CAPM
QS (0) 0.00 –
61
Table 8. Percentages of criteria met for PM experimental procedures
Criteria met (n) Percentages of criteria met Related tests
T (1) 25.00 Virtual Week
N (0) 0.00 –
DV (0) 0.00 –
QS (0) 0.00 –
62
Table 9. Characteristics of the studies included in the meta-analyses
Study Sample Population characteristics Mean age in Mean education in PM measures administered
size years (SD) years (SD)
Carey et al., 2006 71 HIV, healthy 44.28 (10.13) 14.15 (2.54) MIST
CAMPROMPT, Envelope task, Prompt
Delprado et al., 2012 168 aMCI, healthy 74.82 (6.11) 13.13 (2.92)
card task
Dennis et al., 2010 102 SBM, healthy 31.60 (13.80) - CAMPROMPT
Fleming et al., 2009 72 brain injury, healthy 30.02 (11.46) - CAPM
patients with various
Groot et al. 2002 62 neurological conditions, 35.43 (10.97) 13.11 (2.65) CBPMT
healthy
Hannon et al., 1995 129 brain injury, healthy - - PMQ
Heffernan & O'Neil, 2012 56 substance abuse, healthy 24.20 (5.38) - CAMPROMPT, PRMQ
Heffernan et al., 2010a 40 substance abuse, healthy 23.66 (4.92) - CAMPROMPT, PRMQ
Heffernan et al., 2010b 50 substance abuse, healthy 18.64 (0.47) - PRMQ, PRVQ
Heffernan et al., 2013 78 substance abuse, healthy 20.85 (2.39) - PMQ, PRVP
Henry et al., 2007 69 schizophrenia, healthy 36.72 (10.54) 13.80 (2.70) Virtual week
Lee et al., 2016 154 aMCI, healthy 73.63 (2.55) 13.38 (1.61) Envelope task, PRMQ
Man et al., 2011 667 TBI, healthy 44.98 (22.14) - BAPM
Mathias & Mansfield,
50 TBI, healthy 28.50 (9.85) 11.90 (1.86) RBMT
2005
Mioni et al., 2013 36 TBI, healthy 31.86 (10.08) 12.11 (3.20) Virtual Week
Rabin et al., 2014 257 naMCI, MCI, healthy 80.78 (5.57) 14.49 (3.44) RPA-ProMem
patients with various
Radford et al., 2011 40 neurological conditions, 38.45 (15.38) 14.05 (2.49) RPA-ProMem
healthy
63
Rendell et al., 2012 60 MS, healthy 47.05 (9.86) 14.15 (2.95) Virtual Week
Smith et al., 2000 397 AD, healthy 73.21 (8.51) 12.65 (3.50) PRMQ
Thompson et al., 2015 138 dementia, healthy 78.62 (5.15) 11.62 (3.61) PRMQ, Virtual Week
Wilson et al., 1989 294 brain injury, healthy 43.10 (11.24)* - RBMT
Woods et al., 2007 82 schizophrenia, healthy 46.85 (10.38) 13.4 (1.78) MIST
Notes: AD = Alzheimer disease; aMCI = amnestic mild cognitive impairment; HIV = human immunodeficiency virus; MS = multiple sclerosis;
naMCI = non-amnestic mild cognitive impairment; SBM = spina bifida meningomyelocele; TBI = traumatic brain injury. BAPM = Brief
Assessment of Prospective Memory questionnaire; CAMPROMPT = Cambridge Test of Prospective Memory; CAPM = Comprehensive
Assessment of Prospective Memory questionnaire; CBPMT = Cambridge Behavioural Prospective Memory Test; MIST = Memory for Intentions
Screening Test; PMQ = Prospective Memory Questionnaire; PRMQ = Prospective and Retrospective Memory Questionnaire; PRVP =
Prospective Remembering Video Procedure; RBMT = Rivermead Behavioural memory Test; RPA-ProMem = Royal Prince Alfred Prospective
Memory Test. * we estimated standard deviations for both patients and controls using a tabulated conversion f (see Walter & Yao, 2007 for a
description of the statistical method used) because they were not provided in Wilson et al.’ study (1989).
64
Table 10. Planned subgroup analyses
N of N of
Type of PM assessment Tests SMD SE p Q p I2 (%)
studies participants
RBMT 2 344 -0.64 0.21 .002 2.08 .14 52.02
Single-trial measures Envelope task 2 321 -1.77 0.56 <.001 17.57 <.001 94.31
65
Legends to figures
Figure 1. Flow chart depicting the study selection process through the phases of the
66