Systematic Review of The Psychometric Properties of The Boston Carpal Tunnel Questionnaire
Systematic Review of The Psychometric Properties of The Boston Carpal Tunnel Questionnaire
Systematic Review of The Psychometric Properties of The Boston Carpal Tunnel Questionnaire
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/6740682
CITATIONS READS
105 285
3 authors:
Fujian Song
University of East Anglia
222 PUBLICATIONS 12,588 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jose Carlos de Carvalho Leite on 01 June 2016.
Abstract
Background: The Boston Carpal Tunnel Questionnaire (BCTQ) is a disease-specific measure of
self-reported symptom severity and functional status. It is frequently used in the reporting of
outcomes from trials into interventions for carpal tunnel syndrome. We conducted a systematic
review of published studies on the psychometric properties of the BCTQ to determine the level
of evidence on the instrument's validity, reliability and responsiveness to date.
Methods: A search of the databases Medline, CINAHL, AMED and PsychInfo was conducted to
retrieve studies which investigated one or more of the psychometric properties of the BCTQ. Data
abstraction was undertaken by the first two authors.
Results: Ten studies were retrieved which met the inclusion criteria. One study evaluated face and
content validity (43 patients) eight studies assessed construct validity (932 patients), four studies
tested reliability (126 patients) and nine studies assessed responsiveness (986 patients).
Interpretability was evaluated in one study and acceptability in eight studies (978 patients).
Conclusion: The BCTQ is a standardised, patient-based outcome measure of symptom severity
and functional status in patients with carpal tunnel syndrome. The evidence base of the
psychometric properties indicates that the BCTQ is a valid, reliable, responsive and acceptable
instrument and should be included as a primary outcome measures in future CTS trials.
Page 1 of 9
(page number not for citation purposes)
BMC Musculoskeletal Disorders 2006, 7:78 http://www.biomedcentral.com/1471-2474/7/78
encompass all relevant outcomes a combination of Each article was independently read by the first two
generic and specific measures need to be employed [9]. authors and a data extraction form completed. Any dis-
crepancies between reviewers were discussed and agreed.
The Boston Carpal Tunnel Questionnaire (BCTQ), also The data from the studies were summarised in tables, and
referred to as the Levine scale[10], Brigham and Womens' then qualitatively synthesized.
Carpal Tunnel Questionnaire [11] and Carpal Tunnel
Syndrome Instrument [12], is a patient-based outcome Psychometric properties assessed
measure that has been developed specifically for patients The psychometric properties of outcome measures should
with CTS. It has two distinct scales, the Symptom Severity be assessed by their face and content validity, construct
Scale (SSS) which has 11 questions and uses a five-point validity, inter-tester and intra-tester reliability, responsive-
rating scale and the Functional Status Scale (FSS) contain- ness, interpretability, and acceptability and responder
ing 8 items which have to be rated for degree of difficulty burden[13]. A full explanation of these concepts is
on a five-point scale. Each scale generates a final score beyond the scope of this paper and the reader is referred
(sum of individual scores divided by number of items) to Fitzpatrick et al [13]or Norman and Streiner [14], how-
which ranges from 1 to 5, with a higher score indicating ever a brief definition of these psychometric criteria in the
greater disability. The BCTQ has been used as an outcome context of patient-based questionnaires is given in Addi-
measure in clinical studies, and has also undergone exten- tional file 2.
sive testing for validity, reliability and responsiveness. The
purpose of this paper is to review and synthesise the evi- Results
dence on the psychometric properties of the BCTQ pub- The search yielded 21 hits. After reading the titles and
lished to date, and to make recommendations regarding abstracts, eleven studies were excluded because they did
its use in practice and research. not include the BCTQ (n = 7); the BCTQ was used as a cri-
terion measure for other instruments (n = 1); the BCTQ
Methods was applied as a measure to ascertain the incidence or
Search strategy and review criteria severity of CTS (n = 2) or the BCTQ was compared against
The review considered all studies designed primarily to other diagnostic tests designed to detect CTS (n = 1).
investigate an aspect of validity, reliability or responsive-
ness of the BCTQ in patients with carpal tunnel syn- A total of ten studies which were primarily designed to
drome. We also considered any studies reporting on evaluate one or several psychometric properties of the
interpretability and patient acceptability of the BCTQ. BCTQ were included [10-12,15-21]. All these studies
applied the questionnaire in adults (aged 18 to 90 years
The bibliographic databases Medline (1966–2005), old) with a diagnosis of carpal tunnel syndrome (Table 1).
CINAHL (1982–2005), AMED (1985–2005) and Psy-
cINFO (1887–2005) were searched using the following Face and content validity were examined in a study which
MeSH terms: carpal tunnel syndrome, outcome assess- led to the original development of the BCTQ [10] con-
ment, questionnaires, psychometrics, validity, reliability, struct validity was assessed in eight studies, totalling 932
reproducibility, responsiveness. Keyword searches were patients; responsiveness in nine studies (986 patients);
also made using 'Boston Carpal Tunnel Questionnaire' test-retest reliability in four studies (126 patients); accept-
and 'carpal tunnel instrument'. Bibliographies of the arti- ability in eight studies (978 patients); and interpretability
cles obtained were checked to identify any studies not in one study (196 patients) (Table 1). A cross-sectional
retrieved through the electronic databases. The search was design was used in studies assessing face and content
limited to English language articles only. validity [10] and interpretability of the BCTQ [16]. The
four studies which investigated reliability and the nine
The title and abstract of the articles retrieved were read studies assessing responsiveness used prospective cohort
and selected for inclusion if they fulfilled the following data. One study [10] also used retrospective cohort data to
criteria: a prospective, observational study or clinical trial assess responsiveness which has not been reported here
designed to evaluate validity, reliability, responsiveness, due to its limited value. The study design was observa-
acceptability or interpretability of the BCTQ in patients tional for almost all studies, except one [19] which used
with CTS. The full text was obtained for those articles data from a randomized controlled trial. Study power was
which met the inclusion criteria. considered in three out of nine studies. In both the studies
by Bessette et al [16] and Katz et al [18] patients were
A data extraction form was developed and used to sum- recruited through a community-based observational
marise information regarding the psychometric properties study in Maine between 1992–93 and therefore it is pos-
assessed including design, methods, sample and main sible that the same patients were included in both data-
results of those studies included (see Additional file 1). sets.
Page 2 of 9
(page number not for citation purposes)
Page 3 of 9
http://www.biomedcentral.com/1471-2474/7/78
Face/Content Construct Responsiveness Reliability Acceptability Interpretability Population Design* Observational Study No. of Patients included in the
validity validity (O) RCT Power analyses
Legend:
* P = Prospective, R=Retrospective (R), C = Cross-sectional
§ Workers compensation recipients n = 113, non-recipients n = 155
BMC Musculoskeletal Disorders 2006, 7:78 http://www.biomedcentral.com/1471-2474/7/78
Page 4 of 9
(page number not for citation purposes)
Page 5 of 9
http://www.biomedcentral.com/1471-2474/7/78
Symptoms
Historical-objective severity scale Significant correlation (p < 0.001) (Mondelli et al,
2002)
Duration of symptoms Significant correlation (p < 0.001) (Mondelli et al,
2002)
Expectation of symptom relief 0.51 (.56§) (Bessette et al, 1998)
Symptom relief 0.51 (0.56§) (Bessette et al, 1998) 0.59 (Katz et al, 1996)* 0.48 (Katz et al, 1996)*
0.31 (Katz et al, 1996)** 0.19 BCTQ (Katz et al, 1996)**
Nerve conduction studies
Median-nerve sensory conduction velocity 0.11 (Levine et al, 1993) 0.12 (Levine et al, 1993)
Electrophysiological study Significant correlation (p < 0.001) (Mondelli et al,
2002)
Clinical sensory tests
Semmes-Weinstein monofilaments 0.17 (Levine et al, 1993) 0.24 (Levine et al, 1993)
2-point discrimination 0.15 (Levine et al, 1993) 0.42 (Levine et al, 1993)
Clinical motor tests
Pinch strength 0.73 (Amadio et al, 1996) 0.47 (Levine et al, 1993) 0.60 (Levine et al, 1993)
Grip strength 0.87 (Amadio et al, 1996) 0.38 (Levine et al, 1993) 0.50 (Levine et al, 1993)
Patient-based measures of function
DASH (6 wk post-operative) 0.90 (Gay et al, 2003)
DASH (12 wk pos-toperative) 0.87 (Gay et al, 2003)
BMC Musculoskeletal Disorders 2006, 7:78
Legend:
† – Spearman's rank correlation coefficient and nonparametric test used due to non-normal distribution of the outcome scales.
§-Weighted disease-specific health status measure in BCTQ (For each item subjects were asked how important relief of the specific "symptom" or improvement of the specific "function" was to the decision to have
surgery.)
‡ – Spearman correlation coefficient (minimum and maximum) between the BCTQ overall score and the 8 scales of the 36-item Short-form Health Survey.
* – Workers' compensation recipients
** – Workers' compensation non-recipients
BMC Musculoskeletal Disorders 2006, 7:78 http://www.biomedcentral.com/1471-2474/7/78
interventions. The use of Effect sizes as responsiveness for each subscale according to those patients who were
indices tended to generate slightly larger values than when satisfied, somewhat satisfied and dissatisfied. The mean
the SRM was used, hence they should not be compared change between pre- and postoperative scores for the
directly. The responsiveness of the BCTQ total score was Symptom Severity Scale for the satisfied, somewhat satis-
reported in one prospective cohort study [16] (not in fied and dissatisfied patients were 1.6, 1.0 and 0.2 respec-
Table 3). This study assessed the relative responsiveness to tively, indicating that a minimum difference of 0.8 can be
change of generic versus disease-specific as well as deemed as clinically important using patients satisfaction
unweighted versus weighted health status measures in car- as a criterion. For the Functional Status Scale the mean
pal tunnel syndrome. The weighted disease-specific health change pre- and post-operatively were 1.0, 0.6 and 0.1 for
status measure was obtained by asking the subjects how the satisfied, somewhat satisfied and dissatisfied patients
important relief of the specific symptom or improvement respectively, suggesting that a value of 0.5 is clinically
of the specific function measured by the BCTQ was to the important.
decision to have surgery. The weighted-BCTQ score (SRM
= 1.56, ES = 1.99) was more responsive than the Discussion
unweighted score (SRM = 1.36, ES = 1.57). The generic The ten selected studies presented some strength. The
health status measures were less sensitive to change than studies sampled a wide age range of participants, which is
the BCTQ. desirable considering that items of relevance for the
young and the elderly are incorporated [13,22]. Sample
Acceptability of the BCTQ sizes appeared to be adequate to yield stable correlations,
Acceptability was examined in eight studies (Table 1). The although power calculations were reported in three stud-
burden of completing the BCTQ was reported as minimal ies only. The studies used prospective cohort data to assess
in two studies [10,15] based on no loss to follow-up. the majority of the BCTQ psychometric properties. Pro-
Greenslade et al[21] reported the mean time taken to spective cohort studies are at greater risk of missing data
complete the BCTQ as 5.6 minutes (± 3.5 min). Loss to during the follow-up phase which in turn can lead to sys-
follow-up or incomplete responses ranging from 1% to tematic measurement error. However this does not appear
10% were observed in four of the eight studies to be a major problem in these studies, since participation
[12,16,17,21] and reached 19% in two studies[11,18]. rates were relatively high with losses of follow-up varying
Bessette et al [16] reported that only nine out of 231 sub- from none to 19%.
jects who completed the 6 month follow-up evaluation
did not complete the BCTQ, giving a response rate of Limitations must also be acknowledged. Firstly, none of
96%. Greenslade et al [21]showed that two out of 312 the ten studies assessed all the psychometric properties,
pre- and post-operative questionnaires returned had miss- making comparisons difficult specially regarding face and
ing information in the Symptom Severity Scale and 17 out content validity, acceptability, interpretability and relia-
of 312 in the functional status scale, which corresponds to bility. Secondly, the factor structure of the BCTQ, an
a response rate of 99% and 95% respectively. There are no aspect of construct validity, has not been examined in the
recommendations in the literature to date with regards to selected studies. Factor analysis is a method of assessing
how missing responses should be managed and what the the construct validity of a questionnaire. In confirmatory
threshold for number of incomplete items is which would factor analysis, the scores from each item in the scale
render the subscale data invalid. would show high loadings, expressed as high Eigenvalues,
on one of the predicted factors (e.g., symptom severity
Interpretability of the BCTQ and functional status of the BCTQ). It has been hypothe-
Interpretability was assessed in a study including 196 sub- sized that the BCTQ comprises a two-factor structure con-
jects[16]. Using the satisfaction with the outcomes of sur- sistent with symptom severity and functional status [10].
gery as a discrete variable (unsatisfied, somewhat Because the constructs assessed are so distinct (symptom
satisfied, and very or completely satisfied), the minimal severity and functional status), it is likely that the BCTQ
clinically important difference (MCID) was estimated as total score is less informative and helpful for clinical pur-
the mean difference between the BCTQ scores before sur- poses even though four studies reported the total score.
gery and at 6 months after surgery for the unsatisfied and Also, Katz et al[18] found the two-factor structure consist-
somewhat satisfied patients. The MCID is 0.74 for the ent with the symptom severity and factor structure scales
BCTQ (total score based on the average of both subscales in workers' compensation recipients, however this large
with scale ranges from 1 to 5), a value considered superior study did not investigate the BCTQ factor structure; either
to generic measures, e.g. SF-36, in distinguishing clinically confirming the original structure or suggesting an alterna-
important differences after carpal tunnel release. The tive factor models may provide a better explanation of
MCID for individual scales has not been reported, how- that data. Thirdly, test-retest reliability was reported in
ever Atroshi et al [12] also presented summary statistics four studies and Pearson correlation coefficient was used
Page 6 of 9
(page number not for citation purposes)
Page 7 of 9
http://www.biomedcentral.com/1471-2474/7/78
BCTQ Functional Status Score (FS) BCTQ Severity of Symptoms Score (SS) Assessment interval
Amadio et al, 1996 SRM = 1.26 SRM = 1.75 Before and 13.5 weeks post surgery (n = 22)
Atroshi et al, 1998 SRM = 0.94 (0.72–1.16) SRM = 1.7 (1.4–2.0) Before and 13.5 weeks post surgery (n = 102)
ES = 0.94 ES = 2.1
Gay et al, 2003 At 6 weeks after surgery SRM = 0.46; ES = 0.48 At 6 weeks after surgery SRM = 1.67; ES = 1.74 before, 6 and 12 weeks post surgery (n = 34)
At 12 weeks after surgery SRM = 1.05; ES = 1.05 At 12 weeks after surgery SRM = 2.01; ES = 1.96
Greenslade et al, 2004 SRM = 0.62 SRM = 1.07 before and 13.5 weeks post surgery (n = 57)
Katz et al, 1994 SRM = 0.96 SRM = 2.02 before and 13.5 weeks post surgery (n = 43)
ES = 0.86 ES = 2.03
Katz et al, 1996* Male WC/WCNon ES = 0.65/ES = 0.76 Male WC/WCNon ES = 1.43/ES = 1.33 before and 27 weeks post surgery (n = 268)
Female WC/WCNon ES = 1.25/ES = 1.44 Female WC/WCNon ES = 1.63/ES = 2.13
BMC Musculoskeletal Disorders 2006, 7:78
Levine et al, 1993 SRM = 0.71† SRM = 1.13† before and 13.5 weeks after surgery (n = 26)
greater satisfaction with the outcomes of surgery versus greater satisfaction with the outcomes of surgery versus
improvement of FS = 0.50 (p < 0.01) improvement of SS = 0.54 (p < 0.01)
Mondelli et al, 2002 ES = 1.23§ ES = 2.33§ before and 27 weeks post surgery (n = 219)
Legend:
WC – Workers' compensation recipients
WCNon – Workers' compensation Nonrecipients
* ES were not reported in paper but have been calculated based on values given in tables
† responsiveness reported as ES, however calculation given is SRM (mean change/S.D change)
§responsiveness assessed as differences but ES calculated from values given in paper
BMC Musculoskeletal Disorders 2006, 7:78 http://www.biomedcentral.com/1471-2474/7/78
in two of them[10,20], a statistical approach recognized been designated as the minimally clinical important dif-
as inappropriate as it only measures the strength of asso- ference.
ciation between scores and not agreement[23]. Forthly, in
one study [10] the analyses of responsiveness of the BCTQ This review considered English language publications
compared data from two independent cohorts (one pro- only, however there are a number of published studies
spective and another retrospective). It is likely that the which have used translations of the BCTQ into other lan-
information obtained retrospectively is less accurate than guages including Italian [24] Swedish [12] Portuguese
the prospective one. [25] and Spanish [20] widening the applicability of the
BCTQ to non-English speaking settings.
Validity of the BCTQ was assessed in terms of face, content
and construct validity. Face and content validity were Scale development is an ongoing process which may
assessed through consultation with individuals with rele- never be complete. The properties such as the validity,
vant expertise in order to generate the content of the ques- reliability and responsiveness investigated to date are not
tionnaire. The content of the BCTQ had been examined in fixed properties but specific to the instrument used in a
one study[10], suggesting that the questionnaire items given situation and with a given population. The BCTQ
match the test objectives and the impact of carpal tunnel was developed for use in heterogeneous samples of
syndrome on patients' daily life. The construct validity of patients of a wide age range with CTS. Further research is
the BCTQ had been assessed in the majority of the ten needed to examine the consistency of its psychometric
studies. In the selected studies construct validity it was properties, with special attention to the factor structure,
assessed as the extent to which the items of the BCTQ among specific populations, test-retest reliability using
'behaved' the way that the construct it purports to measure appropriate statistical measures and defining the MCID
(that is symptom severity and functional status) should for each subscale against appropriate external criteria.
'behave' with regard to other established measures (e.g.,
the Disabilities of the Arm, Shoulder, and Hand Ques- Clinicians looking for a disease-specific measure for
tionnaire, pinch and grip strength, satisfaction with the assessing pre- and post-operative symptom severity and
outcomes of surgery). Stronger correlations were observed functional status can be confident that the BCTQ is
between the BCTQ and the other disease- and region-spe- responsive to change, repeatable over time and that the
cific measures such as the DASH and Arthritis Impact scales measure what they purport to measure. The BCTQ
Measurement Scale, than between the BCTQ and generic is also acceptable and quick to administer and as it relies
objective measures such as SF-36 and Quality of Life on self-report can be applied via postal methods.
Questionnaire indicating greater overlap between the
former measures. The BCTQ also demonstrates construct Conclusion
validity when its internal consistency was examined. A In summary, the BCTQ offers a standardised patient-
high Cronbach alpha indicates homogeneity of items and based outcome measure of symptom severity and func-
supports the validity of the construct being tested [14]. tional status for which there is good evidence on validity,
reliability and responsiveness and it should be recom-
Responsiveness to clinical change is another important mended for inclusion in future trials on carpal tunnel
feature of an outcome measure. The data on effect sizes interventions.
and standard response means demonstrated that the
Symptom Severity Scale and Functional Status Scales are Competing interests
able to detect clinically meaningful change resulting from The author(s) declare that they have no competing inter-
the treatment for carpal tunnel syndrome and yielded ests.
large effect sizes over a 6 month interval. However in two
of the studies the data on responsiveness were based on a Authors' contributions
subgroup of patients reporting greater satisfaction with JCL carried out the searching and reviewing of studies and
surgery. Responsiveness indices in these are therefore drafted the manuscript. CJH conceived of the original
likely to be larger than in the other studies. Using a idea, obtained funding and participated in the search,
responsive outcome measure will facilitate the detection review and drafting of the manuscript. FS provided expert
of moderate treatment effects in clinical research. advice on systematic reviewing and contributed to the
manuscript. All authors read and approved the final man-
The BCTQ has shown good levels of acceptability with uscript.
response rates of 90% and above and takes less than 10
minutes to complete. The interpretability has been
assessed in relation to patient satisfaction with the out-
comes of surgery and an overall difference of 0.74 has
Page 8 of 9
(page number not for citation purposes)
BMC Musculoskeletal Disorders 2006, 7:78 http://www.biomedcentral.com/1471-2474/7/78
Page 9 of 9
(page number not for citation purposes)
View publication stats