Individual Stats Paper JSCRaccepted
Individual Stats Paper JSCRaccepted
Individual Stats Paper JSCRaccepted
net/publication/374998761
CITATIONS READS
2 2,994
4 authors, including:
All content following this page was uploaded by John R Harry on 17 November 2023.
John R. Harry, PhD1, Jacob Hurwitz, MS2, Connor Agnew, MS3, Chris Bishop, PhD4
1
Department of Kinesiology & Sport Management, Texas Tech University, Lubbock, TX
USA
2
Department of Kinesiology, Mississippi State University, Starkville, MS USA
3
Department of Athletics, Appalachian State University, Boone, NC USA
4
Faculty of Science and Technology, London Sport Institute, Middlesex University, London,
UK
Funding: This research did not receive any specific grant from funding agencies in the
Correspondence Address: John R. Harry, PhD, CSCS; Dept. of Kinesiology & Sport
Management, Texas Tech University, 3204 Main Street, Lubbock, TX 79409 USA; Email:
demonstrated continued reliance on group-style statistical analyses that are held to critical
assumptions not achievable in smaller-sample team settings. There is justification that these
team settings are better suited for replicated single-subject analyses, but there is a dearth of
literature to guide sports science professionals seeking methods appropriate for their teams.
In this report, we summarize four methods’ ability to detect performance adaptations at the
replicated single-subject level and provide our assessment for the ideal methods. These
methods included the model statistic, smallest worthwhile change (SWC), coefficient of
variation (CV), and standard error of measurement (SEM), which were discussed alongside
step-by-step guides for how to conduct each test. To contextualize the methods’ use in
practice, real countermovement vertical jump (CMJ) test data was used from four athletes
(two females and two males) who complete five bi-weekly CMJ test sessions. Each athlete
was competing in basketball at the NCAA Division 1 level. We concluded the combined
application of the model statistic and CV methods should be preferred when seeking to
approach ensures that the differences between tests are A) not random and B) reflect a
worthwhile change. Ultimately, the use of simple and effective methods that are not restricted
Within sports organizations, there has been a concerted effort to form collaborative teams
scientific analysis methods. Typically, the main objective of this effort is to provide empirical
expertise to help guide actionable decisions. This has led to the development of the field
known as “sports science” (16). Despite the benefits realized through this effort and the
development of the larger field of sports science, there has been little advancement with
respect to practices centering on individual athlete evaluations in place of team (i.e., group)
average evaluations. Ultimately, practitioners and team-based sport scientists will continue to
face challenges when seeking to determine whether athletes are adapting to training or related
interventions. At face value, these challenges are due to the available literature to which
practitioners can compare their athletes’ physical performances, as studies are largely limited
both (1, 28, 32, 44, 59). In reality, multiple factors are to blame for the lack of ability to make
Our experiences indicate the limited potential to make actionable decisions is predominantly
due to three realities within the strength and conditioning community. First, in the United
States there have been very few vacancies within sports organizations that are appealing to
qualified sports scientists, which forces untrained or inexperienced practitioners and scientists
into such roles. Second, strength and conditioning coaches are often assigned to, or volunteer
for, sports scientist roles despite a lack of training in areas concerned with research
methodology, data management and analysis, and statistics. Third, formally-trained scientists
who assume roles in sport are predominantly exposed to statistical tests aimed at generalizing
a sample’s average result to the sample’s larger population (26). Obviously, we cannot
control the current state of the literature, personnel decisions, nor the availability of appealing
vacancies for adequately trained candidates in sports organizations. However, we can provide
settings to help scientists and practitioners conduct appropriate analyses and obtain actionable
test results.
response or adaptation, and this practice is echoed in the sports science literature (4, 17, 18,
34, 50). However, replicated single-subject approaches have been used by some to detect
changes and this approach may provide the most value to sports scientists and practitioners
(11, 29, 31, 55), particularly high value for those working with smaller squads of athletes
analyses in athletes is forming (11, 29, 31), helping to create the impetus needed to move
away from group-average assessments, when necessary, without reliance on subjective visual
change (i.e., 5% or 10% improvement) as an indicator for real change. The next logical step
to strengthen the foundation is to demonstrate methods that can be used to conduct high-
quality objective assessments and provide empirical evidence for individual athletes’
performance adaptations. The ultimate objective from the current article was intended to
align with the NSCA’s Essentials of Sport Science textbook, which highlights the importance
their training or related interventions and subsequently make data-driven changes (14).
The purpose of this report was to summarize methods for practitioners to explore for their
own purposes related to individual athlete assessments. We provide explanations for specific
each test for further exploration. In addition, we compare each method and provide our
relevant for practitioners working with athlete populations. To contextualize how each
method can work in practice, we applied them to a small subset (n = 4) of real, longitudinal
data obtained from athletes competing in men’s or women’s basketball at the NCAA Division
1 level. Finally, we provide an editable Microsoft Excel worksheet which includes the
It is beyond the scope of this report to discuss in-depth the limitations of group-level
statistical testing (e.g., t-test, ANOVA, etc.) in team-based settings, as this has been done
elsewhere (6, 8, 9, 22, 33). The key point for practitioners is that group-level testing requires
a normal distribution where the data from the sample (i.e., the athletes) are generalized prior
to analysis. As athlete samples are typically quite small, it is unlikely that a normal
distribution will occur, leading to issues related to the sample standard deviation and standard
error (13). Moreover, group-level approaches with smaller-samples are limited to the
known misrepresentation of the individuals from which the group average is obtained (8).
Replicated single-subject approaches (11, 29, 31, 55) are ideal for smaller-sample settings
where the individuals are not represented by the group average. However, further examples
are needed to help practitioners move away from isolated use of group-average assessments,
when necessary, without reliance on subjective visual inspection or trend analyses for
individual assessments. Consequently, the next logical step is to demonstrate methods that
can be used to conduct high-quality assessments and provide empirical evidence for
Importantly, there are multiple methods at one’s disposal to select from that can be used to
quantitatively explore individual athlete responses. However, some of those methods might
not be ideal even though they can be used at the individual level. Some of these methods,
the Mann-Whitney U Test (9), bootstrapping (25), and multiple regression (24). Other
percentage of non-overlapping data points (38), counting the number of data points above a
specific threshold (41), confidence intervals or effect sizes (52, 53), and statistical process
control (56). An advantage to these latter approaches is they provide adequate scientific rigor
because means and standard deviations of increasing or decreasing sequential data are treated
in consideration for their slope. This is important when comparisons are made between very
different means values (e.g., 10 vs. 100) with similar amounts of normalized variation (i.e.,
10%). The methods have additional value when seeking to determine whether an athletes’
current test result is different from a previous series of tests, or whether the results of one
training period are different from those of a subsequent training period. However, there are
consequences of these methods. First, they can involve between-subjects metrics, notably the
individuals within the team or group (8). Second, calculating rolling averages across several
test sessions minimized the movement variations (i.e., strategies) that dictate an athlete’s
performance outcomes during a test session when the test sessions are pooled together. As
described by Bates (6), all measurement outcomes are dependent upon the state of the
organism interacting with the environment at a specified moment in time. It is our opinion
that individual athlete assessments must seek to account for the uniqueness of individual
athlete strategy, as the available number of strategies changes over time due ongoing changes
consecutive test sessions. Those who may be interested in procedures that involve between-
subject metrics or rolling averages are referred to previous literature (36, 57, 64).
The quantitative approaches we feel have the most potential to reveal test-to-test performance
changes include single-subject ‘significance’ testing [i.e., the model statistic], and comparing
the magnitude of change against the smallest worthwhile change [SWC], the standard error of
measurement [SEM], or the coefficient of variation [CV] (10, 29, 31, 61, 65, 66). The
usefulness of each of these methods is that all were designed for, or can be relatively simple
to apply in, single-subject analyses. In addition, all are reasonably simple to understand,
importantly, each method requires multiple testing efforts (i.e., trials) to be included for each
comparison, which discourages the common practice of including only the “best” effort or
only very few trials (42) from the test sessions. Including multiple trials is a critical
component of the single-subject methodology (6, 33) to reduce the variation in an athlete’s
data, thereby increasing statistical power (7) and provide stable performance data which
better reflects an athlete’s true performance capabilities (40). The resulting outcome for the
presence of a change is therefore obtained with consideration for both the absolute change of
performance and the potential variations among trials within the test (i.e., consistency of each
individual’s result), which can mask changes when not accounted for (8). Importantly,
accounting for variation means the performance result for each athlete’s test session(s)
feedback, which ultimately, determine bodily movement and the amount of performance
Model Statistic
The model statistic technique is a critical difference method that can be loosely considered a
compared to a probabilistic critical difference (7, 9). It was designed in the early 1990s by
Bates et al. (7) and has been used to demonstrate the value of single-subject comparisons
between two conditions relative to the group-level equivalent (31). Critical values were
generated (7) for selected trial sizes (i.e., the number of trials used to calculate the test session
average) and statistical probabilities (i.e., alpha levels; α), which are provided in Table 1. The
final decision from the test therefore indicates the probability for whether the difference
between test sessions was due to random chance, using the user’s a priori choice among
10%, 5%, or 1% probability levels. A unique feature of the model statistic is that it does not
calculate interval limits as one standard deviation away from the mean, and instead
incorporates the weighted mean standard deviation (7), which can also be described as the
variation in the collective number of trials or observations used in the comparison (7, 21).
Ultimately, the critical values and ultimately the critical difference score are analogous to a
1.96*standard deviation interval where the critical difference is a cutoff from which 95
percent of scores are between ± the critical difference. When test sessions with different trials
sizes are compared, we recommended using the critical value associated with the smallest
trial size because the test is more conservative when smaller trial sizes are used (i.e., more
difficult to return a difference that is not due to chance). The main limitation to the model
statistic is that the critical value table was created using vertical ground reaction force data
obtained during the support phase of running. It may be that other types of data or other tasks
Traditional paired-samples t-test and effect sizes could be used in a similar way, but we favor
the model statistic for a few reasons. First, t-tests are inappropriate for repeated test
assessments (e.g., more than two comparative tests) without corrections for familywise error,
require a minimum number of samples (i.e., trials in this case) for adequate statistical power
(7), and the observations that make up the comparative means are subject assumptions related
and related criteria is not typically achievable in athlete testing environments. For effect
sizes, clinical data suggests a similar rate of “differences” will occur in comparison to the
model statistic (23). There are lingering problems, however, related to effect sizes. First, there
compared to the model statistic. Second, effect sizes are limited by the need to select a
magnitude threshold for interpretation based on subjective scales (19, 37, 54), created mostly
from recreationally active samples, or based on athlete training level and not primary sport or
training stimulus (54). As reducing subjectivity and guesswork is critical for the applied
sports scientist, we feel effect sizes would not be appropriate in this type of scenario.
Comparisons between tests for an individual athlete can be performed with the model statistic
1. Calculate the absolute mean difference between pre-test (X1) and post-test (X2)
3. Multiply desired critical value (i.e., test statistic) from Table 1 by the mean SD
The SWC approach is a form of magnitude-based inference (64) that was pioneered as an
traditional statistical tests (37). The SWC has greater specificity to sports science data sets
small sample (or trial) sizes, have relatively higher variations among measurements, and
sports scientists may be less concerned with whether a change is or is not due to
unexplainable chance (15, 47). A constant of 0.2 is used to establish the SWC threshold (see
below) for trained populations or athletes (60). This is because it aligns with the commonly
accepted, albeit subjective, definition for a “small” magnitude difference, or effect size (19).
A constant of 0.6 can be used in the SWC calculation for untrained populations or youth
athletes because large adaptations can be realized in those populations following initial or
short-term periods of training (47). While the selection of the SWC constant could be
objectively calculated (46) for a specific sample or athlete, we elected to use what we
determined from the sports science literature to be the most common SWC approach.
Although the SWC approach is typically utilized at the group level, it can be easily applied at
the replicated single-subject level. Comparisons between tests for an individual athlete can
performed with the SWC approach using the following procedural steps, which are slightly
1. Calculate the athlete’s mean performance display for the pre-test (X1) and post-test
(X2) sessions
3. Calculate the SD across all test sessions (pre- and post-test sessions; referred to here
as SDGlobal)
When comparing the percent change between test sessions to the CV, the objective is to
determine whether performance changes or differences between conditions are greater than
the variation in the test (10). Therefore, this technique is not a probabilistic test, and instead
reveals whether the observed difference exceeds the ‘noise’ (as represented by the CV)
inherent in the results. Rather than using the mean difference between the pre- and post-tests,
the percent difference is typically used and compared to the CV. This may be particularly
negative change (60, 62). A benefit of this is that it may be more feasible to explain to
athletes, coaches, or other stakeholders “how” differences are determined for each
comparison. In addition, the CV approach can be obtained such that the unit of measure for
the test is retained and still tell the same story as when converted to a percentage value, which
can simplify interpretation for some practitioners, stakeholders, or both. While the procedure
outlined here uses one standard deviation to calculate the CV, 1.5 or 2 standard deviations
could also be used to expand the “range of scores”, which could be useful when seeking to
modify the sensitivity of the test and account for what is quantified by each metric’s interval.
This should not be confused with similar processes to compare data from different numeric
scales (64). Rather, it is a way to control the outcome sensitivity for performance tests known
to have greater or lesser movement variability. Comparisons between tests for an individual
athlete can be performed with the CV technique using the following procedural steps:
1. Calculate the athlete’s mean performance display for the pre-test (X1) and post-test
(X2) sessions
5. Compare the percent change between pre-test and post-test sessions to CV1
For the SEM method, the objective is to compare the change in performance to the ‘noise’,
like the CV method, with data inherently retained in the original unit of measurement. This
means that the absolute mean difference (i.e., change of performance) is used in conjunction
with the precision of the test data. The SEM has been described as the intra-individual
version of the SD (13), but there appears to be lesser practical application of SEM because
Although there are two common formulae used to calculate the SEM (2), the inability to
obtain an intra-class correlation coefficient (ICC) at the single-subject level requires that the
SEM is calculated as the square root of the mean square error (MSE), with some
modifications to align with the single-subject dataset. This process may be more complicated
to some than using the ICC. However, it could be beneficial because it avoids the
uncertainties connected to the ICC and allows for more consistency (65) from test to test.
1. Calculate the athlete’s mean performance display for the current test (X1) and the
SS = (X1 –XTotal)2
3. Calculate the MSE using the SS and degrees of freedom (df; number of trials -1),
where number of trials equals the sum of the number of trials recorded across sessions
MSE = SS/df
SEM = √MSE
To demonstrate the similarities and/or differences among methods with respect to detecting
performance changes in athletes, we used data obtained across five bi-weekly test sessions
from two female and two male NCAA Division 1 Basketball players. All were healthy,
uninjured, and active members of an NCAA Division 1 Basketball program throughout the
data collection period. Participants provided written informed consent, and data were
collected in accordance with the Declaration of Helsinki as approved by the local Institutional
Review Board. Although this report does not involve formal research methodology nor is it
an “original research” paper, it is a series of case examples using real data. Because of this,
we felt compelled to acknowledge the ethical considerations for using real data in this report.
The countermovement vertical jump (CMJ) was selected as the test activity, with ground
reaction force data obtained during testing. The CMJ was used for this report because it is
commonly used in research when seeking to understand physical ability among athlete
populations (3, 20, 28, 44, 49, 58). The CMJ was also selected because it (and related jumps)
is performed frequently during competitive play in basketball (27, 48) and strongly associated
with sport-specific qualities such as speed, strength, and agility (4, 45, 49, 51). In addition,
CMJ tests are routinely performed in laboratory and practitioner settings where multiple trials
are collected for each athlete, thereby satisfying the requirements of each method (see
below). The modified reactive strength index (RSIMOD) and vertical jump height were
included as primary and secondary CMJ performance metrics, calculated as center of mass
flight height and the ratio of vertical jump height and time to takeoff, respectively (30). These
metrics were selected according to a recent framework produced to guide practitioners in the
selection of useful metrics to examine CMJ abilities (12). This is because RSIMOD is a valid
and reliable surrogate for athletic explosiveness (43), and RSIMOD appears influenced
Performance changes from session to session were determined using the model statistic,
SWC, CV, and SEM methods described previously. Table 2 provides a summary of the
cumulative increases of performance detected by each method (i.e., 4 possible changes per
method of analysis, per athlete, resulting in 16 possible changes). When detecting increases in
RSIMOD, the most sensitive methods were the SWC and CV methods, with both detecting an
increase of performance between test sessions for 38% (6 out of 16) of the total possible
comparisons (Table 2). The model statistic and SEM methods were more conservative,
detecting increases in performance between test sessions for 25% (4 out of 16) and 19% (3
out of 16) of the total possible comparisons, respectively (Table 2). For jump height, the most
sensitive method was SWC, with increases detected during 50% (8 out of 16) of the total
number of comparisons (Table 2). The next most sensitive method was the CV method,
detecting increases in jump height for 44% (7 out of 16) of the total number of comparisons
(Table 2). The model statistic method detected increases in jump height for 31% (5 out of 16)
of the number of comparisons, while the SEM method detected increases for only 6% (1 out
Importantly, the four methods were largely inconsistent with respect to detecting performance
increases for the same comparisons, as the methods were consistent during only ~13% (4 out
of 32) of the total possible comparisons across the four athletes (see Figures 1-4). The reason
the SWC detected a much greater number of performance gains versus all other methods is
that the equation uses only a portion of the athlete’s variation (20% for the 0.2 constant; 60%
for the 0.6 constant). This creates a scenario in which the SWC will inherently detect a
greater number of performance changes than the other methods, and ultimately, the risk of
false “gains” is greatest. As such, the methods should not be used interchangeably, nor should
reports of change be compared between or among assessments with different methods for
detecting change.
The risk of false-positive outcomes should come into play from the perspective of
determining an actionable change. It is therefore our opinion that a test for change in team
settings includes three key components. First, the test should be objective to eliminate
when determining whether the difference between two tests is random or likely to be
legitimate to avoid erroneous interpretations. Third, the test should provide a simple way to
determine whether a difference between two tests is meaningful to the athlete, practitioner, or
related stakeholder. According to these requisite components and the benefits, limitations, or
both discussed for each method, we recommend the model statistic and CV approaches be
used in parallel. This combined use of approaches will provide practitioners with the ability
to identify actionable changes in their athletes. Further, it eliminates the likelihood for
decisions to be made based on the presence of potentially random differences that seem
Table 3 shows the cumulative increases of performance based on the model statistic, CV, and
combined approach (the table is also available as an editable Microsoft Excel file, provided
with the article as supplemental digital content). The fact that the model statistic and CV
approaches did not identify athlete-specific differences with the same pattern supports the
potential use of the two methods in parallel. The reason for the different patterns of
differences between the approaches relates to the objectives for which the tests were
designed. From the sports scientist’s perspective, the model statistic indicates whether the test
results are legitimately different due to its conservativeness, while the CV indicates the
meaningfulness of the difference between the test results. As such, they are complementary
The recommendation for using the model statistic may be obvious, as it is the only method
providing whether the difference between tests is statistically significant like convention
group analyses (e.g., t-test, ANOVA). However, the recommendation for parallel use of the
CV method may be less clear because the CV, SEM, and SWC methods all incorporate the
session variation, or the athlete’s consistency across trials performed with the session(s). Our
position for omitting the SWC relates to its reliance on subjectivity when determining the
threshold for importance of a change. Moreover, the SWC’s use of 20% or 60% of the
athlete’s variation makes it excessively sensitive, as mentioned previously, which is not ideal
use at the individual athlete level. In addition, the SEM method returned the same number of
differences as the SWC method, albeit with a similar yet not identical pattern of differences
(not shown in tables or figures). Thus, there may be increased risk for erroneous conclusions
using the SWC or SEM methods, in which a certain number of significant performance
changes from the model statistic would coincide with an inflated number of “meaningful”
changes versus the CV method. In turn, practitioners using SWC or SEM might be motivated
to conclude that an athlete has demonstrated a positive adaptation when there was no
One final point to note relates to the level of influence a practitioner or sports scientist places
on the results of a single or repeated performance tests, as the importance of any test can vary
among practitioners, athletes, or both depending on training objectives. For context, Figures
1-4 demonstrate our individual athletes’ CMJ performance changes across five sessions (i.e.,
10 weeks) from the start of full-time training until approximate start of the competition
season. The nature of the training interventions for each athlete, which was somewhat unique
to each individual, centered on progressive changes in training volume and intensity for
defined “increased explosive strength” as an increase of RSIMOD, with jump height used as a
secondary performance metric to help explain changes in RSIMOD. We use this approach
because of previous work demonstrating RSIMOD is a valid and reliable surrogate for athletic
primarily by jump height and not its other component part, time to takeoff (32). As such, if
jump height was not an adequate explanatory metric for RSIMOD changes, it would mean the
change was due primarily to altered times to takeoff and that would be considered in any
recommendations provided. Thus, all athletes were expected to realize CMJ performance
gains at each test session. If a change was not observed on a given test day, we would reflect
on the results and other contributing factors to overall workload, such as physical training,
on-court work, test-day fatigue/athlete readiness. Those data will not be discussed here but
were given to the strength and conditioning staff to decide whether training or related
For female athlete 1 (Figure 1), the session-to-session changes in RSIMOD were similar to
their changes in jump height. For the purposes of this report, we will focus on the way in
which we use CMJ results to make data-driven training changes. In particular, the large
decreases in RSIMOD and jump height from test session 2 to 3 were concerning. A member of
the sports science team provided the strength and conditioning staff with recommendations,
and they decided whether training and/or related workload modifications were appropriate.
The changes they implemented were shown to be successful according to the athlete’s CMJ
results at test session 4, which were both statistically significant (model statistic) and
meaningful (CV). For female athlete 2, the CMJ test results were not concerning enough to
recommend specific changes or considerations between test sessions 1 and 5, though there
was a statistically significant (model statistic) and meaningful (CV) increase of RSIMOD at
test session 5.
always return positive changes, as shown between test sessions 4 and 5 for male athlete 1
(Figure 3). For instance, this athlete demonstrated somewhat positive results from test session
1 to 2, as RSIMOD was shown to meaningfully increase (CV) while jump height was shown to
increase significantly (model statistic) and meaningfully (CV). This performance further
improved from test session 2 to 3, as RSIMOD and jump height increased significantly (model
statistic) and meaningfully (CV) increased. However, there was an alarming drop in
performance, as evidenced by decreases in both RSIMOD and jump height. Although data-
driven recommendations were presented and training and/or related modifications were
prescribed, the athlete did not display statistically significant (model statistic) nor meaningful
(CV) improvements in RSIMOD at test session 5 due to substantial variation across the session
5 trials. Further to this, male athlete 2 displayed concerning results from test session 1 to 2,
primarily driven by time to takeoff due to the < 5% decrease in jump height, specific
recommendations and training and/or related modifications were prescribed specific to the
athlete’s display. Interestingly, the athlete’s subsequent results at test session 3 did not
stimulate targeted change in RSIMOD nor jump height. This trend continued through test
session 5, suggesting the athlete was, for some reason, more resistant to the changes we were
recommending, or the strength and conditioning staff were prescribing. While detailed
exploration into those reasons is beyond the scope of this report, it did provide insight into
A final point for this section relates to our objectives during the 10-week test period
presented, as this approach was specific to the groups of athletes and the goals established for
those athletes. While the overarching objective for this subset of athletes and most athletes in
the basketball programs is to increase “explosive strength”, the way in which that is
stimulated varies across seasons, training blocks, and athletes. This is why the specific
training programs, recommendations, and complimentary data are not discussed, as it would
be of little relevance to the reader. The objective for this practical application section was to
show when test data reveals performance changes that are objective in nature and should
Conclusion
We summarized four methods, specifically the model statistic, smallest worthwhile change
(SWC), coefficient of variation (CV), and standard error of the measurement (SEM), with
respect to detecting individual athletes’ change during performance tests, using the CMJ as
the example test. We provide support for our recommendation that the combined use of the
model statistic and CV should be preferred when seeking to objectively detect real and
to a small subset of real data obtained in four different athletes, competing in men’s or
women’s basketball at the NCAA Division 1 level, to contextualize how these methods and
can work in practice, highlighting when we would or would not recommend or prescribe a
training-related modification.
References
1. Adams K, O’Shea JP, O’Shea KL, and Climstein M. The effect of six weeks of squat,
2. Atkinson G and Nevill AM. Statistical methods for assessing measurement error
1998.
3. Barker LA, Harry JR, and Mercer JA. Relationships Between Countermovement
Jump Ground Reaction Forces and Jump Height, Reactive Strength Index, and Jump
4. Barnes JL, Schilling BK, Falvo MJ, Weiss LW, Creasy AK, and Fry AC. Relationship
and conditioning research / National Strength & Conditioning Association 21: 1192-
1196, 2007.
5. Bates BT. Scientific basis of human movement. Journal of Physical Education &
7. Bates BT, Dufek JS, and Davis HP. The effect of trial size on statistical power.
8. Bates BT, Dufek JS, James CR, Harry JR, and Eggleston JD. The Influence of
28.
10. Bishop C, Abbott W, Brashill C, and Read P. Effects of Strength Training on Bilateral
and Unilateral Jump Performance, and the Bilateral Deficit in Premier League
of print, 2021.
11. Bishop C, Lake J, Loturco I, Papadopoulos K, Turner AN, and Read P. Interlimb
12. Bishop C, Turner A, Jordan M, Harry JR, Loturco I, Lake J, and Comfort P. A
Countermovement and Drop Jump Tests. Strength & Conditioning Journal In Press,
2021.
13. Bland JM and Altman DG. Measurement error. BMJ: British medical journal 312:
1654, 1996.
Essentials of Sport Science. DN French, L Torres Ronda, eds. Champaign, IL: Human
15. Buchheit M. Chasing the 0.2. Int J Sports Physiol Perform 11: 417-418, 2016.
16. Burwitz L, Moore PM, and Wilkinson DM. Future directions for performance‐related
93-109, 1994.
17. Castagna C and Castellini E. Vertical jump performance in Italian male and female
national team soccer players. Journal of strength and conditioning research / National
18. Cohen DD, Restrepo A, Richter C, Harry JR, Franchi MV, Restrepo C, R. P, and
20. Drinkwater EJ, Pyne DB, and McKenna MJ. Design and interpretation of
anthropometric and fitness testing of basketball players. Sports medicine 38: 565-578,
2008.
21. Dufek JS, Bates BT, Davis HP, and Malone LA. Dynamic performance assessment of
selected sport shoes on impact forces. Medicine and science in sports and exercise 23:
1062-1067, 1991.
22. Dufek JS, Bates BT, Stergiou N, and James CR. Interactive effects between group and
23. Dufek JS, Harry JR, Eggleston JD, and Hickman RA. Walking mechanics and
24. Dufek JS and Zhang S. Landing models for volleyball players: a longitudinal
evaluation. The Journal of sports medicine and physical fitness 36: 35-42, 1996.
25. Edgington ES. Randomized single-subject experiments and statistical tests. Journal of
26. Fisher RA. On the mathematical foundations of the theory of statistics, in: In: Theory
2020.
28. Harry JR, Barker LA, James CR, and Dufek JS. Performance Differences Among
Skilled Soccer Players of Different Playing Positions During Vertical Jumping and
Landing. The Journal of Strength & Conditioning Research 32(2): 304-312, 2018.
29. Harry JR, Barker LA, Tinsley GM, Krzyszkowski J, Chowning L, McMahon JJ, and
Press, 2021.
30. Harry JR, Blinch J, Barker LA, Krzyszkowski J, and Chowning L. Low pass filter
31. Harry JR, Eggleston JD, Dufek JS, and James CR. Single-Subject Analyses Reveal
1: 15-28, 2021.
32. Harry JR, Paquette MR, Schilling BK, Barker LA, James CR, and Dufek JS. Kinetic
36. Hopkins WG and Batterham AM. Error rates, decisive outcomes and publication bias
37. Hopkins WG, Marshall SW, Batterham AM, and Hanin J. Progressive statistics for
studies in sports medicine and exercise science. Medicine & Science in Sports &
38. Huberty CJ and Lowman LL. Group overlap as a basis for effect size. Educational
40. James CR, Herman JA, Dufek JS, and Bates BT. Number of trials necessary to
42. Kennedy RA and Drake D. Improving the signal-to-noise ratio when monitoring
43. Kipp K, Kiely MT, and Geiser CF. Reactive strength index modified is a valid
countermovement jump performance that distinguish good from poor jumpers. The
R, and Nakamura FY. Vertical and Horizontal Jump Tests Are Strongly Associated
46. Mann JB, Ivey PA, Mayhew JL, Schumacher RM, and Brechue WF. Relationship
between agility tests and short sprints: Reliability and smallest worthwhile difference
47. Marocolo M, Simim MAM, Bernardino A, Monteiro IR, Patterson SD, and da Mota
2019.
48. McInnes S, Carlson J, Jones C, and McKenna MJ. The physiological load imposed on
basketball players during competition. Journal of sports sciences 13: 387-397, 1995.
49. McMahon J, Lake JP, Ripley N, and Comfort P. Vertical jump testing in rugby
51. Nuzzo JL, McBride JM, Cormie P, and McCaulley GO. Relationship between
53. Parker RI and Hagan-Burke S. Useful effect size interpretations for single case
54. Rhea MR. Determining the magnitude of treatment effects in strength training
research through the use of the effect size. Journal of strength and conditioning
55. Robertson S, Bartlett JD, and Gastin PB. Red, amber, or green? Athlete monitoring in
team sport: the need for decision-support systems. International journal of sports
57. Sands WA, Kavanaugh AA, Murray SR, McNeal JR, and Jemni M. Modern
58. Sporis G, Jukic I, Ostojic SM, and Milanovic D. Fitness profiling in soccer: physical
59. Stone MH, Sands WA, and Stone ME. The downfall of sports science in the United
60. Turner A, Brazier J, Bishop C, Chavda S, Cree J, and Read P. Data analysis for
strength and conditioning coaches: Using excel to analyze reliability, differences, and
in high-performance sport. Part 1: null hypothesis significance testing and the utility
62. Turner AN, Parmar N, and Jovonoski A. Assessing group-based changes in high-
performance sport. Part 2: Effect sizes and embracing uncertainty through confidence
63. Vincent W and Weir J. Statistics in Kinesiology. Champaign, IL: Human Kinetics,
2012.
64. Ward P, Coutts AJ, Pruna R, and McCall A. Putting the “I” back in team.
65. Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient
and the SEM. The Journal of Strength & Conditioning Research 19: 231-240, 2005.
66. Williams CA, Oliver JL, and Faulkner J. Seasonal monitoring of sprint and jump
67. Zatsiorsky VM, Kraemer WJ, and Fry AC. Task-Specific Strength, in: Science and
Table 1. Critical values to determine the critical difference for the model statistic technique.
Notes – Trial size: the number of trials used to calculate the performance means for the test
sessions; α: the alpha level (i.e., statistical probability criterion), or the probability that a
difference between sessions is random.
Table 2. Performance increases detected by each method across athletes.
Notes – SWC: smallest worthwhile change method; CV% coefficient of variation method, SEM: standard error of measurement method;
Increases: number of increases detected across all four athletes (decreases excluded); % Total: percent of increases detected relative to the total
number of comparisons.
Table 3. Performance improvement assessments across five test sessions and four athletes.
Test 1 Improvement Test 2 Improvement Test 3 Improvement Test 4 Improvement Test 5 Improvement
Athlete Mean SD MS CV Both Mean SD MS CV Both Mean SD MS CV Both Mean SD MS CV Both Mean SD MS CV Both
F1 0.393 0.016 N/A N/A N/A 0.547 0.043 TRUE TRUE TRUE 0.389 0.010 FALSE FALSE FALSE 0.560 0.030 TRUE TRUE TRUE 0.504 0.113 FALSE FALSE FALSE
RSIMOD
F2 0.337 0.007 N/A N/A N/A 0.367 0.051 FALSE TRUE FALSE 0.346 0.031 FALSE FALSE FALSE 0.327 0.010 FALSE FALSE FALSE 0.342 0.005 TRUE TRUE TRUE
M1 0.504 0.028 N/A N/A N/A 0.553 0.043 FALSE TRUE FALSE 0.688 0.058 TRUE TRUE TRUE 0.595 0.021 FALSE FALSE FALSE 0.532 0.160 FALSE FALSE FALSE
M2 0.777 0.126 N/A N/A N/A 0.536 0.012 FALSE FALSE FALSE 0.524 0.093 FALSE FALSE FALSE 0.526 0.076 FALSE FALSE FALSE 0.576 0.058 FALSE FALSE FALSE
Group 0.503 0.065 N/A N/A N/A 0.501 0.040 FALSE FALSE FALSE 0.487 0.057 FALSE FALSE FALSE 0.502 0.042 FALSE FALSE FALSE 0.489 0.102 FALSE FALSE FALSE
Test 1 Improvement Test 2 Improvement Test 3 Improvement Test 4 Improvement Test 5 Improvement
Athlete Mean SD MS CV Both Mean SD MS CV Both Mean SD MS CV Both Mean SD MS CV Both Mean SD MS CV Both
F1 0.248 0.001 N/A N/A N/A 0.306 0.012 TRUE TRUE TRUE 0.260 0.012 FALSE FALSE FALSE 0.301 0.012 TRUE TRUE TRUE 0.301 0.008 FALSE FALSE FALSE
Jump
Height F2 0.254 0.010 N/A N/A N/A 0.275 0.007 TRUE TRUE TRUE 0.263 0.013 FALSE FALSE FALSE 0.277 0.003 FALSE FALSE FALSE 0.265 0.012 FALSE FALSE FALSE
M1 0.319 0.010 N/A N/A N/A 0.339 0.009 TRUE TRUE TRUE 0.374 0.019 TRUE TRUE TRUE 0.355 0.003 FALSE FALSE FALSE 0.378 0.030 FALSE TRUE FALSE
M2 0.452 0.012 N/A N/A N/A 0.424 0.027 FALSE FALSE FALSE 0.405 0.029 FALSE FALSE FALSE 0.412 0.047 FALSE FALSE FALSE 0.438 0.008 FALSE FALSE FALSE
Group 0.318 0.009 N/A N/A N/A 0.336 0.016 FALSE TRUE FALSE 0.326 0.020 FALSE FALSE FALSE 0.336 0.024 FALSE FALSE FALSE 0.346 0.017 FALSE FALSE FALSE
Notes – F1: female athlete 1, F2: female athlete 2; M1: male athlete 1; M2: male athlete 2; MS: model statistic method; CV: coefficient of
variation method; Both: combined approach using model statistic and coefficient of variation methods; TRUE: change detected by associated
method; FALSE: change not detected by associated method; When TRUE is contained in green, both methods detected change.
Figure Captions
Figure 1. RSIMOD (left) and jump height (right) performance results and differences detected by the model statistic and CV methods for female
athlete 1.
Notes – Data are presented as mean ± 1 standard deviation across 3 trials for each test session; Model Statistic Change from Previous: non-
random (p < 0.05) change between adjacent test sessions; CV% Minimum Increase: threshold that must be exceeded to indicate change between
adjacent test sessions.
Figure 2. RSIMOD (left) and jump height (right) performance results and differences detected by the model statistic and CV methods for female
athlete 2.
Notes – Data are presented as mean ± 1 standard deviation across 3 trials for each test session; Model Statistic Change from Previous: non-
random (p < 0.05) change between adjacent test sessions; CV% Minimum Increase: threshold that must be exceeded to indicate change between
adjacent test sessions.
Figure 3. RSIMOD (left) and jump height (right) performance results and differences detected by the model statistic and CV methods for male
athlete 1.
Notes – Data are presented as mean ± 1 standard deviation across 3 trials for each test session; Model Statistic Change from Previous: non-
random (p < 0.05) change between adjacent test sessions; CV% Minimum Increase: threshold that must be exceeded to indicate change between
adjacent test sessions.
Figure 4. RSIMOD (left) and jump height (right) performance results and differences detected by the model statistic and CV methods for male
athlete 2.
Notes – Data are presented as mean ± 1 standard deviation across 3 trials for each test session; Model Statistic Change from Previous: non-
random (p < 0.05) change between adjacent test sessions; CV% Minimum Increase: threshold that must be exceeded to indicate change between
adjacent test sessions.