Evaluating The Usability of A Co-Designed Power Assisted Exercise Graphical User Interface For People With Stroke
Evaluating The Usability of A Co-Designed Power Assisted Exercise Graphical User Interface For People With Stroke
Evaluating The Usability of A Co-Designed Power Assisted Exercise Graphical User Interface For People With Stroke
Journal of NeuroEngineering
Journal of NeuroEngineering and Rehabilitation (2023) 20:95
https://doi.org/10.1186/s12984-023-01207-7 and Rehabilitation
Abstract
Background Digital advancement of power assisted exercise equipment will advance exercise prescription for peo-
ple with stroke (PwS). This article reports on the remote usability evaluation of a co-designed graphical user interface
(GUI) and denotes an example of how video-conference software can increase reach to participants in the testing
of rehabilitation technologies. The aim of this study was to evaluate the usability of two sequential versions of the GUI.
Methods We adopted a mixed methods approach. Ten professional user (PU) (2M/8F) and 10 expert user (EU) partici-
pants (2M/8F) were recruited. Data collection included a usability observation, a ‘think aloud’ walk through, task com-
pletion, task duration and user satisfaction as indicated by the Post Study System Usability Questionnaire (PSSUQ).
Identification of usability issues informed the design of version 2 which included an additional submenu. Descriptive
analysis was conducted upon usability issues and number of occurrences detected on both versions of the GUI. Infer-
ential analysis enabled comparison of task duration and PSSUQ data between the PU and EU groups.
Results Analysis of the ‘think aloud’ walkthrough data enabled identification of 22 usability issues on version 1
from a total of 100 usability occurrences. Task completion for all tasks was 100%. Eight usability issues were directly
addressed in the development of version 2. Two recurrent and 24 new usability issues were detected in version 2
with a total of 86 usability occurrences. Paired two tailed T-tests on task duration data indicated a significant decrease
amongst the EU group for task 1.1 on version 2 (P = 0.03). The mean PSSUQ scores for version 1 was 1.44 (EU group)
and 1.63 (PU group) compared with 1.40 (EU group) and 1.41 (PU group) for version 2.
Conclusions The usability evaluation enabled identification of usability issues on version 1 of the GUI which were
effectively addressed on the iteration of version 2. Testing of version 2 identified usability issues within the new sub-
menu. Application of multiple usability evaluation methods was effective in identifying and addressing usability issues
in the GUI to improve the experience of PAE for PwS. The use of video-conference software to conduct synchronous,
remote usability testing is an effective alternative to face to face testing methods.
Keywords Assistive technology, Co-design, Graphical user interface, Power assisted exercise, Rehabilitation
technology, Stroke, Usability evaluation
*Correspondence:
Rachel Young
r.young@shu.ac.uk
Full list of author information is available at the end of the article
© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativeco
mmons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 2 of 25
satisfaction [30]. It calls for representative users to per- findings and is organized according to the four underpin-
form representative tasks to identify the strengths and ning objectives. The findings and their interpretation are
shortfalls of a device in order to bring about improve- explored in the discussion section and compared with
ments [31]. Technologies for PwS previously evaluated previous relevant examples in the published literature.
through usability testing include an assistive game con-
troller [32], sensor feedback system for gait [26], wear- Methods
able functional electrical stimulation garments [33] and Aim
virtual reality gaming system [34]. Data collection meth- The aim of this study was to evaluate the usability of a
ods which have been implemented in the testing of novel co-designed GUI to enable PwS and rehabilitation pro-
assistive technologies include user satisfaction question- fessionals to effectively utilise power assisted exercise
naires [35, 36], task completion [25, 26], task duration equipment. The objectives were to: (1) evaluate the usa-
[25] and comparison between different devices [23]. bility of version 1 of the GUI; (2) use the findings from
Recurrent usability issues include difficulty donning and version 1 to develop and evaluate a second iteration
doffing [32, 33], failure to complete tasks [37] and diffi- (extended version) of the GUI; (3) compare the usability
culty accessing emergency stop function [26, 33]. of version 1 with version 2; (4) Analyse usability as expe-
The importance of trust in assistive and rehabilitation rienced by EU and PU’s.
technologies for PwS has been emphasised and features To achieve this aim, we adopted a mixed methods
which facilitate sustained successful engagement include approach. Quantitative methods were used to examine
task variety, clear communication, fatigue management task completion, task duration and user satisfaction using
and reward [35]. Usability evaluation is central to the the Post Study System Usability Questionnaire (PSSUQ)
development of acceptable and meaningful technologies [40]. Task completion is a strong indicator of the usability
which will be adopted by service providers and utilised by of digital rehabilitation technologies [41] and task dura-
end users [31]. Usability testing has historically been an tion data provides an indication of set up time which is
in-person activity where participants and researchers co- a key determinant in the adoption of rehabilitation tech-
locate [38]. The Covid-19 pandemic accelerated engage- nologies [24, 42]. The PSSUQ was selected to measure
ment with communication technologies and the research user satisfaction as it distinguishes between system usa-
community has shifted from face-to-face methods of data bility, quality of information and quality of the interface
collection to increased use of video-conferencing soft- [40]. ‘Think aloud’ was adopted as a qualitative method
ware [39]. The study reported in this article represents to gain insight into the users’ experience of navigating the
an example of how remote methods of usability testing GUI and identify specific usability issues [21]. All usabil-
can increase reach to users of rehabilitation technologies ity evaluations were conducted with both EU and PU.
[38] and represents a potential solution to the challenges Version 1 of the GUI was specifically designed for the
associated with recruitment of participants for face-to- cross-cycle machine (Fig. 1) as previous user involvement
face testing methods. indicated that this machine was the most popular [13].
It was envisaged that the GUI would be adapted to the
Overview of article range of machines manufactured by Shapemaster Global.
The study reported in this article recruited representa- Figure 2 is an image of the chest-and-legs machine which
tive user groups to evaluate the usability of two sequen- was ranked second most popular through consensus
tial versions of the co-designed GUI to optimise the methods [13].
usability and functionality of the new technology. For The version 1 prototype GUI (Fig. 3) comprised 7 sub-
the purposes of this manuscript, usability is defined as menus, namely; (1) user login; (2) programme selection;
“the effectiveness, efficiency, and satisfaction with which (3) duration selection; (4) real time feedback; (5) exercise
specified users achieve specified goals in particular envi- completion; (6) performance feedback; and (7) assistance
ronments” [30]. Users in the context of this study are alert. The real time exercise feedback phase of the pro-
either PU, i.e. rehabilitation professionals or clinical exer- gramme (step 4) was defaulted to play for a 30 s duration
cise physiologists, or EU i.e. PwS, including people who to enable animation of the virtual effort detection display.
have prior experience of PAE equipment. The methods The virtual effort was displayed on the semi-circular dial
section defines four objectives which underpinned the with darker shades of purple indicating increased effort.
study and describes the synchronous remote usability A menu bar at the bottom of the page enabled navigation
testing procedure conducted on two sequential versions to the homepage or previous page. This was positioned
of the co-designed GUI. The approaches adopted to col- centrally rather than as a sidebar to account for the spa-
lect and analyse quantitative and qualitative data are tial awareness impairments which can occur following
explained and justified. The results section reports on the stroke [43]. Activation of the ‘help’ icon navigated directly
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 4 of 25
Sample size
Preliminary testing has been implemented in usability
evaluation to determine the probability of error detection
[44]. Due to resource and timescale restrictions this was
not feasible and so the probability of error detection was
estimated at 0.15. The probabilistic model of problem
discovery described by Sauro and Lewis [40] was applied
to determine sample size with a target of 95% chance of
observation. We therefore aimed for a 95% likelihood of
detecting usability problems with an estimated 15% prob-
ability of occurrence. A sample size of 19 participants
was required [40].
Recruitment
Fig. 2 Chest and legs: machine ranked second through consensus Convenience sampling was implemented to identify par-
methods in use by an EU, supported by PU
ticipants for the EU and PU groups. The criteria for par-
ticipation across both groups was inclusive to capture
a range of perspectives and user priorities. The inclu-
to an ‘assistance called’ message intended to assure users sion criteria for the EU representation were; diagnosis
that a team member had been alerted. of stroke; access to a Wi-Fi connected laptop or digital
tablet; able to follow verbal instructions in English; and
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 5 of 25
Fig. 3 Graphical user interface version 1: this version was created to test the ‘quick start’ programme and help activation function
able to provide informed consent. No prior experience the participants had prior experience of PAE and 40% of
of PAE was stipulated. People who were unable to pro- the participants in the EU group had contributed to prior
vide informed consent due to severe cognitive impair- user involvement and co-design stages of the technology
ment were excluded from participation. Participants for project (Table 1). One participant (EU05) was unable to
the EU group were identified through a local independ- activate the remote-control mouse icon on Zoom. After
ent rehabilitation service and the service user network at several attempts the participant decided to withdraw
Sheffield Hallam University. The inclusion criteria for the from the study.
PU group were; employment relevant to rehabilitation or The mean age of participants in the PU group was 42.3
exercise prescription for people with long term condi- (SD 6.09) years and included representation from sport
tions, access to a Wi-Fi connected laptop or digital tablet; sciences, rehabilitation physiotherapists and industry.
able to follow verbal instructions in English and able to Fifty percent had direct experience of PAE and 60% had
provide informed consent. Participants for the PU group contributed to earlier stages of the project (Table 2). A
were identified through academic teams at the host uni- participant in the PU group (PU5) withdrew from the
versity, independent practitioners known to the research study prior to test two due to work pressures.
team and service providers known to the manufacturer.
Potential participants were identified by the lead author Usability testing procedure
(RY) and invited to consider participation via email with All tests were conducted via remote digital media by
an accompanying participant information sheet. The tar- the lead author (RY). The virtual meetings were pass-
get recruitment was 10 participant per group. Consent word protected and the meeting room was locked once
was confirmed through completion and submission of the participant had entered the system. A short famil-
an electronic form. Due to the virtual methods of par- iarisation session was scheduled to ensure that the
ticipant recruitment and data collection enforced by the remote technology could be accessed by each partici-
Covid-19 lockdown, detailed assessment of the type and pant. The Zoom media ‘remote control’ function was
severity of stroke related impairment was not possible. synched with a screen share of the adobe interface. The
participants were supported through activation of the
Participants remote-control mouse icon and supported in briefly
Ten EU participants (6M/4F) and ten PU participants navigating through the virtual GUI to ensure that they
(2M/8F) consented to participate. The mean age of the could activate the functions and view the interface
EU participants was 61.7 years (SD 10.2) and mean time from their selected device. Test one was scheduled dur-
since stroke was 60.9 months (SD 24.7). Fifty percent of ing each familiarisation session. The familiarisation
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 6 of 25
meeting, test one and test two were recorded directly required to directly navigate through the task under
to the lead author’s device into a secure digital storage timed conditions.
system at the host university. Test two was conducted on the same sample of partici-
Test one evaluated the usability of version 1 of the pants and. scheduled between four to six weeks after test
GUI and comprised three specific tasks (1.1, 1.2, 2.0) one and evaluated the usability of version 2 of the GUI.
in the ‘Quick Start’ programme (Table 3). Participants Tasks 1.1, 1.2 and 2.0 were repeated and four additional
were asked to verbalise their thoughts about navigat- tasks (3.1, 3.2, 4.1, 4.2) were introduced to evaluate the
ing through the GUI using a ‘think-aloud’ technique extended ‘my programme’ submenu of the GUI. The
[31]. Alongside the ‘think aloud’ data task completion purpose of repeating the test one tasks was to establish
rates and task duration data were collected. Each task whether the changes implemented between version 1 and
was completed twice. During the first attempt at each version 2 affected the usability of the GUI. In order to
task, participants were encouraged to ‘think aloud’ as optimise consistency of testing conditions, each task was
they navigated through the interface and identified the repeated twice, with the first attempt being a ‘think aloud’
icons which would enable task completion. They were walkthrough of the GUI and the second attempt a timed
prompted to explain their decisions and verbally share test conducted in silence (Fig. 4).
their experience of navigating the interface. The second The research team were cognisant of ensuring a
attempt was conducted in silence and participants were positive participant experience throughout all testing
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 7 of 25
Task 1.1 Access quick start 1.1: Time lapsed from login and virtual exercise
Select 6 min
Activate virtual exercise programme
You want to do a 6-min workout in the ‘quick start’
programme
Task 1.2 1.2: Time lapsed from login to opening the results menu
You will view your results at the end of the exercise View results
Task 2.0 Access quick start 2.0: Time lapsed from login to opening assistance called
You want to do a 4-min workout in ‘quick start.’ As Select 4 min menu
the machine starts to move you realise your hand Activate virtual exercise
is not secured to the moving component and you Activate ‘help’ icon
decide to call for help
Version 2 only
Task 3.1 Access ‘my programme’ 3.1: Time lapsed from login to virtual exercise
Access ‘baseline assessment’
Activate virtual exercise
You want to complete a baseline assessment
in the ‘my programme’ area. Assistance is available
Task 3.2 Increase target intensity 3.2: Time lapsed from login to opening the results menu
View results
You decide that you would like to increase the tar-
get intensity during exercise and view your results
on completion
Task 4.1 Access ‘my programme’ 4.1: Time lapsed from login to virtual exercise
Please choose either the ‘hilly’ or ‘steady’ option Select ‘hilly’ or ‘steady’
in the ‘my programme’ area Activate virtual exercise
Task 4.2 4.2: Time lapsed to opening the results menu
You decide that you would like to decrease the target Decrease target intensity
intensity during exercise and view results on comple- View results
tion
Task 1.1 Task 1.2 Task 2.0 Task 3.1 Task 3.2 Task 4.1 Task 4.2
Fig. 4 Timeline to represent tasks conducted on version one and version two: The first three tasks were conducted on versions one and two. The
final four tasks were specific to the new submenus created within version 2
procedures. The lead author advised that the tasks were physiotherapist with knowledge of the communication
not intended to test the capabilities of the participant and processing impairments which can occur following
and that any difficulties encountered whilst complet- stroke. Verbal instructions and prompts were adapted
ing the tasks reflected shortfalls in the design of the according to responses from each participant and rest
GUI. The lead author is an experienced neurological time was offered between each task.
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 8 of 25
Usability observation form very high scale and subscale reliability and construct
Test one and test two were audio–video recorded to ena- validity [46]. Participants were required to complete a
ble identification of usability issues, record task comple- 7-point Likert scale with responses ranging from strongly
tion and task duration. A usability observation form was agree (1) to strongly disagree (7) (Table 5). An overall
used to document all findings (Additional file 1). Cur- mean score is calculated from PSSUQ responses plus
sor tracking was observed on the video footage of each individual scores for three subsections: system useful-
virtual test; errors, hesitation or delays in navigation ness, information quality and interface quality (Table 4).
through the GUI were documented as a usability occur- Lower mean scores indicate higher user satisfaction [40].
rence. The ‘think aloud’ data were initially summarised Participants were issued with an on-line version of the
onto the usability observation form by the lead author. questionnaire at the end of each test and requested to
Four of the recordings alongside the respective usability complete it and submit responses within 24 h.
observation forms were sense checked by a second mem-
ber of the research team (NS). Discussion between RY Data analysis
and NS led to agreement that the ‘think aloud’ data would Descriptive and inferential statistics were conducted in
be transcribed verbatim onto the usability observation Excel (Microsoft) and SPSS (IBM version 28.0.0.).
form to ensure the user experience was fully captured.
Narrative which indicated user uncertainty, hesitation or Usability issues
dissatisfaction with the GUI was documented as a usabil- Usability occurrences recorded on the usability observa-
ity occurrence. tion forms were collated to identify the total number of
incidents detected through cursor tracking and ‘think
Participant satisfaction aloud’ data on version 1 and version 2 of the GUI. Usa-
The PSSUQ was selected to capture participants’ experi- bility incidents which recurred across participants were
ence of the GUI on completion of each test. The PSSUQ clustered to develop a definitive list of usability issues.
is a 16-item standardised questionnaire devised to meas- The identified usability issues were coded according to
ure users’ perceived satisfaction of a software system four a-priori categories developed during stages one and
(Tables 4, 5). The PSSUQ has concurrent validity [45], two of the research programme [13, 20]. The categories
System usefulness 1. Overall, I am satisfied with how easy it was to use this system
2. It was simple to use this system
3. I was able to complete the tasks and scenarios quickly using this system
4. I felt comfortable using this system
5. I believe I could be productive quickly using this system
6. I believe I could become productive quickly using this system
Information quality 7. The system gave error messages that clearly told me how to fix problems
8. Whenever I made a mistake using the system I could recover easily and quickly
9. The information provided with this system was clear
10. It was easy to find the information I needed
11. The information was effective in helping me complete the tasks and scenarios
Interface quality 12. The organisation of information on the systems screens was clear
13. The interface of this system was pleasant
14. I liked using the interface of this system
15. This system has all the functionalities and capabilities I expect it to have
16. Overall, I am satisfied with this system
On a scale from Strongly Agree to Strongly Disagree, please rate the following statements
(Positive Statement) 1 2 3 4 5 6 7 NA
Table 6 User by problem matrix
Young et al. Journal of NeuroEngineering and Rehabilitation
Safety
(2023) 20:95
Operational
Programme effectiveness
User engagement
*Frequency of problem
**Severity of problem
Page 9 of 25
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 10 of 25
were; (1) system safety; (2) operational efficiency; (3) pro- repeated task times. Independent T-Tests were con-
gramme effectiveness; and (4) user engagement. ducted on all task time data to detect any statistically sig-
To determine which usability issues required prioritisa- nificant difference in completion times recorded between
tion, the frequency of occurrence was collated and sever- the EU and PU groups.
ity was scored. Frequency was recorded on a modified
user by problem matrix (Table 6) [31]. Total issue occur- User satisfaction
rence was summated to enable comparison between the Shapiro-Wilks tests (significance 0.05) were conducted
user streams and incidence of problems on versions 1 on task time to determine normal distribution. Total
and 2 of the GUI. PSSUQ scores were analysed in addition to analysis of the
The problem severity scale developed by Dumas and individual sub-sections. An independent samples T-Test
Redish [31] was adapted to identify features which may was conducted on the difference in scores between the
cause risk of injury, impede programme effectiveness or user streams for version 1 and version 2 of the GUI.
reduce user engagement. Table 7 indicates the adapted
categories in italics. All detected usability issues were Results
scored to determine severity. The results are presented in alignment with the under-
Descriptive analysis of the user by problem matrix pinning objectives of the study.
was conducted to examine the pattern of usability issues
across the a-priori categories and compare sequential Evaluate version 1 of the GUI
versions of the GUI. The total occurrence of usability issues detected and
Two members of the research team (RY and AH) dis- recorded during the examination of version 1 was 100.
cussed each usability issue, considering the frequency Each incident was described and coded to the relevant
and severity to determine which usability issues would be a-priori category which enabled identification of recur-
addressed in the iteration of version 2 of the GUI. Usabil- rent usability problems. The distribution of usability inci-
ity issues with a severity score of four were automatically dents across the four categories on version 1 was 24%
addressed. safety, 28% operational, 22% programme effectiveness
and 26% user experience.
Task completion Twenty-two different usability issues were identified
Task completion was defined as navigation through all during the testing of version 1 (Table 8), a detailed list-
required submenus within the GUI to access the exercise ing of these can be accessed in the supplementary mate-
programme, user performance or assistance request stip- rials 2.0. Each problem was analysed by two members
ulated in the task descriptor. No time limit was applied. of the research team (RY, AH) and the decision regard-
Instances in which a participant made an error but was ing whether to directly address the problem in the itera-
able to self-correct and navigate to the intended menu tion of version 2 was determined by the issue frequency,
were recorded as task completion. Task completion data severity and feasibility of adapting the underpinning
were recorded and collated on the usability observation technology.
form.
Safety
Task duration Features which could lead to the machine commencing
Shapiro-Wilks tests (significance 0.05) were conducted or sustaining unintended movement were identified as a
on task time to determine normal distribution. Calcula- safety risk, alongside difficulties associated with request-
tion of the task duration geometric mean mitigated for ing help. The usability tests completed on version 1 of the
the positively skewed data distribution which is a com- GUI indicated that the ‘help’ icon was not visible enough
mon occurrence with timed tasks [40]. One sample and the ‘assistance called’ text was easy to miss. Ten par-
T-Tests were conducted on the geometric means calcu- ticipants reported feeling unsure about the difference
lated for tasks 1.1 and 4.1 to determine the probability of between the stop/pause/help functions visible during live
95% of users commencing exercise within the benchmark exercise. To address these problems, the menu bar vis-
target of 25 s. ible during the live exercise phase of the programme was
Two-tailed T-Tests are considered robust to the posi- reconfigured to display distinct icons for pause, stop and
tive skew associated with task duration data and log help. The icons were slightly larger and the ‘help’ icon was
transformation is not required [40]. Two-tailed paired positioned on the end of the menu bar. On the ‘assistance
t-tests were conducted on the mean difference scores called’ page, the ‘cancel’ icon was relocated to the bottom
between version 1 and version 2 for tasks 1.1, 1.2 and of the page with the ‘assistance called’ text centralised
2.0 to detect any statistically significant difference in (Fig. 5).
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 11 of 25
Table 7 Problem severity scale the effort feedback dial. The real time visualisation of
detected effort was identified as a priority for amend-
Level 1 Prevents task completion
May lead to user injury ment in version 2 of the GUI. The redesign introduced
May cause programme to be ineffective an expanding and contracting balloon as an alternative to
May cause user disengagement the feedback dial visualised in version 1. (Fig. 7).
Level 2 Creates significant delay or frustration
Significantly impedes programme effectiveness
User engagement
Level 3 Problems have minor effect on usability
May have minor effect on programme effectiveness Concerns regarding clarity of performance results and
May cause minor user uncertainty motivational features were categorised into this section.
Level 4: Subtle and possible enhancements/suggestions Usability testing of version 1 indicated that nine partici-
pants did not understand Watts as a performance metric.
Eight participants reported that the concept of cycling up
a ‘col de Shapemaster’ was not meaningful and two par-
Table 8 Usability issues according to category
ticipants shared that the still image was uninspiring. Ver-
Category Number of detected Number of sion 2 of the GUI displayed standalone numbers and the
usability problems usability
occurrences ‘col de Shapemaster’ concept was replaced by ‘Shapemas-
ter Island.’ (Fig. 8).
Safety 4 24
Operational 8 28
Task completion rates and task duration
Programme effectiveness 6 22
Analysis of task completion and duration enabled the
User engagement 4 26
research team to quantify the usability of the GUI in the
context of specific tasks aligned with its projected pur-
pose. During the testing of version 1, EU7 experienced
Operational difficulties with remote control connectivity causing
Usability issues which could lead to a delay in users oper- the completion times for tasks 1.1 and 1.2 to be inva-
ating the equipment or cause them to require frequent lid and not included in the descriptive analysis; task 2.0
guidance from support staff were coded within the oper- was abandoned. Task completion and duration data are
ational category. Eight operational problems were identi- detailed in Table 9.
fied on version 1; the most frequently occurring usability The completion rate for all tasks was 100% except for
problem was associated with the duplication of activat- Task 2.0 for EU7 which was attributed to failed connec-
ing the ‘start/play’ icons to commence exercise. Delays in tivity rather than navigation through the GUI.
identifying the ‘start’ icon were observed amongst nine The benchmark duration for Task 1.1 was 25 s which
participants. Five participants across both groups ver- was the maximum duration from opening the GUI to
bally reported that the repeated clicking to activate the commencing exercise stipulated by representative com-
machine could cause frustration or confusion. These mercial operators. For this analysis, the EU and PU group
issues were directly addressed in version 2 of the GUI. data were analysed individually as the intention was for
Instead of clicking a ‘start’ and then ‘play’ icon to initi- EUs to operate the GUI independently in a real-world
ate exercise, activation of ‘start exercise’ triggered a three setting.
second countdown with no repeated clicks required. The Calculation of the geometric mean using log transfor-
background to the ‘select duration’ page was adjusted to mation of task duration data generated a better estimate
ensure that the functional icons were distinct (Fig. 6). Six of the central values and has less error or bias than the
operational issues with low frequency and severity scores standard mean for small samples of usability data [40].
were not addressed (Additional file). One tailed T-tests were conducted on the geometric
means calculated from Task 1.1 data recorded from ver-
sion 1 of the GUI for the EU and PU groups to determine
Programme effectiveness the probability of 95% of users achieving the benchmark
The programme effectiveness category identified those target (Table 10).
problems associated with the GUI which had the poten-
tial to impede users in engaging in an optimal intensity
User satisfaction
of exercise or quality of movement. Real time feedback
All participants who completed the usability test on
regarding intensity of effort was a pivotal feature of the
version 1 (n = 19) submitted PSSUQ responses. Anal-
co-designed GUI; however, usability testing of version 1
ysis of PSSUQ scores indicated high levels of user
indicated that 13 of the 19 participants misinterpreted
satisfaction across both user groups and favourable
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 12 of 25
Fig. 5 Safety problems addressed: Stop and pause icons were added to the menu bar and the ‘assistance called’ message was centralised
comparison with PSSUQ normative data. Due to limi- Develop and evaluate an extended version 2 of the GUI
tations associated with published normative values, Development of version 2
inferential analysis would not have represented a Version 2 of the GUI addressed eight of the usability
meaningful comparison [38]. The ‘information qual- issues identified during the testing of version 1 and these
ity’ subsection attained the lowest satisfaction scores are detailed in Table 12.
across both user groups and this pattern is mirrored in Version 2 also included an extended range of pro-
the published normative data [40] (Table 11). gramme options underpinned by an individualised base-
The scores submitted by the EU group were slightly line assessment. The intention was to develop a tailored
lower than the PU group indicating greater satisfaction prescription of exercise at an optimal intensity for the
amongst the EU group. An independent samples T-Test individual user. The ‘baseline assessment’ programme
was conducted on the difference in scores between the would be completed with supervision from an exercise
two groups, no statistically significant difference in satis- or rehabilitation professional to ensure an appropriate
faction between the user groups was detected (P = 0.296, intensity and duration of exercise (Fig. 9).
confidence interval − 0.19 to 0.58). The ‘my programme’ menu also included the choice of
either a ‘steady’ or ‘hilly’ interval programme. The target
intensity was indicated by a white balloon, with detected
purple effort expanding within it (step three in Fig. 10).
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 13 of 25
Fig. 6 Operational problems addressed: On version 2, activation of the ‘start exercise’ icon triggered a countdown to commencement of movement
avoiding the need for a second click on the ‘play’ icon
Fig. 7 Programme effectiveness problems addressed: The effort biofeedback was re-designed on version 2. The expanding circle replaced the dial
used on version 1
Fig. 8 User engagement problems addressed: The concept of ‘Col-de Shapemaster’ was replaced by ‘Shapemaster Island’ and watts were removed
from the metric details
EU1 12 51 31
EU2 20 63 30 user experience (Table 13). Two recurrent issues identi-
EU3 19 56 32 fied during testing of version 1; identification of the ‘help’
EU4 27 67 26 icon and interpretation of effort detection feedback. 24
EU6 33 77 38 new usability issues were identified.
EU7 280 * 380* Terminated
EU8 16 55 19 Safety
EU9 26 66 32 Usability testing on version 2 of the GUI indicated that
EU10 21 59 13 identification of the ‘help’ icon remained an issue for
PU1 9 48 24 two participants and three new usability problems were
PU2 20 57 25 detected. Four participants reported that a new count-
PU3 16 55 25 down feature did not allow enough time to prepare for
PU4 21 59 20 machine movement. One PU participant was concerned
PU5 21 55 28 that the plus and minus icons on the live exercise page
PU6 15 53 18 could be mistaken for speed adjustment and three partic-
PU7 12 51 19 ipants were concerned that users would proceed without
PU8 11 48 34 assistance during a baseline assessment.
PU9 11 47 22
PU10 13 57 22 Operational
Range 9–33 47–77 18–38 Testing of version 2 of the GUI indicated that the oper-
Median duration 17.5 55.5 25 ational problems observed in version 1 did not recur.
% Task completion 100% 100% 100% However, the introduction of the extended ‘my pro-
*Invalid data due to connectivity gramme’ area of the GUI did create five new usability
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 15 of 25
Safety Help button not visible enough Help icon more centrally positioned on menu bar
Assistance called message not visible enough Assistance called message centralised
Distinction between stop/pause/help functions not clear Menu bar reformatted
Operational Repeated clicks to start exercise ‘Start exercise’ icon triggered a countdown to exercise
Select duration/start exercise icons not visible enough Background and icon boundaries amended to be more distinct
Programme effectiveness Effort detection dial misinterpreted Effort detection displayed as an expanding balloon
User experience Performance metrics (watts) not understand by users Standalone numbers displayed
‘Col-de-Shapemaster’ concept not meaningful ‘Shapemaster Island’ concept introduced
Fig. 9 Graphical User Interface version 2 baseline assessment menu: The login submenu and programme selection were developed from version 1.
Steps 3–8 illustrate the ‘baseline assessment’ function
problems associated with the new features. The con- confusion amongst PUs and EUs It was suggested that
cept of a baseline assessment, intended for new users substantial explanation and support would be needed
or people wishing to review their progress, created to support users in navigating this programme option.
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 16 of 25
Fig. 10 Graphical User Interface Version 2 hilly exercise programme menu: The white margin outside the purple circle indicated the target effort
for the user
Safety 1 3 10
Operational 0 5 25
Programme effectiveness 1 10 35
User experience 0 6 16
The omission of duration selection option for the ‘hilly’ User experience
or ‘steady’ workout options was identified by five par- The unquantified numbers on the results page raised a
ticipants and has the potential to cause operational dis- concern by six participants across the PU (3) and EU (3)
ruption if not amended in future iterations. groups who reported that a metric was needed. Three
different participants, two from EU and one from PU
groups, observed that the ‘Shapemaster Island’ concept
Programme effectiveness was not consistently embedded across the menus of the
Usability tests completed on version 2 indicated that GUI. The importance of feedback regarding symmetry
the new iteration of the real time effort feedback was of feedback was expressed by two PU participants and
much clearer than version 1, with only one partici- two different PU participants noted that the intensity
pant (EU9) expressing uncertainty. However, the new level was not included in the results page.
features introduced into the ‘my programme’ area Usability issues with a severity score of 4 or occur-
generated a range of new usability issues. The most fre- rence greater than 25% are summarised in Table 14 and
quently occurring problem was associated with uncer- will be considered for amendment in the next iteration
tainty regarding the purpose of the white circle which of the GUI.
was intended to indicate the target intensity. The other
problems were associated with the intensity selection
function, absence of temporal tracking, speed selection
and heart rate feedback (Additional file 2).
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 17 of 25
EU1 12 52 21 17 56 10 48
EU2 14 64 25 34 75 47 88
EU3 10 52 23 18 65 16 46
EU4 25 72 49 31 75 19 66
EU6 34 77 45 29 71 17 58
EU7 34 79 34 15 53 34 75
EU8 16 57 20 18 58 13 51
EU9 20 89 43 29 69 15 45
EU10 16 54 16 20 54 12 46
PU1 13 49 16 21 56 12 49
PU2 11 49 23 20 63 13 50
PU3 14 54 20 20 59 12 51
PU4 15 53 26 21 58 14 52
PU5 W/D* W/D* W/D* W/D* W/D* W/D* W/D*
PU6 15 53 19 20 59 15 48
PU7 14 52 17 19 63 12 54
PU8 12 52 22 19 65 9 48
PU9 13 53 18 22 60 12 48
PU10 14 52 20 23 67 12 51
Range (sec) 11–34 49–89 18–49 17–34 53–75 9–47 46–88
Median duration (sec) 14 53 21.15 20 61.5 13 50.5
% Task completion 100% 100% 100% 100% 100% 100% 100%
*Participant withdrawn
duration data recorded during the testing of version 2. two groups, no statistically significant difference in satis-
The geometric mean using log transformation of task faction between the user groups was detected (P = 0.827,
duration data was calculated for each user group and one confidence interval − 0.30 to 0.37).
tailed T-tests were conducted. The results summarised in
Table 16 indicate the probability of 95% attainment of the Comparison of the usability of version 1 and version 2
target benchmark across both user groups. Direct comparison between the ‘quick start’ programme
The baseline assessment programme evaluated dur- on version 1 and version 2 aimed to evaluate differences
ing Tasks 3.1 and 3.2 required a user induction or for- recorded pertaining to problem occurrence, type of usa-
mal review which would be supervised, therefore the bility issues detected and performance of tasks 1.1, 1.2
benchmark target duration was not applicable. However, and 2.0. The extended menus explored on version 2 cre-
Task 4.1 was intended to evaluate independent naviga- ated a new user experience and therefore statistical com-
tion through the GUI and the 25 s benchmark target was parison of user satisfaction as reported in the PSSUQ was
applicable. Analysis of user group attainment of this is not explored.
detailed in Table 17.
The probability of attaining the 25 s benchmark Usability issues
amongst the EU group was below 95% indicating that this Five usability issues were identified on the ‘quick start’
programme option may have the potential to cause oper- submenu on version 2, compared with 22 on version 1.
ational disruption due to user delay. Two of the issues identified on version 2 were recurrent;
visibility of the ‘help’ icon and clarity of the effort detec-
User satisfaction tion biofeedback. However, the frequency of problem
All participants who completed the usability test on ver- occurrence was lower, with one participant reporting
sion 2 submitted PSSUQ responses, however, two data difficulty associated with interpretation of the biofeed-
sets from the PU group were discarded due to a techni- back on version 2 compared with 13 participants during
cal issue with the survey software. The ‘information qual- the testing of version 1. Three new usability issues were
ity’ subsection attained the lowest satisfaction scores associated with the changes made between version 1 and
amongst the EU group, whereas ‘interface quality’ was version 2. The countdown feature was considered too
the aspect of lowest satisfaction amongst the PU group. short and potentially unsafe by four participants; six par-
Comparison with normative PSSUQ data indicated good ticipants did not like the absence of performance metrics
levels of user satisfaction (Table 18). and three participants reported that the ‘Shapemaster
The scores indicated by the EU group were slightly Island’ theme was inconsistent.
lower than the PU group indicating greater satisfaction
amongst the EU group. An independent samples T-Test Task performance
was conducted on the difference in scores between the With the exception of the connectivity issues which
affected EU7 during the testing of version 1, there was
100% task completion for tasks 1.1, 1.2 and 2.0 across
Table 17 Task 4.1 benchmark comparison versions one and two of the GUI.
Shapiro-Wilks (significance 0.05) tests conducted on
Geometric Benchmark P-value % Achievement
Mean (SD) in in seconds task duration data indicated normal distribution. Log
seconds transformation of raw task duration data is not required
for comparison between mean values as two-tailed t-tests
EU Group 19.3 (1.55) 25 P = 0.074 92.53%
are considered robust to the positive skew associated
PU Group 12.2 (1.16) 25 P = 0.0001 99.99%
with this type of data set [27]. Paired two tailed T-tests
were performed on the mean difference between version the usability as experienced by the two user groups.
1 and version 2 completion times to detect any statisti- The distribution of problem occurrence across user
cally significant difference between version 1 and ver- groups on the two versions of the GUI is summarised
sion 2 task duration data [27] (Table 19). Participants in Table 20.
with incomplete task duration data sets (PU5, EU7) were During the testing of version 1, 40 usability incidents
excluded from this stage of analysis. were detected amongst the EU group compared with 60
Duration was significantly faster for Task 1.1 on ver- incidents amongst the PU group. The PU group were
sion 2 of the GUI compared to version 1 amongst the EU more likely to encounter or identify concerns regarding
group (p = 0.03). A non-significant increase in duration the safety, operational efficiency and effectiveness of the
of tasks 1.2 and 2.0 on version 2 was recorded amongst system when compared with the EU group. On version 2,
the EU group. A non-significant decrease in all task dura- the distribution of usability incidents was 41 for the EU
tion between Versions 1 and 2 amongst the PU group was group, compared with 45 amongst the PU group. Aspects
recorded. of the extended ‘my programme’ menu on version 2 were
unclear to both user groups, particularly the target inten-
Analyse usability as experienced by EU and PU participants sity circle and selection of programme intensity. This
Comparison between the EU and PU groups aimed to accounted for the high occurrence of usability issues
ensure that the GUI was accessible and intuitive for use amongst PU and EU participants in the programme
by PwS and supporting professionals. Detection of sig- effectiveness category of version 2.
nificant differences in task performance and user sat- Task duration was compared between the EU and PU
isfaction would enable the team to identify features on groups to detect any statistically significant differences
the GUI which may require specific amendment. The in usability experienced by PwS. Independent two-sided
occurrence of usability problems and task performance T-Tests were conducted to compare mean completion
data were analysed to detect any differences between time between the EU and PU group (Table 21).
Task 1.1 − 3.37 (3.62) P = 0.03* Task 1.1 − 0.77 (4.20) P = 0.59
(0.35–6.39) (− 2.46 to 4.01)
Task 1.2 + 2.87 (8.74) P = 0.38 Task 1.2 − 1.11 (4.64) P = 0.49
(− 10.1 to 4.41) (− 2.46 to 4.69)
Task 2.0 + 2.62 (11.08) P = 0.52 Task 2.0 − 3.11 (5.13) P = 0.10
(− 11.87 to 6.62) (− 0.84 to 7.06)
EU 9 4 11 10 9 18 11 9 81
PU 15 6 17 15 13 17 15 7 105
Total 24 10 28 25 22 35 26 16 186
Table 21 Comparison of task duration between professional and expert users (version 1)
Task 1.1 Task 1.2 Task 2.0
EU Mean (SD) PU Mean (SD) EU Mean PU Mean EU Mean PU Mean
(sec) (sec) (sec) (sec) (sec) (sec)
21.7 (6.67) 14.2 (4.14) 61.7 (4.55) 53.0 (4.55) 27.6 (2.84) 23.2 (1.58)
P = 0.018* (− 13.5 to − 1.53) P = 0.023* (− 16.0 to − 1.44) P = 0.204 (− 11.5 to 2.76)
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 20 of 25
Table 22 Comparison of task duration between professional and expert users (version 2)
Task 1.1 Task 1.2 Task 2.0
EU Mean (SD) (sec) PU Mean (SD) (sec) EU Mean (sec) PU Mean (sec) EU Mean PU Mean (sec)
(sec)
18.3 (7.8) 13.4 (1.3) 64.4 (13.5) 51.8 (1.7) 30.2 (13.1) 20.1 (3.1)
P = 0.06 (− 11.51 to 1.65) P = 0.033* (− 24.1 to − 1.36) P = 0.067 (− 21.1 to 0.99)
Task 3.1 Task 3.2 Task 4.1 Task 4.2
EU Mean (SD) PU Mean (SD) EU Mean (SD) PU Mean (SD) EU Mean (SD) PU Mean (SD) EU Mean (SD) PU Mean (SD)
(sec) (sec) (sec) (sec) (sec) (sec) (sec) (sec)
23.4 (7.19) 20.5 (1.33) 64.0 (8.9) 61.1 (3.5) 17.0 (7.4) 12.3 (1.65) 58.1 (15.2) 50.1 (2.08)
P = 0.268 (− 8.45 to 2.67) P = 0.388 (− 9.98 to 4.21) P = 0.123 (− 1.59 to 10.92) P = 0.156 (− 3.75 to 19.75)
The PU participants were significantly quicker than the examples from the literature indicate high similarity
EU participants to complete Task 1.2 on both versions of between the findings detected through heuristic evalu-
the GUI. Although the PU participants were quicker to ation and usability testing with representative end users
complete Tasks 1.1 and 2.0, the difference was only statis- [29].
tically significant on Task 1.1 in version 1. The PU partici- The ‘think aloud’ data and usability observations were
pants were quicker to complete Tasks 3.1, 3.2, 4.1 and 4.2 combined to create a descriptive list of categorised
but the difference between the user groups did not reach issues. The total number of recorded usability incidents
statistical significance (Table 22). on version 1 was 100 with 22 different usability issues
identified. Eight of the 22 detected issues were priori-
tised according to severity and frequency and directly
Discussion addressed in version 2. The total number of usability
This study evaluated the usability of a high-fidelity proto- incidents on version 2 was 86, with 24 new usability
type GUI which was co-designed to enable PwS to choose issues identified. Most of these were associated with
from a range of exercise programmes and view real time the new, extended programme menus, indicating that
feedback of their exercise performance during exercise. the amendments made to the ‘quick start’ menu did
Two sequential versions of the GUI were evaluated with improve usability. This descriptive approach will ena-
two user groups using online remote media with version ble specific usability issues to be ranked and addressed
2 amended in response to usability problems detected on on future iterations of the interface [47]. Although the
version 1 and extended to offer a range of programme ‘think aloud’ data enabled insight into participant’s
choices. The use of a remote testing method to evaluate experience of navigating the GUI, comparable usabil-
the usability of the new technology is reported which ity studies have captured rich qualitative data through
denotes a solution to the challenges associated with face- focus groups or interviews to gain a more in-depth
to-face usability evaluation with users of rehabilitation understanding of the participant’s perspectives on a
technologies. The value of different testing approaches is novel technology [14, 21].
also reflected which will guide future research and design Although the amendments implemented on version
teams in the selection of tasks and analysis methods. 2 of the ‘quick start’ menu did improve its usability,
Integrated multiple methods of usability evaluation the occurrence and seriousness of usability problems
were implemented to detect usability problems and detected on version 2 suggests that further amend-
evaluate the user experience. Empirical, performance- ments are required before the technology is imple-
based metrics including task completion rates and task mented. The ability to stop assisted movement quickly
duration were used to evaluate the usability of the GUI. and call for assistance is a priority for safe use of power
In comparison, the ‘think aloud’ data and video foot- assisted exercise and the EU group were slower to com-
age captured qualitative insights into the users’ experi- plete this task on version 2 compared with version 1.
ence and facilitated identification of specific usability On reviewing version 2 it was recognised that the ‘help’
issues across all of the a-priori categories. Triangula- icon was positioned more peripherally on the menu bar.
tion of different usability evaluation methods increases This is particularly pertinent considering impairment
the chance of identifying usability issues and heuristic in spatial awareness is widely reported amongst PwS
evaluation conducted by usability experts may further which can impact ability to process visual input [43].
enhance methodological robustness [31]. However,
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 21 of 25
The use of red, centralised icons has therefore been The baseline assessment on version 2 was intended to
recommended to ensure rapid activation of safety func- create an individualised prescription for each user. Base-
tions such as ‘stop’ or ‘quit’ on devices designed for PwS line assessment has been previously integrated with gam-
[26]. ing technologies for PwS to develop a programme which
Task completion and task duration data benchmarked was adaptive to different users and responsive to their
against the commercial target indicated that the ‘Quick fluctuating cognitive and motor ability [32, 48]. The pur-
Start’ programme on both versions of the GUI would pose of the paler target intensity balloon introduced on
enable users to commence exercise independently and version 2 was not clear to most participants and it was
within the required timescales. Comparison of task suggested that this would require verbal explanation
duration between version 1 and version 2 indicated a to new users of the technology. Quantification of user
non-significant decrease in task duration amongst the performance was an area of dissonance between par-
PU participants and significant decrease for Task 1.1 ticipants during the testing of version 1 and version 2.
amongst EU participants. This apparent improvement in Positive reward about performance and a system which
usability may be attributed to the changes implemented is responsive to all levels of ability is important to sustain
on version 2. It is also possible that repeated exposure to user engagement [49]. Achievement of an effective and
the GUI may have contributed to the participant’s ability sustained exercise intensity is a challenge for providers of
to navigate through it more quickly [48]. stroke recovery services as patients typically do not sus-
The safety and operational usability categories exempli- tain the level of effort required for physiological benefit
fied the divergence which can exist between operational [7]. Assisted exercise with real-time feedback represents
efficiency and safety. Adjustments implemented on ver- a potential solution as the motorised mechanism enables
sion 2 did reduce the occurrence of operational and movement in the presence of motor impairment [50].
safety problems, although access to support and super- Sophisticated human-in-the-loop feedback systems syn-
vision will need to be monitored during implementation chronised with detected mechanical work rate have been
of the technology. The co-designed GUI was intended to piloted on similar technologies to optimise user attain-
promote user independence, although the value of a sup- ment of target intensity [51].
ported induction to the equipment and availability of The PSSUQ data captured an impression of the user
support throughout exercise was emphasised during the experience and indicated that reported satisfaction was
co-design stages of the research programme [13]. The high with a non-significant increase recorded for ver-
safety of rehabilitation technologies is service and set- sion 2. However, the PSSUQ was not sensitive to specific
ting specific [32, 49]. Factors which should be considered usability issues and did not directly inform the amend-
in the implementation of rehabilitation devices in stroke ments implemented on version 2. Comparable examples
rehabilitation include physical space, staff capacity, user from the stroke literature have implemented modified
ability and technological features [49]. user satisfaction questionnaires to evaluate and com-
One of the key features for the new technology was pare novel technologies [35]. Feingold-Polak et al. [35]
the introduction of effort detection capability and provi- reported higher user satisfaction for a robot guided exer-
sion of biofeedback to enable users to observe, adjust and cise technology compared with a computer led system,
compare their exercise performance to previous sessions. although this difference was not statistically significant.
Sophisticated gamification, augmented or virtual real- User satisfaction scores were slightly higher amongst the
ity technology was beyond the resource available for this EU group. Evaluation of similar assistive technologies has
early iteration of the GUI but could be potentially incor- also reported higher satisfaction amongst expert users
porated in the future. The effort feedback dial featured compared with professional users [14, 33]. It is possible
on version 1 was widely misinterpreted as an indication that PU’s underestimate the ability of EU’s to navigate
of remaining duration; the dial was replaced by the effort and operate digital interventions [33]. Service providers
balloon on version 2 which was very quickly understood influence the extent to which assistive technologies are
by nearly all participants. Identification of the misinter- adopted and therefore addressing the viewpoints of PU
pretation was detected through the ‘think aloud’ data. representatives is important to ensure successful imple-
Analysis of think aloud data in the evaluation of digital mentation [49].
apps for use by older adults has previously enabled cat- The anticipated operators of digitised power assisted
egorisation of usability issues according to severity and exercise equipment include leisure centres, community
types of barrier detected [48]. This exemplifies the value venues and rehabilitation services, with the target user
of ‘think aloud’ data compared with usability studies groups comprising PwS, supported by therapy teams or
which have focussed on user satisfaction and adverse exercise professionals. The use of remote testing meth-
events to quantify usability [36]. ods enabled recruitment of participants who would have
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 22 of 25
encountered practical barriers to attendance of face-to- were detected and addressed, the team recommend field
face usability evaluation [39]. Rehabilitation and exercise testing of a late stage prototype prior to commercial
professionals were recruited alongside PwS to capture implementation of the new technology. As the horizon
the perspectives of multiple end users. This combination for digital, robotic and assistive technologies expands,
was intended to optimise detection of usability issues methodological approaches to optimise their design
across the a-priori categories. PU participants detected and usability are a priority in the field of rehabilitation
more potential issues than the EU group during the test- engineering and robotics. The medical device tech-
ing of version 1. Interestingly, this disparity was not iden- nology framework ensures involvement of PU and EU
tified during the testing of version 2. It is possible that the groups and promotes a logical and yet iterative approach.
EU participants required longer to understand the usabil- The methods reported in this article have the poten-
ity testing process and gain confidence in identifying and tial to serve as an example in the development of future
articulating potential issues. PU participants focussed technologies.
on operational and safety issues, whilst the EU partici-
pants commented more on the programme effective-
ness and user experience. Comparable usability studies Study limitations
examining stroke related technologies have selected only Data were collected during a period of national lock-
healthy participants to avoid the potential for bias asso- down imposed by the government during the covid-19
ciated with motor or cognitive impairment [32]. Expert pandemic; the original proposal to field test the GUI was
users and those with lived experience remain under- adapted through implementation of remote media to
represented in the development of new technologies and enable virtual testing.
systems devised to optimise rehabilitation outcomes [26, The objectives of the study were attained insofar as two
32]. Participants with neurological impairment have crit- sequential versions of the GUI were developed and evalu-
ical views on assistive technologies and their perspective ated capturing a diverse range of user experiences. The
should be complicit in the development and implementa- tasks which guided the usability testing were relevant to
tion of new equipment and products [33]. the proposed long-term use of the GUI and were effec-
This study reported on stage three of a co-design and tive in highlighting usability problems.
usability evaluation centred on the digital advancement Several limitations are acknowledged in that the remote
of PAE equipment. Effort detection technology and a testing of a technology devised for venue-based exercise
range of programme menus to guide the user through the inevitably situated the user experience out of context.
setup process were developed and evaluated. The poten- Although the sample size was calculated through appli-
tial to further develop the technology was identified by cation of the probabilistic model of problem discovery,
research participants and the project team. Integration of this method was developed for non-clinical populations
heart rate sensors on the handles would enable specific [40]. Given the complex cognitive, perceptual and motor
monitoring of exercise intensity [52], whilst haptic or impairments associated with stroke, a larger sample of
auditory signalling may improve accessibility of the tech- EU participants would have reduced the likelihood of
nology for people with visual or perceptual impairments errors due to over or under representation. Measurement
[53]. The real time feedback displayed on the GUI could of the degree of motor or sensory impairment alongside
be gamified or developed as an immersive virtual real- cognitive and perceptual changes amongst the EU group
ity experience [54]. Development of a user identification was not conducted. The heterogenous nature of the sam-
system has been identified as a commercial priority and ple means that results cannot be viewed as conclusive
will enable data analytics, intelligent exercise prescription to the whole stroke population. On several occasions,
and connectivity with referring services [32]. participants commented that the usability problems
This application of the medical device technology encountered would have been less likely to occur if they
framework has integrated co-design techniques [13] with had been engaged with the machine in a real-world set-
mixed method usability testing of two sequential ver- ting such as in a gym environment or in a rehabilita-
sions of a new GUI. Due to the restrictions imposed by tion centre. However, the remote technology did enable
the COVID-19 pandemic, face to face usability testing more effective capture of the data. Stage 3 of the Medi-
was not possible and in order to navigate this challenge, cal Device Technology framework does stipulate real
synchronous remote testing was implemented. This study field testing of prototypes and this has been previously
adds to the small number of examples of remote usability achieved by design teams who have conducted usabil-
testing with hard to reach user groups which offers the ity trials within the home environment [21]. In addition,
advantage of cost effectiveness compared with in-house field testing enables identification of technical problems
usability tests [55]. Although numerous usability issues due to hardware issues [35].
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 23 of 25
The testing procedure was dependent on reliable inter- from a high-fidelity prototype, to a market ready ver-
net connectivity, access to a digital device and an ability sion of the technology which will enable end users
to use Zoom software. This excluded individuals with of PAE to identify, monitor and progress rehabilita-
limited digital access or ability from participation which tion goals. The next step in this process will comprise
is an area of increasing concern in healthcare provi- field testing of a late stage prototype in rehabilitation
sion and research [56]. Although the ‘think aloud’ data settings with a new sample of PU and EU representa-
allowed some exploration of the participants’ qualitative tives. The iterative model which underpins the medi-
perspective, the approach to data collection and analysis cal device technology framework will ensure sustained
was primarily empirical. Comparable usability studies user involvement throughout implementation and eval-
have included semi structured interviews to capture an uation of the new technology.
in-depth insight into the users’ perspective and experi-
ence [33]. The same sample of participants tested version
Abbreviations
1 and version 2 of the GUI which enabled direct intra- EU Expert user
subject comparison between the versions. However, it GUI Graphical user interface
is acknowledged that this may have introduced bias as PAE Power assisted equipment
PSSUQ Post Study System Usability Questionnaire
the amendments were based on the participant’s initial PU Professional user
feedback [31]. Introduction of new participants to ver-
sion 2 would have strengthened the design of the study. Supplementary Information
The remote testing methods reported in this study have The online version contains supplementary material available at https://doi.
the potential to be applied to the evaluation of other user org/10.1186/s12984-023-01207-7.
interfaces synched with rehabilitation technologies [39].
However, other widely reported barriers to adoption of Additional file 1. Usability observation form.
rehabilitation technologies which include donning, doff- Additional file 2. User by problem matrices.
ing and set-up time require some face-to-face interaction
between the participants and the research team. Acknowledgements
The research team would like to extend our gratitude to the participants who
contributed their time towards the usability evaluation reported in this article.
Conclusions We would also like to express gratitude towards the Shapemaster team for
their engagement in this programme of research.
Robust co-design and usability evaluation methods are
integral to the development and implementation of new Author contributions
assistive technologies in stroke rehabilitation. Remote RY led the recruitment, data collection and data analysis phases of the project.
RY led the write up of all key sections of the manuscript. KS guided the
testing of two sequential versions of a co-designed GUI development of the project protocol and advised on ethical considerations.
with two user groups enabled identification of usability KS edited two iterations of the manuscript. DB guided on the development of
issues and evaluation of user satisfaction. The changes the project protocol and inferential analysis of the quantitative data. DB edited
the first iteration of the manuscript. AH designed versions one and two of the
implemented on version 2 successfully addressed seri- GUI and co-analysed all descriptive data. AH contributed to the write up of
ous usability problems detected on version 1. However, the methods and results sections of the manuscript. NS contributed towards
the extended range of programme options introduced on interpretation of the usability data including identification and categorisa-
tion of usability issues. NS proof read the final version of the manuscript. CS
version 2 created new usability problems; these mostly guided the project throughout the planning, delivery and write up stages
reflected concerns regarding therapeutic effectiveness with specific expertise on the development and testing of rehabilitation
of the technology rather than its operational efficiency technologies. CS edited two versions of the manuscript and proof read the
submitted version.
or safety features. The ‘think aloud’ data combined with
the observation of task walk performance was effec- Funding
tive in detecting specific usability issues, whilst the task This work was supported by Grow MedTech under Grant (POF00095).
completion and duration data provided an indication of Availability of data and materials
the operational readiness of the technology. The PSSUQ All datasets generated including usability observation forms and statistical cal-
scores provided an overall impression of user satisfaction culations are available from the corresponding author on reasonable request.
and enabled comparison between user groups and the
two versions of the GUI. Declarations
The recruitment of EU and PU representatives ena- Ethics approval and consent to participate
bled the research team to identify and address a range Ethical approval was granted by the ethics committee at Sheffield Hallam
of usability problems. Diverse user perspectives were University. The reference code was: (ER26319972).All participants signed a
consent form with reference to a detailed participant information sheet. They
captured which improved the usability of the GUI and were made aware of their right to withdraw from the study without need for
generated a vision for future technology advancement. explanation or any impact on future services or opportunities.
The findings from this study will facilitate the transition
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 24 of 25
Consent for publication 11. Young R, Broom D, O’Brien R, Sage K, Smith C. Users’ experience of
The participants were made aware of the intent to publish the study at the community-based power assisted exercise: a transition from NHS to third
point of consent. All data was anonymised and no images or identifying sector services. Int J Qual Stud Health Well Being. 2021;16(1):1949899.
information have been included within the manuscript. 12. Linder SM, Rosenfeldt AB, Rasanow M, Alberts JL. Forced aerobic exercise
enhances motor recovery after stroke: a case report. Am J Occup Ther.
Competing interests 2015;69(4):1–8. https://doi.org/10.5014/ajot.2015.015636.
The work published in this manuscript is part of a programme of research 13. Young R, Smith C, Sage K, Broom D. Application of the nominal group
examining power assisted exercise as part of the lead author’s doctoral technique to inform a co-design project on power assisted exercise
study. An academic collaboration between the equipment manufacturer and equipment for people with stroke. Physiotherapy. 2021;113:e80–1.
Sheffield Hallam University exists in which machines have been provided for 14. Bauer CM, Nast I, Scheermesser M, Kuster RP, Textor D, Wenger M, Kool J,
use in kind. There is no restriction or clauses on publishing negative findings. Baumgartner D. A novel assistive therapy chair to improve trunk control
during neurorehabilitation: perceptions of physical therapists and
Author details patients. Appl Ergonomics. 2021;94:103390.
1
Department of Allied Health Professions, Advanced Wellbeing Research Cen- 15. Cameirão MS, Smailagic A, Miao G, Siewiorek DP. Coaching or gaming?
tre, Sheffield Hallam University, 2 Old Hall Road, Sheffield S9 3TU, UK. 2 Faculty Implications of strategy choice for home based stroke rehabilitation. J
of Health and Education, Manchester Metropolitan University, Manchester NeuroEng Rehabil. 2016;13(1):1–15.
Brooks Building, 53 Bonsall Street, Manchester M15 6GX, UK. 3 Centre for Sport 16. Doumas I, Everard G, Dehem S, Lejeune T. Serious games for upper
Exercise and Life Sciences, Institute of Health and Well-Being, Coventry limb rehabilitation after stroke: a meta-analysis. J NeuroEng Rehabil.
University, Coventry CV1 2DS, UK. 4 Sports Engineering Research Group, 2021;18(1):1–16.
Advanced Wellbeing Research Centre, Sheffield Hallam University, 2 Old Hall 17. Enam N, Veerubhotla A, Ehrenberg N, Kirshblum S, Nolan KJ, Pilkar R.
Road, Sheffield S9 3TU, UK. 5 College of Health, Wellbeing and Life Sciences, Augmented-reality guided treadmill training as a modality to improve
Sheffield Hallam University, Collegiate Crescent Campus, Sheffield S10 2BP, UK. functional mobility post-stroke: a proof-of-concept case series. Top Stroke
6
Advanced Wellbeing Research Centre, Sheffield Hallam University, Collegiate Rehabil. 2021;28(8):624–30.
Crescent Campus, Sheffield S10 2BP, UK. 18. Park S, Liu C, Sánchez N, Tilson JK, Mulroy SJ, Finley JM. Using biofeedback
to reduce step length asymmetry impairs dynamic balance in people
Received: 21 May 2022 Accepted: 19 June 2023 poststroke. Neurorehabil Neural Repair. 2021;35(8):738–49.
19. Williamson T, Kenney L, Barker AT, Cooper G, Good T, Healey J, Heller B,
Howard D, Matthews M, Prenton S, Ryan J, Smith C. Enhancing public
involvement in assistive technology design research. Disabil Rehabil
Assist Technol. 2015;10(3):258–65.
References 20. Shah SGS, Robinson I, AlShawi S. Developing medical device technolo-
1. Pogrebnoy D, Dennett A. Exercise programs delivered according to gies from users’ perspectives: a theoretical framework for involving
guidelines improve mobility in people with stroke: a systematic review users in the development process. Int J Technol Assess Health Care.
and meta-analysis. Arch Phys Med Rehabil. 2020;101(1):154–65. 2009;25(4):514–21.
2. Regan EW, Handlery R, Stewart JC, Pearson JL, Wilcox S, Fritz S. Integrating 21. Thilo FJS, Hahn S, Halfens RJG, Schols Jos MGA. Usability of a wearable
survivors of stroke into exercise-based cardiac rehabilitation improves fall detection prototype from the perspective of older people-a real field
endurance and functional strength. J Am Heart Assoc. 2021;10(3):1–12. testing approach. J Clin Nurs. 2019;28(1–2):310–20. https://doi.org/10.
3. Young RE, Broom D, Sage K, Crossland K, Smith C. Experiences of venue 1111/jocn.14599.
based exercise interventions for people with stroke in the UK: a system- 22. Thilo FJS, Bilger S, Halfens RJG, Schols JM, G. A., & Hahn, S. Involvement
atic review and thematic synthesis of qualitative research. Physiotherapy. of the end user: exploration of older people’s needs and preferences for
2021;110:5–14. a wearable fall detection device—a qualitative descriptive study. Patient
4. Brouwer R, Wondergem R, Otten C, Pisters MF. Effect of aerobic training Prefer Adherence. 2017;11(11):11–22. https://doi.org/10.2147/PPA.S1191
on vascular and metabolic risk factors for recurrent stroke: a meta-analy- 77.
sis. Disabil Rehabil. 2021;43(15):2084–91. 23. Sun M, Smith C, Howard D, Kenney L, Luckie H, Waring K, Taylor P, Merson
5. MacKay-Lyons M, Billinger SA, Eng JJ, Dromerick A, Giacomantonio N, E, Finn S. FES-UPP: a flexible functional electrical stimulation system
Hafer-Macko C, Macko R, Nguyen E, Prior P, Suskin N, Tang A, Thorn- to support upper limb functional activity practice. Front Neurosci.
ton M, Unsworth K. Aerobic exercise recommendations to optimize 2018;12:449.
best practices in care after stroke: AEROBICS 2019 update. Phys Ther. 24. Smith C, Kenney L, Howard D, Waring K, Sun M, Luckie H, Hardiker N, Cot-
2020;100(1):149–56. terill S. Prediction of setup times for an advanced upper limb functional
6. Barstow B, Thirumalai M, Mehta T, Padalabalanarayanan S, Kim Y, Motl RW. electrical stimulation system. J Rehabil Assist Technol Eng. 2018;5:1–9.
Developing a decision support system for exercise engagement among 25. British Design Council. Framework for Innovation: Design Council’s
individuals with conditions causing mobility impairment: perspectives evolved Double Diamond. (2019) https://www.designcouncil.org.uk/our-
of fitness facility fitness exercisers and adapted fitness center trainer. work/skills-learning/tools-frameworks/framework-for-innovation-design-
Technol Disabil. 2020;32(4):295–305. councils-evolved-double-diamond/
7. Gothe NP, Bourbeau K. Associations between physical activity intensi- 26. Jie L-J, Jamin G, Smit K, Beurskens A, Braun S. Design of the user interface
ties and physical function in stroke survivors. Am J Phys Med Rehabil. for “Stappy”, a sensor-feedback system to facilitate walking in people
2020;99(8):733–8. after stroke: a user-centred approach. Disabil Rehabil Assist Technol.
8. Young RE, Richards E, Darji N, Velpula S, Goddard S, Smith C, Broom D. 2020;15(8):959–67.
Power-assisted exercise for people with complex neurological impair- 27. Nasr N, Leon B, Mountain G, Nijenhuis SM, Prange G, Sale P, Amirabdol-
ment: a feasibility study. Int J Ther Rehabil. 2018;25(6):262–71. lahian F. The experience of living with stroke and using technology:
9. Bossink LWM, van der Putten AAJ, Waninge A, Vlaskamp C. A power- opportunities to engage and co-design with end users. Disabil Rehabil
assisted exercise intervention in people with profound intellectual and Assist Technol. 2016;11(8):653–60. https://doi.org/10.3109/17483107.
multiple disabilities living in a residential facility: a pilot randomised 2015.1036469.
controlled trial. Clin Rehabil. 2017;31(9):1168–78. https://doi.org/10.1177/ 28. Jayasree-Krishnan V, Ghosh S, Palumbo A, Kapila V, Raghavan P. Develop-
0269215516687347. ing a framework for designing and deploying technology-assisted
10. Jacobson BH, Smith D, Fronterhouse J, Kline C, Boolani A. Assessment of rehabilitation after stroke: a qualitative study. Am J Phys Med Rehabil.
the benefit of powered exercises for muscular endurance and functional 2021;100(8):774–9.
capacity in elderly participants. J Phys Act Health. 2012;9(7):1030–5. 29. Albu M, Atack L, Srivastava I. Simulation and gaming to promote health
https://doi.org/10.1123/jpah.9.7.1030. education: results of a usability test. Health Educ J. 2015;74(2):244–54.
30. BS EN ISO 9241-11:2018: Ergonomics of human-system interaction.
usability: Definitions and concepts (2018). British Standards Institute.
Young et al. Journal of NeuroEngineering and Rehabilitation (2023) 20:95 Page 25 of 25
31. Dumas J, Janice R. A Practical Guide to Usability Testing. Intellect™ Exeter assess and influence aerobic capacity early after stroke: a proof-of-con-
England. Revised edition (1999) cept study. Disabil Rehabil Assist Technol. 2014;9(4):271–8.
32. Burdea G, Kim N, Polistico K, Kadaru A, Grampurohit N, Roll D, Damiani F. 52. Alzahrani A, Hu S, Azorin-Peris V, Barrett L, Esliger D, Hayes M, Akbare S,
Assistive game controller for artificial intelligence-enhanced telerehabili- Achart J, Kuoch S. A multi-channel opto-electronic sensor to accurately
tation post-stroke. Assist Technol. 2021;33(3):117–28. https://doi.org/10. monitor heart rate against motion artefact during exercise. Sensors.
1080/10400435.2019.1593260. 2015;15(10):25681–702.
33. Moineau B, Myers M, Shaheen Ali S, Popovic MR, Hitzig SL. End-user and 53. Sigrist R, Rauter G, Riener R, Wolf P. Augmented visual, auditory, haptic,
clinician perspectives on the viability of wearable functional electrical and multimodal feedback in motor learning: a review. Psychon Bull Rev.
stimulation garments after stroke and spinal cord injury. Disabil Rehabil 2012;20(1):21–53.
Assist Technol. 2021;16(3):241–50. 54. Mubin O, Alnajjar F, Jishtu N, Alsinglawi B, Al MA. Exoskeletons with virtual
34. Na JS, Kumar JA, Hur P, Crocher V, Motawar B, Lakshminarayanan K. reality, augmented reality, and gamification for stroke patients’ rehabilita-
Usability evaluation of low-cost virtual reality hand and arm rehabilitation tion: systematic review. JMIR Rehabil Assist Technol. 2019;6(2):e12010.
games. J Rehabil Res Dev. 2016;53(3):321–33. 55. Hill JR, Brown JC, Campbell NL, Holden RJ. Usability-in-place—remote
35. Feingold-Polak R, Barzel O, Levy-Tzedek S. A robot goes to rehab: a novel usability testing methods for homebound older adults: rapid literature
gamified system for long-term stroke rehabilitation using a socially review. JMIR Form Res. 2021;5(11):e26181. https://doi.org/10.2196/26181.
assistive robot—methodology and usability testing. J Neuroeng Rehabil. 56. Senbekov M, Saliev T, Bukeyeva Z, Almabayeva A, Zhanaliyeva M,
2021;18(1):1–122. Aitenova N, Toishibekov Y, Fakhradiyev I. The recent progress and applica-
36. Guillén-Climent S, Garzo A, Muñoz-Alcaraz MN, Casado-Adam P, Arcas- tions of digital technologies in healthcare: a review. Int J Telemed Appl.
Ruiz-Ruano J, Mejías-Ruiz M, Mayordomo-Riera F. A usability study in 2020;2020:8830200.
patients with stroke using MERLIN, a robotic system based on serious
games for upper limb rehabilitation in the home setting. J NeuroEng
Rehabil. 2021;18(1):1–16. Publisher’s Note
37. Mah J, Jutai JW, Finestone H, Mckee H, Carter M. Usability of a low-cost Springer Nature remains neutral with regard to jurisdictional claims in pub-
head tracking computer access method following stroke. Assist Technol. lished maps and institutional affiliations.
2015;27(3):158–71.
38. Labinjo T, Ashmore R, Serrant L, Turner J. The use of zoom videoconfer-
encing for qualitative data generation: a reflective account of a research
study. 2021.
39. Sherwin LB, Yevu-Johnson J, Matteson-Kome M, Bechtold M, Reeder B.
Remote usability testing to facilitate the continuation of research...18th
World Congress of Medical and Health Informatics, MedInfo 2021-One
World, One Health–Global Partnership for Digital Innovation, 2–4 Octo-
ber, 2021. Stud Health Technol Inform. 2022;290:424–7.
40. Sauro J, Lewis J. Quantifying the user experience; practical statistics for
user research. 2nd ed. Burlington: Morgan Kaufmann; 2016.